On Tue, Nov 29, 2016 at 08:45 PM, Don Ward wrote: I should have been a bit clearer when I said my program didn’t work. It does, but only for fixed strings: there is none of the RE special character magic. And, I agree, the crucial question is how to construct a pattern from a string that treats the special characters as special characters, rather than just literals. In passing, write( type( )) writes string, whereas write(type( )) writes pattern, which isn’t quite what I expected. I had quite high hopes for Arbno(), but soon realised that it wanted a pattern for its argument, not a string and, even when I fed it with a variable that had the type of pattern, it still didn’t work how I might have expected it to. At that stage, I asked my original question. If Clint’s “option #1" is "write a library procedure that parses the regex and builds the corresponding pattern”, I wonder whether Arbno() might be a suitable interface: i.e. if it’s a pattern already, do what it does now, otherwise turn the string into a pattern and then do it. Perhaps a separate procedure might be clearer. The reader with no time for trivia may profitably skip the rest of this message ... I may have found a use for Succeed: If I modify my program to be as below (additions in red: the reason for the strange comment at the end will be clear in a moment)
procedure main(args) local f, line, re := pop(args) || Succeed() write(type(re)) every f := open(!args, "r") do { every line := !f do { if ( line ?? re ) then write(line) } } end #[dne][edn][den] If I use grep on this program source I get bash-3.2$ grep "[dne][edn][den]" gerp.icn local f, line, re := pop(args) || Succeed() end #[dne][edn][den]as expected: grep has found “eed", “end" and the regular expression itself in the final comment. Whereas, if I use the program on its own source code I get bash-3.2$ ./gerp "[dne][edn][den]" gerp.icn pattern #[dne][edn][den]showing that although I have a pattern, it isn’t interpreting the special characters. Don, this isn't run time regex. Regex literals are preprocessed before the compile. Look at unicon -E gerp.icn to see how it is the preprocessor phase that expands the special meanings, by generating Unicon function expressions. The program patstr.icn procedure main(argv) local f, line, re := pop(argv) a := write(type(a)) b := write(type(b), " ", string(b)) p := pattern_concat("", re) write(type(p)) every f := open(!argv, "r") do { every line := !f do { if line ?? p then write(line) } } end Expands to prompt$ unicon -E patstr.icn Parsing patstr.icn: . /home/btiffin/unicon/bin/icont -c -E -O patstr.icn /tmp/uni13423387 patstr.icn: #line 0 "/tmp/uni13423387" #line 0 "patstr.icn" procedure main(argv ); local f, line, re; #line 13 "patstr.icn" re := pop(argv) a := pattern_concat("abc", (Arbno('abc'))); write(type(a)); b := pattern_concat("abc", Any('abc')); write(type(b), " ", string(b)); p := pattern_concat("", re); write(type(p)); every f := open(!argv, "r") do { every line := !f do { if( "" ? pattern_match( line, p)) then write(line) } }; end No errors The Unicon VM nevers see the literals, it see the results of pattern_xxx constructors and functions. Along with use of Arbno() to handle the Kleene stars and character set square brackets. unicon -E will show more clearly what is going on, and the separation of compile time and runtime behaviour, and what is going to be allowed with regex literals. Cheers, Brian If I miss off the "|| Succeed()” from the initialisation of re and try again I get string #[dne][edn][den] I still get a pattern match, even though it’s a string not a pattern, but it’s the literal string that is matching. Therefore Succeed() may be used to turn a string into a pattern! Unfortunately not in a useful way. On 29 Nov 2016, at 22:27, Jeffery, Clint (jeffe...@uidaho.edu (mailto:jeffe...@uidaho.edu)) wrote: My thanks to Don, Jay, and anyone else who is trying out stuff related to patterns. I am on the road ATM but will work on improving the diagnostics related to Jay's experiments. Regarding Don's original request and Jay's comments on it: backquotes in patterns is not a full "eval" interpreter that will take arbitrary Icon strings and turn them into code. Maybe we need that, and maybe someone will build it some day. In the meantime, after figuring out the best workarounds that may be available, you can judge for yourself whether the patterns are still useful, or whether they remain unfinished business. The basic question is: given a regular expression supplied as string data s, how best should we construct a corresponding pattern. The answer sadly is not . The Unicon translator has a parser for regular expressions and emits pattern function calls for them, but we want to do it from the Unicon VM. Options include: write a library procedure that parses the regex and builds the corresponding pattern; write a library procedure that invokes the translator to do the work and use dynamic loading to get the code loaded; extend the language with a new built-in that does the same or similar; extend the backquotes operator to do what we want here; or use another idea that you think up. Don: great minds think alike. When I started to update the Unicon book to talk about patterns, I immediately figured we needed to update the "grep" example to use patterns, and came up against the same issue you're asking about. I haven't implemented a solution yet, but perhaps we should do option #1 and see what that looks like. Cheers,Clint ------------------------------------ From: Jay Hammond Sent: Tuesday, November 29, 2016 1:55:07 PM To: Don Ward; unicon-group@lists.sourceforge.net (mailto:unicon-group@lists.sourceforge.net) Subject: Re: [Unicon-group] Converting strings to patterns Hi Don,I tried running your program.To get it to do anything I had to change line 2, separate the local declaration and the assignment.to clarify, repat is a (new) variable that I intend to hold a pattern procedure main(args) local f, line, re re := pop(args) write(re) repat := re every f := open(!args, "r") do { while ( line := read(f) ) do { if line ?? repat then write(line) } } endI created qqq.txt with the linesQQQ qqq cqcqcqand ran testpat QQQ qqq.txt after compiling testpat.icnOutput was QQQ then the contents of qqq.txtas if repat always matches. (it has the null value??) I tried forcing repat to be a pattern (utr18 says that patterns are composed of strings concatenated or alternated) so I tried repat := re .| fail()repat := re .| rebut the pattern building process gave me node errors at compile time.dopat2.icn:6: # "re": syntax error (237;349) File dopat2.icn; Line 16 # traverse: undefined node typeline 16 is the line after end in main, i.e. program source end. I tried using the -f s option at the compile step, so as to use unevaluated expressions in patternslike repat := < `re` ># that syntax ought to force a pattern! node traversal errors again. And the backquotes were not recognised. Perhaps I put the -f s option in the wrong place? I also tried repat := < Q > || < Q > || < Q > dopat2.icn:6: # "repat": invalid argument in augmented assignment File dopat2.icn; Line 16 # traverse: undefined node typeso it is not considering || to be pattern concatenation repat := < Q || Q || Q > gave the same error! So although UTR18 seems to give options for converting strings to patterns I have not had any luck so far. Jay On 29/11/2016 14:33, Don Ward wrote: Here is a very simple (and simple minded) grep program. The idea being to apply a Unicon regexp pattern to a series of files, just like grep procedure main(args) local f, line, re := pop(args) every f := open(!args, "r") do { every line := !f do { if line ?? re then write(line) } } end Of course, it doesn’t work because in line 6 I have a string instead of a pattern. Is there any way to convert the contents of re from a string to a pattern? After reading UTR18 again (and again), I’ve come to the conclusion that there isn’t any way to do it. The pertinent extract from UTR18 is in section 4.5.3 "Limitations due to lack of eval()”. But before I give up on the idea entirely, I thought I’d check to see if my understanding is correct. Don ------------------------------------------------------------------------------ _______________________________________________ Unicon-group mailing list Unicon-group@lists.sourceforge.net (mailto:Unicon-group@lists.sourceforge.net) https://lists.sourceforge.net/lists/listinfo/unicon-group (https://lists.sourceforge.net/lists/listinfo/unicon-group)
------------------------------------------------------------------------------
_______________________________________________ Unicon-group mailing list Unicon-group@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/unicon-group