On Tue, Nov 29, 2016 at 08:45 PM, Don Ward  wrote: I should have been a bit 
clearer when I said my program didn’t work. It does, but only for fixed 
strings: there is none of the RE special character magic.  And, I agree, the 
crucial question is how to construct a pattern from a string that treats the 
special characters as special characters, rather than just literals.
In passing, write( type(  ))  writes string, whereas write(type(  )) writes 
pattern, which isn’t quite what I expected.
I had quite high hopes for Arbno(), but soon realised that it wanted a pattern 
for its argument, not a string and, even when I fed it with a variable that had 
the type of pattern, it still didn’t work how I might have expected it to. At 
that stage, I asked my original question. If Clint’s  “option #1" is "write a 
library procedure that parses the regex and builds the corresponding pattern”, 
I wonder whether Arbno() might be a suitable interface: i.e. if it’s a pattern 
already, do what it does now, otherwise turn the string into a pattern and then 
do it. Perhaps a separate procedure might be clearer.
The reader with no time for trivia may profitably skip the rest of this message 
...
I may have found a use for Succeed: If I modify my program to be  as below  
(additions in red: the reason for the strange comment at the end will be clear 
in a moment)

procedure main(args)

 local f, line, re := pop(args) || Succeed()
 write(type(re))

 every f := open(!args, "r") do {

 every line := !f do {

 if ( line ?? re ) then write(line)

 }

 }

end

#[dne][edn][den]
If I use grep on this program source I get 
bash-3.2$ grep "[dne][edn][den]" gerp.icn

 local f, line, re := pop(args) || Succeed()

end

#[dne][edn][den]as expected: grep has found “eed", “end" and the regular 
expression itself in the final comment. Whereas, if I use the program on its 
own source code I get

bash-3.2$ ./gerp "[dne][edn][den]" gerp.icn

pattern

#[dne][edn][den]showing that although I have a pattern, it isn’t interpreting 
the special characters.

Don, this isn't run time regex.  Regex literals are preprocessed before the 
compile.

Look at unicon -E gerp.icn

to  see how it is the preprocessor phase that expands the special meanings, by 
generating Unicon function expressions.

The program patstr.icn

procedure main(argv)
    local f, line, re := pop(argv)

    a := 
    write(type(a))

    b := 
    write(type(b), " ", string(b))

    p := pattern_concat("", re)
    write(type(p))

    every f := open(!argv, "r") do {
        every line := !f do {
            if line ?? p then write(line)
        }
    }

end

Expands to

prompt$ unicon -E patstr.icn
Parsing patstr.icn: .
/home/btiffin/unicon/bin/icont -c  -E  -O patstr.icn /tmp/uni13423387
patstr.icn:
#line 0 "/tmp/uni13423387"
#line 0 "patstr.icn"
procedure main(argv    );
        local f, line, re;

#line 13 "patstr.icn"
                       re := pop(argv)

    a := pattern_concat("abc", (Arbno('abc')));
    write(type(a));

    b := pattern_concat("abc", Any('abc'));
    write(type(b), " ", string(b));

    p := pattern_concat("", re);
    write(type(p));

        every f := open(!argv, "r") do {
                every line := !f do {
                        if( "" ? pattern_match( line,    p))  then write(line)
                }
        };

end
No errors

The Unicon VM nevers see the literals, it see the results of pattern_xxx 
constructors and functions.  Along with use of Arbno() to handle the Kleene 
stars and character set square brackets.

unicon -E will show more clearly what is going on, and the separation of 
compile time and runtime behaviour, and what is going to be allowed with regex 
literals.

Cheers,
Brian
 If I miss off the "|| Succeed()” from the initialisation of re and try again I 
get

string

#[dne][edn][den]

I still get a pattern match, even though it’s a string not a pattern, but it’s 
the literal string that is matching.
Therefore Succeed() may be used to turn a string into a pattern!  Unfortunately 
not in a useful way.
On 29 Nov 2016, at 22:27, Jeffery, Clint (jeffe...@uidaho.edu 
(mailto:jeffe...@uidaho.edu))  wrote:

My thanks to Don, Jay, and anyone else who is trying out stuff related to 
patterns.  I am on the road ATM but will work on improving the diagnostics 
related to Jay's experiments.  Regarding Don's original request and Jay's 
comments on it: backquotes in patterns is not a full "eval" interpreter that 
will take arbitrary Icon strings and turn them into code.  Maybe we need that, 
and maybe someone will build it some day.  In the meantime, after figuring out 
the best workarounds that may be available, you can judge for yourself whether 
the patterns are still useful, or whether they remain unfinished business.
The basic question is: given a regular expression supplied as string data s, 
how best should we construct a corresponding pattern. The answer sadly is not . 
The Unicon translator has a parser for regular expressions and emits pattern 
function calls for them, but we want to do it from the Unicon VM. Options 
include: write a library procedure that parses the regex and builds the 
corresponding pattern; write a library procedure that invokes the translator to 
do the work and use dynamic loading to get the code loaded; extend the language 
with a new built-in that does the same or similar; extend the backquotes 
operator to do what we want here; or use another idea that you think up.
Don: great minds think alike. When I started to update the Unicon book to talk 
about patterns, I immediately figured we needed to update the "grep" example to 
use patterns, and came up against the same issue you're asking about.  I 
haven't implemented a solution yet, but perhaps we should do option #1 and see 
what that looks like.
Cheers,Clint
------------------------------------
From: Jay Hammond 
Sent: Tuesday, November 29, 2016 1:55:07 PM
To: Don Ward; unicon-group@lists.sourceforge.net 
(mailto:unicon-group@lists.sourceforge.net)
Subject: Re: [Unicon-group] Converting strings to patterns
 Hi Don,I tried running your program.To get it to do anything I had to change 
line 2, separate the local declaration and the assignment.to clarify, repat is 
a (new) variable that I intend to hold a pattern
procedure main(args)
    local f, line, re
    re := pop(args)

    write(re)

    repat := re 

    every f := open(!args, "r") do {
        while ( line := read(f) ) do {
            if line ?? repat then write(line)
        }
    }
endI created qqq.txt with the linesQQQ
qqq
cqcqcqand ran testpat QQQ qqq.txt after compiling testpat.icnOutput  was 
QQQ 
then the contents of qqq.txtas if repat always matches. (it has the null 
value??)

I tried forcing repat to be a pattern (utr18 says that patterns are composed of 
strings concatenated or alternated)  so I tried 
repat := re  .|  fail()repat := re  .|  rebut the pattern building process gave 
me node errors at compile time.dopat2.icn:6: # "re": syntax error (237;349)
File dopat2.icn; Line 16 # traverse: undefined node typeline 16 is the line 
after end in main, i.e. program source end.
I tried using the -f s option at the compile step, so as to use unevaluated 
expressions in  patternslike 
    repat :=  <  `re` ># that syntax ought to force a pattern!
node traversal errors again. And the backquotes were not recognised. Perhaps I 
put the -f s option in the wrong place?
I also tried 
    repat :=  < Q > ||   < Q > ||   < Q > 
dopat2.icn:6: # "repat": invalid argument in augmented assignment
File dopat2.icn; Line 16 # traverse: undefined node typeso it is not 
considering || to be pattern concatenation    repat :=  <  Q  ||  Q ||  Q > 
gave the same error!
So although UTR18 seems to give options for converting strings to patterns I 
have not had any luck so far.

Jay
On 29/11/2016 14:33, Don Ward wrote:
Here is a very simple (and simple minded) grep program. The idea being to apply 
a Unicon regexp pattern to a series of files, just like grep
procedure main(args)
local f, line, re := pop(args)
every f := open(!args, "r") do {
every line := !f do {
if line ?? re then write(line)
}
}
end
Of course, it doesn’t work because in line 6  I have a string instead of a 
pattern.
Is there any way to convert the contents of re from a string to a pattern?
After reading UTR18 again (and again), I’ve come to the conclusion that there 
isn’t any way to do it. 
The pertinent extract from UTR18 is in section 4.5.3 "Limitations due to lack 
of eval()”.
But before I give up on the idea entirely, I thought I’d check to see if my 
understanding is correct.
Don
        
------------------------------------------------------------------------------ 
        _______________________________________________ Unicon-group mailing 
list Unicon-group@lists.sourceforge.net 
(mailto:Unicon-group@lists.sourceforge.net) 
https://lists.sourceforge.net/lists/listinfo/unicon-group 
(https://lists.sourceforge.net/lists/listinfo/unicon-group)
------------------------------------------------------------------------------
_______________________________________________
Unicon-group mailing list
Unicon-group@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/unicon-group

Reply via email to