Re: Regex surprises
Sean, To follow up on what Brian said, you can also do the same thing on a single line with named captures https://docs.raku.org/language/regexes#Named_captures (This often isn't as good as Brian's method, since breaking things up as Brian showed often helps readability. But it can be a good option if you're trying to keep something concise.) Using that syntax, your example goes from S:g[(x)|(y)] = $0 ?? x-replacement !! y-replacement to S:g[$=[x]|y] = $ ?? x-replacement !! y-replacement which is pretty similar. I hope that helps! Best, Daniel -- Daniel Sockwell / codesections Website: www.codesections.com
Regex surprises
Hi Brian, I would think this would become pretty easy. Consider: sub MAIN ( $replacement, :$x, :$y ) { my regex x {<{ $x }> } my regex y { <{$y}> } S:g! ! <{ $replacemement }> ! if $x; S:g! ! <{ $replacement }> ! if $y } Is this what you are looking for or did I miss something? >..."y" ~~ /(x)|(y)/ I would probably take > advantage of the composability of Raku regexes, and do something like my > regex x { x } my regex y { y } and then use / | / and check for $ or $ e.g. > this becomes S:g[|] = $ ?? x-replacement !! y-replacement
Re: Regex surprises
> > > ... > > > "y" ~~ /(x)|(y)/ I would probably take advantage of the composability of Raku regexes, and do something like my regex x { x } my regex y { y } and then use / | / and check for $ or $ > > S:g[(x)|(y)] = $0 ?? x-replacement !! y-replacement e.g. this becomes S:g[|] = $ ?? x-replacement !! y-replacement Brian
Re: Regex surprises
Raku removed all of the regex cruft that has accumulated over the years. (Much of that cruft was added by Perl.) I'm not going to respond to the first part of your email, as I think it is an implementation artifact. On Mon, Sep 12, 2022 at 3:06 PM Sean McAfee wrote: > Hello-- > > I stumbled across a couple of bits of surprising regex behavior today. > > First, consider: > > S:g[ x { say 1 } ] = say 2 given "xx" > > I expected this to print 1, 2, 1, 2, but it prints 1, 1, 2, 2. So it > looks like, in a global substitution like this, Raku doesn't look for > successive matches and then evaluate the replacements as it goes, but finds > all of the matches *first* and then works through the substitutions. In > my actual problem I was mutating state in the regex code block, and then it > didn't work because all of the mutations happened before even a single > replacement was evaluated. Is it really meant to work this way? > Now the following is intentional. Raku treats regexes as a domain specific sub language. One of the ways it does that is by having each sub expression act as an independent sub expression. Would you expect `/ (a) | (b) /` to act significantly differently to `if ($a) {$0 = ...} elsif ($b) {$0 = ...}`? (Where `$0` could be thought of as representing the first value on the stack, or similar.) Next, consider: > > > "y" ~~ /(x)|(y)/ > 「y」 > 0 => 「y」 > > y is in the second set of grouping parentheses, so I expected it to be in > group 1, but it's in group 0. So it looks like the group index starts from > 0 in every branch of an alternation. I do so much regex slinging I'm > amazed it took me so long to discover this, if it's not a relatively recent > change. I'm accustomed to being able to determine which alternation branch > was matched by checking which group is defined (in other languages too, not > just Raku). This kind of thing: > > S:g[(x)|(y)] = $0 ?? x-replacement !! y-replacement > > I guess instead I need to do this: > > S:g[x|y] = $/ eq 'x' ?? x-replacement !! y-replacement > > It seems very strange that I need to re-examine the match to know what > matched. The match should be able to tell me what matched. Or is there > perhaps some alternate way for me to tell which alternative matched? > Other languages, including Perl, have just added feature after feature to regexes without thinking about the regex language as a whole. Raku started over from scratch. Larry then took the knowledge learned over decades of language design and applied it to regex. Like I said, one of those things that Larry realized is that independent sub expressions should be independent. If you really want to know how to determine which alternation matched, there are plenty of ways to do it. / $0 = (x) | $1 = ($y) / / $ = x | $ = y / / x :my $*alternation = 0; | y :my $*alternation = 1; / / x :my $*replacement = ...; | y :my $*replacement = ...; / That last one would allow you to remove the `??` `!!` from your code. (I haven't been doing much with Raku for months, so there are likely some other methods I'm not thinking of.)
Regex surprises
Hello-- I stumbled across a couple of bits of surprising regex behavior today. First, consider: S:g[ x { say 1 } ] = say 2 given "xx" I expected this to print 1, 2, 1, 2, but it prints 1, 1, 2, 2. So it looks like, in a global substitution like this, Raku doesn't look for successive matches and then evaluate the replacements as it goes, but finds all of the matches *first* and then works through the substitutions. In my actual problem I was mutating state in the regex code block, and then it didn't work because all of the mutations happened before even a single replacement was evaluated. Is it really meant to work this way? Next, consider: > "y" ~~ /(x)|(y)/ 「y」 0 => 「y」 y is in the second set of grouping parentheses, so I expected it to be in group 1, but it's in group 0. So it looks like the group index starts from 0 in every branch of an alternation. I do so much regex slinging I'm amazed it took me so long to discover this, if it's not a relatively recent change. I'm accustomed to being able to determine which alternation branch was matched by checking which group is defined (in other languages too, not just Raku). This kind of thing: S:g[(x)|(y)] = $0 ?? x-replacement !! y-replacement I guess instead I need to do this: S:g[x|y] = $/ eq 'x' ?? x-replacement !! y-replacement It seems very strange that I need to re-examine the match to know what matched. The match should be able to tell me what matched. Or is there perhaps some alternate way for me to tell which alternative matched?