I opened a github issue: https://github.com/rakudo/rakudo/issues/3728
On 5/26/20, Joseph Brenner <doom...@gmail.com> wrote: > Hey Brad, thanks much for the explication: > >> 「<!before .>」 should probably also prevent the position from being at the >> end. > >> It does work if you write it differently > >> 'abc' ~~ / b <!before( /./ )> / >> Nil > > That's pretty interesting, though I can't say I understand at all > what's going on there. > >> It does seem like there could be a bug here. > > That was my suspicion. I'll probably open an issue on it soon. > >> All of that said, I don't think it is useful to tell new Raku programmers >> that you can use those features that way. > > Yes, certainly not. Just to be clear, I'm just messing around > with after/before to get a better sense of what they do. > > I tried to avoid saying the two forms are equivalent, they just > do roughly similar things. > > > > On 5/26/20, Brad Gilbert <b2gi...@gmail.com> wrote: >> I'm not sure that is the best way to look at 「<before>」 and 「<after>」. >> >> > 'abcd123abcd' ~~ / <?before <digit>> .+ <?after <digit>> / >> 「123」 >> >> In the above code 「<?before <digit>>」 makes sure that the first thing >> that >> 「.+」 matches is a 「<digit>」 >> And 「<?after <digit>>」 makes sure that the last thing 「.+」 matches is >> also >> a 「<digit>」 >> >> The 「<?before <digit>>」 is written in front of the 「.+」 so it starts at >> that position >> >> It does the thing that 「<digit>」 would normally do. >> >> ' a b c d 1 2 3 a b c d ' >> ' _ _ _ _^1^_ _ _ _ _ _ ' >> >> The thing is, 「<before>」 resets the position to what it was immediately >> before the successful 「<digit>」 match. >> >> ' a b c d 1 2 3 a b c d ' >> ' _ _ _ _^_ _ _ _ _ _ _ ' >> >> The 「.+」 then tries to grab everything >> >> ' a b c d 1 2 3 a b c d ' >> ' _ _ _ _^1 2 3 a b c d^' >> >> Then 「<?after <digit>>」 gets to tell it that it can't do that. >> >> The reason is that 「<after>」 looks backwards from the current position. >> The >> current position is at the very end. >> It obviously isn't a 「<digit>」, so 「.+」 has to keep giving up characters >> until its last value is a 「<digit>」. >> >> ' a b c d 1 2 3 a b c d ' >> ' _ _ _ _^1 2 3^_ _ _ _ ' >> >> --- >> >> You can use 「<after>」 to check that is at the beginning. >> >> 'abc' ~~ / <!after .> b / >> Nil >> >> The reason is that if the current position is anywhere other than the >> beginning 「.」 would match. >> Since we used 「!」 that won't fly. >> >> 「<!before .>」 should probably also prevent the position from being at the >> end. >> >> It does work if you write it differently >> >> 'abc' ~~ / b <!before( /./ )> / >> Nil >> >> Note that 「<before>」 and 「<after>」 are really just function calls. >> >> It does seem like there could be a bug here. >> >> --- >> >> All of that said, I don't think it is useful to tell new Raku programmers >> that you can use those features that way. >> >> It make them think that these two regexes are doing something similar. >> >> / ^ ... / >> / <!after .> ... / >> >> They match the same three characters, but for entirely different reasons. >> >> The 「^」 version is basically the same as: >> >> / <?{ $/.pos == 0 }> ... / >> >> While the other one is something like: >> >> / <!{ try $/.orig.substr( $/.pos - 1, 1 ) ~~ /./ }> ... / >> >> (The 「try」 is needed because 「.substr( -1 )」 is a Failure.) >> >> So then these: >> >> / ... $ / >> / ... <!before .> >> >> Would be >> >> / ... <?{ $/.pos == $/.orig.chars }> / >> / ... <!{ try $/.orig.substr( $/.pos, 1 ) ~~ /./ }> / >> >> --- >> >> What I think is happening is that the 「<!after .>」 works because the >> 「.substr( -1, 1)」 creates a Failure. >> >> The thing is that 「'abc'.substr( 3, 1 )」 doesn't create a Failure, it >> just >> gives you an empty Str. >> >> (The second argument is the maximum number of characters to return.) >> >> On Mon, May 25, 2020 at 4:10 PM Joseph Brenner <doom...@gmail.com> wrote: >> >>> Given this string: >>> my $str = "Romp romp ROMP"; >>> >>> We can match just the first or last by using the usual pinning >>> features, '^' or '$': >>> >>> say $str ~~ m:i:g/^romp/; ## (「Romp」) >>> say $str ~~ m:i:g/romp$/; ## (「ROMP」) >>> >>> Moritz Lenz (Section 3.8 of 'Parsing', p32) makes the point you >>> can use 'after' to do something like '^' pinning: >>> >>> say $str ~~ m:i:g/ <!after .> romp /; ## (「Romp」) >>> >>> That makes sense: the BOL is "not after any character" >>> So: I wondered if there was a way to use 'before' to do >>> something like '$' pinning: >>> >>> say $str ~~ m:i:g/ romp <!before .> /; ## (「Romp」 「romp」) >>> >>> That was unexpected: it filters out the one I was trying to >>> match for, though the logic seemed reasonable: the EOL is "not >>> before any character". >>> >>> What if we flip this and do a positive before match? >>> >>> say $str ~~ m:i:g/ romp <?before .> /; ## (「Romp」 「romp」) >>> >>> That does exactly the same thing, but here the logic makes >>> sense to me: the first two are "before some character", >>> but the last one isn't. >>> >> >