I'm not sure that is the best way to look at 「<before>」 and 「<after>」.

    > 'abcd123abcd' ~~ / <?before <digit>> .+ <?after <digit>> /
    「123」

In the above code 「<?before <digit>>」 makes sure that the first thing that
「.+」 matches is a 「<digit>」
And 「<?after <digit>>」 makes sure that the last thing 「.+」 matches is also
a 「<digit>」

The 「<?before <digit>>」 is written in front of the 「.+」 so it starts at
that position

It does the thing that 「<digit>」 would normally do.

    ' a b c d 1 2 3 a b c d '
    ' _ _ _ _^1^_ _ _ _ _ _ '

The thing is, 「<before>」 resets the position to what it was immediately
before the successful 「<digit>」 match.

    ' a b c d 1 2 3 a b c d '
    ' _ _ _ _^_ _ _ _ _ _ _ '

The 「.+」 then tries to grab everything

    ' a b c d 1 2 3 a b c d '
    ' _ _ _ _^1 2 3 a b c d^'

Then  「<?after <digit>>」 gets to tell it that it can't do that.

The reason is that 「<after>」 looks backwards from the current position. The
current position is at the very end.
It obviously isn't a 「<digit>」, so 「.+」 has to keep giving up characters
until its last value is a 「<digit>」.

    ' a b c d 1 2 3 a b c d '
    ' _ _ _ _^1 2 3^_ _ _ _ '

---

You can use 「<after>」 to check that is at the beginning.

     'abc' ~~ / <!after .> b /
     Nil

The reason is that if the current position is anywhere other than the
beginning 「.」 would match.
Since we used 「!」 that won't fly.

「<!before .>」 should probably also prevent the position from being at the
end.

It does work if you write it differently

    'abc' ~~ / b <!before( /./ )> /
    Nil

Note that 「<before>」 and 「<after>」 are really just function calls.

It does seem like there could be a bug here.

---

All of that said, I don't think it is useful to tell new Raku programmers
that you can use those features that way.

It make them think that these two regexes are doing something similar.

    / ^ ... /
    / <!after .> ... /

They match the same three characters, but for entirely different reasons.

The 「^」 version is basically the same as:

    / <?{ $/.pos == 0 }> ... /

While the other one is something like:

    / <!{ try $/.orig.substr( $/.pos - 1, 1 ) ~~ /./ }> ... /

(The 「try」 is needed because 「.substr( -1 )」 is a Failure.)

So then these:

    / ... $ /
    / ... <!before .>

Would be

    / ... <?{ $/.pos == $/.orig.chars }> /
    / ... <!{ try $/.orig.substr( $/.pos, 1 ) ~~ /./ }> /

---

What I think is happening is that the 「<!after .>」 works because the
「.substr( -1, 1)」 creates a Failure.

The thing is that 「'abc'.substr( 3, 1 )」 doesn't create a Failure, it just
gives you an empty Str.

(The second argument is the maximum number of characters to return.)

On Mon, May 25, 2020 at 4:10 PM Joseph Brenner <doom...@gmail.com> wrote:

> Given this string:
>    my $str = "Romp romp ROMP";
>
> We can match just the first or last by using the usual pinning
> features, '^' or '$':
>
>    say $str ~~ m:i:g/^romp/;               ## (「Romp」)
>    say $str ~~ m:i:g/romp$/;               ## (「ROMP」)
>
> Moritz Lenz (Section 3.8 of 'Parsing', p32) makes the point you
> can use 'after' to do something like '^' pinning:
>
>    say $str ~~ m:i:g/ <!after .> romp /;   ## (「Romp」)
>
> That makes sense:  the BOL is "not after any character"
> So: I wondered if there was a way to use 'before' to do
> something like '$' pinning:
>
>   say $str ~~ m:i:g/ romp <!before .> /;  ## (「Romp」 「romp」)
>
> That was unexpected: it filters out the one I was trying to
> match for, though the logic seemed reasonable: the EOL is "not
> before any character".
>
> What if we flip this and do a positive before match?
>
>   say $str ~~ m:i:g/ romp <?before .> /;  ## (「Romp」 「romp」)
>
> That does exactly the same thing, but here the logic makes
> sense to me: the first two are "before some character",
> but the last one isn't.
>

Reply via email to