On Fri, May 13, 2005 at 03:36:50PM +0000, Luke Palmer wrote:
> I'm basically saying that you should treat your:
> $str ~~ /abc :: def | ghi :: jkl | mn :: op/;
> As:
> $rule = rx/abc :: def | ghi :: jkl | mn :: op/;
> $str ~~ /^ .*? <$rule>/;
> Which means that you fail the rule, your .*? advances to the next
> character and tries the rule again.
Taking this explanation literally, this would mean that
$rule = rx/abc :: def | ghi :: jkl | mn :: op/;
$rule = rx/abc ::: def | ghi ::: jkl | mn ::: op/;
both succeed against "xyzabc---ghijkl". But even just considering
the :: instance, this interpretation doesn't match what you said
in your original message that :: would fail the rule without
further advancing:
Pm> $rule =3D rx :w / plane :: (\d+) | train :: (\w+) | auto :: (\S+) / ;
Pm> "travel by plane jet train tgv today" ~~ $rule
LP> When you fail over the :: after plane, it skips out of the alternation
LP> looking for something to backtrack before it. Since there is nothing,
LP> the rule fails.
> Maybe I'm misunderstanding your interpretation (when in doubt, explain
> with code).
One of us is misunderstanding the other. I'll explain with code,
but first let's clarify the difference. I read your first message as
claiming that
$r1 = rx / abc :: def | ghi :: jkl | mn :: op /;
$r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
$r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;
are equivalent. I believe $r2 and $r3 are not equivalent.
For comparison, let's first look at a slightly different example,
and let's avoid subrules they don't provide the auto-advance
of unanchored patterns that forms the crux of my question.
First, I'm quite certain that $r2 and $r3 are different. For
illustration, let's use a variation like:
$q2 = rx / \w [ abc ::: def | ghi ::: jkl | mn ::: op ] /;
$q3 = rx / \w [ [ abc :: def | ghi :: jkl | mn :: op ] ]/;
"xyzabc---xyzghijklmno" ~~ $q2 # fails after seeing "zabc"
"xyzabc---xyzghijklmno" ~~ $q3 # matches "zghijkl"
The difference is precisely the difference between ::: and :: --
the former fails the rule entirely, while the latter simply fails
the current group (of alternations) and tries again.
With :::, an unanchored rule should also stop its process of
"advancing to the next character and trying again".
(Otherwise, "abefgh" ~~ rx / [ ab ::: cd | ef ::: gh ] / succeeds.)
So, by analogy
$r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
$r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;
"xyzabc---xyzghijklmno" ~~ $r2 # fails after seeing "abc"
"xyzabc---xyzghijklmno" ~~ $r3 # matches "ghijkl"
The :: in $r3 doesn't cause the entire rule to fail, just the
group, so the match is free to backtrack and continue its
"advance to the next character and try again". (What the "::"
in $r3 *does* do is to tell the matching engine to not bother
trying the remaining alternatives once it has seen an "abc" at
this point.)
So, going back to the original
$r1 = rx / abc :: def | ghi :: jkl | mn :: op /;
does it work like $r2 or $r3? My gut feeling is that it should
work like $r2 -- i.e., that once we find an "abc" we'll fail the rule
if there's not a "def" following. This also accords with what
others have written in reply, when they say that all three of my
expressions fail in the same way (even though they do not).
However, *if* we say that :: at the top level fails the rule, that
means that as things currently stand
$z1 = rx :w /foo/;
$z2 = rx /:w::foo/;
$z3 = rx /[:w::foo]/;
can be a little surprising:
"hello foo" ~~ $z1 # matches "foo"
"hello foo" ~~ $z2 # fails immediately upon the 'h' != 'f'
"hello foo" ~~ $z3 # matches "foo"
which was the point of my original post. And as I said there, I don't
have a problem with this, I just wanted to make this result didn't
surprise too many others.
I hope this was clear enough -- if not, explain counter examples
in code. :-)
Pm