Re: C:: in rules

2005-05-14 Thread Patrick R. Michaud
On Fri, May 13, 2005 at 01:07:20PM -0700, Larry Wall wrote:
 On Fri, May 13, 2005 at 11:54:47AM -0500, Patrick R. Michaud wrote:
 : $r1 = rx / abc :: def | ghi :: jkl | mn :: op /;
 : $r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
 : $r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;
 
 I would prefer that $r1 work like $r3, not like $r2, for two reasons.

Now implemented as such in Parrot r8103.  And yes, it now means that

rx :w /foo/;
rx /:w::foo/;
rx /[:w::foo]/;

are all identical, which is very nice.  

 By the way, I still think of it as a group of alternatives even
 if there's only one alternative, and no |.  But I can see how that
 can be misread to imply at least two alternatives.  [...]
 And if there's no alternative, you only have one alternative.  
 Ain't English wonderful?

...and this last bit means we can strike the It is illegal
to use C:: outside of an alternation from S05, since we're always
inside of an alternation (group of alternatives), even if there's 
only one alternative.  

That sentence has now been struck.  

Many thanks for the clarification and discussion.

Pm


Re: C:: in rules

2005-05-13 Thread TSa (Thomas Sandlaß)
Larry Wall wrote:
Speaking of which, it seems to me that :p and :c should allow an
argument that says where to start relative to the current position.
In other words, :p means :p(0) and :c means :c(0).  I could also see
uses for :p(-1) and :p(+1).
Isn't that slightly inconsistent with :p meaning :p(1) the so-called
real winner for passing boolean options of A12?
--
TSa


Re: C:: in rules

2005-05-13 Thread Markus Laire
TSa (Thomas Sandlaß) kirjoitti:
Larry Wall wrote:
Speaking of which, it seems to me that :p and :c should allow an
argument that says where to start relative to the current position.
In other words, :p means :p(0) and :c means :c(0).  I could also see
uses for :p(-1) and :p(+1).

Isn't that slightly inconsistent with :p meaning :p(1) the so-called
real winner for passing boolean options of A12?
Perhaps spec should be changed so that :p means :p(bool::true) or :p(?1) 
and not :p(1)

--
Markus Laire
Jam. 1:5-6


Re: C:: in rules

2005-05-13 Thread Juerd
Markus Laire skribis 2005-05-13 11:43 (+0300):
 Perhaps spec should be changed so that :p means :p(bool::true) or :p(?1) 
 and not :p(1)

aol
Agreed
/


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: C:: in rules

2005-05-13 Thread Aaron Sherman
On Fri, 2005-05-13 at 00:26, Patrick R. Michaud wrote:
 On Thu, May 12, 2005 at 08:56:39PM -0700, Larry Wall wrote:
  On Thu, May 12, 2005 at 09:33:37AM -0500, Patrick R. Michaud wrote:
  : Also, A05 proposes incorrect alternatives to the above 
  : 
  : /[:w[]foo bar]/

  I would just like to point out that you are misreading those.

 I've been looking at patterns too long

You know, this is going to be a problem for a lot of people...

Think of this case:

/:w[foo bar|bar foo]/

I may be in the minority here, but I think we should try to avoid having
[] and () mean different things in different parts of a rule, especially
where one use is VERY common, and the other is obscure at best. I'd even
be ok with only allowing this inside our already highly magical :

/:w[foo bar|bar foo]/

and

/:p(false)/

and

/ :p5['ponie'] (?{die;}) /

I checked, and while ::... has a meaning in S05, :... does not, so
as long as we never allow a modifier called ::, this would work.

In fact, Larry, I think it's safe to say that  is actually more
sought-after than that : everyone wants ;-)

-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: C:: in rules

2005-05-13 Thread Luke Palmer
On 5/12/05, Patrick R. Michaud [EMAIL PROTECTED] wrote:
 I have a couple of questions regarding C ::  in perl 6 rules.
 First, a question of verification -- in
 
 $rule = rx :w / plane :: (\d+) | train :: (\w+) | auto :: (\S+) / ;
 
 travel by plane jet train tgv today ~~ $rule
 
 I think the match should fail outright, as opposed to matching train tgv.
 In other words, it acts as though one had written
 
 $rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
 
 and not
 
 $rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;

Those both do the same thing (which is the same as your example). 
When you fail over the :: after plane, it skips out of the alternation
looking for something to backtrack before it.  Since there is nothing,
the rule fails.

 Does this sound right?
 
 Next on my list, S05 says It is illegal to use :: outside of
 an alternation, but A05 has
 
 /[:w::foo bar]/
 
 which leads me to believe that :: isn't illegal here even though there's
 no alternation.  I'd like to strike that sentence from S05

Yeah, I think using :: to break out of the innermost bracketing group
is helpful even without an alternation present.

 Also, A05 proposes incorrect alternatives to the above
 
 /[:w[]foo bar]/# null pattern illegal, use null
 /[:w()foo bar]/# null capture illegal, and probably undesirable
 /[:w\bfoo bar]/# not exactly the same as above
 
 I'd like to remove those from A05, or at least put an Update:
 note there that doesn't lead people astray.  One option not
 mentioned in A05 that we can add there is
 
 /[:w?nullfoo bar]/
 
 which is admittedly ugly.
 
 So, now then, on to the item that got me here in the first place.
 The upshot of all of the above is that
 
 rx :w /foo bar/
 
 is not equivalent to
 
 rx /:w::foo bar/

Yeah, but it is.  So no problem.  :-)

 which may surprise a few people.  The :: at the beginning of
 the pattern effectively anchors the match to the beginning of
 the string or the current position -- i.e., it eliminates the
 implicit C .*?  at the start of the match.

Ohhh, ohh.  There isn't an implicit .*? at the beginning of the match.
 It's more like there's an implicit .*? followed by a rule call to the
match.  Think of it as that we're trying to match the pattern at any
position rather than there being an implicit .*?.

Luke


Re: C:: in rules

2005-05-13 Thread Luke Palmer
On 5/13/05, Patrick R. Michaud [EMAIL PROTECTED] wrote:
 To use the phrase from later in your message, there's still
 the implicit .*? followed by the rule call.  Since the rule
 itself hasn't failed (only the group failed), we're still free to
 try to match the pattern at later positions.

I'm basically saying that you should treat your:

$str ~~ /abc :: def | ghi :: jkl | mn :: op/;

As:

$rule = rx/abc :: def | ghi :: jkl | mn :: op/;
$str ~~ /^ .*? $rule/;

Which means that you fail the rule, your .*? advances to the next
character and tries the rule again.

Maybe I'm misunderstanding your interpretation (when in doubt, explain
with code).

Luke


Re: C:: in rules

2005-05-13 Thread Patrick R. Michaud
On Fri, May 13, 2005 at 03:36:50PM +, Luke Palmer wrote:
 I'm basically saying that you should treat your:
 $str ~~ /abc :: def | ghi :: jkl | mn :: op/;
 As:
 $rule = rx/abc :: def | ghi :: jkl | mn :: op/;
 $str ~~ /^ .*? $rule/;
 Which means that you fail the rule, your .*? advances to the next
 character and tries the rule again.

Taking this explanation literally, this would mean that

$rule = rx/abc :: def | ghi :: jkl | mn :: op/;
$rule = rx/abc ::: def | ghi ::: jkl | mn ::: op/;

both succeed against xyzabc---ghijkl.  But even just considering
the :: instance, this interpretation doesn't match what you said 
in your original message that :: would fail the rule without 
further advancing:

Pm $rule =3D rx :w / plane :: (\d+) | train :: (\w+) | auto :: (\S+) / ;
Pm travel by plane jet train tgv today ~~ $rule

LP When you fail over the :: after plane, it skips out of the alternation
LP looking for something to backtrack before it.  Since there is nothing,
LP the rule fails.

 Maybe I'm misunderstanding your interpretation (when in doubt, explain
 with code).

One of us is misunderstanding the other.  I'll explain with code, 
but first let's clarify the difference.  I read your first message as 
claiming that

$r1 = rx / abc :: def | ghi :: jkl | mn :: op /;
$r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
$r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;

are equivalent.  I believe $r2 and $r3 are not equivalent.  
For comparison, let's first look at a slightly different example, 
and let's avoid subrules they don't provide the auto-advance
of unanchored patterns that forms the crux of my question.

First, I'm quite certain that $r2 and $r3 are different.  For
illustration, let's use a variation like:

$q2 = rx / \w [ abc ::: def | ghi ::: jkl | mn ::: op ] /;
$q3 = rx / \w [ [ abc :: def | ghi :: jkl | mn :: op ] ]/;

xyzabc---xyzghijklmno ~~ $q2 # fails after seeing zabc
xyzabc---xyzghijklmno ~~ $q3 # matches zghijkl

The difference is precisely the difference between ::: and :: --
the former fails the rule entirely, while the latter simply fails
the current group (of alternations) and tries again.  
With :::, an unanchored rule should also stop its process of 
advancing to the next character and trying again.  
(Otherwise,  abefgh ~~ rx / [ ab ::: cd | ef ::: gh ] / succeeds.)

So, by analogy

$r2 = rx / abc ::: def | ghi ::: jkl | mn ::: op /;
$r3 = rx / [ abc :: def | ghi :: jkl | mn :: op ] /;

xyzabc---xyzghijklmno ~~ $r2 # fails after seeing abc
xyzabc---xyzghijklmno ~~ $r3 # matches ghijkl

The :: in $r3 doesn't cause the entire rule to fail, just the
group, so the match is free to backtrack and continue its
advance to the next character and try again.  (What the ::
in $r3 *does* do is to tell the matching engine to not bother 
trying the remaining alternatives once it has seen an abc at
this point.)

So, going back to the original

$r1 = rx / abc :: def | ghi :: jkl | mn :: op /;

does it work like $r2 or $r3?  My gut feeling is that it should 
work like $r2 -- i.e., that once we find an abc we'll fail the rule
if there's not a def following.  This also accords with what 
others have written in reply, when they say that all three of my
expressions fail in the same way (even though they do not).

However, *if* we say that :: at the top level fails the rule, that
means that as things currently stand

$z1 = rx :w /foo/;
$z2 = rx /:w::foo/;
$z3 = rx /[:w::foo]/;

can be a little surprising:

hello foo ~~ $z1 # matches foo
hello foo ~~ $z2 # fails immediately upon the 'h' != 'f'
hello foo ~~ $z3 # matches foo

which was the point of my original post.  And as I said there, I don't
have a problem with this, I just wanted to make this result didn't
surprise too many others.

I hope this was clear enough  -- if not, explain counter examples
in code.  :-)

Pm


Re: C:: in rules

2005-05-13 Thread Larry Wall
On Fri, May 13, 2005 at 11:43:42AM +0300, Markus Laire wrote:
: Perhaps spec should be changed so that :p means :p(bool::true) or :p(?1) 
: and not :p(1)

I'm still not sure I believe in booleans to that extent.  I suppose
we could go as far as to make it :p(0 but true).  Actually, it's more
like undef but true, if you want to be able to distinguish

sub foo (+$p = 0) { # no :p at all
say true if $p;   # :p with no argument
$p //= 42;  # :p with no argument
...
}

Or maybe it's something more like 1 but assumed.  In any event, it'd
be nice to be able to distinguish :p from :p(1) somehow.  Maybe the
Bool type is good enough for that.  The bool type probably isn't unless
we depend on autoboxing to turn it into a Bool consistently.

Larry


Re: C:: in rules

2005-05-13 Thread Damian Conway
Larry wrote:
I'm still not sure I believe in booleans to that extent.  I suppose
we could go as far as to make it :p(0 but true).  Actually, it's more
like undef but true, if you want to be able to distinguish
sub foo (+$p = 0) { # no :p at all
say true if $p; # :p with no argument
$p //= 42;  # :p with no argument
...
}
Yes, I was thinking along the same lines. Cundef but true as a default seems 
to be more accurate and useful than CBool::true.

Damian


Re: C:: in rules

2005-05-13 Thread Luke Palmer
On 5/13/05, Patrick R. Michaud [EMAIL PROTECTED] wrote:
 First, I'm quite certain that $r2 and $r3 are different.  For
 illustration, let's use a variation like:
 
 $q2 = rx / \w [ abc ::: def | ghi ::: jkl | mn ::: op ] /;
 $q3 = rx / \w [ [ abc :: def | ghi :: jkl | mn :: op ] ]/;
 
 xyzabc---xyzghijklmno ~~ $q2 # fails after seeing zabc
 xyzabc---xyzghijklmno ~~ $q3 # matches zghijkl

Okay, I know where the misunderstanding is.  When we use these kinds
of examples, let's not rely on the implicit matching semantic.  I'm
saying that the above code is equivalent to:

# the following is a rule, so ::: backtracks out of it and no further
rule q2 { \w [ abc ::: def | ghi ::: jkl | mn ::: op ] }
rule q3 { \w [ [ abc :: def | ghi :: jkl | mn :: op ] ] }
xyzabc---xyzghijklmno ~~ /^ .*? q2/;   # ::: backtracks into the .*?
xyzabc---xyzghijklmno ~~ /^ .*? q3/;

The presence of the \w does nothing, because \w doesn't backtrack. 
Alternations and quantifiers backtrack when you fail beyond them, \w
just fails.  You never enter the same subpattern (meant in the most
general case: .* is a subpattern, for instance) in the same state. 
Something had to change behind you in order for a subpattern to be
re-entered.

I think the misunderstanding is rather simple.  You keep talking like
you prepend a .*? to the rule we're matching.  I think that's wrong
(and this is where I'm making a design call, so we can dispute on this
once we're clear that it's this that is being disputed).  I think
there is a special rule:

rule matchanywhere($rx) { .*? $rx }

Which makes a *subrule call* to the rule we're matching.  Therefore
::: just breaks out of that subrule, and backtracks into the .*?
again.

Because of this, I think there will be a difference between ::: and
commit at the top level, but not :: and :::.

Luke


Re: C:: in rules

2005-05-13 Thread Larry Wall
On Sat, May 14, 2005 at 01:15:36AM +, Luke Palmer wrote:
: I think the misunderstanding is rather simple.  You keep talking like
: you prepend a .*? to the rule we're matching.  I think that's wrong
: (and this is where I'm making a design call, so we can dispute on this
: once we're clear that it's this that is being disputed).  I think
: there is a special rule:
: 
: rule matchanywhere($rx) { .*? $rx }
: 
: Which makes a *subrule call* to the rule we're matching.  Therefore
: ::: just breaks out of that subrule, and backtracks into the .*?
: again.

I want ::: to break out of *that* dynamic scope (or the equivalent
matchrighthere scope), but not ::.

Larry


Re: C:: in rules

2005-05-13 Thread Luke Palmer
On 5/14/05, Larry Wall [EMAIL PROTECTED] wrote:
 On Sat, May 14, 2005 at 01:15:36AM +, Luke Palmer wrote:
 : I think the misunderstanding is rather simple.  You keep talking like
 : you prepend a .*? to the rule we're matching.  I think that's wrong
 : (and this is where I'm making a design call, so we can dispute on this
 : once we're clear that it's this that is being disputed).  I think
 : there is a special rule:
 :
 : rule matchanywhere($rx) { .*? $rx }
 :
 : Which makes a *subrule call* to the rule we're matching.  Therefore
 : ::: just breaks out of that subrule, and backtracks into the .*?
 : again.
 
 I want ::: to break out of *that* dynamic scope (or the equivalent
 matchrighthere scope), but not ::.

I'm not sure that's such a good idea.  When you say:

rule foo() { a* ::: b }

You know precisely where that ::: is going to take you: right out of
the rule.  That's the way it works in grammars, and there's no
implicit anything else that you're breaking out of.  But you're saying
that when we use a bare // matching a string, that's no longer the
case?  In other words, this:

$str ~~ / a* ::: b /

Is different from:

$str ~~ / foo /

That seems like a pretty obvious indirection, and a mistake to break
it.  There's nothing there except foo, how could it act differently?

Luke


Re: C:: in rules

2005-05-13 Thread Patrick R. Michaud
On Sat, May 14, 2005 at 04:26:44AM +, Luke Palmer wrote:
 On 5/14/05, Larry Wall [EMAIL PROTECTED] wrote:
  I want ::: to break out of *that* dynamic scope (or the equivalent
  matchrighthere scope), but not ::.
 
 I'm not sure that's such a good idea.  When you say:
 
 rule foo() { a* ::: b }
 
 You know precisely where that ::: is going to take you: right out of
 the rule.  [...] But you're saying that when we use a bare // 
 matching a string, that's no longer the case?  In other words, this:
 
 $str ~~ / a* ::: b /
 
 Is different from:
 
 $str ~~ / foo /
 
 That seems like a pretty obvious indirection, and a mistake to break
 it.  There's nothing there except foo, how could it act differently?

Because $str ~~ / foo / puts the ::: in a subrule, whereas
$str ~~ / a* ::: b / does not.  It's the same sort of difference
that one gets between

{ return if $a; }

and

sub foo() { return if $a; }

{ foo() }

It's clear that the Creturn in the first case affects control flow in
in the current sub, while the nested Creturn of foo() in the second
case does not.

Pm


C:: in rules

2005-05-12 Thread Patrick R. Michaud
I have a couple of questions regarding C ::  in perl 6 rules.
First, a question of verification -- in

$rule = rx :w / plane :: (\d+) | train :: (\w+) | auto :: (\S+) / ;

travel by plane jet train tgv today ~~ $rule

I think the match should fail outright, as opposed to matching train tgv.
In other words, it acts as though one had written

$rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;

and not

$rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;

Does this sound right?

Next on my list, S05 says It is illegal to use :: outside of 
an alternation, but A05 has

/[:w::foo bar]/

which leads me to believe that :: isn't illegal here even though there's
no alternation.  I'd like to strike that sentence from S05.

Also, A05 proposes incorrect alternatives to the above 

/[:w[]foo bar]/# null pattern illegal, use null
/[:w()foo bar]/# null capture illegal, and probably undesirable
/[:w\bfoo bar]/# not exactly the same as above

I'd like to remove those from A05, or at least put an Update:
note there that doesn't lead people astray.  One option not
mentioned in A05 that we can add there is

/[:w?nullfoo bar]/  

which is admittedly ugly.

So, now then, on to the item that got me here in the first place.
The upshot of all of the above is that 

rx :w /foo bar/

is not equivalent to

rx /:w::foo bar/

which may surprise a few people.  The :: at the beginning of
the pattern effectively anchors the match to the beginning of
the string or the current position -- i.e., it eliminates the
implicit C .*?  at the start of the match.  To put the :w
inside the rule (e.g., in a variable or subrule), one would
have to write it as

rx /[:w::foo bar]/
rx /:wnullfoo bar/

Now then, I don't have a problem at all with this outcome -- 
but I wanted to let p6l verify my interpretation of things and
make sure it's okay for me to adjust S05/A05 accordingly.

Pm


Re: C:: in rules

2005-05-12 Thread Aaron Sherman
My take, based on S05:

On Thu, 2005-05-12 at 10:33, Patrick R. Michaud wrote:
 I have a couple of questions regarding C ::  in perl 6 rules.
 First, a question of verification -- in
 
 $rule = rx :w / plane :: (\d+) | train :: (\w+) | auto :: (\S+) / ;
 
 travel by plane jet train tgv today ~~ $rule
 
 I think the match should fail outright, as opposed to matching train tgv.

Correct, that's the meaning of ::

S05: Backtracking over a double colon causes the surrounding group of
alternations to immediately fail:

Your surrounding group is the entire rule, and thus you fail at that
point.

 In other words, it acts as though one had written
 
 $rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
 
 and not
 
 $rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;

Your two examples fail in the same way because of the fact that the
group IS the whole rule.

 Next on my list, S05 says It is illegal to use :: outside of 
 an alternation, but A05 has
 
 /[:w::foo bar]/

I can't even figure out what that means. :w turns on word mode
(lexically scoped per S05) and :: is a group-level commit. What are we
committing exactly? Looks like a noop to me, which actually might not be
so bad. However, you're right: this is an error as there are no
alternations.

 which leads me to believe that :: isn't illegal here even though there's
 no alternation.  I'd like to strike that sentence from S05.

I don't think it should be removed. You can always use ::: if that's
what you wanted.

 Also, A05 proposes incorrect alternatives to the above 
 
 /[:w[]foo bar]/# null pattern illegal, use null

Correct.

 /[:w()foo bar]/# null capture illegal, and probably undesirable

Correct.

 /[:w\bfoo bar]/# not exactly the same as above

No, I think that's exactly the same.

 So, now then, on to the item that got me here in the first place.
 The upshot of all of the above is that 
 
 rx :w /foo bar/
 
 is not equivalent to
 
 rx /:w::foo bar/

If we feel strongly, it could be special-cased, but your null solution
seems fine to me.

-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: C:: in rules

2005-05-12 Thread Jonathan Scott Duff
On Thu, May 12, 2005 at 12:53:46PM -0400, Aaron Sherman wrote:
 On Thu, 2005-05-12 at 10:33, Patrick R. Michaud wrote:
  Next on my list, S05 says It is illegal to use :: outside of 
  an alternation, but A05 has
  
  /[:w::foo bar]/
 
 I can't even figure out what that means. :w turns on word mode
 (lexically scoped per S05) and :: is a group-level commit. What are we
 committing exactly? Looks like a noop to me, which actually might not be
 so bad. However, you're right: this is an error as there are no
 alternations.

I think the definition of :: needs to be changed slightly.  You even
used a phrase that isn't exactly true according to spec but would be
if :: meant what I think it should mean.   That phrase is :: is a
group-level commit.  This isn't how I read S05 (and apparently how
you and others read it as well, hence your comment to Pm that there
are no alternations).  S05 says:

Backtracking over a double colon causes the surrounding group of
alternations to immediately fail:

I think it should simply read:

Backtracking over a double colon causes the surrounding group to
immediately fail:

In other words, the phrase of alternations is a red herring.

  which leads me to believe that :: isn't illegal here even though there's
  no alternation.  I'd like to strike that sentence from S05.
 
 I don't think it should be removed. You can always use ::: if that's
 what you wanted.

I too think it should be stricken.

  /[:w\bfoo bar]/# not exactly the same as above
 
 No, I think that's exactly the same.

What does \b mean again?  I assume it's no longer backspace?

  So, now then, on to the item that got me here in the first place.
  The upshot of all of the above is that 
  
  rx :w /foo bar/
  
  is not equivalent to
  
  rx /:w::foo bar/
 
 If we feel strongly, it could be special-cased, but your null solution
 seems fine to me.

If :: were to fail the surrounding group we can say that a rule
without [] or () is an implicit group for :: purposes.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]


Re: C:: in rules

2005-05-12 Thread Patrick R. Michaud
On Thu, May 12, 2005 at 12:53:46PM -0400, Aaron Sherman wrote:
 My take, based on S05:
 
  In other words, it acts as though one had written
  
  $rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
  
  and not
  
  $rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;
 
 Your two examples fail in the same way because of the fact that the
 group IS the whole rule.

False.  In the first case the group is the whole rule.  In the second
case the group would not include the (implied) '.*?' at the start of
the rule.  Perhaps it helps to see the difference if I write it this way:

$rule = rx :w /null[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/;

Note that the rule is *unanchored*, thus it tries at the first character,
if it fails then it goes to the second character, if that fails it goes
to the third, etc.  Thus, given:

  $rule1 = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
  $rule2 = rx :w /null[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;

  travel by plane jet train tgv today ~~ $rule1;   # fails
  travel by plane jet train tgv today ~~ $rule2;   # matches train tgv

They're not equivalent.

  Next on my list, S05 says It is illegal to use :: outside of 
  an alternation, but A05 has
  
  /[:w::foo bar]/
 
 I can't even figure out what that means. :w turns on word mode
 (lexically scoped per S05) and :: is a group-level commit. What are we
 committing exactly? Looks like a noop to me, which actually might not be
 so bad. 

Yes, the point is that it's a no-op, because

/[:wfoo bar:]/

is something entirely different.

  /[:w\bfoo bar]/# not exactly the same as above
 
 No, I think that's exactly the same.

Nope.  Consider:  

 $foo = rx /[:w::foo bar]/
 $baz = rx /[:w\bfoo bar]/

 myfoo bar ~~ $foo  # matches
 myfoo bar ~~ $baz  # fails, foo is not on a word boundary

Pm


Re: C:: in rules

2005-05-12 Thread Patrick R. Michaud
On Thu, May 12, 2005 at 12:33:59PM -0500, Jonathan Scott Duff wrote:
 
   /[:w\bfoo bar]/# not exactly the same as above
  
  No, I think that's exactly the same.
 
 What does \b mean again?  I assume it's no longer backspace?

For as long as I can remember \b has meant word boundary in
regular expressions.  :-) :-)

Pm


Re: C:: in rules

2005-05-12 Thread Uri Guttman
 PRM == Patrick R Michaud [EMAIL PROTECTED] writes:

  PRM On Thu, May 12, 2005 at 12:33:59PM -0500, Jonathan Scott Duff wrote:
   
 /[:w\bfoo bar]/# not exactly the same as above

No, I think that's exactly the same.
   
   What does \b mean again?  I assume it's no longer backspace?

  PRM For as long as I can remember \b has meant word boundary in
  PRM regular expressions.  :-) :-)

except in char classes where it gets its backspace meaning back.

:-)

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org


Re: C:: in rules

2005-05-12 Thread Aaron Sherman
On Thu, 2005-05-12 at 13:44, Patrick R. Michaud wrote:
 On Thu, May 12, 2005 at 12:53:46PM -0400, Aaron Sherman wrote:

   In other words, it acts as though one had written
   
   $rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
   
   and not
   
   $rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;
  
  Your two examples fail in the same way because of the fact that the
  group IS the whole rule.
 
 False.  In the first case the group is the whole rule.  In the second
 case the group would not include the (implied) '.*?' at the start of
 the rule.

That cannot be true. If it were, then:

s/[a]//

and

s/a//

would replace different things, and they MUST NOT. If I've missed some
fundamental way in which rx:p5/(?:...)/ is different from rx/[...]/,
then please let me know. Otherwise, we can simply demonstrate this with
P5:

perl -le 'abcaabbcc =~ /(?:aa)/;print $'

and unshockingly, that prints aa, not abcaa

 Note that the rule is *unanchored*, thus it tries at the first character,
 if it fails then it goes to the second character, if that fails it goes
 to the third, etc.  

Yes, you're correct, but when you step forward over input in order to
find a start for your unanchored expression, you do NOT consume that
input, grouping or not. To say:

$foo ~~ /unanchored/

is something like

for 0..length($foo)-1 - $i {
substr($foo,$i) ~~ /^unanchored/;
}

and always has been. Unless I'm unaware of some subtlety of [], it is
just the same as P5's (?:...), which behaves exactly this way.

I'll skip the rest of your post for now, except for the last bit, since
I think we need to resolve which universe we're in before we can give
each other street directions ;-)

   /[:w\bfoo bar]/# not exactly the same as above
  
  No, I think that's exactly the same.
 
 Nope.  Consider:  
 
  $foo = rx /[:w::foo bar]/
  $baz = rx /[:w\bfoo bar]/
 
  myfoo bar ~~ $foo  # matches
  myfoo bar ~~ $baz  # fails, foo is not on a word boundary

You're correct, sorry about that.

-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: C:: in rules

2005-05-12 Thread Patrick R. Michaud
$rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
$rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;

On Thu, May 12, 2005 at 02:29:24PM -0400, Aaron Sherman wrote:
 On Thu, 2005-05-12 at 13:44, Patrick R. Michaud wrote:
  On Thu, May 12, 2005 at 12:53:46PM -0400, Aaron Sherman wrote:
   Your two examples fail in the same way because of the fact that the
   group IS the whole rule.
  
  False.  In the first case the group is the whole rule.  In the second
  case the group would not include the (implied) '.*?' at the start of
  the rule.
 
 That cannot be true. If it were, then:
   s/[a]//
 and
   s/a//
 would replace different things, and they MUST NOT. 

No, /[a]/ is still the same as /a/ here -- I'm not discussing that at
all, nor am I implying any special [] or rule semantics.   I'm simply
referring to the fact that the rule is free to step across the
characters in the string, same as you pointed out.

Let me backtrack(!) and try a slightly different example, 
first using a group and (::)

$r1 = rx /[abc :: def | ghi :: jkl | mn :: op]/;

abcdef  ~~ $r1 # matches abcdef
xyzghijkl ~~ $r1   # matches ghijkl
xyzabcghijkl ~~ $r1# matches ghijkl

Why does the last one match?  Because it fails the group but
doesn't fail the rule -- i.e., the rule is still free to advance
its initial pointer to the next character and try again.  Contrast
this with:

$r2 = rx /abc ::: def | ghi ::: jkl | mn ::: op/;

abcdef  ~~ $r1 # matches abcdef
xyzghijkl ~~ $r1   # matches ghijkl
xyzabcghijkl ~~ $r1# fails!

This one fails, because once we match the abc, we're commited
to completing the match or failing the rule altogether.

Does this work to convince you that the two expression are indeed 
different?
   
Pm 


Re: C:: in rules

2005-05-12 Thread Aaron Sherman
On Thu, 2005-05-12 at 15:41, Patrick R. Michaud wrote:
 $rule = rx :w / plane ::: (\d+) | train ::: (\w+) | auto ::: (\S+) / ;
 $rule = rx :w /[ plane :: (\d+) | train :: (\w+) | auto :: (\S+) ]/ ;
 
 On Thu, May 12, 2005 at 02:29:24PM -0400, Aaron Sherman wrote:
  On Thu, 2005-05-12 at 13:44, Patrick R. Michaud wrote:

   False.  In the first case the group is the whole rule.  In the second
   case the group would not include the (implied) '.*?' at the start of
   the rule.

This was a very unfortunate choice of explanations, since an implied
.*? would change the semantics of the match deeply. However, your
later explanation:

 $r1 = rx /[abc :: def | ghi :: jkl | mn :: op]/;
 
 abcdef  ~~ $r1 # matches abcdef
 xyzghijkl ~~ $r1   # matches ghijkl
 xyzabcghijkl ~~ $r1# matches ghijkl
 
 Why does the last one match?  Because it fails the group but
 doesn't fail the rule -- i.e., the rule is still free to advance
 its initial pointer to the next character and try again. 

... is very understandable. Now I'm just left with a vague sense that I
never want to see anyone use :: :-)

-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: C:: in rules

2005-05-12 Thread Patrick R. Michaud
On Thu, May 12, 2005 at 05:15:55PM -0400, Aaron Sherman wrote:
 On Thu, 2005-05-12 at 15:41, Patrick R. Michaud wrote:
  False.  In the first case the group is the whole rule.  In the second
  case the group would not include the (implied) '.*?' at the start of
  the rule.
 
 This was a very unfortunate choice of explanations, since an implied
 .*? would change the semantics of the match deeply. 

I agree, my wording on this wasn't all that clear--I haven't found
a good phrase for the stepping that takes place at the beginning
of an unanchored match.  And in earlier versions of PGE, the
stepping was actually performed by a '.*?' node at the beginning
of the expression tree that didn't participate in the captured
result.  

Anyway, we're in agreement as to what :: and ::: do, so I'll propose
changes to S05/A05 and we can go from there.  Thanks! :-)

Pm



Re: C:: in rules

2005-05-12 Thread Larry Wall
On Thu, May 12, 2005 at 09:33:37AM -0500, Patrick R. Michaud wrote:
: Also, A05 proposes incorrect alternatives to the above 
: 
: /[:w[]foo bar]/# null pattern illegal, use null
: /[:w()foo bar]/# null capture illegal, and probably undesirable
: /[:w\bfoo bar]/# not exactly the same as above
: 
: I'd like to remove those from A05, or at least put an Update:
: note there that doesn't lead people astray.  One option not
: mentioned in A05 that we can add there is
: 
: /[:w?nullfoo bar]/  
: 
: which is admittedly ugly.

I would just like to point out that you are misreading those.
The [] and () above are part of pair syntax, not rule syntax.
Likewise your :w?null should be taken to :w('?null').  We used to
try to distinguish modifiers like :w that don't take an argument,
but that's a bad plan.  All colon pairs parse alike wherever they
occur.  That's why we've required space before bracket delimiters
outside, but the same constraint holds inside rules.

Which means, of course, that we should probably try to figure
what :w($x) actually means...  :-)

Speaking of which, it seems to me that :p and :c should allow an
argument that says where to start relative to the current position.
In other words, :p means :p(0) and :c means :c(0).  I could also see
uses for :p(-1) and :p(+1).

We could also pass positions as opaque objects, which is another
reason not to consider positions as mere numbers.

Larry


Re: C:: in rules

2005-05-12 Thread Patrick R. Michaud
On Thu, May 12, 2005 at 08:56:39PM -0700, Larry Wall wrote:
 On Thu, May 12, 2005 at 09:33:37AM -0500, Patrick R. Michaud wrote:
 : Also, A05 proposes incorrect alternatives to the above 
 : 
 : /[:w[]foo bar]/# null pattern illegal, use null
 : /[:w()foo bar]/# null capture illegal, and probably undesirable
 : /[:w\bfoo bar]/# not exactly the same as above
 : 
 
 I would just like to point out that you are misreading those.

Ouch, you're right!  I've been looking at patterns too long, I
guess -- thanks for the correction.  

 Speaking of which, it seems to me that :p and :c should allow an
 argument that says where to start relative to the current position.
 In other words, :p means :p(0) and :c means :c(0).  I could also see
 uses for :p(-1) and :p(+1).

Sounds good to me.

Pm