Re: Regex stuff...
Piers Cawley wrote: > If I replace C<< ($key, $val) >> with > > @ary = m/<$pattern>/ > > and the match succeeds, how many elements are there in @ary? Zero. No explicit captures in that pattern. > Suppose you want to use a hypothetical variable to bind a name to > a capture: > > / (\S+) { let $x := $1 } / > > A shorthand for that is: > > / $x:=(\S+) / > > The parens are number independently of any name, so $x is an alias > for $1. > > And it's that last sentence that's important here. So, it looks like > C<< +@ary >> is going to be 4. No. Only explicit paren captures add to the array attribute of the match object. Implicit or explicit named captures add to the *hash* attribute of the match object. > m: w / $2:=(\S+) = $1:=(\S+) / > > Note that, left to their own devices, those grouping parentheses would > generate the $1 and $2 in the order given. > Now, assignment to hypotheticals doesn't happen all at once, it > happens when the matching engine reaches that point in the > string. Unless I'm very much mistaken, the order of execution will > look like: > > $2:=$1; $1:=$2; > > And it seems to me that we'll end up with $1 and $2 both bound to the > same string; which isn't working quite how I expect. Or do we special > case the occasions when we bind the result of a capturing group to a > numeric match variable? That's my understanding. If you *explicitly* bind a captured group to a numbered hypothetical, then the capture doesn't also implicitly bind to a numbered hypothetical. Damian
Re: Regex stuff...
Piers Cawley wrote: > Unless I'm very much mistaken, the order of execution will > look like: > > $2:=$1; $1:=$2; You're not binding $2:=$1. You're binding $2 to the first capture. By default $1 is also bound to the first capture. Assuming that numbered variables aren't special, the order of execution is: $2:=$1:=first; $1:=$2:=second; That doesn't make any sense though, so numbered variables must be treated specially -- an explicit numbered binding replaces the default numbered binding. So, the order of execution is really: $2:=first; $1:=second; I think this solves both of your puzzles. One last thing though. Binding might be done at compile-time because it changes variables, not the values of variables. Thinking about binding as a compile-time declaration might be easier than thinking about run-time execution order. Thinking about binding as a compile-time thing, the rule / $2:=(\S+) = $1:=(\S+) / becomes / [\S+] = [\S+] / - Ken
Re: Regex stuff...
On 31 Aug 2002 at 10:26, Piers Cawley wrote: > > my $pattern = rx:w / $1:=(\S+) = $2:=(\S+) | > > $2:=(\S+) = $1:=(\S+) /; > > Count the capturing groups. Looks like there's 4 of 'em to me. $1, $2, > $3 and $4 are automatic variables which, according to the Apocalypse > get set for every capturing group independent of any named variable to > which they might also be bound. Not if those capturing groups have been renumbered. >From A5: > You can reorder paren groups by naming them with numeric variables: > > / $2:=(.*?), \h* $1:=(.*) / > If you use a numeric variable, the > numeric variables will start renumbering from that point, so > subsequent captures can be of a known number (which clobbers any > previous association with that number). So for instance you can > reset the numbers for each alternative: > > / $1 := (.*?) (\:) (.*) { process $1, $2, $3 } > | $1 := (.*?) (=\>) (.*) { process $1, $2, $3 } > | $1 := (.*?) (-\>) (.*) { process $1, $2, $3 } > / So binding to $1 etc is a special case. Your example never captures to $1..$4 but only to $1,$2 according to the renumbering. Note that it's actually called 'reordering/renumbering' instead of 'binding' in A5 for numeric variables. -- Markus Laire 'malaire' <[EMAIL PROTECTED]>
Re: Regex stuff...
On 31 Aug 2002 at 0:17, Piers Cawley wrote: > my $pattern = rx:w / $1:=(\S+) = $2:=(\S+) | > $2:=(\S+) = $1:=(\S+) /; > > @ary = m/<$pattern>/ > > how many elements are there in @ary? I can > make a case for 4 quite happily. Certainly that's what A5 seems to > imply: > > Suppose you want to use a hypothetical variable to bind a name to > a capture: > > / (\S+) { let $x := $1 } / > > A shorthand for that is: > > / $x:=(\S+) / > > The parens are number independently of any name, so $x is an alias > for $1. > > And it's that last sentence that's important here. So, it looks like > C<< +@ary >> is going to be 4. How could it be 4? If the example would've been > my $pattern = rx:w / $a:=(\S+) = $b:=(\S+) | > $b:=(\S+) = $a:=(\S+) /; Then there is 4 variables to speak of ($1,$2,$a,$b) and a question arises about which of these are returned. In the original example however we only have 2 variables ($1,$2) so it can't really return anything else than those 2. > m: w / $2:=(\S+) = $1:=(\S+) / > > Now, assignment to hypotheticals doesn't happen all at once, it > happens when the matching engine reaches that point in the > string. Unless I'm very much mistaken, the order of execution will > look like: > > $2:=$1; $1:=$2; > > And it seems to me that we'll end up with $1 and $2 both bound to the > same string; which isn't working quite how I expect. Or do we special > case the occasions when we bind the result of a capturing group to a > numeric match variable? As I understand it, binding to $1 etc.. is a special case. Also I don't see any problems in your example: m: w / $2:=(\S+) = $1:=(\S+) / First () is captured and assigned to $2 (instead of $1). Then second () is captured and assigned to $1. -- Markus Laire 'malaire' <[EMAIL PROTECTED]>