Trey Harris wrote:
> On second reading, it occurs to me that this wouldn't work quite right,
> because the :w would imply a \s+ between <lt> and <identifier>, between
> the equals, and before the <gt>.
No. Under :w you get \s+ between literal sequences that are potential identifiers, and
\s* between anything else. So your:
> rule parsetag :w {
> <lt> $tagname := <identifier>
> %attrs := [ (<identifier>) =
> (<val>)
> ]*
> /?
> <gt>
> }
is really:
rule parsetag :w {
\s* <lt> \s* $tagname := <identifier>
%attrs := [ \s* (<identifier>) \s* =
\s* (<val>)
]*
\s* /?
\s* <gt>
}
Which matches valid tags (and some invalid ones too).
> Does an explicit space assertion in :w
> automatically suppress the implicit ones on either side?
Yes.
> I.e., would
>
> rule parsetag :w {
> <lt> \s* $tagname := <identifier>
> %attrs := [ (<identifier>) = <val> ]*
> \s* /?
> <gt>
> }
>
> Work? Or would I have to be explicit about everything:
To get the (lack-of-)spacing rules you probably desire, you'd only have
to be explicit only where the default rules are inappropriate:
rule parsetag {
<lt>[$tagname:=<identifier>] \s+
%attrs := [ (<identifier>)=(<val>) ]*
/?<gt>
}
> It strikes me that this is a problem crying out for a DWIMmy
> solution--something that could deal with whitespace in a common way, i.e.,
> required between tokens that can't otherwise be differentiated.... am I
> missing something?
Yes. You're missing:
Another new modifier is :w, which causes an implicit match of
whitespace wherever there's literal whitespace in a pattern. In
--> other words, it replaces every sequence of actual whitespace in
--> the pattern with a \s+ (between two identifiers) or a \s*
--> (between anything else). So
m:w/ foo bar \: ( baz )*/
^
really means (expressed in Perl 5 form):
m:p5/\s*foo\s+bar\s*:(\s*baz\s*)*/
^^^
You can still control the handling of whitespace under :w,
since we extend the rule to say that any explicit
whitespace-matching token can't match whitespace implicitly on
either side. So:
m:w/ foo\ bar \h* \: (baz)*/
really means (expressed in Perl 5 form):
m:p5/\s*foo bar[\040\t\p{Zs}]*:\s*(baz)*/
Damian