On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
> On Sun, Nov 20, 2005 at 01:26:21AM +0100, Juerd wrote:
> : Ruud H.G. van Tol skribis 2005-11-20 1:19 (+0100):
> : > Maybe
> : > "\x{123a 123b 123c}"
> : > is a nice alternative of
> : > "\x{123a} \x{123b} \x{123c}".
>
> We already have, from A5, \x[0a;0d], so you can supposedly say
> "\x[123a;123b;123c]"
Hmm, I hadn't caught that particular syntax in A05. AFAIK it's not
in S05, so I should probably add it, or whatever syntax we end up
adopting.
(BTW, we haven't announced it on p6l yet, but there's a new version of
S05 available.)
> [...]
> But I see that the semicolon is rather cluttery, mainly because it's
> too tall. I'm not sure going all the way to space is good, but we
> might have
> "\x[123a,123b,123c]"
> just to get a little visual space along with the separator.
Just to verify, with this syntax would we expect
\x[123a,123b,123c]+
to be the same as
[\x123a \x123b \x123c]+
and not "\x123a \x123b \x123c+" ?
> It occurs to me that we didn't spec whether character classes ignore
> whitespace. They probably should, just so you can chunk things:
>
> / <[ a..z A..Z 0..9 _ ]> /
>
> Then the question arises about whether <[ \ ]> is an escaped space
> or a backslash, or illegal
I vote that it's an escaped space. A backslash is nearly always \\
(or should be imho).
> But if we make it match a backslash
> or illegal, then the minimal space matcher becomes \x20, I think,
> unless you graduate to \s. On the other hand, if we make it match
> a space, people aren't going to read that way unless they're pretty
> sophisticated...
There's also <sp>, unless someone redefines the <sp> subrule.
And in the general case that's a slightly more expensive mechanism
to get a space (it involves at least a subrule lookup). Perhaps
we could also create a visible meta sequence for it, in the same
way that we have visible metas for \e, \f, \r, \t. But I have
no idea what letter we might use there.
I don't think I like this, but perhaps C<< <> >> becomes <?null>
and C<< < > >> becomes <' '>? Seems like not enough visual distinction
there...
Pm