On Sun, Nov 20, 2005 at 10:27:17AM -0600, Patrick R. Michaud wrote:
: On Sat, Nov 19, 2005 at 06:32:17PM -0800, Larry Wall wrote:
: > On Sun, Nov 20, 2005 at 01:26:21AM +0100, Juerd wrote:
: > : Ruud H.G. van Tol skribis 2005-11-20  1:19 (+0100):
: > : > Maybe 
: > : >     "\x{123a 123b 123c}" 
: > : > is a nice alternative of 
: > : >     "\x{123a} \x{123b} \x{123c}". 
: > 
: > We already have, from A5, \x[0a;0d], so you can supposedly say 
: >     "\x[123a;123b;123c]" 
: 
: Hmm, I hadn't caught that particular syntax in A05.  AFAIK it's not 
: in S05, so I should probably add it, or whatever syntax we end up 
: adopting.

Yes.

: (BTW, we haven't announced it on p6l yet, but there's a new version of
: S05 available.)

Indeed, there are new versions of most of the S's.  People who want the
latest should use svn.perl.org, which also makes it easy to do diff listings
with svn or svk.

: > [...]
: > But I see that the semicolon is rather cluttery, mainly because it's
: > too tall.  I'm not sure going all the way to space is good, but we
: > might have
: >     "\x[123a,123b,123c]" 
: > just to get a little visual space along with the separator.  
: 
: Just to verify, with this syntax would we expect
: 
:     \x[123a,123b,123c]+
: 
: to be the same as
: 
:     [\x123a \x123b \x123c]+
: 
: and not "\x123a \x123b \x123c+" ?

Yes.  I think the rule interpretation of \x is that it is a sequence to
be considered a single character regardless of its context.  Certainly
the square brackets we've mandated would tend to read as grouping anyway.

Of course, the main point of the \x[a,b,c] notation is to allow
interpolation of sequences of hex characters into ordinary strings,
and those don't care about abstract character boundaries.

: > It occurs to me that we didn't spec whether character classes ignore
: > whitespace.  They probably should, just so you can chunk things:
: > 
: >     / <[ a..z A..Z 0..9 _ ]> /
: > 
: > Then the question arises about whether <[ \ ]> is an escaped space
: > or a backslash, or illegal  
: 
: I vote that it's an escaped space.  A backslash is nearly always \\
: (or should be imho).
: 
: > But if we make it match a backslash
: > or illegal, then the minimal space matcher becomes \x20, I think,
: > unless you graduate to \s.  On the other hand, if we make it match
: > a space, people aren't going to read that way unless they're pretty
: > sophisticated...
: 
: There's also <sp>, unless someone redefines the <sp> subrule.

But you can't use <sp> in a character class.  Well, that is, unless
you write it:

    <+[ a..z ]+<sp>>

or some such.  Maybe that's good enough.

: And in the general case that's a slightly more expensive mechanism 
: to get a space (it involves at least a subrule lookup).  Perhaps 
: we could also create a visible meta sequence for it, in the same 
: way that we have visible metas for \e, \f, \r, \t.  But I have 
: no idea what letter we might use there.

Something to be said for \_ in that regard.

: I don't think I like this, but perhaps  C<< <> >> becomes <?null> 
: and C<< < > >> becomes <' '>?  Seems like not enough visual distinction
: there...

<_> maybe.  I'm good with <> being <?null>, and <,> being element boundary
when matching lists.  But I'd like to reserve < > for delimiting what
is returned by $<>, the string officially matched:

    "foo bar baz" ~~ /:w foo < \w+ > baz/
    say $/;     # foo bar baz
    say $<>;    # bar

Or possibly

    "foo bar baz" ~~ /:w foo << \w+ >> baz/

but that should probably mean whatever

    "foo bar baz" ~~ /:w foo « \w+ » baz/

eventually means.  Which I haven't the foggiest.  But we should probably
reserve the brackets on general principle's sake, just because brackets
are so scarce.

I dunno.  If «...» in ordinary code does shell quoting, maybe «...» in
rules does filename globbing or some such.  I can see some issues with
anchoring semantics.  Makes more sense on a string as a whole, but maybe
can anchor on element boundaries if used on a list of filenames.
I suppose one could even go as far as

    rule jpeg :i « *.jp{e,}g »

or whatever the right glob syntax is.

Larry

Reply via email to