Re: A rule by any other name...

Audrey Tang Tue, 09 May 2006 18:34:08 -0700

Allison Randal wrote:
> More importantly, whitespace skipping isn't a very significant option in
> grammars in general, so creating two keywords that distinguish between
> skipping and no skipping is linguistically infelicitous. It's like
> creating two different words for "shirts with horizontal stripes" and
> "shirts with vertical stripes". Sure, they're different, but the
> difference isn't particularly significant, so it's better expressed by a
> modifier on "shirt" than by a different word.


This is not only "space" skipping; as we discussed, <ws> skips over
comments as well as spaces, because a language (such as Perl 6) can
defined its own <ws> that serves as valid separator. To wit:

    void main () {}
    void/* this also works */main () {}

Or, in Perl 6:

    say time;
    say#( this also works )time;

> From a practical perspective, both the Perl 6 and Punie grammars have
> ended up using 'token' in many places (for things that aren't tokens),
> because :words isn't really the semantics you want for parsing computer
> languages. (Though it is quite useful for parsing natural language and
> other things.) What you want is comment skipping, which isn't the same
> as :words.

Currently it's defined, and used, the same as :words.

I think the confusion arises from <ws> being read as "whitespace"
instead of as "word separator".  Maybe an explicit <wordsep> can fix
that, or maybe rename it to something else, but the token/rule
distinction of :words is very useful, because it's more usual for
languages to behave like C and Perl 6, instead of:

    ex/* this calls exit */it();

which is rarer, and can be treated with separate "token" rules than <ws>.

Audrey

signature.asc
Description: OpenPGP digital signature

Re: A rule by any other name...

Reply via email to