Re: Perl's parser and lexer will likely be in Perl (was Re: RFC 334 (v1) I'm {STILL} trying to understand this...)

Simon Cozens Tue, 17 Oct 2000 02:30:52 -0700

On Tue, Oct 17, 2000 at 03:56:20AM -0400, Adam Turoff wrote:
> > We could learn quite a bit by looking through the code from
> > Parse::RecDescent, switch.pm, and friends. Damian's done a lot of parsing
> > (including parsing Perl) with Perl, so this would be a good place to start.

It's time to drag out my quote of the week:

    Recursive-descent, or predictive, parsing ONLY works on grammars
    where the first terminal symbol of each subexpression provides
    enough information to choose which production to use.

(Appel, emphasis mine.)

> Gisle and I were talking about this tonight, and it *might* be possible
> to write the Perl tokenizer in a Perl[56] regex, which is more easily 
> parsable in C.  All of a sudden, toke.c is replaced by toke.re, which
> would be much more legible to this community (which is more of a strike
> against toke.c instead of a benefit of some toke.re).  That would certainly
> qualify as implementing the Perl grammar in Perl, and might even be
> achievable.   (*gasp!*)

This would have to take account of the fact that Perl's tokeniser is
aware of what's going on in the rest of perl. Consider

    print foo;

What should the tokeniser return for "foo"? Is it a bareword? Is it a
subroutine call? Is it a class? Is it - heaven forbid - a filehandle? 
Well, it could be any of these things. You have to choose.

So, while I don't doubt that, with the state of Perl's regexes these
days, it's possible to create something with enough sentience to
tokenize Perl, I've really got to wonder whether it's sane.

-- 
BEWARE!  People acting under the influence of human nature.

Re: Perl's parser and lexer will likely be in Perl (was Re: RFC 334 (v1) I'm {STILL} trying to understand this...)

Reply via email to