Re: Lexing requires execution (was Re: Will _anything_ be able to truly parse and understand perl?)

Matthew Walton Fri, 26 Nov 2004 05:55:43 -0800

Randal L. Schwartz wrote:

"Matthew" == Matthew Walton <[EMAIL PROTECTED]> writes:

Matthew> So you're saying that in Perl 6 it will be entirely impossible to
Matthew> determine if / appears as the division operator or as the beginning of
Matthew> a regex from a purely syntactic examination of the source code?

Yes.

Matthew> I'm finding that very, very hard to believe. Regexps aren't valid
Matthew> where /-the-operator is, after all.

And that's precisely why Perl can work as it does.  If an operator is
expected, / is divide.  If a term is expected, / is the beginning of a
regex.  This has been true since Perl1 (maybe 0).  There are a few
other characters that also work similarly, but / is the most frequent
and most troublesome.  And it got worse for Perl5, because of
user-defined prototypes, which as far as I can tell, are still present
in Perl6.

Perl 6 has formal parameters for subs, methods etc. I don't see any mention of Perl 5-style prototypes in S6, and I honestly can't see how they could possibly fit with formal parameters. Hopefully Larry or someone can clarify whether they still exist or not.

If they don't still exist, this eases the problem somewhat, but not entirely I understand. Being able to call subs and methods without parentheses around the argument lists causes problems; a quick scan of the updated Synopses failed to reveal the rules for that in Perl 6.

Your impression is wrong.  In the presence of user-defined prototypes,
you *must* execute the code that might alter a prototype in order to
determine whether / is a divide (and therefore standalone token) or
the beginning of a regex (and therefore must locate the end of the
regex to properly be a token).

Since Perl 5 style prototypes don't appear to exist anymore, this may be easier. I don't believe that the addition of the // operator compounds the problem anymore, because hopefully by that point it was possible to determine that you've seen an operator.

The Perlmonks article throws up a lot of very nasty cases. Not knowing the entire current language definition by heart, I can't say this with absolutely certainty, but I retain the belief that Perl 6 is at least *easier* to deal with than Perl 5.

It is also possible that telling the difference between /-as-divide and /-as-regex becomes much easier if lookahead is employed in the tokeniser. Unfortunately, that makes the tokeniser much more complicated, and it's just a vague and random idea.

Re: Lexing requires execution (was Re: Will _anything_ be able to truly parse and understand perl?)

Reply via email to