Re: One-pass parsing and forward type references

Larry Wall Mon, 01 Feb 2010 09:33:07 -0800

Please don't assume that rakudo's idiosyncracies and design fossils
are canonical.  STD does better namespace management in some respects,
particularly in accepting the approved predeclaration form:


    class Foo {...}

(and rakudo might now accept this too).

You don't want to use augment for this, because augment is only for
*cheating*, and will eventually require a 'use MONKEY_TYPING;' before it.

You are correct that the one-pass parsing is non-negotiable; this is
how humans think, even when dealing with unknown names.  However, human
language has ways of telling the listener whether a new word is a noun
or a verb, and we generally choose not to require articles or verb
markers in Perl 6, so class names (nouns) are officially ambiguous
with function names (verbs).  We could, as has been suggested before,
have an optional noun marker for undeclared classes, but this tends
to insidiously creep into lots of places, when a single predeclaration
would be clearer and more efficient.

The other reason we'll stick with one-pass parsing is that you can
only do multiple pass parsing if you know exactly which language
you're parsing all the time.  This works against extensibility.
Multiple passes introduce linguistic race conditions any time you're
not sure what language you're in; see source filters for a good bad
example of why you don't want to do multi-pass parsing.

Because type names are nouns in Perl 6, we want to parse them like
other nouns, such as variables and constants, not like verbs.
Know whether something is a noun or a verb is very important in
maintaining the two-terms-in-a-row prohibition, which is the
primary way in which Perl 6 expressions are self-clocking.  Break
that clock and your error messages will turn to mush.  Some of
the smartest error messages that STD emits are actually the
two-terms message transmogrified into something more specific
by context.

Therefore, a word that is declared as a sigilless noun must change
the parsing from there on down so that we know to expect an infix
after it, not a term, which a verb functioning as a listop would
expect.

    constant Int pi = 3;
    pi 'foo';   # syntax error

(Notice how the fact that Int is predeclared also allows us to
correctly parse the constant declaration, since allows us to
distinguish the existing type from the name to be declared.)

Now that we have basic philosophy out of the way, on to other
arguments.  You might take the predeclaration policy as subtle
encouragement to use the late binding provided by method calls,
which don't require predeclaration.  But Perl 6 is not religiosly
OO to the exclusion of FP, so that's a weak argument, and we're not
trying to discourage people from doing cross-module function calls.

You make an analogy between type recursion and functional recursion,
which is true up to a point, but types and other unsigilled names
tend to be rarer than function names, so it makes sense, if we see
an unknown name, to assume (if we choose to assume) that it is a
postdeclared verb rather than a postdeclared noun.  We don't want to
break function recursion in order to support type recursion.  We can't
guess it both ways since we wouldn't know what to expect next.  Perl 5
did lookahead and tried to guess in some cases, but this generally
turned out to be a mistake.  It was difficult to document and only
inspired mistrust in the parser.  Plus it degrades the two-term rule.

But also note that there are several other ways to predeclare
types implicitly.  The 'use', 'require', and 'need' declarations
all introduce a module name that is assumed to be a type name.

We can also deduce that any use of Foo::Bar::baz to mean
there exists a Foo::Bar type.  STD does this already, though
perhaps only as a lexically scope name at the moment.  We might
assume it to (at least potentially) be global.

A 'use' also predeclares any types in its own space and makes the
public ones available as predeclaration.  From the standpoint of a
typical midsized application that are spread across multiple files,
most of the predeclarations should arrive like this, and not require
a yadayada predeclaration.  And in Perl 6 you shouldn't generally
be using types without a 'use' of the definitions of those types,
or you might not import some desirable multi into your lexical scope.

Of course, we have to deal with 'use' cycles.  Perl 5 simply assumes
that it doesn't need to load a module that it knows is already being
loaded, even if that loading is not complete or would result in
something not being defined that ought to be.  We could take the same
agnostic approach and require the user to break use cycles explicitly
with yada.

Or we could detect that exact situation and relax the rules slightly
to do a bit of dwimmery on type names that *start* with the
partially compiled module.  In that case, it might be reasonable,
if we've got a suspended 'use Foo', to assume:

    Foo::Bar::baz       # likely a function
    Foo::Bar::Baz       # likely a class name

and maybe even allow an assumed function to transmute into a known
class if we guessed wrong, as long as they didn't try to use as a
listop with args.

Other heuristics are also possible.  We could, in this case
perhaps, relax the rule on multi-pass and take a preliminary grep
of the suspended module for things that look like type or function
declarations.  Perhaps the user would choose a heuristic by pragma.

But I also think that type recursion is likelier to indicate a design
error than function recursion, so I'm not sure how far down this road
we want to go.  We could, for instance, create a new type name every
time we see

    has Unknown::Name $.foo;

that would eliminate most of the circularity problem for has-a
relationships, at the expense of not catching typos till run time.
Hard to know where to balance that.  But in general, given that
you ought to have used the appropriate definitions in the first
place, I tend to bias in favor of catching typos no later than
CATCH time.

Hope this helps, or I just wasted a lot of time.  :-)

Larry

Re: One-pass parsing and forward type references

Reply via email to