On Sat, Apr 16, 2005 at 11:30:49AM -0700, Larry Wall wrote:
: The basic rule of thumb is that we pretend we're a top-down parser
: even if we aren't, and we only look for the trailing delimiter when
: we're not trying to parse something embedded that would naturally
: slurp up the trailing delimiter as part of the internal construct.
: Certainly any kind of bracketing structure hides anything inside it
: from the delimiter scanner, but so do tokens like identifiers.

I think I have to clarify what I mean by that last phrase.  Trailing
delimiters are hidden inside any token that has already been started,
but not at the start of a token (where token is taken to be fairly
restrictive).  Therefore these are errors:

    qq. $foo.bar() .
    qq: @foo::bar[] :

However

    qq/ &foobar( $a / $b ) /

is just fine, since (...) is looking for its own termination.
Basically we don't have to keep track of sets of terminators (unless
we want to use that info after a syntax error to make hypotheses and
explore alternate realities in the service of better error messages).

Given our plan of a hybrid parser with a bottom-up operator precedence
parser sandwiched between top-down parsers, and assuming that "."
is the tightest operator that the bottom-up expression parser treats as
an operator, it more or less comes down to the fact that anything the
expression parser pulls in as a single term is going to be treated
as a construct that ignores any outer delimiters because it's calling
out to a lower-level top-down parser at that point to parse the term
in question.

Hmm, I guess there's still a little ambiguity in there in the case of
lookahead.  And the fact is, a construct like

    qq. $foo.bar() .

either has to do some lookhead or some backtracking to determine that
the entire interpolated expression ends with a bracketed construct,
since we've said that

    " $foo.bar() "

interpolates $foo.bar(), while

    " $foo.bar "

interpolates only $foo.  (With similar constraints on array and hash
interpolation.)  So it's possible that

    qq. $foo.bar() .

could parse okay if we treat the () as a terminator that some grammatical
construct is looking ahead for.  But given that $foo is the one interpolator
that doesn't require trailing brackets, it seems like it's terribly
ambigous in this case.  However, only dot has that problem, and with

    qq: @foo::bar[] :

you know it requires the [] to interpolate at all.  So I guess this is one
of those we can argue both ways.  The chance of someone writing

    qq:@foo::bar[]

when they mean

    qq:@foo: :bar[]

seems fairly remote.  So my best guess at this point is that we should
let the interpolative lookahead hide the trailing delimiter also, and
that is probably what the user expects in any event, since when they
were writing the expression, the nearby context is the preceding
term, but the distant context is the delimiter, which they've probably
just forgotten is potentially ambiguous.  So let's just resolve it
that way without telling them.

I guess this is the one place we're requiring arbitrarily long lookahead
to figure things out, since we interpolate

    " @foo::bar::baz::fee::fie::foe[] "

but not

    " @foo::bar::baz::fee::fie::foe "

under the current rules.  I think the lookahead doesn't have to parse
past the [ (or other opener), though.  All it has to decide is whether
the next : (or dot) is to be treated as part of the interpolation.  So
this is a syntax error (of the runaway "" variety, presumably):

    " @foo::bar::baz::fee::fie::foe[ "

Larry

Reply via email to