OK. It turns out that I'm a dimwit. But y'all knew that, didn't you.

Suitably implemented combinator parsers can behave like GLR or GLL parsers.
They pursue all of the possible parses simultaneously and emerge with one
that produces a correct tree. In consequence, there is really no notion of
a shift/reduce or reduce/reduce conflict. There has been some good work on
implementing error diagnostics reasonably well in the combinator parser
world. The latest version of boost::spirit actually does a decent job of
this.

The remaining problem appears to be that combinator parsers remain somewhat
less efficient than conventional parsers, mainly because there is no
opportunity to precompile and optimize the grammar. Jeff Overby's work
manages to get within a factor of 5 (speed) and a factor of 1.5 (space)
relative to an *ad hoc* parser. Daniel Spiewak has shown that GLL parsers
can be implemented in combinator style, and that these run on O(n) time
when the input happens to satisfy LL(1), and in O(n^3) worst case, which is
encouraging. The other issue is realization of an aggressive lexer, though
I personally do not see that as a major problem.

If all of this holds, I'm still not sure that I want to bet on the
boost::spirit library, but the *notion* of a combinator parser for BitC
isn't inherently silly.

But in that case, there remains the question: should it be a goal (at all)
to keep the BitC grammar within some family that can be conventionally
implemented? Though I've been aware of GLR and PEG parsers for many years,
it has always been my sense that an LALR, LR(k) or LL(k) grammar is a
better thing to do in a programming language, mainly because these
approaches eliminate parse ambiguities very early and therefore admit a
very efficient implementation. I would have said that the ability to guard
against infinite recursion is also a factor in my thinking, but Spiewak's
work shows how that issue is resolved. It's actually a very nice trick,
similar in character to Thompson's RE machine.

The principal attractions of going to a combinator approach are that (a) it
eliminates an external tool (the parser generator), and (b) it exercises
the notion that implementing DSLs with local language extensions is
actually a good thing to do.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to