So now that we contemplate a typed AST, we run into a new problem: how to
type the code emitted by the parser generator.

If the parse algorithm is any of the "top down" family of algorithms (LL,
PEG, GLL, others), there's no problem. Each production can be implemented
as a procedure, and each procedure returns its result AST as a return
value. The reason this can be typed using a conventional static type system
is that we are't using a separate parse stack.

But if the parse algorithm is one of the "bottom up" family of algorithms
(LR, GLR, others), the typing seems more complicated, because all of the
shift-reduce algorithms use a separate parse stack.

The reason a separate parse stack is a bother is that a given "position" in
the parse stack can have distinct types over time as the parse tree is
constructed; we cannot assign any simple static type to a position.
Off-hand, I can't even see how to type the parse stack using dependent
types. I'm sure a way exists, and I'm pretty sure one of the type system
wizards on the list will promptly trot it out and say "it's obvious", but
I'm a simple-minded old-school hacker at heart, and it's not jumping out at
me. Sure, you can use a union type as your stack slot, but it seems
unfortunate to do that when the parse state already tells you what you need
to know.

Thankfully, there are many reasons to prefer recursive-descent techniques
when they will solve your problem. Most notably error handling. But one of
the places where the shift-reduce approach is actually *better* is in IDEs.
In an IDE, you need to be able to deal with fragmented and/or incomplete
input. Think of it as parse subtrees interspersed with junk. Under those
conditions, the top-down approach simply doesn't work.

You could, of course, make the parse stack element a union type, but then
you'd end up doing a lot of run-time type checking that you don't need to
do (because the parse state already encodes that).

It dawned on me this morning that this is why most textbook presentations
of ASTs use a weakly typed AST structure: doing so preserves the ability to
use a separate parse stack.


It seems curious that when relative strengths of parsing strategies are
discussed, nobody ever talks about static typing. Or at least I've never
*seen* anyone do so.


shap
_______________________________________________
bitc-dev mailing list
bitc-dev@coyotos.org
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to