On 12 Jun 2015 08:57, "Matt Oliveri" <atma...@gmail.com> wrote: > > (I think you're replying to me.) > > On Fri, Jun 12, 2015 at 2:43 AM, Keean Schupke <ke...@fry-it.com> wrote: > > What do you mean by a typed AST? > > Sorry. I mean having multiple AST types, approximately one per CFG nonterminal. > > > My point was that in the AST a function node accepts N polymorphic > > sub-expressions (one for each argument). You cannot statically type the > > number of arguments or the types of each argument as the language you are > > compiling is not the same as the compilers host language, it requires > > runtime checks. > > Well then we're talking about different ways of typing the AST. I'm > just saying we use static types to keep track of which nonterminal a > node is an instance of. There is no polymorphism in the AST types > themselves, because CFGs don't have polymorphism. (Well, I used > polymorphism to factor out common fields, but this can be statically > unfolded.) > > We (you and I) really shouldn't try to discuss enforcing anything > stronger till we reach common ground on refinement types, which are > the smart way to do it. > > In case we're understanding Shap differently again, I think he's just > pointing that even this low level of enforcement causes trouble for > typing a shift-reduce parser. > > > So all nodes in my AST are subclasses of a generic AST node class, this is > > no problem as the nodes need to be polymorphic anyway. We don't know until > > runtime if a an abstraction will be in an application node etc (we do know a > > var node is a leaf node). > > OK, I think we're also running into the ambiguity of "polymorphic" > now. You're talking about subtype polymorphism, and a form (like > application) would be a subclass of the AST class. I was thinking of > ML, and there's no parametric polymorphism or subtyping, just a number > of cases for each type. > > > For an LR parser the stack is just a vector of > > pointers to AST nodes (the base class of all AST nodes). Tree waking is done > > using the visitor pattern, type-safely recovering the full type of each node > > without casting. > > This just sounds like using one AST type. > > > In a functional language I would use a disjoint union for the node type (as > > the tag in the disjoint union encodes the necessary runtime polymorphism). > > To try and use types to encode this, where you are pushing runtime > > polymorphism into the type system, and then to try and ignore that type > > using a heterogeneous list (lime a list with existential type) seems > > mistaken to me, when the fact is this is necessarily runtime polymorphism, > > and that is most simply and directly encoded as a disjoint union type. You > > gain nothing by trying to be tagless, as the type system will have to insert > > hidden tags to recover the runtime polymorphism. It seems a classic example > > of obfuscation by making things more complex than they needs to be, when the > > simple solution is the right one. > > Keean, the goal here is not to improve performance over the > one-big-tagged-union option. It is indeed clear that you can't do > better, if you want AST nodes to be useful. The goal is to know, > statically, that the parsed AST will always conform to the constrained > structure allowed by the original CFG.
Okay, I see. Can we always encode the CFG in the compiler host type-system? What if well-typedness depends on intersection or linear types, or some other sub-structural type system that is not expressible in the host type-system? Can we not rely on being well-typed by construction? So if the parser only outputs well typed fragments, that's enough? Somewhere we need a specification of the grammar, if you are using parser combinators we could say the combinators are the grammar specification, so what they output is always well typed. Keean.
_______________________________________________ bitc-dev mailing list bitc-dev@coyotos.org http://www.coyotos.org/mailman/listinfo/bitc-dev