I'm not sure this idea is generally useful enough to warrant inclusion as a
language feature, but I want to describe it to get reactions. We actually
have *part* of this concept already embedded in BitC, but not all of it.

My personal use case derives from my ASTMaker tool. I don't remember any
more which little language prompted me to build astmaker. Originally I said
"oh, I'll just do the AST as a union type (in C)", but then I wanted a
consistency check to make sure that I had parent/child relations right. So
I started to build a tool that allowed me to describe the desired tree data
structure and emitted the corresponding C++ struct for me. The same tool
emits a recursive-descent sanity checking procedure that can be pointed at
any node and will determine whether the child notes have the right type tag
(the tool also emits a union of these type tags).

It turns out that the check cannot be expressed statically in any language
that I happen to know. To see why, consider that there are many places
where we want to express "constrained choice". So, for example, if we have
four-function arithmetic, we might want to say something like

AST mulexpr = expr expr

where expr can be any of the concrete AST types { mulexpr, divexpr,
addexpr, subexpr }.

But we can't express what we want using subtypes. That is, we can't model
this as

  mulexpr <: expr <: AST

this is because there are sometimes overlapping groupings. For example,
there might be places where we want to disallow "divexpr" but allow the
other three (it comes up in kind classes in habit).

I'm scratching my head for what to call this. We need something
pronounceable, but the best description might be to say that expr is a
"constrained union value". When "expr" appears as a type, we are saying
that the corresponding value is of type AST, but that the value is further
constrained in such a way that the leg type of the value must be one of
mulexpr, divexpr, addexpr, or subexpr. Knowing this is useful because it
absolves us from the requirement to build case statements that check leg
types that we know cannot be present.

expr can also be thought of as a subtype of AST - sort of. I say "sort of"
because two different constraints expr and notdivexpr might (intentionally)
overlap. This means, for example, that if we see a constructor of the form:

  mulexpr(e1, e2)

we can safely infer AST, but can't decide whether we are looking at an expr
or a notdivexpr.

It turns out that we already have something like the necessary inference
mechanism to deal with this, because we use it internally for the various
integer sizes. And in point of fact, "expr" has a lot of the flavor of
"int8", which is the subset of signed integers that fit in 8 bits. int8 and
int16 can be viewed as overlapping subunions. The main difference is that
we presently treat int8 as a constraint, where we'd sort of like to be able
to write int8 and expr as types.

We also have support in a different sense: a bitc union-case expression
allows legs to match more than one case. So, e.g., one could match mulexpr
and addexpr in a single union-case leg. What happens in this case is that
we construct an anonymous type that is the intersection of the leg types.
It contains only those fields whose type and position are identical in both
legs. This provides us with most of what we need to be able to dispatch
over subunions in a concise way in most of the practically useful cases.

Does this notion of subunions strike people as having enough utility to
warrant elevating it to a language feature? What other use-cases might
motivate it (if any)?


shap
_______________________________________________
bitc-dev mailing list
bitc-dev@coyotos.org
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to