I've hit a mixfix-related snag, and it's inducing me to consider a fairly
major change in the mixfix model.

In Haskell, the precedence of type annotation is very low. This has the
consequence that type annotation lacks a certain symmetry:

  a:int + b => (a:int) + b
  a+b:int => (a+b):int
  f a + b:int => ((f a) + b):int
  a + f b :int => (a + (f b)):int

In C, where function call has higher precedence than cast, but almost
everything else has lower precedence. Thus:

  a + f(b) : int => a + ((f(b)):int)

My opinion is that this was a bad decision in Haskell. It works out because
(a) all mixfix in haskell is infix, and (b) the majority of operators return
the same type as their rightmost argument, with the result that the parse
does not surprise the human if they mistakenly expect tighter binding of
annotation. If Haskell instead defined arithmetic as C does (that is: using
three types), it would generate a wealth of surprises. For example, the
following would not type check:

  a:int32 + b:int16 => ((a:int32) + b):int16   ;; has no match in arith3
type class

For BitC, my intuition is that type annotation wants to occupy roughly the
same precedence position as C-style cast, which is to say that application
binds more tightly than type annotation.


One reason that mixfix works cleanly in ML/Haskell is that the "band" of
precedence covered by mixfix is contiguous. I don't know if this is true in
Scala. By "contiguous", I mean all non-mixfix parse interpretation happens
at strictly higher precedence than the mixfix layer or at strictly lower
precedence than the mixfix layer.

Unfortunately, if type annotation binds less tightly than function
application in BitC, this simplification is lost. Pragmatically, this has
several implications:

  1. Our mixfix engine has to have a limited notion of syntactic categories
for mixfix holes.
  2. There is no longer a clean phase separation between mixfix type
processing and
      mixfix expression processing. They remain syntactically distinct, but
the respective
      trees get constructed by a combined engine pass.

My inclination here is to take a step in the direction of Coq (and Scala?)
and introduce syntactic categories into the BitC mixfix subsystem. As a
first cut, this goes as follows:

1. A mixfix declaration now specifies a syntactic category.

   old:  mixfix 10  _+_
   new: mixfix expr 10 _+_

The recognized categories are "expr" and "type".

2. The general form of a mixfix "hole" marker is now "#cM_", where 'c' is an
optional category (one of 'e', 't', 'k', or 'b' for expresssion, type, kind,
or binding) 'M' is an optional modifier (usage not yet defined, but
anticipating T for thunk and possibly S for sequence and/or B for "implied
block"). If 'c' is missing, it is assumed to match the syntactic category of
the mixfix declaration. If both 'c' and 'M' are omitted, the '#' can also be
omitted. Thus:

  mixfix expr 10 _+_   declares infix plus
  mixfix expr 126 _:#t_  declares type annotation
  mixfix type 8 _+_     type-level sum (for nat type)

The immediate purpose of this is to let type annotation be handled in the
mixfix layer, but it also anticipates the introduction of nat kind later.


The implementation of this is not conceptually difficult, but it amplifies
my concern about error recovery. ':', in particular, would have been one of
the places where an error recovery production was a natural thing to do. Is
anyone aware of any good work on error recovery in mixfix?

I'm also aware that this approach is readily generalized to named
sub-syntaxes in the style of coq (though perhaps our implementation should
be less buggy than theirs). I'ld like to go one step at a time on that, and
my current thought is that changes in expression sub-syntax should always be
signalled by explicit parenthesization of some sort.

Reactions and thoughts appreciated, as always.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to