Pun intended.

It's clear from my most recent exchange with Matt that we actually *don't*
have a common understanding of how arity-concrete and arity-abstract
application interact. I think we need to start over completely, go way back
to basics, and work our way back up to something we can agree on. It's
clear to me that the mechanism I have in mind works, and that others may as
well. I think what we are hung up on is notation and the account for what
that notation means. Let's start over. In fact, let's start by going all
the way back to HM.

Here are the statements that I think we can probably agree on:

In HM, we have two facts:

1. The syntax

f a b c


is a convenience shorthand for

((f a) b) c


and the syntax

u->v->w->x


is a convenience shorthand for

u->(v->(w->x)


2. An application must consume all actual parameters. This is ensured by
checking the arrow type. The arrow type must have **at least** as many
positions to the left of the arrow[s] as there are actual parameters. It
may have more, it may not have less.

3. An application may have *fewer* actual parameters than the arrow type
naively expects. We often refer to this as a partial application, but that
is a misnomer. Since all applications in HM take a single argument, there
really is no such thing as a partial application. There are complete
applications that return a value of function type.

4. Optimization of procedure application in HM is non-semantic, but it has
to preserve the rules that I have stated above. All arguments present must
be consumed. By the time of optimization, all type variables are fully
concretized, and when that is true we can justify rewriting the HM-style
arrow notation using the arity-concrete notation:

cfn u -> (cfn v -> (cfn w -> x)


having done so, the optimizer is free to "combine" arities into larger
arities, provided (a) the total number of arguments consumed is not
altered, and (b) combining is only performed at positions where an actual
parameter is known to be present. What I mean by (b) is that given a
partial application

(f a) b


we have to look at our concrete type as:

cfn u -> (cfn v -> { cfn w -> x })


where the curly brackets indicate the "unapplied" portion of the type. That
part cannot be re-combined by the optimizer, or at least not without a lot
more information. That's the part that will get returned as a function
object that may (or may not) get applies elsewhere. Based on what we can
see in front of us at the application site, we don't know enough to
optimize the arity of that return value.

However, based on what we know at the application site, we *do* know that
two arguments are present to consume, and this justifies us in co-rewriting
the application and its type as

(f a b)
cfn u v -> { cfn w -> x }


respectively.


So far so good. This much, I think, we probably all agree on.

I suspect we can also agree on the following two statements:

6. The additional restriction in BitC is that we disallow partial
applications (in the absence of explicit syntax). The consequence is that
the number of arguments consumed by the application (that is: up to, but
not including, the part of the type in the brackets) must exactly match the
number of actual parameters that are present.

7. Where two arrow types unify, the corresponding concrete function type
must unify. This is necessary to ensure (e.g.) assignment compatibility.
Unification of cfn types requires that arities at corresponding positions
must agree.

I have attempted to introduce an arity type variable to address this
requirement, and I have apparently created a fair bit of confusion. What I
think we *can* say about this is that *some* sort of unifiable thing
(presumably a variable) must exist to capture the notion that there can
only be a single concretization of a given type in a given specialization.



Up to this point in the discussion, I think that we are all fairly well in
agreement. The next step will be to figure out how to "lift" all of this
into the type system so that we can talk coherently about what happens when
arity is part of semantics. The two rules BitC introduces are:

B1) No hidden allocations
B2) No arguments based on "optimizer makes right"

My sense is that we are all in agreement about what can practically be done
in the presence of those two rules. What we seem to be struggling with is
how to express the specialization rule.


Before I go on, I want to confirm that we are actually in agreement about
the points above. PLEASE (just for now) restrict responses ONLY to the
points I have made above. A wandering discussion is very disruptive. By all
means ask other questions, or fork the thread to pursue side questions.
When you do so, CHANGE THE SUBJECT LINE. If I have missed an important
thing that we know, please bring it forward. We'll get to the lifting
discussion in the next step, but let us try to start from common ground
when we do.


shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to