Re: [bitc-dev] Arity inference at call (application) sites

Matt Oliveri Mon, 02 Mar 2015 13:45:48 -0800

On Mon, Mar 2, 2015 at 9:11 AM, Jonathan S. Shapiro <[email protected]> wrote:
> On Sun, Mar 1, 2015 at 9:58 AM, Matt Oliveri <[email protected]> wrote:
>>
>> On Sun, Mar 1, 2015 at 11:35 AM, Jonathan S. Shapiro <[email protected]>
>> wrote:
>> > Let's ignore arity for just a moment in order to ask this question more
>> > generally: if we are working with HM-style function types, can you nest
>> > *those* in the way that you ask? That is: is there any difference
>> > between
>> >
>> > fn 'a->'b->'c->'d->'e->'r
>> >
>> > and
>> >
>> > fn 'a->'b->'c->(fn 'd->'e->'r)
>> >
>> > I think the answer to this is obviously "no". Or alternatively, the
>> > answer
>> > to this is obviously "yes", and we should understand
>> >
>> > fn 'a->'b->'c->'d->'e->'r
>> >
>> > as a shorthand for:
>> >
>> > fn 'a->(fn 'b->(fn 'c->(fn 'd->(fn 'e->'r))))
>>
>> I think the answer is that "fn 'a->'b->'c->'d->'e->'r" is a syntax
>> error. HM function types always take one argument.
>
> And yet we very commonly see HM function types written as
>
>  'a->'b->'c->'d->'e->'r
>
> So ignore the 'fn' in what I wrote above, go back, and please answer my
> question. We cannot make progress without that starting point. Re-writing in
> something more recognizably HM, given the two types:
>
>  'a->'b->'c->'d->'e->'r
>  'a->'b->'c->('d->'e->'r)
>
> That is: that the parens here are solely for purposes of precedence
> breaking, and in this case they have no effect on the type and the two types
> are therefore identical? [Expected answer: they are the same.]


Yup. They are the same.

> Do we also agree that in a curried language, *all* arities are 1, and
> therefore we should understand the type
>
>  'a->'b->'c->'d->'e->'r
>
> as a shorthand for:
>
>  'a->('b->('c->('d->('e->'r))))

It sure is, courtesy of the parser. But you have to admit, adding back
the "fn" prevents the "shorthand" from being literally the same parse
tree as the longhand.

In other words, if anything,
fn 'a->'b->'c->'d->'e->'r
parses as
fn 'a->('b->('c->('d->('e->'r))))
but I thought we are not accepting ('e->'r) as a well-formed function
type under your proposal. So it doesn't actually parse that way.
There's only one function type here.

So adding back the "fn" is not entirely trivial. If you wanted
fn 'a->'b->'c->'d->'e->'r
to be shorthand for
fn 'a->(fn 'b->(fn 'c->(fn 'd->(fn 'e->'r))))
which, without arity abstraction, you would, then you have to
"ungroup" the function type, by constructing the nested function types
after parsing gives you the one function type with a list of argument
types.

Fine, that's perfectly doable. But _why_ did we add "fn"? It's only a
nuisance without arity abstraction, as I've just detailed. But with
arity abstraction, I figured "fn" marks the points at which we have
abstracted the arity. (Except once it concretizes, which is another
reason I don't like your notation.)

Whether it makes sense to ungroup the function types once arity
abstraction is in play is a completely separate question from whether
it makes sense without. Form follows function.

> That is: arguments are consumed one at a time until we run out of arguments,
> and any remaining portion of the function type survives as the return type
> of the returned value?

Why give such a complex explanation of "'a->'b->'c->'d->'e->'r"? It's
just some right-nested function types. In HM, there's technically no
such thing as a curried function. That's just an idiom. With arity
abstraction, we can abstract the difference between a "curried
function" and an actual multi-arg function, and that means we need to
start thinking about what a curried function really is. But only then.

>> > that is: the conventional arity notation is a convenience shorthand for
>> > the
>> > fully nested one. The only sense I can see in which this is *not* a good
>> > interpretation is that we don't know whether 'r will end up unifying
>> > with a
>> > function type, and our nesting may get deeper as a result. That seems a
>> > little odd for a convenience shorthand, but certainly not impossible.
>> > Merely
>> > odd.
>>
>> What is the "conventional arity notation"?
>> Is that "fn 'a->'b->'c->'d->'e->'r"?
>> Or "fn 'arity 'a->'b->'c->'d->'e->'r"?
>> Or something else?
>
> I'm very sorry. The word "arity" in that sentence should have read "arrow".
> So:
>
> The conventional _arrow_ notation is a convenience shorthand for the fully
> nested one. Does that make the proposition clearer?

Yes. But that kind of thing is below the radar for this discussion.
The parser sees them as nested. They are nested. The type system
doesn't need to address curried functions, as I explained above.

>> You might not follow the above because AFAIK you consider (fn 2
>> 'a->'b->'c) to be literally, syntactically the same as (fn 'arity
>> 'a->'b->'c), once you've substituted ('arity := 2)
>
> Well, no. Obviously I do not, because I keep saying that this substitution
> (at least in the general case) requires us to introduce a new arity
> variable, because 'c could still unify with a function type!

Oh yeah. We'll get back to arguing about that later.

>> How about, for the time being, we use (cfn 2 'a->'b->'c) and (afn
>> 'arity 'a->'b->'c)?
>
> Definitely not. If we are forced to that, then the arity discussion is over
> and BitC will not support currying at all. Let's work first to see if we can
> come to agreement about nesting, given my clarifications. Hmm. Maybe that's
> too strong. My intuition is that arrows need to be a single type, and that
> multiple arrow types are a bad idea.

afns would only have been an alternative notation. The collection of
concrete types would not semantically change.

> Take this in steps, Matt. The essential issue here is that you are
> struggling with the substitution rule that I have proposed, and I haven't
> answered your questions about it adequately yet. Don't jump the gun quite
> yet.

OK. From one of your replies below, I see that I was making a bad
assumption, and there doesn't need to be a cfn/afn distinction at this
point in the discussion. However, I don't agree that the essential
misunderstanding is limited to how to substitute into an arity
variable.

>> > So another way to look at this would be to write
>> > it as:
>> >
>> > f :: ('nargs >= 4) => fn 'nargs 'a->'b->'c-'d->'r
>>
>> Now you're getting weird. From your previous email, I thought the
>> point of ('arity < 5) is to prevent partial applications.
>
> Not at all! The ('arity < 5) notation means simply that there are 4
> arguments present, and therefore (a) the *first* concrete application may
> not consume more than 4 arguments nor less than 1, and (b) it's return type
> must be something that might account for the *rest* of the arguments (if any
> remain).

But that _does_ prevent partial applications, in this case! If 'r gets
instantiated before 'arity, we might forget the upper bound if we
hadn't made it explicit. At least that's the case if you allow
ungrouping.

> The ('nargs >=4) argument is coming from the opposite perspective, but on
> reflection I don't think that it works. What it was trying to capture is
> "since 4 arguments are present, f must be a composition of functions that
> will, when applied in sequence, consume 4 arguments without introduction of
> a hidden allocation". It could consume more because the return value of the
> function might be a function type.

As long as you're not going to use it, I don't have to ask about it anymore.

>> I don't
>> understand how ('nargs >= 4) prevents this. We don't need to say
>> anything additional to see that f can be applied to at least 4
>> arguments, because its arity-abstract type already lists 4 argument
>> types. We can safely apply more arguments than the native shallow
>> arity might turn out to be.
>
> But we *do* need a variable present to reflect the fact that the concrete
> specialization involves a particular sequence of concrete arities. I agree
> with you that an *abstract* arity variable (in either form) does not add new
> information w.r.t. the number of arguments required. It exists only for the
> purpose of ensuring that when two abstract-arity function types are unified,
> they unify in such a way that they end up with the same concretization at
> all points.

And also to let us impose upper bounds, maybe.

>> > with the intended interpretation that f is some [composition of]
>> > functions
>> > such that f takes at least four arguments, and may take more depending
>> > on
>> > how r unifies. Following the arity-free convenience syntax intuition
>> > above,
>> > this can be viewed as a shorthand for:
>> >
>> > f :: fn 1 'a -> fn 1 'b -> fn 1 'c -> fn 1 'd -> 'r
>>
>> Whoa, I didn't know that. I thought that type was arity-concrete (up
>> until you get an 'r, which may be an arity-abstract function).
>
> It is. What I'm writing here is how this would expand to a curry-style
> application, where all arities are concrete and all arities are 1.

But
('nargs >= 4) => fn 'nargs 'a->'b->'c-'d->'r
could concretize to
fn 4 int->int->int->int->bool
for example, and that's an incompatible concrete arity with
fn 1 int -> fn 1 int -> fn 1 int -> fn 1 int -> bool
assuming they are, in fact arity-concrete.

So you cannot say
('nargs >= 4) => fn 'nargs 'a->'b->'c-'d->'r
is shorthand for
fn 1 'a -> fn 1 'b -> fn 1 'c -> fn 1 'd -> 'r
because then they'd have to be equivalent types, but only the former
has the concrete-arity-4 instance.

>> > where 'r is completely free, an in particular may unify with fn 'nargs
>> > ...
>> > for any 'nargs, but simultaneously need not unify with a function type
>> > at
>> > all. And in *that* interpretation of things, we're saying that we are
>> > permitted [inductively] to combine smaller arities into higher arities
>> > and
>> > rewrite the application AST, such that (inductively)
>> >
>> >       fn 1 'a -> (fn 1 'b -> 'c)
>> >
>> > may be "composed" into
>> >
>> > fn 2 'a->'b->'c
>>
>> So how would you actually write a function type whose native shallow
>> arity is 2? I'm starting to think I actually have a quite-different
>> proposal from yours, and that we should start over, explaining our
>> proposals in more detail.
>
> We have used several different notations. In our current notation, I would
> write an arity-concrete function having arity 2 as:
>
> fn 2 'a->'b->'c
>
> which is what I wrote above.

Viewing that syntax as an arity-concrete function type seems at odds
with viewing "fn 'arity 'a->'b->'c->'d->'r" as shorthand. You _need_
to keep the argument types grouped, because you don't _know_ if it's
curried or not.

>> > which may in turn be combined further. So in this view we are starting
>> > with
>> > all arities equal to 1 (which is the curried view) and coercing the
>> > arities
>> > *upwards* within the type rule for /app/.
>> >
>> > This actually might be a more consistent and simpler way to think about
>> > things.
>>
>> But what drives specialization? How do you talk about the types of
>> defined functions, which are arity-concrete, and how they unify with
>> arity-abstract function types?
>
> Application drives it.

Never mind. That isn't a very informative answer, but I think I know
the answer if you still consider (fn 2 'a->'b->'c) to be
arity-concrete.

>> > I'm not sure whether this is helping or not,
>> > so I want to stop here to give
>> > you a chance to read and see if you can help refine either the question
>> > or my answer.
>>
>> It's helping by uncovering massive differences in our thinking.
>
> Well, I suppose that's progress.
>
>> > Haven't forgotten the other question, but I think that refining this one
>> > may
>> > possibly lead us to a different answer there, so let's tackle this one
>> > first.
>>
>> It seems you want afns to be nestable. But, unlike me, you think afns
>> can be "groupped" or "ungrouped" between nested afns. And you don't
>> seem to have cfns in your syntax at all. Yes, we probably should work
>> this out before (or at the same time as) we worry about how the arity
>> parameter of an afn is interpreted.
>
> It seems to me that we haven't come to a common understanding of whether fns
> of any sort are nestable, and that's where we need to start.

No, my latest thinking is that they should always be nestable.

We need to address whether they can be ungrouped freely, like you seem
to be doing with
('nargs >= 4) => fn 'nargs 'a->'b->'c-'d->'r
as shorthand for
fn 1 'a -> fn 1 'b -> fn 1 'c -> fn 1 'd -> 'r

To me, that's nonsense, and the nonsense is especially clear in this
case, since only the latter function type has concrete arity. In fact,
it was because this seems so obviously nonsensical to me that I
assumed that "fn 1 'a -> ..." was not intended to be arity-concrete,
so I introduced "cfn" and "afn" to disambiguate. But it seems that's
not what you meant.

This really needs an explanation. Perhaps you didn't really mean "shorthand".

> I think the
> afn/cfn distinction is very, very problematic.

It's ugly. I believe it's not problematic, but it's not actually
needed either, since I don't believe ungrouping is sensible.

>> For me, the grouping of nested afns, assuming it's allowed, reflects
>> the individual arity-abstract applications. We cannot regroup them
>> because that would change the requirements on the concrete instances.
>> Each arity-abstract application, individually, must not specialize to
>> involve an arity-concrete partial application. In other words, it must
>> specialize to one or more exactly-matching arity-concrete
>> applications.
>
> I'm with you up to this point, but then you write:
>
>> You cannot split an arity-concrete application between
>> two arity-abstract applications.
>
> But Matt, there *are* no arity-abstract applications. The only *real*
> applications are the arity-concrete applications!

But Shap, I said "arity-abstract applications" in the part just
before, that you understood! In case that was a fluke, by
arity-abstract application, I mean the syntactic form. All of the
programmer's applications are arity-abstract in your scheme. Obviously
there are no arity-abstract applications at runtime.

So to rephrase what I said, the types need to reflect that each
arity-abstract application that the programmer writes must,
individually, specialize to one or more arity-concrete applications.
For example, you cannot generally coalesce two separate arity-abstract
applications of 2 arguments into an arity-abstract application of 4
arguments in order to specialize it to a 4-application. In that case
there's no way to specialize the individual arity-abstract
applications. You have split a concrete 4-application between two
abstract 2-applications, in my earlier phrasing. (Oh and you can't
split a concrete application between _more than_ two abstract
applications either. :) )
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Arity inference at call (application) sites

Reply via email to