[rust-dev] polymorphism

john skaller Mon, 16 May 2011 17:57:03 -0700

On 17/05/2011, at 2:54 AM, Graydon Hoare wrote:

BTW: I'm continuing on with this analysis because I think Rust
is really good. If it were just "crap" like Java I wouldn't bother.
This is a pretty long post because there are real examples in it
which show how hard it is to handle just one canonical family of multi-argument
functions, namely the tuple constructor.

The executive summary is: if it is this hard to handle just one family the 
handling
of a language where *all* functions are multi-argument is probably untenable
with very high level kinding.

> On 16/05/2011 8:13 AM, john skaller wrote:
> 
>> Sure just look at "binder1" and "binder2" in the C++ standard library and
>> wonder where "binder3" and "binder4" have gone. They haven't gone,
>> they're not defined. You need an infinite number of them. If you have only
>> one argument functions .. you only need one of them for completeness.
> 
> Here we're getting into (almost) a concrete argument. But I think you're 
> overdoing it.
> 
> Am I misunderstanding?

 I think you understand it, but you mightn't quite appreciate just
how bad it can get in a combinatorial situation.

It isn't just a matter of 1, 2, 3 ... oh, we need 5, well we can always write 
out
the 5 case. That's true, its rare, and the 5 case for one function can be coded
in a few (boring) minutes for a simple function. A bit longer for a more complex
function like say STL sort (which is polymophic). 

Before continuing a caveat: usually you would do the interfacing by using
an adaptor like binder2 etc. but in Rust that may not be possible without losing
the alias property .. if it were possible you wouldn't have the constraint that
you *need* multi-args to support aliases to support down-stack pointers
to support stack protocol memory management. I'm not sure about this.

But this is not the real problem; that comes when you have higher order stuff
and you're *combining* these N-ary things in more complex ways, and then
the 1.. N argument problem becomes a case not of N functions or even N *  M
function but more like 2^N functions, which is, of course untenable.

It's hard to illustrate because any illustration requires a massively large 
code example,
and these examples don't exist because .. well .. doing them is untenable :)

So I will actually give an illustration which is only lower order to give a feel
for it. As I mentioned I have this problem in Felix with tuples because they're 
the
canonical multi-arg function in ML and Felix. 

In Felix, I have Haskell style type classes (which, by the way, seem useful and
could be put into Rust). Here are some of them:

// equality
typeclass Eq[t] {
  virtual fun eq: t * t -> bool;
  virtual fun ne (x:t,y:t):bool => not (eq (x,y));
  axiom reflex(x:t): eq(x,x);
  axiom sym(x:t, y:t): eq(x,y) == eq(y,x);
  axiom trans(x:t, y:t, z:t): implies(eq(x,y) and eq(y,z), eq(x,z));
}

// total order
typeclass Tord[t]{
  inherit Eq[t];
  virtual fun lt: t * t -> bool;
  virtual fun gt(x:t,y:t):bool =>lt(y,x);
  virtual fun le(x:t,y:t):bool => not (gt(x,y));
  virtual fun ge(x:t,y:t):bool => not (lt(x,y));
  virtual fun max(x:t,y:t):t=> if lt(x,y) then y else x endif;
  virtual fun min(x:t,y:t):t => if lt(x,y) then x else y endif;
  axiom trans(x:t, y:t, z:t): implies(lt(x,y) and lt(y,z), lt(x,z));
  axiom antisym(x:t, y:t): lt(x,y) or lt(y,x) or eq(x,y);
  axiom reflex(x:t, y:t): implies(le(x,y) and le(y,x), eq(x,y));
}

// printable string rep of data type
typeclass Str [T] {
  virtual fun str: T -> string;
}

and now I'm going to bore you with the implementation for tuples.
Note that in Felix an array T ^ N is just a tuple with N components
where N is a type (NOT an integer but instead a sum of units)
and there's a conflict when you do things for arrays and tuples,
you get a duplicate function error if you define both in general,
so you have to resolve the conflict between 

        T * U and T ^ 2 with U set to T

by coding the overridding specialisation for T * T explicitly. So here
are the specialisations of just three functions for 2 to 5 component
tuples, that provide:

(a) equality of tuples
(b) lexicographic ordering of tuples
(c) stringisation of tuples to support things like:

println (1,2,3,4,5);

which is done by calling str function and implementing
print for just strings.

It took many hours to write out and check these things, because there
were some silly typos when I did it and the compiler error messages
were .. well .. less than helpful :)

///////////////////////////////////////////////

// two components
// string
instance[T,U] Str[T*U] {
   fun str (t:T, u:U) => "("+str t + ", " + str u+")";
}
instance[T] Str[T*T] {
   fun str (t1:T, t2:T) => "("+str t1 + ", " + str t2+")";
}
// equality
instance[t,u with Eq[t],Eq[u]] Eq[t*u] {
  fun eq: (t * u) * (t * u) -> bool =
  | (?x1,?y1),(?x2,?y2) => x1==x2 and y1 == y2
  ;
}
instance[t with Eq[t]] Eq[t*t] {
  fun eq: (t * t) * (t * t) -> bool =
  | (?x1,?y1),(?x2,?y2) => x1==x2 and y1 == y2
  ;
}
// total order
instance[t,u with Tord[t],Tord[u]] Tord[t*u] {
  fun lt: (t * u) * (t * u) -> bool =
  | (?x1,?y1),(?x2,?y2) => x1 <= x2 and y1 < y2
  ;
}
instance[t with Tord[t]] Tord[t*t] {
  fun lt: (t * t) * (t * t) -> bool =
  | (?x1,?y1),(?x2,?y2) => x1 <= x2 and y1 < y2
  ;
}
open[t,u with Tord[t], Tord[u]] Tord[t*u];

//---------------------------------------------------------------------------------
// three components
// string
instance[T,U,V] Str[T*U*V] {
  fun str (t:T, u:U, v:V) => "("+str t + ", " + str u + ", " + str v+")";
}
instance[T] Str[T*T*T] {
  fun str (t1:T, t2:T, t3:T) => "("+str t1 + ", " + str t2 + ", " + str t3+")";
}
// equality
instance[t,u,v with Eq[t], Eq[u], Eq[v]] Eq[t*u*v] {
  fun eq: (t * u * v) * (t * u * v) -> bool =
  | (?x1,?y1,?z1),(?x2,?y2,?z2) => x1==x2 and y1 == y2 and z1 == z2
  ;
}
instance[t with Eq[t]] Eq[t*t*t] {
  fun eq: (t * t * t) * (t * t * t) -> bool =
  | (?x1,?y1,?z1),(?x2,?y2,?z2) => x1==x2 and y1 == y2 and z1 == z2
  ;
}
// total order
instance[t,u,v with Tord[t],Tord[u],Tord[v]] Tord[t*u*v] {
  fun lt: (t * u * v) * (t * u * v) -> bool =
  | (?x1,?y1,?z1),(?x2,?y2,?z2) => x1 <= x2 and y1 <= y2 and z1 < z2
  ;
}
instance[t with Tord[t]] Tord[t*t*t] {
  fun lt: (t * t * t) * (t * t * t) -> bool =
  | (?x1,?y1,?z1),(?x2,?y2,?z2) => x1 <= x2 and y1 <= y2 and z1 < z2
  ;
}
open[t,u,v with Tord[t], Tord[u], Tord[v]] Tord[t*u*v];

//---------------------------------------------------------------------------------
// four components
// string
instance[T,U,V,W] Str[T*U*V*W] {
  fun str (t:T, u:U, v:V, w:W) => "("+str t + ", " + str u + ", " + str v + ", 
" + str w+")";
}
instance[T] Str[T*T*T*T] {
  fun str (t1:T, t2:T, t3:T, t4:T) => "("+str t1 + ", " + str t2 + ", " + str 
t3 + ", " + str t4+")";
}
// equality
instance[t,u,v,w with Eq[t], Eq[u], Eq[v], Eq[w]] Eq[t*u*v*w] {
  fun eq: (t * u * v * w) * (t * u * v * w) -> bool =
  | (?x1,?y1,?z1,?a1),(?x2,?y2,?z2,?a2) => x1==x2 and y1 == y2 and z1 == z2 and 
a1 == a2
  ;
}
instance[t with Eq[t]] Eq[t*t*t*t] {
  fun eq: (t * t * t * t) * (t * t * t * t) -> bool =
  | (?x1,?y1,?z1,?a1),(?x2,?y2,?z2,?a2) => x1==x2 and y1 == y2 and z1 == z2 and 
a1 == a2
  ;
}
// total order
instance[t,u,v,w with Tord[t],Tord[u],Tord[v],Tord[w]] Tord[t*u*v*w] {
  fun lt: (t * u * v *w) * (t * u * v * w) -> bool =
  | (?x1,?y1,?z1,?a1),(?x2,?y2,?z2,?a2) => x1 <= x2 and y1 <= y2 and z1 <= z2 
and a1 < a2
  ;
}
instance[t with Tord[t]] Tord[t*t*t*t] {
  fun lt: (t * t * t * t) * (t * t * t * t) -> bool =
  | (?x1,?y1,?z1,?a1),(?x2,?y2,?z2,?a2) => x1 <= x2 and y1 <= y2 and z1 <= z2 
and a1 < a2
  ;
}
open[t,u,v,w with Tord[t], Tord[u], Tord[v], Tord[w]] Tord[t*u*v*w];

//---------------------------------------------------------------------------------
// five components
// string
instance[T,U,V,W,X] Str[T*U*V*W*X] {
  fun str (t:T, u:U, v:V, w:W, x:X) => 
    "("+str t + ", " + str u + ", " + str v + ", " + str w+", "+ str x+")"
  ;
}
instance[T] Str[T*T*T*T*T] {
  fun str (t1:T, t2:T, t3:T, t4:T, t5:T) => 
    "("+str t1 + ", " + str t2 + ", " + str t3 + ", " + str t4+", "+str t5+")"
  ;
}
// equality
instance[t,u,v,w,x with Eq[t], Eq[u], Eq[v], Eq[w], Eq[x]] Eq[t*u*v*w*x] {
  fun eq: (t * u * v * w *x) * (t * u * v * w * x) -> bool =
  | (?x1,?y1,?z1,?a1,?b1),(?x2,?y2,?z2,?a2,?b2) => 
    x1==x2 and y1 == y2 and z1 == z2 and a1 == a2 and b1 == b2;
  ;
}
instance[t with Eq[t]] Eq[t*t*t*t*t] {
  fun eq: (t * t * t * t *t) * (t * t * t * t * t) -> bool =
  | (?x1,?y1,?z1,?a1,?b1),(?x2,?y2,?z2,?a2,?b2) => 
    x1==x2 and y1 == y2 and z1 == z2 and a1 == a2 and b1 == b2;
  ;
}
// total order
instance[t,u,v,w,x with Tord[t],Tord[u],Tord[v],Tord[w],Tord[x]] 
Tord[t*u*v*w*x] {
  fun lt: (t * u * v * w * x) * (t * u * v * w * x) -> bool =
  | (?x1,?y1,?z1,?a1,?b1),(?x2,?y2,?z2,?a2,?b2) => x1 <= x2 and y1 <= y2 and z1 
<= z2 and a1 <= a2 and b1 < b2
  ;
}
instance[t with Tord[t]] Tord[t*t*t*t*t] {
  fun lt: (t * t * t * t *t) * (t * t * t * t * t) -> bool =
  | (?x1,?y1,?z1,?a1,?b1),(?x2,?y2,?z2,?a2,?b2) => x1 <= x2 and y1 <= y2 and z1 
<= z2 and a1 <= a2 and b1 < b2
  ;
}
open[t,u,v,w,x with Tord[t], Tord[u], Tord[v], Tord[w], Tord[x]] 
Tord[t*u*v*w*x];

////////////////////////////

Now here is another module which also illustrates some problems:

//////////////

open module Categ 
{
  // note: in Felix, products are uniquely decomposable, but arrows
  // are not. So we cannot overload based on arrow factorisation.
  // for example, the curry functions can be overloaded but
  // the uncurry functions cannot be

  // Note: Felix is not powerful enough to generalise these
  // operation in user code, i.e. polyadic programming

  // change star into arrow
  fun curry[u,v,r] (f:u*v->r) => fun (x:u) (y:v) => f (x,y);
  fun curry[u,v,w,r] (f:u*v*w->r) => fun (x:u) (y:v) (z:w) => f (x,y,z);

  // change arrow into star
  fun uncurry2[u,v,r] (f:u->v->r) => fun (x:u,y:v) => f x y;
  fun uncurry3[u,v,w,r] (f:u->v->w->r) => fun (x:u,y:v,z:w) => f x y z;

  // argument order permutation
  fun twist[u,v,r] (f:u*v->r) => fun (x:v,y:u) => f (y,x);

  // projection
  fun proj1[u1,u2,r1,r2] (f:u1*u2->r1*r2) => 
    fun (x:u1*u2) => match f x with | ?a,_ => a endmatch;
  fun proj2[u1,u2,r1,r2] (f:u1*u2->r1*r2) => 
    fun (x:u1*u2) => match f x with | _,?b => b endmatch;

  // parallel composition
  fun ravel[u1,u2,r1,r2] (f1:u1->r1,f2:u2->r2) => fun (x1:u1,x2:u2) => f1 x1, 
f2 x2;

  // series composition
  fun compose[u,v,w] (f:v->w, g:u->v) => fun (x:u) => f (g x);
  fun rev_compose[u,v,w] (f:u->v, g:v->w) => fun (x:u) => g (f x);
}

//////////////////////

You can see in this module products causing problems again, needing
to write uncurry2 and uncurry3 out, same for many of these functions.

The pain here is that the Ocaml representation (in the compiler) of a
product is just a list, so it is trivial to implement these kinds of functions
*in the compiler* with recursion, maps, or folds. The operations are
categorical, meaning they're the building blocks of higher order operations:
they're fundamental to allow easy construction of powerful things, meaning,
to support code reuse by avoiding duplication of code.

So you see my pain: in Felix, I cannot handle tuples properly. I cannot provide
a way the USER can put things into the library I can do easily in the compiler,
but clearly I don't want to change the compiler for every new function that
should have gone into the library.

The point here is: this is "bleeding edge", this problem HAS to be solved to get
anywhere. The problem arises because tuples (and anonymous sums which
are their duals) are not associative so can't be trivially decomposed by using 
a fold and a binary function.

This may look esoteric, but it is impacts the "average programmer" and a large
fraction of the basic tutorial examples:

println$ "Hello World on ", gethostname();

Woops, printing a tuple. people *expect* that to work so I HAVE to support it.

The point of all this: I have a problem with ONE multi-argument function,
namely the tuple constructor, which by itself is so bad it threatens the whole
Felix project unless I can provide enough higher order polymorphism to fix the
problem of having to write out all the cases ...

in rust ALL the functions are multi-arg. So the problem occurs not just for
one function (after which the type system factors it away) but for all of them.

You can see why this looks like a major backward step in code reuse to me!
Just look at the pain for supporting just tuples! 

For interest, in Ocaml functors don't combine properly either.
Just to to make an Ocaml functor for a Ring (a set with addition
and multiplication) by composing it out of two functors on the
same data type which provide the addition group and the multiplication
semi-group. It can be done but it is really contorted, I had to get an expert
to show me how to do it.

Safety is all about reusability. As Graydon mentioned elsewhere the line between
statically assured correctness and run time checks for dynamics is fuzzy, 
meaning
you will *always* have to have some dynamics, and in those cases where you have
to actually test code to gain faith in the correctness of the code it is even
more important to have reusability because of the very high price of testing:
you don't want to have to test multiple implementations when you could have
had a single one.

In Rust the problem might go away if "aliases" were put into the type
system properly. Eg if you have a proper "weak pointer" type which could
be used anywhere. If that could be done you could get rid of multi-argument
functions by using tuples without losing the down stack pointer assurance.

However when I think about this I don't see how it can work, but I don't 
understand
Rust well enough. A simple "weak pointer" type probably doesn't work: if you 
have
a tuple containing weak pointer and copy it, the copy may persist if it is 
returned
from a function and this causes the stack frame containing the data pointed to 
by
the weak pointer to be popped off the stack.

What's actually needed is to take the weak pointer as a type and replace
the current artificial, non-type constraint (aliases can only be used in
function args) with a *type based* constraint, i.e. make the assurance
directly part of the type system. I do NOT know how to do this.

However .. if you provide curried functions clearly you can have multiple
arguments in the currified style and apply the "alias" property to those
arguments you want to (because those arguments are separate).
This is the same non-type constraint but now we have a standard type
system again. Again I don't know if this would even work (from a type
theoretic viewpoint) and I have no idea what the impact would be on
the memory model (so this is just an observation that there could be
many possible solutions).

Anyhow, we have probably talked enough about this: pragmatically
Rust is going to go ahead as it is, otherwise it will never get released
and that would be a great pity!

--
john skaller
[email protected]

_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

[rust-dev] polymorphism

Reply via email to