Re: [rust-dev] Interface / typeclass proposal

Niko Matsakis Sun, 27 Nov 2011 22:13:27 -0800

Hi Marijn,

Sorry for the late reply. Overall, I am a fan of this idea. In theprevious thread the idea of bounded polymorphism came up and it's beenrattling around in my head ever since: I'm glad to see it be sketchedout in more detail. To my mind, this need not conflict with theprevious proposals: basically it can be seen as a pure addition thatallows one to avoid constructing interface instances but still make useof dynamic dispatch.


Some comments:

- I don't see why every category must declare an associated interface.I like the idea of *allowing* categories to declare associatedinterfaces, as it will lead to better error reporting, but I think*requiring* it is too much. It implies that the main purpose of acategory is dynamic dispatch and I am not sure that will be the case.

- I also like the idea that an interface can be created after the factand make use of methods declared in categories or elsewhere that are notaware of the new interface. This is not possible in your proposalunless you declare a wrapper "instance." Maybe this is not a big dealin practice, or even a good thing, as you avoid the potential problem ofthe compiler assembling an interface out of various unrelated methodsthat happen to have the right names.

- I think the term instance ought to be reserved for runtime values:most languages that aren't Haskell use instance to refer to an instanceof a type, such as an object etc. I'm not sure why Haskell felt theneed to change that convention but I think it's confusing and there isno need for us to follow suit. Therefore, I will keep calling instancescategories :).

- I really like the idea of making copy, equality, and other such thingsinterface functions (not sure whether send will work or not). I am a bitconcerned about the definition of comparison and equality because theyreally want to have two receiver types. I guess that one could do:


  interface ord<T> {
    fn lt(other: T) -> bool;
  }

and now you can write a sort function:

  fn sort<S:ord<S>>(v: [S]) { ... }

assuming F-bounded polymorphism, where the bounds of the type variable Scan reference S. We have to make sure we get this right of course.

- It also raises some questions about static overloading. Assuming thatthere are multiple ord<> categories declared for the type of thevariable `x`, what happens when `x.lt(y)` is invoked? Under myproposal, I suppose, each category would have a unique name, and anexplicit category name would be required: `x.foo::lt(y)`. The moreJava-like solution would be to decide which `lt()` is invoked based onthe type of `y`, but this complicates type inference and is confusing toboot (static overloading vs dynamic dispatch).




Niko


On 11/25/11 8:06 AM, Marijn Haverbeke wrote:

Niko proposed categories [1] two weeks ago. I'm happy that we're
looking in this direction. Niko's proposal makes interfaces
structural. I'm going to argue that nominal interfaces have advantages
both in programmer ergonomics and in compilation-model complexity.

[1]:https://mail.mozilla.org/pipermail/rust-dev/2011-November/000941.html

I'll be sticking rather closely to Haskell's type class system, which
has proven itself in practice. If you aren't already enthusiastic
about Haskell's type classes, I recommend watching Simon Peyton Jones'
talk about them [2]. He goes over the way type classes can be
implemented, and shows a number of really cool applications.

[2]:http://channel9.msdn.com/posts/MDCC-TechTalk-Classes-Jim-but-not-as-we-know-them
(try to skip the first 3 minutes, they might spoil your appetite)

Context: I'd like to implement some minimum viable dynamic dispatch
system and get rid of our `obj` implementation before the first public
release. Something that can be extended later with classes and traits,
but for now just allows us to define vtables that accompany types.

To recap, Niko's categories look like this:

     category vec_seq([T]) {
       fn len() ->  uint { vec::len(self) }
       fn iter(f: block(T)) { for elt in self { f(elt); } }
     }

Having that, you can do

     import vec::vec_seq;
     assert [1].len() == 1u;
     [1, 2, 3].iter({|i| log i; })

Which is statically resolved. Dynamic dispatch (if I understand
correctly), would look like this:

     type iterable<T>  = iface { fn iter(block(T)); };
     fn append_to_vec<copy T>(x: iterable<T>, y: itable<T>) ->  [T] {
         let r = [];
         x.iter {|e| r += [e];}
         y.iter {|e| r += [e];}
         r
     }
     // Assuming there exists a category for lists that implements iter
     append_to_vec([1, 2, 3] as iterable, list::cons(4, list::nil) as iterable)

That causes the compiler to create two vtables, both containing an
`iter` method, and wrap the arguments in {vtable, @value} structures
when they are cast to `iterable` (they'll probably have to be boxed to
make sure the size of such a value is uniform, and cleanup is
straightforward).

Alternatively, my proposal looks like this: interfaces could be fixed
groups of methods, that are always implemented all at once.

     // Define an interface called `seq`
     interface seq<T>  {
         fn len() ->  uint;
         fn iter(f: block(T));
     }
     // Declare [T] to be an instance of seq
     instance [T] seq<T>  {
         fn len() ->  uint { vec::len(self) }
         fn iter(f: block(T)) { for elt in self { f(elt); } }
     }

The static way to use this would look the same as above. If you've
imported `vec::seq` (std::vec's implementation of seq), you can simply
say [1].len(). If there is any instance in scope that applies to type
[int] and has a method `len`, that instance's implementation is
called. If multiple interfaces are found, the one in the closes scope
is chosen. If they are in the same scope, or no interface is found,
you get a compiler error.

Dynamic dispatch works differently.

     // Declare T to be an instance of the seq interface
     fn total_len<T:seq>(seqs: [T]) ->  uint {
         let cnt = 0u;
         for s in seqs { count += s.len(); }
         count
     }

In this proposal, the seq vtable is not something that get attached to
the value by casting it to an interface, but rather acts as an
implicit parameter to the function. The cool thing is that we already
have these implicit parameters -- they map very closely to our type
descriptors, which we are implicitly passing to generics. What would
happen, for such a call, is that the compiler notices that the type
parameter has an interface bound, so that instead of passing just a
tydesc, it passes both a tydesc and the `seq` vtable that belongs to
that type. Inside the function, `s` is known to be of type `T:seq`, so
the `s.len` call looks up the `len` method in the vtable passed for
type parameter T.

You can also require type parameter to conform to multiple interfaces,
as in `fn foo<T: seq, print>(...)` -- that requires passing multiple
vtables. (Niko: this is the thing you asked about. Turns out it's not
hard to do.)

It should be noted that this has both advantages and disadvantages
compared to the 'wrap by casting to interface approach'. For one
thing, it doesn't allow this

     fn draw_all<T: drawable>(shapes: [T]) { for s in shapes { s.draw(); } }

.. or at least, it doesn't do what you want, because it requires all
arguments to be of the same type, and only passes a single vtable. An
extension can be implemented (at a later time), to support this
instead:

     fn draw_all(shapes: [drawable]) { ... }
     draw_all([my_circle as drawable, my_rectangle as drawable]);

The drawable interface, when used as a type instead of a type
parameter bound, would denote a boxed value paired with a vtable, just
like in Niko's proposal.

And the good part: In the case where the interface is used as a type
parameter bound, which should cover most use cases, things do not have
to be boxed to be handled generically, and the content of regular data
structures (such as `[int]`) can be approached generically. This is
fast, and it allows type classes to applied all over the place...

Everything that's currently an obj could become an instance. We'd get
static, super-fast dispatch when using them monomorphically, and be
able to decide on our own representation (rather than being locked
into a boxed representation, as objs are) for the values. Being able
to define methods on built-in types means that many things wouldn't
require defining a new representation at all.

The operations that are currently magically implemented by the
compiler and runtime, such as comparing and logging, could be methods
on interfaces (see Haskell's Eq, Cmp, and Show type classes). That'd
make them overridable and extendable.

With a Sufficiently Smart Inliner, we could even do arithmetic with
methods, and get operator overloading for free (see Haskell's Num type
class), and allow things like generic implementations of sum, average,
and similar numeric operations over sequences of `T: num` type.

Our type parameter bounds, `copy` and `send`, could become interfaces,
with suitable implementations in the standard library. Copying of
generic types would be forbidden, but `copy` would be a method of the
`copy` typeclass, so you could say `copy(elt)` if you declared your
type parameter with a `: copy` bound.

Implementing a system like this in its simple form is not terribly
hard, especially since we already have tydescs implemented. Making it
as powerful as Haskell's requires some extensions (most importantly:
default implementations of methods, and interfaces that are
parameterized with other interfaces), but those can be tackled
piecemeal. There's even a credible path towards multiple dispatch
(methods dispatching on the type of more than one argument), though it
requires a more complicated interface-dispatching mechanism.


I'd like to spend next week implementing this. Comments, additions,
and violently disagreeing flames are welcome.

Cheers,
Marijn
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev



_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Re: [rust-dev] Interface / typeclass proposal

Reply via email to