RFC 256 (v2) Objects : Native support for multimethods

Perl6 RFC Librarian Mon, 25 Sep 2000 11:28:01 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Objects : Native support for multimethods

=head1 VERSION

  Maintainer: Damian Conway <[EMAIL PROTECTED]>
  Date: 18 September 2000
  Last Modified: 25 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 256
  Version: 2
  Status: Frozen

=head1 ABSTRACT

This RFC proposes that Perl natively support multiply dispatched
subroutines (multimethods).

=head1 DESCRIPTION

Multimethods solve a class of problems in which objects of two or more
hierarchies must interact polymorphically, but where the nature of the
interaction is determined by the run-time type of I<both> (I<all>) the
objects.

=head2 Theoretical model

It is proposed that calls to certain (explicitly marked) subroutines be
dispatched using a different mechanism from that used for regular Perl 5
subroutines, or that used for Perl 5 methods.

These explicitly marked subroutines would share the same name, but
provide unique parameter lists. Collectively all marked subroutines 
with the same name but different parameters lists would be known
as a I<multimethod>, with each individual subroutine known as a 
I<variant> of the multimethod.

The new dispatch mechanism would look at the classes or types of each
argument to the subroutine (by calling C<ref> on each) and determine
the "closest" matching variant of the multimethod, according to the
parameter types specified in the variants' parameter list.

The result would subsume the behaviour of function overloading (e.g. in
C++) but in a more sophisticated manner, since multimethods take the
run-time inheritance relationships of each argument into account.
Another way of thinking of the mechanism is that it would perform
polymorphic dispatch on every argument of a multimethod, not just on
the first.

=head2 Defining multimethods

A single variant of a multimethod would be defined by specifying a 
subroutine with a full parameter list (as per RFC 128), and
appending the attribute C<:multi>. Two or more such subroutines
could be defined with the same name, in the same package namespace, provide
they were both defined with the C<:multi> attribute, and their parameter
lists were not identical.

For example:

        package LargeNum;
        package LargeInt;    use base 'LargeNum';
        package LargeFloat;  use base 'LargeNum';

        package LargeMath;

        sub divide(LargeInt $a, LargeInt $b) : multi {
                ...
        }

        sub divide(LargeInt $a, LargeFloat $b) : multi {
                ...
        }

This creates a (single) multimethod called C<LargeMath::divide> with two
variants. If the multimethod is called with two references to LargeInt
objects as arguments:

        $x = LargeInt->new(1234567890);
        $y = LargeInt->new(9876543210);
        $quotient = divide($x, $y);

then the first variant is invoked. If the multimethod
is called with a LargeInt reference and a LargeFloat reference:

        $x = LargeInt->new(1234567890);
        $y = LargeFloat->new(9876543210.12345);
        $quotient = divide($x, $y);

then the second variant is called.

Calling the multimethod with any other combination of LargeNum-derived
reference arguments (e.g. a reference to a LargeFloat and a reference
to a LargeInt, or two LargeFloat references) results in an exception
being thrown, with the message: 

        No viable candidate for call to multimethod divide

To avoid this, a "catch-all" variant could be specified:

        sub divide(LargeNum $b, LargeNum $b) : multi {
                ...
        }

Now, calling C<divide> with (for example) a LargeFloat
reference and a LargeInt reference causes this third variant to be
invoked. That's because a LargeFloat I<is-a> LargeNum (so the first
argument is compatible with the first parameter), and a LargeInt I<is-a>
LargeNum too (so the second argument is compatible with the second
parameter). Note that adding this third variant doesn't affect calls
to the other two, since multiple dispatch always selects the nearest
match.


=head2 Finding the "nearest" multimethod

The usefulness of multiple dispatch depends on how intelligently the
dispatcher decides which variant of a multimethod is "nearest" to a
given set of arguments. That decision process is called I<dispatch
resolution>, and it is proposed that it be done (or I<appear> to be done,
modulo optimization) like this:

=over 4

=item 1.

If the types of the arguments given (as determined by applying C<ref>
to each in turn) exactly match the set of parameter types
specified in any variant of the multimethod, that variant is
immediately called.

=item 2.

Otherwise, the dispatch mechanism compiles a list of viable targets.
A viable target is a variant of the multimethod with the same
number of parameters as there were arguments passed. In addition,
for each parameter the specified parameter type must be a base
class of the actual type of the corresponding argument in the
actual call (i.e. each argument must belong to a subclass of the
corresponding parameter type).

=item 3.

If there is only one viable target, it is called immediately. If
there are no viable targets, an exception is thrown indicating
that the multimethod can't handle the specific set of arguments.
(but see L<Handling dispatch failure> below).

=item 4.

Otherwise, the dispatch mechanism examines each viable target and
computes its inheritance distance from the actual set of
arguments. The inheritance distance from a single argument to the
corresponding parameter is the number of inheritance steps between
their respective classes (working up the tree from argument to
parameter). If there's no inheritance path between them, the
distance is infinite. The inheritance distance for a set of
arguments is just the sum of their individual inheritance
distances.

=item 5.

The dispatch mechanism then chooses the viable target with the
smallest inheritance distance as the actual target. If more than
one viable target has the same smallest distance, the call is
ambiguous. In that case, the dispatch process fails and an
exception is thrown (but see L<Handling dispatch failure> below)
If there's only a single actual target, its identity is cached
(to accelerate subsequent dispatching), and then the actual target is
invoked.

=back


=head2 Handling built-in types

Because the multiple dispatch mechanism applies C<ref> to determine
the type of each argument, it is indifferent to whether those
arguments are references to blessed objects, or just regular Perl data
types. That means that multimethod variants can also be defined to process
built-in data types:

        sub stringify(ARRAY $a) : multi {
                my @arg = map { stringify $_ } @a;
                return  "[" . join(", ",@a) . "]";
        }

        sub stringify(HASH $h) : multi {
                my %arg = map { stringify $_ } %h;
                return  "{" . join(", ", map( "$_=>$h{$_}", keys %h) ) . "}";
        }

        sub stringify(CODE $c) : multi {
                return "sub {???}";
        }

        sub stringify(Regexp $r) : multi {
                return "qr/$r/";
        }

        sub stringify(UNIVERSAL $r) : multi {   # *any* type of object
                return $obj->stringify()
                        if $obj->can('stringify');
                return "???";
        }

        sub stringify($scalar) : multi {        # catch-all if ref() eq ""
                return qq{'$scalar'};
        }


        # and later...

        print stringify([1,2,3]);
        print stringify({a=>1,b=>[1,2,3],c=>MyClass->new(),d=>qr/#.*/});
        print stringify(42);

Notes:

=over 4

=item *
In the above example, 
C<stringify> is used recursively in processing the contents of
arrays and hashes. This ensures that nested components (as in the
second of the "and later..." examples) are correctly stringified.
This indicates some of the polymorphic power of multimethods, in that the
single C<map>-ped call to C<stringify> suffices to correctly stringify
every array element and hash entry, no matter what type that nested
element or entry is.

=item *

The parameter type C<UNIVERSAL> provides a means of writing a single
variant that captures references to I<any> blessed object, since every
blessed object automatically inherits from C<UNIVERSAL>.

=item *

A variant whose argument is untyped (i.e. the last variant in the above
example) represents a "catch-all" case, which in this particular
case would receive all calls to C<stringify> which had non-reference
scalar arguments. If the C<stringify(UNIVERSAL)> variant did not exist,
the untyped variant would also receive all calls with blessed objects 
as arguments.

=back


=head2 Handling dispatch failure

It's relatively easy to set up a multimethod such that particular
combinations of argument types cannot be correctly dispatched. For
example, consider the following variants of a multimethod called
C<put_peg>:

        package RoundPeg;     use base 'Peg';
        package SquareHole;   use base 'Hole';

        sub put_peg(RoundPeg,Hole) : multi {
                print "round peg in generic hole\n"
        }

        sub put_peg(Peg,SquareHole) : multi {
                print "generic peg in square hole\n"
        }

        sub put_peg(Peg,Hole) : multi {
                print "generic peg in generic hole\n"
        }

If C<put_peg> is called like this:

        my $peg  = RoundPeg->new();
        my $hole = SquareHole->new();

        put_peg($peg, $hole);

then the call can't be dispatched, because there's no way to
decide between the variants C<(RoundPeg,Hole)> and C<(Peg,SquareHole)>,
each of which is the same inheritance distance (i.e. 1 derivation) from
the actual arguments.

The response to this situation (proposed above) is to throw an exception
like this:

        Cannot resolve call to multimethod put_peg(RoundPeg,SquareHole).
        The multimethods:
                put_peg(RoundPeg,Hole)
                put_peg(Peg,SquareHole)
        are equally viable

This is the obvious behaviour, since the case truly I<is> ambiguous.

Some languages (e.g. CLOS) take a different approach -- breaking the
tie by a recount on the inheritance distance of each argument 
starting from the left. In this example, that would mean that
the call would be dispatched to C<put_peg(RoundPeg,Hole)>, since
the left-most parameter of that variant is a "closer" match than the
left-most parameter of the C<put_peg(Peg,SquareHole)> variant.

In the author's opinion, this approach is appalling, since it favours
one parameter above all others for no better reason than it comes first
in the argument list. But there is no reason why pegs should be more
significant than holes. Moreover, arbitrarily resolving the dispatch in
this way will often mask a fundamental design flaw in the multimethod.

However, experience indicates that sometimes the more specialized
variants of a multimethod are only provided as optimizations, and a more
general variant (in this case, the C<(Peg,Hole)> variant) would suffice
as a default where such an ambiguity exists. It is proposed that an
additional parameterized attribute -- C<:default(ambiguous)> -- be
provided so that one particular multimethod can be nominated as the
dispatch recipient in cases of ambiguity:

        sub put_peg(Peg,Hole) : multi default(ambiguous) {
                print "some kinda peg in some kinda hole\n"
        }

Now, whenever a call to C<put_peg> can't be dispatched because it's
ambiguous, this default variant will be called, rather than throwing an
exception.

Multiple dispatch can also fail if there are I<no> suitable variants
available to handle a particular call. For example:

        my $peg  = JPEG->new();
        my $hole = Loophole->new();

        put_peg($peg, $hole);

which, it is proposed above, would normally produce the exception:

        No viable candidate for call to multimethod put_peg(JPEG,Loophole)

The classes JPEG and Loophole aren't in the Peg and Hole hierarchies, so
there's no inheritance path back to a more general C<(Peg,Hole)> variant.

It is proposed that the C<:default> attribute might also take an
parameter C<failure>, which would indicate that the attributed variant
should be called when dispatch fails to find any suitable variant:

        sub put_peg(@args) : multi default(failure) {
                print "Unknown kinda peg in unknown kinda hole\n";
        }

It is further proposed that a single variant could cover I<both> types
of default behaviour, if it were specified with just the (unparameterized)
C<default> attribute:

        sub put_peg(@args) : multi default {
                print "Any kinda peg in any kinda hole\n";
        }

=head2 Multimethod redispatch

It is further proposed that the semantics of the proposed NEXT pseudoclass
(RFC 190) be extended to cover redispatch of multimethods.

Specifically, it is proposed that within a multimethod variant, any call
of the form C<&NEXT::foo;> to the same multimethod would cause the
dispatch mechanism to resume its dispatch process at the next most viable
candidate(s) and thus select the next closest variant.

For example, given the following multimethods:

        package Object;
        package HardObject;     use base 'Object';
        package SoftObject;     use base 'Object';

        package Simulation;

        sub collide(Object $obj1, Object $obj2) {
                print STDLOG "$obj1->{id} collided with $obj2->{id}";
        }

        sub collide(HardObject $obj1, HardObject $obj2) {
                print "crunch!";
                &NEXT::collide;         
        }

        sub collide(SoftObject $obj1, SoftObject $obj2) {
                print "gloop!";
                &NEXT::collide;
        }

In this case, if one of the two more specialized collision variants
is called, the more general variant that logs all collisions would
also be invoked (and passed the same arguments).


=head1 MIGRATION ISSUES

None.


=head1 IMPLEMENTATION

See the Class::Multimethods module.


=head1 REFERENCES

"Object Oriented Perl", Chapter 13.

RFC 128: Subroutines: Extend subroutine contexts to include name parameters and lazy 
arguments

RFC 190 : NEXT pseudoclass for method redispatch
RFC 256 (v2) Objects : Native support for multimethods

Reply via email to