RFC 95 (v1) Object Classes

Perl6 RFC Librarian Fri, 11 Aug 2000 10:43:20 -0700
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Object Classes

=head1 VERSION

    Maintainer: Andy Wardley <[EMAIL PROTECTED]>
    Date: 11 Aug 2000
    Version: 1
    Mailing List: [EMAIL PROTECTED]
    Number: 95

=head1 ABSTRACT

This RFC proposes a syntax and semantics for defining object classes
in Perl 6.  It introduces the C<class> keyword, which can be thought
of as a special kind of C<package>, and the C<.> dot operator which
overloads the required semantics for accessing attributes (variables
and methods) of the class and objects of the class.  This object class
mechanism should exist entirely independantly of the existing Perl 5
concept of blessing references into objects.  It should be possible
and natural to use both techniques within any Perl program.

=head1 CAVEAT

This is a bold and widely encompassing proposal for a formal object
oriented mechanism in Perl 6, a very tender subject indeed.  I've
attempted to keep things as Perlian as possible but it will
undoubtedly cause outrage in a significant number of readers who will
consider it an utter abomination that should never be allowed within a
Camel's length of Perl.  This may prove to be an accurate assessment,
but please don't forget that at this stage it's just a Request For
Comments on "One Possible Way To Do It".  Please feel free to point
all the hideous errors, glaring oversights, obvious omissions and
good, old-fashioned, stupid ideas that I've included (or not) in this
proposal.  Or if you prefer, just ignore it and let Larry be the
judge.  If it's a turkey then we'll roast it but please don't shoot
the farmer for rearing it.

=head1 DESCRIPTION

=head2 Defining a Class

The new C<class> keyword is proposed for defining an object class.  This
can be thought of as a special kind of C<package>.

    class User;
    # class definition

The class definition would continue until the next C<class> or
C<package> declaration, or until the end of the current file.

    class User;         # define 'User' class
    # class definition

    class Project;      # define 'Project' class
    # class definition

    package main;       # no longer defining any class
    # do something cool

Note that classes are not distinct from packages.  Classes are defined
B<within> a particular package, not B<instead> of a package.  Changing
packages must end a class declaration because a class can only be
defined in one package (but can be imported into others).  Thus, we
can say 'package main' to signal the end of a class definition, even
though we were already in the main package, and defining classes that
exist in the main package.

=head2 Class Variables

In its simplest form, a class can be thought of as a kind of
structure.  Lexical variables defined within the scope of a class
declaration become attributes (or data members) of that class.  The
existing C<my> keyword indicates per-instance variables while C<our>
is used to create class variables which are shared across all
instances of a given class

    class User;

    our $nusers = 0;                # class variables
    our $domain = 'nowhere.com';

    my ($id, $name, $email);        # object (instance) variables
    my @projects;

=head2 Class Methods

Subroutines defined within the scope of a class declaration become
methods of that object class.  Subroutines may be prefixed by C<my>
to indicate object methods or by C<our> to indicate class methods.

    class Foo;

    our sub bar {                       # class method
        ...
    }

    my sub baz {                        # object method
        ...
    }

The leading C<my> on subroutines should probably be optional.  All
subroutines not explicitly prefixed C<our> would then implicitly be
object methods.

    class Foo;

    our sub bar {                       # class method
        ...
    }

    sub baz {                           # object method
        ...
    }

Alternately, we could add the C<class> attribute to class methods.
This is consistent with the existing proposals/implementation of
subroutine attributes, but it requires more typing and isn't
consistent with the proposed naming of class/object variables.

    sub bar : class {
    }

We could, of course, add C<class> as an attribute to class variables
as well.

    my $foo : class;                    # class variable
    my $bar;                            # object variable.

We'll stick with C<my> and C<our> for now.

=head2 Extending a Class

A class can be extended by simply adding new variable or subroutine
definitions within the particular class scope, just as additional
C<package> definitions can be added in Perl 5.

    class User;
    my ($id, $name, $email);

    class Project;
    my ($id, $title);

    ... code passes ...

    # add to 'User' class definition
    class User;
    my @projects;

=head2 Accessing Class Variables

The C<.> dot operator is proposed as a means of accessing class or
object attributes.  The primary justification for introducing a new
operator is to clearly disambiguate it from existing operators that
access blessed object methods (C<-E<gt>>,
e.g. C<$object-E<gt>method()>) or package variable delimiters (C<::>,
e.g. C<$Foo::bar>).  We intend to overload the semantics of the dot
operator to encompass aspects of both these existing operators.  We
hope to acheive a consistent syntax based around an operator which
will "Do The Right Thing" whenever possible.

It is unfortunate that this clashes with the existing string
concatenation operator, also C<.>.  However, it should be possible to
work around most, if not all of the conflicts that might arise (see
L<ISSUES|ISSUES>).  The dot operator is something of a de factor
standard in many languages and is the obvious choice (IMHO, of
course).

An alternative should be to use the existing C<-E<gt>> operator, but
we would have to overload it with new functionality.  This is not
insurmountable but more likely to affect backwards compatability and
limit forwards extensibility.  Other suggestions welcome.

Class variables are accessed in a similar way to package variables,
with the C<.> being substituted for C<::>.

    class Foo;

    our ($bar, @baz);

    package main;

    $Foo.bar = 10;          # set variable '$bar' in class 'Foo'
    push(@Foo.baz, 20);     # push onto list '@baz' in class 'Foo'

One important distinction would be that accessing or assigning to a
non-existant attribute would raise an error, ideally at compile time.
(this should probably be enabled/disabled by a C<use/no strict
methods> pragma or something similar)

    $Foo.bad = 30;          # error: 'bad: not a member of class Foo'
    print $Foo.dud;         # error: 'dud: not a member of class Foo'

This shouldn't prevent the class or individual objects from being
extended at some later time (by some other method), nor should it
restrict the types or multiplicity of data stored within the class or
its instantiated objects.  The primary concern here is to establish a
contract in the class declaration which defines a particular
interface which cannot be casually broken by mistyping an attribute
name or accidentally violating the calling protocol which the class
has established.  If you want fully adaptable objects then blessed
hashes should serve you admirably (or use an inner hash and define
the class get() and set() methods to marshal it, but more on that
later).

In certain cases it may be desirable to mark a class or object as
"immutable" to prevent any attempts to extend it.  This is similar to
the use of the 'final' keyword in Java or the concept of 'sealing' in
Dylan (which has a significant optimisation benefit in allow
attribute offsets to be calculated at compile time).  This is
probably the subject for another RFC.

=head2 Calling Class Methods

The dot operator can also be used to call class methods.

    class Foo;

    our sub bar {
        ...
    }

    package main;

    Foo.bar();

Here we call the method by prefixing the dot operator with the class
name.  To achieve equivalence with accessing class variables (see
L<Calling Object Methods>), we should probably also allow the class
name to be prefixed by C<$>, C<@> or C<%> appropriately to indicate/coerce
the expected return type.

    class Foo;
    our $x, @y;

    sub z {
        my $i = 0;
        return map { $i++ . ": $_" } @y;
    }

    package main;
    print $Foo.x;
    print join(', ', @Foo.y);
    print join(', ', @Foo.z);

The key concept here is to think of the object as a series of 'slots'
(attributes) which we can put data into or retrieve data from.  Some slots
hold one 'thingy' (scalar attributes) and some hold multiple 'thingies'
(list/hash attributes).  We shouldn't know or care about what's going on
inside the object when we access or update these attributes.  It's
possibly best described as a "tied namespace" (to coin a phrase?)

=head2 Instantiating Objects of a Class

The existing C<new> keyword can be used to create new object instances
of a given class.

    class User;
    my ($id, $name, $email);

    package main;

    my $u = new User;

This could also be called as a class method (and should probably be the
recommended practice?).

    my $u = User.new();

We should probably treat this 'new' class attribute just like any other
and allow it to have a leading '$' (i.e. it's just a class attribute which
returns a single thingy).

    my $u = $User.new();

But this might look a little foreign, so possibly no prefix on a class
name should assume '$' so that 'User.new' is the same as '$User.new'.
Note that this would clash with an existing scalar named '$User'
(see L<ISSUES|ISSUE>).

The default behaviour for C<new>, in the absence of any constructor
method (see L<Constructor Methods>) will be to assign the parameters
passed to the object variables in the order defined within the class.

    class User;
    my ($id, $name, $email);

    package main;
    my $u = User.new('abw', 'Andy Wardley', '[EMAIL PROTECTED]');

RFC 57, "Subroutine prototypes and parameters", proposes named
parameters which might be useful in this context.  The syntax is not
yet finalised and the proposal may never be officially adopted, but in
the case that it is, it might look similar to one of the following
examples:

    # assume assignments within argument list are in object scope
    my $u = User.new(  $id = 'abw',
                     $name = 'Andy Wardley',
                    $email = '[EMAIL PROTECTED]'  );

    # use '=>' and apply parser magic to grok named parameters
    my $u = User.new(  $id => 'abw',
                     $name => 'Andy Wardley',
                    $email => '[EMAIL PROTECTED]'  );

    # many other possible forms... suggestions to perl6-language-subs

This would allow construction of sparse objects (i.e. in which only
some attributes are set) and hopefully make it easier to remember
attributes by name rather than position.

Note that we have to consider how array or hash array attributes might
'gobble up' all remaining arguments.

    class Author;
    my ($name, @books);

    package main;
    my $dna = Author.new('Douglas Adams',
                         'Hitch-Hikers Guide to the Universe',
                         'The Restaurant at the End of the Universe',
                         ...);

This is effectively the same issue as for subroutine prototypes.  It is
discussed further in RFC 57.

=head2 Accessing Object Variables

Object variables are accessed or updated in the same way as class
variables, by using the dot operator and specifying the object
reference as the receiver (on the left of the dot).

    my $u = User.new('tom', 'Thomas Tank');

    $u.email = '[EMAIL PROTECTED]';

    print "Name: ", $u.name, "\nEmail: ", $u.email, "\n";

The object is effectively a fixed structure.  Unlike a hash array
which is inherently adaptable, the object is limited to a restricted
interface.  Any attempt to access undefined attributes will result in
a compile-time error (again, under control of the aforementioned
pragma or switch).

=head2 Calling Object Methods

Object methods are called using the same syntax as for accessing
attributes.

    class User;
    my ($id, $name, $email);

    sub about {
        # member variables are in scope
        return "Name: $name\nEmail: $email\n";
    }

    package main;

    my $u = User.new('dick', 'Dick Richards', '[EMAIL PROTECTED]');
    print $u.about;

It is entirely intentional that the syntax for accessing attributes is
the same as for calling methods.  To users of the class these should
be synonymous.  This allows the designer of the class to change the
implementation (for example by changing an attribute to a method to
enable some new magic) without requiring any change in the user's
code.

It should be possible to define both a variable and method of the same
name.  The dot operator should always call the method in preference to
accesing the attribute directly.  Thus we can extend a class by wrapping
an existing attribute in an accessor method and have all existing
calls in user code forwarded appropriately.

    class User;
    my ($id, $name, $email);
    our $domain = 'perl.org';

    sub email {
        # automatically generate $email if not defined
        $email ||= "$id@$domain";
    }

    package main;
    my $u = User.new('lwall', 'Larry Wall');
    print $u.email;                     # prints "[EMAIL PROTECTED]"

It should be possible to define methods as lvalue subs so that they can
be used on the left hand side of an C<=> assignment.  This achieves greater
equivalence with attributes.  For example:

    class User;
    my ($id, $name, $email);

    sub email : lvalue {
        $email = shift || $email;
        # do something clever
    }

    package main;
    my $u = User.new('lwall', 'Larry Wall');
    $u.email = '[EMAIL PROTECTED]'

=head2 Compound Dot Operations

It should be possible to chain multiple dot operations into a single
expression.

    $foo.bar.baz = 10;

=head2 Internal Variables

Within a class definition, all variables are lexically scoped.
Within an object method, for example, the object variables are clearly
visible unless masked by other lexical variables in a narrower scope.

    class User;
    my ($id, $name, $email);
    our $domain = 'nowhere.com';

    sub summary {
        # new $email lexical masks object variable
        my $email = $email || "$id@$domain";

        # modify local copy
        $email = "<a href=\"mailto:$email\">$email</a>";

        return "Name: $name\nEmail: $email\n";
    }

The special read-only variable C<$me> (or C<$ME>, C<$self>, C<$this>,
etc.)  should be implicity defined in the outermost object scope.
Object methods can explicitly reference their attributes and methods
through this variable.  It is automatically defined and does not need
to be passed to methods as a parameter as in Perl 5 blessed objects.

    class User;
    my ($id, $name, $email);

    sub email {
        my $email = shift || return $me.email;
        print "old: ", $me.email, "\n";
        print "new: ", $email, "\n";
        $me.email = $email;
    }

    package main;

    my $u = User.new('abw', 'Andy Wardley', '[EMAIL PROTECTED]');

    print $u.email, "\n";            # [EMAIL PROTECTED]

    $u.email('[EMAIL PROTECTED]');        # old: [EMAIL PROTECTED]
                                     # new: [EMAIL PROTECTED]

Object variables (C<my>) should probably be undefined or inaccessible
to class methods, along with the C<$me> variable.  Class variables
(C<our>) would be visible to all class and object methods.  The
special read-only class variable C<$class> should also be defined and
visible to class and object methods alike.  This should also be an
externally accessible attribute allowing the type of any object to be
easily determined.

    my $u = User.new('foo', 'Mr Foo');
    print $u.class;                 # prints "User"

=head2 Private Variables and Methods

Class or object members which are prefixed with '_' should be
considered private and not accessible from outside the class
definition.  This enforces the existing Perl 5 convention of
specifying "private" keys in blessed hashes with an underscore and
avoids the need for a new C<private> keyword or subroutine attribute
to achieve the same purpose.

    class User;
    our $_user_cache = { };         # private class variable
    my  $_password;                 # private object variable

    my ($id, $name, $email);        # public attributes

    sub _encrypt {                  # private object method
        ...
    }

=head2 Inheritance

OBSERVATION: Perl 6 classes should probably support single, linear
inheritance only for the simple reason that multiple inheritance is
generally more trouble than it's worth.  It's easy to inherit multiple
interfaces (i.e. specify that the object conforms to the interface of
other classes without actually inheriting the implementation) or
use composition rules (i.e. create internal objects and delegate to
them) to achieve much the same effect as MI with far fewer problems
(see L<ISSUES|ISSUES>).

The C<isa> keyword can be used to create a subclass of a base class.

    class Person;
    my ($name, $sex, $dob);

    class User isa Person;
    my ($id, $email);

    class Hacker isa User;
    my @cool_hacks;

Each subclass inherits the attributes of its parents in defined order
from most generic (super) to most specific (sub).  The Hacker class
above then contains the attributes as if written:

    class Hacker;
    my ($name, $sex, $dob);         # inherited from Person via User
    my ($id, $email);               # inherited from User
    my @cool_hacks;

If we decide to implement multiple inheritance then it might look like
this:

    class Hacker isa User, Employee;

In this case we can still assume that base class attributes are
inherited in order defined.  We must consider how to handle the
problem of inheriting multiple instances of the same base class
(i.e. inheritance diamond).  See L<ISSUES|ISSUES>.

Attributes in super classes can be redefined by subclasses.  The
special read-only variable C<$super> should be implicitly defined
within the object scope.  Through this, object methods can explicitly
access attributes of (one of) the parent class(es).

    class User;

    sub foo {
        print "User foo\n";
    }

    class Hacker isa User;

    sub foo {
        print "Hacker foo\n";
        $super.foo;
    }

    package main;

    my $h = Hacker.new();
    $h.foo();                       # prints: Hacker foo
                                    #         User foo

It might also be useful to allow direct access to specific base class
parts.  e.g.

    class Hacker isa User;

    sub foo {
        $super.User.foo();          # call foo() method on User base class
    }

The C<super> attribute should be accessible as an external attribute
returning the name of the immediate parent class.

    my $h = Hacker.new();
    print $u.class;                 # prints "Hacker"
    print $u.super;                 # prints "User"

The C<isa> attribute should return a list of the self and parent classes
in most-specific to most-general order when called without any arguments.
If a specific class name (or names?) is specified then it should return
a boolean result indicating if the object is a member of the class(es).

    class Person;
    class User isa Person;
    class Hacker isa User;

    my $h = Hacker.new();
    print join(', ', @h.isa);       # prints "Hacker, User, Person"
    if ($h.isa('Person')) {         # true
        ...
    }

Similarly, the C<can> method should return a list of attributes
(variables or methods) that the object supports, or a boolean result
for a specific test.  These should be returned in order defined.

    class Foo;
    my $foo;
    sub foo { ... };                # wrapper method masks variable
    sub bar { ... };

    class Bar isa Foo;
    my $baz;

    package main;
    my $b = Bar.new();
    print join(', ', @b.can);       # prints "foo, bar, baz"

Note that I've used C<@h.isa> and C<@b.can>.  Should this be C<@$h.isa>
and C<@$b.can> or C<@{$h.isa}> and C<@{$b.can}> or something else?
Highlander variables would make this problem go away (RFC 9).

All classes should probably be implicitly inherited from the 'Class'
base class (similar to the UNIVERSAL object in Perl 5).  It would then
be this class that implements the C<isa>, C<can> and other common methods.

    $obj.can('can');                # always true
    $obj.can('isa');                # ditto
    $obj.isa('Class');              # likewise

We might like to permit the inheritance of multiple interfaces from
other classes without inheriting the implementation.  The C<can>
keyword could be used to inherit interfaces, added after the class
name or superclass, or anywhere else within the class definition.

    class Person;
    can Walk, Talk;
    my ($name, $sex, $dob);

    class User isa Person can Login;
    my ($id, $_password);
    ...

    class Hacker isa User can HackPerl;
    ...

On the other hand, we might not.  Either way, this is venturing beyond
the initial scope of this RFC.  As an aside, readers might like to
frighten themselves by substituting 'extends' for 'isa' and
'implements' for 'can' in the above examples (see RFC <mumble>, "Perl
is not Java" :-)

=head2 Delegation, Aliasing and Mixins

It should be possible for objects to create internal "plumbing" to
help with delegation and interaction with other objects.  One possible
solution would be to allow variable attributes to contain references
to other class or objects attributes that are then traversed
automatically when the attribute is accessed.

    class Foo;
    our $foo;
    my $bar;
    sub baz { ... }

    class Bar;
    my $_foo = Foo.new(...);        # private Foo object
    my $wiz = \$_foo.bar;           # alias $wiz to $_foo object var
    my $waz = \$_foo.baz;           # alias $waz to $_foo object method
    my $foobar = \$Foo.foo;         # alias $foobar to Foo class var $foo

    package main;
    my $bar = Bar.new();
    $bar.foobar;                    # -> $Foo.foo
    $bar.wiz;                       # -> $_foo.bar
    $bar.waz;                       # -> $_foo.baz()

It may be desirable to allow all attributes of another class or object to
be imported into another object namespace.  For example:

    class User;
    my ($name);
    sub welcome { return "Hello World\n" };

    class Hacker;
    import User;                    # equivalent to $class.import('User')?
    my $cool_hacks = [];

The 'Hacker' class is not derived from 'User' but contains a copy of the
declaration which is added to its own.  The User class is used as a 'Mixin',
so named because the definition literally gets mixed in to the enclosing
class.  Hacker is thus defined as if written:

    class Hacker;
    my ($name);
    sub welcome { return "Hello World\n" };
    my $cool_hacks = [];

Maybe C<mixin> or C<mix> would be a better keyword than C<import>?

    class Hacker;
    mixin User;

It should also be possible to mixin the attributes of a particular object,
rather than a class.

    Class Helper;
    my $msg;
    sub help { return $msg };

    Class Hacker;
    my $helper = Helper.new("Hello World\n");
    mixin $helper;                  # equiv. to: my $msg  = \$helper.msg
                                    #            my $help = \$helper.help
    package main;
    my $hacker = Hacker.new();
    print $hacker.help;             # -> $hacker.helper.help which prints
                                    #    "Hello World\n"

=head2 Constructor Methods

Perl 6 should automatically instantiate objects via the C<new> keyword
or class method.  Any initialisation method should then be called.

This RFC proposes that "special" methods such as these be defined in
UPPER CASE.  For example:

    class User;
    our $domain = 'perl.org';
    my ($id, $name, $email);

    # initialiser method; $me is already defined along with
    # any attributes specified as arguments
    sub NEW {
        die "User id not specified" unless defined $id;
        die "User name not specified" unless defined $name;
        $email ||= "$id\@$domain";
    }

    package main;
    my $u = User.new('lwall', 'Larry Wall');
    print $u.email;                 # prints "[EMAIL PROTECTED]"

The default action of the internal object constructor called by C<new>
is to instantiate an object of the required class and then fill its
public attribute slots in the order defined with any arguments passed.
Named parameters, if used, would allow the attributes to be filled in
a more specific manner.  If a NEW() method is defined then it should be
called at this point.

Base class NEW() constructors should be called in order.  Note that the
C<$class> variable should be correctly defined in the base class,
Person, to contain the name of the derived class, User.

    class Person;
    my ($name, $sex, $dob);

    sub NEW {
        die "$class name not specified" unless $name;
        $sex ||= 'unknown';
        $dob ||= 'unknown';
        print "new Person  (name: $name)\n";
    }

    class User isa Person;
    our $domain = 'perl.org';
    my ($id, $email);

    sub NEW {
        die "$class id not specified" unless defined $id;
        $email ||= "$id@domain";
        print "new User  (name: $name  email: $email)\n";
    }

    package main;

    # attributes are $name, $sex, $dob, $email;
    my $u = User.new('Larry Wall', undef, undef, 'lwall');
    print "Name: ", $u.name, "\n";
    print "DOB: ", $u.dob, "\n";
    print "Email: ", $u.email, "\n";

Output:

    new Person  (name: Larry Wall)
    new User  (name: Larry Wall  email: [EMAIL PROTECTED])
    Name: Larry Wall
    DOB: unknown
    Email: [EMAIL PROTECTED]

Note one severe limitation of this model.  We must provide constructor
arguments in exactly the right order to satisfy base class (Person)
attributes first, followed by the derived class (User) attributes.
This makes our base classes exceptionally fragile to change.  If we
want to add an attribute to a class then we run the risk of breaking
any classes that are derived from it.  For this reason, some form of
named parameterisation would be preferred (see RFC 57).  e.g.

    # parameters are considered in the object scope
    my $u = User.new($id = 'lwall', $name = 'Larry Wall');

    # other options through various kinds of parsing magic
    my $u = User.new($id => 'lwall', $name => 'Larry Wall');
    my $u = User.new($id := 'lwall', $name := 'Larry Wall');
    my $u = User.new(id = 'lwall', name = 'Larry Wall');

This allows the structure of classes to be changed at any time without
affecting existing code.  Furthermore, we can specify only the attributes
that we care to define and in any order.

Remember that object methods mask object attributes.  Thus, it should be
possible to define an accessor method around an attribute and have it
implicitly called by the constructor to set an attribute.

    class User;
    our $domain = 'perl.org';
    my ($id, $name, $email);

    sub NEW {
        # perhaps these could be pre-conditions?
        die "$class id not specified" unless defined $id;
        die "$class name not specified" unless defined $name;

        # use $id as default $email
        $email ||= $id;
    }

    sub email : lvalue {
        my $new_email = shift || return $email;

        # append domain if not specified
        $new_email = "$new_email\@$domain"
            unless $new_email =~ /@/;

        $email = $new_email;
    }

    package main;

    my $larry = User.new('lwall', 'Larry Wall');

    # alternative using named parameter:
    # e.g.   ...User.new($id = 'lwall', $name = 'Larry Wall');

In this example, the C<new> constructor instantiates an object, sets
the $id and $name attributes and then calls NEW().  After some validity
checks, it defaults the $email attribute to $id.  This calls the email
sub which appends '@perl.org' to 'lwall' and sets the $email attribute.

In this next example, the $email attribute will be provided explicitly
to the constructor.

    # class User as above
    my $larry = User.new('lwall', 'Larry Wall', 'supreme_court');
       # or named parameter equivalent

The C<new> constructor sets the $id and $name attributes directly and
then calls the email accessor method to set the $email attribute, appending
'@perl.org' as required.  When the NEW() method is then called, $email is
already set and there's no need to default from the $id value.

On final idea is that we might use named method prototypes (RFC 57) to
intercept various constructor attributes.  When the internal C<new>
constructor is called it could first inspect the prototype for NEW()
(if defined) and attempt to map any arguments passed onto those parameters.
Any remaining arguments not satisfied by the prototype could then be
applied to object attributes, skipping over any that are being handled
by the NEW() initialiser.  The named prototype variables would be
new lexicals within the NEW() method.

    class User;
    our $domain = 'space.doubt.org';
    my ($id, $name, $email);

    sub NEW ($email, $honorific = 'Mr.') {
        $email ||= $id;
        $email = "$email\@$domain"
            unless $email =~ /@/;

        # now update object attribute
        $me.email = $email;

        # $name is the real object attribute, there's no local $name
        $name = "$honorific $name";
    }

    package main;

    # all parameters set directly, no args sent to NEW()
    my $u1 = User.new( $id = 'elrich',
                     $name = 'Elrich von Lichtenstein' );

    print $u1.name;             # Mr. Elrich von Lichtenstein
    print $u1.email;            # [EMAIL PROTECTED]

    # $honorific forwarded to NEW(), others set directly
    my $u2 = User.new( $id = 'elrich',
                     $name = 'Elrich von Lichtenstein',
                $honorific = 'Count' );

    print $u2.name;             # Count Elrich von Lichtenstein
    print $u2.email;            # [EMAIL PROTECTED]

    # $email and $honorific forwarded to NEW(), others set directly
    my $u3 = User.new( $id = 'elrich',
                     $name = 'Elrich von Lichtenstein',
                $honorific = 'Count',
                    $email = 'evonlich' );

    print $u3.name;             # Count Elrich von Lichtenstein
    print $u3.email;            # [EMAIL PROTECTED]


A further benefit of this approach is in allowing additional parameters
to be passed to the NEW() initialiser (e.g. $honorific) which aren't
object attributes and persist only within the scope of the NEW() method.

Without named parameters, we would have to be more careful about the
order in which we passed arguments.  Let's look at a simple example.

    class User;
    my ($id, $name, $email);

    sub NEW ($id) {
        ...
        $me.id = $id;
    }

The NEW() constructor expects the first parameter as C<$id>.  Any
remaining positional arguments would be assigned directly to C<$name>
and C<$email> in that order, skipping over C<$id> which has been
handled elsewhere.  This would allow us to monkey around with the
order of constructor parameters (although probably not advisable)
and have it do the right thing (for some definition of "right").

    class User;
    my ($id, $name, $email, $pass);

    sub NEW ($name, $email) {
        ...
    }

The constructor parameter order would now be C<($name, $email, $id, $pass)>.
Could get hairy, especially with multiple base classes.

=head2 Other Magical Methods

The OLD() method is proposed to compliment the NEW() method, being
called immediately before the object is destroyed.  Each destructor
(we'll call them that for now) should be called in reverse order from
most specific (sub) class to most general (super).

Objects should probably implement general purpose get() and set()
attributes.

    class Author;
    my $name;

    my $user = User.new('Douglas Adams');
    print $user.get('name');            # same as $user.name

    my $key = 'name';
    print $user.get($key);              # likewise

    $user.set('name', 'Douglas Fir');   # $user.name = 'Douglas Fir'

It might be desirable to allow special GET() and SET() methods to be
defined to intercept all external accesses and updates to object
attributes.  When called from within an object definition, the get()
method would not be diverted.

    class Astronaut;
    my ($name, ...);

    # NOTE: we're using named prototypes again (see RFC 57), e.g.
    # sub foo ($bar) {         ===>       sub foo { my ($bar) = @_;
    #    ...                                  ...
    # }                                   }

    sub GET ($attr) {
        die "I'm sorry $name, I can't do that"
            if $attr eq 'open_doors';

        # anything not prohibited is allowed
        $me.get($attr);     # Go directly to get().  Do not pass GET().
    }                       # Do not collect $200.

    sub SET ($attr, @value) {
        # do whatever
        $me.set($attr, @value);
    }

Note that the SET() method might received multiple arguments (@value)
when an attribute method is called as:

    $obj.method(10, 20, 30);

The INSPECT() method (or something similar) should be called whenever
the object itself is evaluated rather than a specific attribute.  In
conjunction with Damian Conway's RFC 21, the proposed want() function
could be used to return a view of the object in many different
formats.  The default, implicit INSPECT() method might look something
like this:

    sub INSPECT {
        # I know, I should be using switch and currying.... :-)
        if (want('HASH')) {
            # return hash of current attributes and values
            return map { ( $_ => $me.get($_) ) } @me.can;
        }
        if (want('ARRAY')) {
            # return list of values
            return map { $me.get($_) } @me.can;
        }
        elsif (want('SCALAR')) {
            # return self for copy by reference
            return $me;
        }
        elsif (want('STRING')) {
            return "$class: " .
                    join(', ', map { "$_ => " . $me.get($_) } @me.can);
        }
        ...etc...
    }

This would allow an object reference ($object in these example) to be used
in many different ways.  For example.

    my %objhash = %$object;     # copy attribs/values into hash
    my @values  = @$object;     # copy values into list
    my $obj2 = $object;         # copy by reference
    print "$object";            # stringification

=head2 Loading External Classes

There should be a mechanism for loading external class definitions.  The
simplest implementation would be to load external modules with C<use>, as
in Perl 5.  Modules could thus contain package variables and subs (like
existing Perl 5 modules) and/or new class declarations.

    #!/usr/bin/perl6                # '-w' and 'use strict' by default :-)
    use DBI;

    my $dbh5 = DBI->connect(...)    # Perl 5 interface
    my $dbh6 = DBI.connect(...)     # Perl 6 interface

We may prefer to implement a new keyword for loading class modules,
such as 'import', 'load', etc.  This would allow us to apply different
heuristics for finding class modules, perhaps by naming them with a
different suffix (e.g. ".pc" instead of ".pm") or by storing them in a
different location to "regular" modules.

A class definition, either declared in or imported into a Perl program
would effectively create a scalar variable with the same name of the
class in the current namespace (i.e. package or other class
definition).

    class User;
    my ($id, $name, $email);

    package main;
    print $User;        # e.g. prints "OBJECT(0x80cf15c)"
    print User;         # same, implicit leading '$'

This would clash with any existing variable called '$User'.  For this
reason, we would generally suggest that class names are capitalised
and variable name lower case.  We also have to consider possible
clashes with other @User and %User, or adopt the "Highlander Variables"
proposed in RFC 9.

Loading an external class definition should have the same effect of
defining this $Classname variable.  This is akin to a package
exporting names by default and it can be a Bad Thing.  This is one
possible justification for a new C<load> (or other) keyword that
could be used like this:

    load User;             # creates $User, a reference to a class
    load User as MyUser;   # create $MyUser, as above

It might be preferable (or optional) to use leading '$' characters.  This
makes it more obvious that a variable is being created.

    load $User;
    load $User as $MyUser;

Hmm, quite icky.  Perhaps then, C<load> could load the class and return a
reference to the singleton class object (i.e. the one object representing
the class itself, rather than an instance of the class).  Then you could
assign it to your own class variable.

    my $Dude  = load User;          # $Dude -> User class object
    my $larry = $Dude.new(...);

When called in a void context, it could export the class object into the
caller's package under the default name.

    load User;                      # $User -> User class object
    my $larry = $User.new(...);

Alternately, C<load> could be a method (er, attribute) of the universal
base class, Class?  That way we don't need a new C<load> keyword.

    Class.load('User');
    my $Dude = Class.load('User');

=head1 ISSUES

This section covers a few of the outstanding issues.

=head2 Attribute visibility

I'm not sure about making all non-underscored attributes visible by
default.  It might be better to explicitly declare them in some way.

    our ($foo, $bar) : public;
    my  ($wiz, $waz) : public;
    my  ($id, $ssn)  : readonly;

    # everything else is private
    my ($a, $b, $c);

We can mark subs in the same way, but I think it would be tedious to
have to declare every one as 'public'.

    sub foobar : public {
    }

We could assume that an accessor method inherits the visibility of an
attribute of the same name, but I think this might be hazardous, with
too much happening "behind the scenes".  This is why it might be
better to default everything to public visibility and specifically
mark private items by attribute, leading underscore or some other
method.

Furthermore, I see a typical use of this construct to create simple
fixed structure records and we want to make these simple things
simple.  e.g.

    class Product;
    my ($id, $name, $price);

    package main;
    my $p = Product.new('xyz123', 'Carrots', 57);

Default public visibility, no special constructor methods, nice and
simple.  Let the more complex uses require more complex syntax!

=head2 Type Checking

If type checking is to be implemented then it is proposed that data
and interfaces be typed but variables left untyped (although note that
Larry has already suggested that variables should be typable in Perl 6
so this may be a moot point).  Typed variables work something like
this:

    # typed variables
    my Dog $spot;               # $spot can only ever be a Dog
    my Cat $felix;              # ditto $felix a Cat

    $spot = $felix;             # error: $spot is not a Cat

Typed data and interfaces would work more like this:

    my $spot = Dog.new();       # $spot is a Dog for now
    my $felix = Cat.new();      # $felix just happens to be a Cat

    foo($spot);                 # OK, $spot.isa('Dog')
    foo($felix);                # NOT OK, ! $felix.isa('Dog')

    $felix = $spot;             # OK, variables are untyped
    foo($felix);                # OK, $felix.isa('Dog')

    sub foo (Dog $dog) {
        ...
    }

The second is more natural to Perl and should be much easier to
implement.  The only time that type checking is required is at the
point when a subroutine with a typed interface (prototype definition)
is called.  The big downside is that type checking is performed at
runtime, not compile time, which may be why Larry plans to implement
typable variables anyway.

This is the subject for another RFC.

=head2 Multiple methods

We might want to consider how we can use prototype signatures and/or
attributes to resolve attribute calls against multiple defined
subs of the same name.  Ideally, this would implement multimethod
dispatch (Yet Another Conway RFC in the pipeline).

    class Foo;
    my $bar;

    sub bar() {
        # ...
        return $bar;
    }

    sub bar($b) : lvalue {
        $bar = $b;
    }

    package main;

    my $f = Foo.new(10);
    $f.bar(20);             # bar($b)
    $f.bar = 30;            # bar($b)
    print $f.bar;           # bar()

Typed interfaces:

    class Foo;
    my $bar;

    sub bar(Bar $b) {       # expects $b.isa('Bar')
        $bar = $b;
    }

    sub bar(Foo $f) {       # expects $f.isa('Foo')
        $bar = $f.bar;
    }

    sub bar($n) {           # any scalar
        $bar = Bar.new($n);
    }

    sub bar() {             # no arguments
        return $bar;
    }

=head2 Multiple Inheritance

Here's a typical inheritance diamond problem.

    class Person;
    class User isa Person;
    class Employee isa Person;
    class Hacker isa User, Employee;

                             +---------+
                             | Person  |
                             +---------+
                                  |
                                 /_\
                                  |
                        +---------+---------+
                        |                   |
                   +---------+         +----------+
                   |  User   |         | Employee |
                   +---------+         +----------+
                        |                   |
                        +---------+---------+
                                  |
                                 /_\
                                  |
                             +---------+
                             |  Hacker |
                             +---------+

This is bad for a number of reasons which are beyond the scope of this
RFC.

One possible solution is to linearise the class hierarchy and make it
monotonous.  That is, to re-arrange the base classes into a linear, single
inheritance graph and to ensure that each base class is defined only once.
This may prove restrictive but that may be a Good Thing.

>From the previous example, Employee would inherit the same Person base
class as User.

                             +---------+
                             | Person  |
                             +---------+
                                  |
                                 /_\
                                  |
                             +---------+
                             |  User   |
                             +---------+
                                  |
                                 /_\
                                  |
                             +---------+
                             | Employee|
                             +---------+
                                  |
                                 /_\
                                  |
                             +---------+
                             | Hacker  |
                             +---------+

There are other ways to skin this particular cat and further
discussion probably the subject for another RFC.  I'm sure Damian's
got one in the pipeline :-)

=head2 Dot Operator vs String Concatenation

The string concatenation problem might not prove too difficult to
avoid.  In the trivial case, the element following the period in
an object call will not be prefixed by any I<funny character>, or
be a quoted string or constant.

    $foo.$bar;          # string concatenation

    $foo.bar;           # object method

Note this would prevent us from dispatching a method by symbolic
reference.

    my $method = 'bar';
    $object.$method;            # Uh-oh, now there's a leading '$'

One work around would be to explicitly use the object get() and set()
methods when the attribute name must be determined dynamically.

    $object.get($method);
    $object.set($method, $value);

To avoid conflict with subroutines of a given name we could require that
they be prefixed by C<&> to distinguish them so.

    sub bar {
        return "bar string";
    }

    $foo.bar();         # object method;
    $foo.&bar();        # string concatenation of $foo and "bar string"

Furthermore, we could enable the lexer/parser to always interpret ' . ' (with
surrounding whitespace) as the concatenation operator and an unadorned '.'
as the object dot operator, unless followed by a I<funny character>.

    $foo . bar();       # string concatenation
    $foo.bar();         # object method

    $foo.$bar;          # string concatenation
    $foo.&bar();        # string concatenation

=head2 String Interpolation

We would also need to consider the case of string interpolated
variable references.  One solution would be to apply the dot operation
only when a variable is explicitly scoped with curly braces, but ignore
it otherwise.

    "$foo.bar.baz"    ==>   $foo . '.bar.baz'
    "${foo.bar}.baz"  ==>   $foo.bar() . '.baz'

=head2 Immutable Classes

Sealing classes (see Dylan, I'll dig out some refs) may be a very good
idea for efficiency as well as for enforcing strict control on the
users of a class.  If it's possible to bind some or all of the object
attributes at compile time then we should be able to make accessing
object attributes and methods very efficient indeed.  We can calculate
the offsets of attributes in the class/object stash at compile time
making our opcodes relocatable against any object activation record.

(lots of hand waving)

=head2 Standard Class Library

We might want to implement a new class hierarchy containing a
well-ordered and consistent set of class modules, similar to Java.
This would compliment the more esoteric CPAN module collection.
Having many ways to do something is a Good Thing, but sometimes we'd
like to be told the one way that the authors of Perl think is a good
way.

Then again, we might not.

=head1 REFERENCES

RFC  9: Highlander variables

RFC 21: Replace C<wantarray> with a generic C<want> function

RFC 57: Subroutine prototypes and parameters

Gwydion Dylan programming language homepage: http://www.gwydiondylan.org/
RFC 95 (v1) Object Classes

Reply via email to