This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Object Classes =head1 VERSION Maintainer: Andy Wardley <[EMAIL PROTECTED]> Date: 11 Aug 2000 Version: 1 Mailing List: [EMAIL PROTECTED] Number: 95 =head1 ABSTRACT This RFC proposes a syntax and semantics for defining object classes in Perl 6. It introduces the C<class> keyword, which can be thought of as a special kind of C<package>, and the C<.> dot operator which overloads the required semantics for accessing attributes (variables and methods) of the class and objects of the class. This object class mechanism should exist entirely independantly of the existing Perl 5 concept of blessing references into objects. It should be possible and natural to use both techniques within any Perl program. =head1 CAVEAT This is a bold and widely encompassing proposal for a formal object oriented mechanism in Perl 6, a very tender subject indeed. I've attempted to keep things as Perlian as possible but it will undoubtedly cause outrage in a significant number of readers who will consider it an utter abomination that should never be allowed within a Camel's length of Perl. This may prove to be an accurate assessment, but please don't forget that at this stage it's just a Request For Comments on "One Possible Way To Do It". Please feel free to point all the hideous errors, glaring oversights, obvious omissions and good, old-fashioned, stupid ideas that I've included (or not) in this proposal. Or if you prefer, just ignore it and let Larry be the judge. If it's a turkey then we'll roast it but please don't shoot the farmer for rearing it. =head1 DESCRIPTION =head2 Defining a Class The new C<class> keyword is proposed for defining an object class. This can be thought of as a special kind of C<package>. class User; # class definition The class definition would continue until the next C<class> or C<package> declaration, or until the end of the current file. class User; # define 'User' class # class definition class Project; # define 'Project' class # class definition package main; # no longer defining any class # do something cool Note that classes are not distinct from packages. Classes are defined B<within> a particular package, not B<instead> of a package. Changing packages must end a class declaration because a class can only be defined in one package (but can be imported into others). Thus, we can say 'package main' to signal the end of a class definition, even though we were already in the main package, and defining classes that exist in the main package. =head2 Class Variables In its simplest form, a class can be thought of as a kind of structure. Lexical variables defined within the scope of a class declaration become attributes (or data members) of that class. The existing C<my> keyword indicates per-instance variables while C<our> is used to create class variables which are shared across all instances of a given class class User; our $nusers = 0; # class variables our $domain = 'nowhere.com'; my ($id, $name, $email); # object (instance) variables my @projects; =head2 Class Methods Subroutines defined within the scope of a class declaration become methods of that object class. Subroutines may be prefixed by C<my> to indicate object methods or by C<our> to indicate class methods. class Foo; our sub bar { # class method ... } my sub baz { # object method ... } The leading C<my> on subroutines should probably be optional. All subroutines not explicitly prefixed C<our> would then implicitly be object methods. class Foo; our sub bar { # class method ... } sub baz { # object method ... } Alternately, we could add the C<class> attribute to class methods. This is consistent with the existing proposals/implementation of subroutine attributes, but it requires more typing and isn't consistent with the proposed naming of class/object variables. sub bar : class { } We could, of course, add C<class> as an attribute to class variables as well. my $foo : class; # class variable my $bar; # object variable. We'll stick with C<my> and C<our> for now. =head2 Extending a Class A class can be extended by simply adding new variable or subroutine definitions within the particular class scope, just as additional C<package> definitions can be added in Perl 5. class User; my ($id, $name, $email); class Project; my ($id, $title); ... code passes ... # add to 'User' class definition class User; my @projects; =head2 Accessing Class Variables The C<.> dot operator is proposed as a means of accessing class or object attributes. The primary justification for introducing a new operator is to clearly disambiguate it from existing operators that access blessed object methods (C<-E<gt>>, e.g. C<$object-E<gt>method()>) or package variable delimiters (C<::>, e.g. C<$Foo::bar>). We intend to overload the semantics of the dot operator to encompass aspects of both these existing operators. We hope to acheive a consistent syntax based around an operator which will "Do The Right Thing" whenever possible. It is unfortunate that this clashes with the existing string concatenation operator, also C<.>. However, it should be possible to work around most, if not all of the conflicts that might arise (see L<ISSUES|ISSUES>). The dot operator is something of a de factor standard in many languages and is the obvious choice (IMHO, of course). An alternative should be to use the existing C<-E<gt>> operator, but we would have to overload it with new functionality. This is not insurmountable but more likely to affect backwards compatability and limit forwards extensibility. Other suggestions welcome. Class variables are accessed in a similar way to package variables, with the C<.> being substituted for C<::>. class Foo; our ($bar, @baz); package main; $Foo.bar = 10; # set variable '$bar' in class 'Foo' push(@Foo.baz, 20); # push onto list '@baz' in class 'Foo' One important distinction would be that accessing or assigning to a non-existant attribute would raise an error, ideally at compile time. (this should probably be enabled/disabled by a C<use/no strict methods> pragma or something similar) $Foo.bad = 30; # error: 'bad: not a member of class Foo' print $Foo.dud; # error: 'dud: not a member of class Foo' This shouldn't prevent the class or individual objects from being extended at some later time (by some other method), nor should it restrict the types or multiplicity of data stored within the class or its instantiated objects. The primary concern here is to establish a contract in the class declaration which defines a particular interface which cannot be casually broken by mistyping an attribute name or accidentally violating the calling protocol which the class has established. If you want fully adaptable objects then blessed hashes should serve you admirably (or use an inner hash and define the class get() and set() methods to marshal it, but more on that later). In certain cases it may be desirable to mark a class or object as "immutable" to prevent any attempts to extend it. This is similar to the use of the 'final' keyword in Java or the concept of 'sealing' in Dylan (which has a significant optimisation benefit in allow attribute offsets to be calculated at compile time). This is probably the subject for another RFC. =head2 Calling Class Methods The dot operator can also be used to call class methods. class Foo; our sub bar { ... } package main; Foo.bar(); Here we call the method by prefixing the dot operator with the class name. To achieve equivalence with accessing class variables (see L<Calling Object Methods>), we should probably also allow the class name to be prefixed by C<$>, C<@> or C<%> appropriately to indicate/coerce the expected return type. class Foo; our $x, @y; sub z { my $i = 0; return map { $i++ . ": $_" } @y; } package main; print $Foo.x; print join(', ', @Foo.y); print join(', ', @Foo.z); The key concept here is to think of the object as a series of 'slots' (attributes) which we can put data into or retrieve data from. Some slots hold one 'thingy' (scalar attributes) and some hold multiple 'thingies' (list/hash attributes). We shouldn't know or care about what's going on inside the object when we access or update these attributes. It's possibly best described as a "tied namespace" (to coin a phrase?) =head2 Instantiating Objects of a Class The existing C<new> keyword can be used to create new object instances of a given class. class User; my ($id, $name, $email); package main; my $u = new User; This could also be called as a class method (and should probably be the recommended practice?). my $u = User.new(); We should probably treat this 'new' class attribute just like any other and allow it to have a leading '$' (i.e. it's just a class attribute which returns a single thingy). my $u = $User.new(); But this might look a little foreign, so possibly no prefix on a class name should assume '$' so that 'User.new' is the same as '$User.new'. Note that this would clash with an existing scalar named '$User' (see L<ISSUES|ISSUE>). The default behaviour for C<new>, in the absence of any constructor method (see L<Constructor Methods>) will be to assign the parameters passed to the object variables in the order defined within the class. class User; my ($id, $name, $email); package main; my $u = User.new('abw', 'Andy Wardley', '[EMAIL PROTECTED]'); RFC 57, "Subroutine prototypes and parameters", proposes named parameters which might be useful in this context. The syntax is not yet finalised and the proposal may never be officially adopted, but in the case that it is, it might look similar to one of the following examples: # assume assignments within argument list are in object scope my $u = User.new( $id = 'abw', $name = 'Andy Wardley', $email = '[EMAIL PROTECTED]' ); # use '=>' and apply parser magic to grok named parameters my $u = User.new( $id => 'abw', $name => 'Andy Wardley', $email => '[EMAIL PROTECTED]' ); # many other possible forms... suggestions to perl6-language-subs This would allow construction of sparse objects (i.e. in which only some attributes are set) and hopefully make it easier to remember attributes by name rather than position. Note that we have to consider how array or hash array attributes might 'gobble up' all remaining arguments. class Author; my ($name, @books); package main; my $dna = Author.new('Douglas Adams', 'Hitch-Hikers Guide to the Universe', 'The Restaurant at the End of the Universe', ...); This is effectively the same issue as for subroutine prototypes. It is discussed further in RFC 57. =head2 Accessing Object Variables Object variables are accessed or updated in the same way as class variables, by using the dot operator and specifying the object reference as the receiver (on the left of the dot). my $u = User.new('tom', 'Thomas Tank'); $u.email = '[EMAIL PROTECTED]'; print "Name: ", $u.name, "\nEmail: ", $u.email, "\n"; The object is effectively a fixed structure. Unlike a hash array which is inherently adaptable, the object is limited to a restricted interface. Any attempt to access undefined attributes will result in a compile-time error (again, under control of the aforementioned pragma or switch). =head2 Calling Object Methods Object methods are called using the same syntax as for accessing attributes. class User; my ($id, $name, $email); sub about { # member variables are in scope return "Name: $name\nEmail: $email\n"; } package main; my $u = User.new('dick', 'Dick Richards', '[EMAIL PROTECTED]'); print $u.about; It is entirely intentional that the syntax for accessing attributes is the same as for calling methods. To users of the class these should be synonymous. This allows the designer of the class to change the implementation (for example by changing an attribute to a method to enable some new magic) without requiring any change in the user's code. It should be possible to define both a variable and method of the same name. The dot operator should always call the method in preference to accesing the attribute directly. Thus we can extend a class by wrapping an existing attribute in an accessor method and have all existing calls in user code forwarded appropriately. class User; my ($id, $name, $email); our $domain = 'perl.org'; sub email { # automatically generate $email if not defined $email ||= "$id@$domain"; } package main; my $u = User.new('lwall', 'Larry Wall'); print $u.email; # prints "[EMAIL PROTECTED]" It should be possible to define methods as lvalue subs so that they can be used on the left hand side of an C<=> assignment. This achieves greater equivalence with attributes. For example: class User; my ($id, $name, $email); sub email : lvalue { $email = shift || $email; # do something clever } package main; my $u = User.new('lwall', 'Larry Wall'); $u.email = '[EMAIL PROTECTED]' =head2 Compound Dot Operations It should be possible to chain multiple dot operations into a single expression. $foo.bar.baz = 10; =head2 Internal Variables Within a class definition, all variables are lexically scoped. Within an object method, for example, the object variables are clearly visible unless masked by other lexical variables in a narrower scope. class User; my ($id, $name, $email); our $domain = 'nowhere.com'; sub summary { # new $email lexical masks object variable my $email = $email || "$id@$domain"; # modify local copy $email = "<a href=\"mailto:$email\">$email</a>"; return "Name: $name\nEmail: $email\n"; } The special read-only variable C<$me> (or C<$ME>, C<$self>, C<$this>, etc.) should be implicity defined in the outermost object scope. Object methods can explicitly reference their attributes and methods through this variable. It is automatically defined and does not need to be passed to methods as a parameter as in Perl 5 blessed objects. class User; my ($id, $name, $email); sub email { my $email = shift || return $me.email; print "old: ", $me.email, "\n"; print "new: ", $email, "\n"; $me.email = $email; } package main; my $u = User.new('abw', 'Andy Wardley', '[EMAIL PROTECTED]'); print $u.email, "\n"; # [EMAIL PROTECTED] $u.email('[EMAIL PROTECTED]'); # old: [EMAIL PROTECTED] # new: [EMAIL PROTECTED] Object variables (C<my>) should probably be undefined or inaccessible to class methods, along with the C<$me> variable. Class variables (C<our>) would be visible to all class and object methods. The special read-only class variable C<$class> should also be defined and visible to class and object methods alike. This should also be an externally accessible attribute allowing the type of any object to be easily determined. my $u = User.new('foo', 'Mr Foo'); print $u.class; # prints "User" =head2 Private Variables and Methods Class or object members which are prefixed with '_' should be considered private and not accessible from outside the class definition. This enforces the existing Perl 5 convention of specifying "private" keys in blessed hashes with an underscore and avoids the need for a new C<private> keyword or subroutine attribute to achieve the same purpose. class User; our $_user_cache = { }; # private class variable my $_password; # private object variable my ($id, $name, $email); # public attributes sub _encrypt { # private object method ... } =head2 Inheritance OBSERVATION: Perl 6 classes should probably support single, linear inheritance only for the simple reason that multiple inheritance is generally more trouble than it's worth. It's easy to inherit multiple interfaces (i.e. specify that the object conforms to the interface of other classes without actually inheriting the implementation) or use composition rules (i.e. create internal objects and delegate to them) to achieve much the same effect as MI with far fewer problems (see L<ISSUES|ISSUES>). The C<isa> keyword can be used to create a subclass of a base class. class Person; my ($name, $sex, $dob); class User isa Person; my ($id, $email); class Hacker isa User; my @cool_hacks; Each subclass inherits the attributes of its parents in defined order from most generic (super) to most specific (sub). The Hacker class above then contains the attributes as if written: class Hacker; my ($name, $sex, $dob); # inherited from Person via User my ($id, $email); # inherited from User my @cool_hacks; If we decide to implement multiple inheritance then it might look like this: class Hacker isa User, Employee; In this case we can still assume that base class attributes are inherited in order defined. We must consider how to handle the problem of inheriting multiple instances of the same base class (i.e. inheritance diamond). See L<ISSUES|ISSUES>. Attributes in super classes can be redefined by subclasses. The special read-only variable C<$super> should be implicitly defined within the object scope. Through this, object methods can explicitly access attributes of (one of) the parent class(es). class User; sub foo { print "User foo\n"; } class Hacker isa User; sub foo { print "Hacker foo\n"; $super.foo; } package main; my $h = Hacker.new(); $h.foo(); # prints: Hacker foo # User foo It might also be useful to allow direct access to specific base class parts. e.g. class Hacker isa User; sub foo { $super.User.foo(); # call foo() method on User base class } The C<super> attribute should be accessible as an external attribute returning the name of the immediate parent class. my $h = Hacker.new(); print $u.class; # prints "Hacker" print $u.super; # prints "User" The C<isa> attribute should return a list of the self and parent classes in most-specific to most-general order when called without any arguments. If a specific class name (or names?) is specified then it should return a boolean result indicating if the object is a member of the class(es). class Person; class User isa Person; class Hacker isa User; my $h = Hacker.new(); print join(', ', @h.isa); # prints "Hacker, User, Person" if ($h.isa('Person')) { # true ... } Similarly, the C<can> method should return a list of attributes (variables or methods) that the object supports, or a boolean result for a specific test. These should be returned in order defined. class Foo; my $foo; sub foo { ... }; # wrapper method masks variable sub bar { ... }; class Bar isa Foo; my $baz; package main; my $b = Bar.new(); print join(', ', @b.can); # prints "foo, bar, baz" Note that I've used C<@h.isa> and C<@b.can>. Should this be C<@$h.isa> and C<@$b.can> or C<@{$h.isa}> and C<@{$b.can}> or something else? Highlander variables would make this problem go away (RFC 9). All classes should probably be implicitly inherited from the 'Class' base class (similar to the UNIVERSAL object in Perl 5). It would then be this class that implements the C<isa>, C<can> and other common methods. $obj.can('can'); # always true $obj.can('isa'); # ditto $obj.isa('Class'); # likewise We might like to permit the inheritance of multiple interfaces from other classes without inheriting the implementation. The C<can> keyword could be used to inherit interfaces, added after the class name or superclass, or anywhere else within the class definition. class Person; can Walk, Talk; my ($name, $sex, $dob); class User isa Person can Login; my ($id, $_password); ... class Hacker isa User can HackPerl; ... On the other hand, we might not. Either way, this is venturing beyond the initial scope of this RFC. As an aside, readers might like to frighten themselves by substituting 'extends' for 'isa' and 'implements' for 'can' in the above examples (see RFC <mumble>, "Perl is not Java" :-) =head2 Delegation, Aliasing and Mixins It should be possible for objects to create internal "plumbing" to help with delegation and interaction with other objects. One possible solution would be to allow variable attributes to contain references to other class or objects attributes that are then traversed automatically when the attribute is accessed. class Foo; our $foo; my $bar; sub baz { ... } class Bar; my $_foo = Foo.new(...); # private Foo object my $wiz = \$_foo.bar; # alias $wiz to $_foo object var my $waz = \$_foo.baz; # alias $waz to $_foo object method my $foobar = \$Foo.foo; # alias $foobar to Foo class var $foo package main; my $bar = Bar.new(); $bar.foobar; # -> $Foo.foo $bar.wiz; # -> $_foo.bar $bar.waz; # -> $_foo.baz() It may be desirable to allow all attributes of another class or object to be imported into another object namespace. For example: class User; my ($name); sub welcome { return "Hello World\n" }; class Hacker; import User; # equivalent to $class.import('User')? my $cool_hacks = []; The 'Hacker' class is not derived from 'User' but contains a copy of the declaration which is added to its own. The User class is used as a 'Mixin', so named because the definition literally gets mixed in to the enclosing class. Hacker is thus defined as if written: class Hacker; my ($name); sub welcome { return "Hello World\n" }; my $cool_hacks = []; Maybe C<mixin> or C<mix> would be a better keyword than C<import>? class Hacker; mixin User; It should also be possible to mixin the attributes of a particular object, rather than a class. Class Helper; my $msg; sub help { return $msg }; Class Hacker; my $helper = Helper.new("Hello World\n"); mixin $helper; # equiv. to: my $msg = \$helper.msg # my $help = \$helper.help package main; my $hacker = Hacker.new(); print $hacker.help; # -> $hacker.helper.help which prints # "Hello World\n" =head2 Constructor Methods Perl 6 should automatically instantiate objects via the C<new> keyword or class method. Any initialisation method should then be called. This RFC proposes that "special" methods such as these be defined in UPPER CASE. For example: class User; our $domain = 'perl.org'; my ($id, $name, $email); # initialiser method; $me is already defined along with # any attributes specified as arguments sub NEW { die "User id not specified" unless defined $id; die "User name not specified" unless defined $name; $email ||= "$id\@$domain"; } package main; my $u = User.new('lwall', 'Larry Wall'); print $u.email; # prints "[EMAIL PROTECTED]" The default action of the internal object constructor called by C<new> is to instantiate an object of the required class and then fill its public attribute slots in the order defined with any arguments passed. Named parameters, if used, would allow the attributes to be filled in a more specific manner. If a NEW() method is defined then it should be called at this point. Base class NEW() constructors should be called in order. Note that the C<$class> variable should be correctly defined in the base class, Person, to contain the name of the derived class, User. class Person; my ($name, $sex, $dob); sub NEW { die "$class name not specified" unless $name; $sex ||= 'unknown'; $dob ||= 'unknown'; print "new Person (name: $name)\n"; } class User isa Person; our $domain = 'perl.org'; my ($id, $email); sub NEW { die "$class id not specified" unless defined $id; $email ||= "$id@domain"; print "new User (name: $name email: $email)\n"; } package main; # attributes are $name, $sex, $dob, $email; my $u = User.new('Larry Wall', undef, undef, 'lwall'); print "Name: ", $u.name, "\n"; print "DOB: ", $u.dob, "\n"; print "Email: ", $u.email, "\n"; Output: new Person (name: Larry Wall) new User (name: Larry Wall email: [EMAIL PROTECTED]) Name: Larry Wall DOB: unknown Email: [EMAIL PROTECTED] Note one severe limitation of this model. We must provide constructor arguments in exactly the right order to satisfy base class (Person) attributes first, followed by the derived class (User) attributes. This makes our base classes exceptionally fragile to change. If we want to add an attribute to a class then we run the risk of breaking any classes that are derived from it. For this reason, some form of named parameterisation would be preferred (see RFC 57). e.g. # parameters are considered in the object scope my $u = User.new($id = 'lwall', $name = 'Larry Wall'); # other options through various kinds of parsing magic my $u = User.new($id => 'lwall', $name => 'Larry Wall'); my $u = User.new($id := 'lwall', $name := 'Larry Wall'); my $u = User.new(id = 'lwall', name = 'Larry Wall'); This allows the structure of classes to be changed at any time without affecting existing code. Furthermore, we can specify only the attributes that we care to define and in any order. Remember that object methods mask object attributes. Thus, it should be possible to define an accessor method around an attribute and have it implicitly called by the constructor to set an attribute. class User; our $domain = 'perl.org'; my ($id, $name, $email); sub NEW { # perhaps these could be pre-conditions? die "$class id not specified" unless defined $id; die "$class name not specified" unless defined $name; # use $id as default $email $email ||= $id; } sub email : lvalue { my $new_email = shift || return $email; # append domain if not specified $new_email = "$new_email\@$domain" unless $new_email =~ /@/; $email = $new_email; } package main; my $larry = User.new('lwall', 'Larry Wall'); # alternative using named parameter: # e.g. ...User.new($id = 'lwall', $name = 'Larry Wall'); In this example, the C<new> constructor instantiates an object, sets the $id and $name attributes and then calls NEW(). After some validity checks, it defaults the $email attribute to $id. This calls the email sub which appends '@perl.org' to 'lwall' and sets the $email attribute. In this next example, the $email attribute will be provided explicitly to the constructor. # class User as above my $larry = User.new('lwall', 'Larry Wall', 'supreme_court'); # or named parameter equivalent The C<new> constructor sets the $id and $name attributes directly and then calls the email accessor method to set the $email attribute, appending '@perl.org' as required. When the NEW() method is then called, $email is already set and there's no need to default from the $id value. On final idea is that we might use named method prototypes (RFC 57) to intercept various constructor attributes. When the internal C<new> constructor is called it could first inspect the prototype for NEW() (if defined) and attempt to map any arguments passed onto those parameters. Any remaining arguments not satisfied by the prototype could then be applied to object attributes, skipping over any that are being handled by the NEW() initialiser. The named prototype variables would be new lexicals within the NEW() method. class User; our $domain = 'space.doubt.org'; my ($id, $name, $email); sub NEW ($email, $honorific = 'Mr.') { $email ||= $id; $email = "$email\@$domain" unless $email =~ /@/; # now update object attribute $me.email = $email; # $name is the real object attribute, there's no local $name $name = "$honorific $name"; } package main; # all parameters set directly, no args sent to NEW() my $u1 = User.new( $id = 'elrich', $name = 'Elrich von Lichtenstein' ); print $u1.name; # Mr. Elrich von Lichtenstein print $u1.email; # [EMAIL PROTECTED] # $honorific forwarded to NEW(), others set directly my $u2 = User.new( $id = 'elrich', $name = 'Elrich von Lichtenstein', $honorific = 'Count' ); print $u2.name; # Count Elrich von Lichtenstein print $u2.email; # [EMAIL PROTECTED] # $email and $honorific forwarded to NEW(), others set directly my $u3 = User.new( $id = 'elrich', $name = 'Elrich von Lichtenstein', $honorific = 'Count', $email = 'evonlich' ); print $u3.name; # Count Elrich von Lichtenstein print $u3.email; # [EMAIL PROTECTED] A further benefit of this approach is in allowing additional parameters to be passed to the NEW() initialiser (e.g. $honorific) which aren't object attributes and persist only within the scope of the NEW() method. Without named parameters, we would have to be more careful about the order in which we passed arguments. Let's look at a simple example. class User; my ($id, $name, $email); sub NEW ($id) { ... $me.id = $id; } The NEW() constructor expects the first parameter as C<$id>. Any remaining positional arguments would be assigned directly to C<$name> and C<$email> in that order, skipping over C<$id> which has been handled elsewhere. This would allow us to monkey around with the order of constructor parameters (although probably not advisable) and have it do the right thing (for some definition of "right"). class User; my ($id, $name, $email, $pass); sub NEW ($name, $email) { ... } The constructor parameter order would now be C<($name, $email, $id, $pass)>. Could get hairy, especially with multiple base classes. =head2 Other Magical Methods The OLD() method is proposed to compliment the NEW() method, being called immediately before the object is destroyed. Each destructor (we'll call them that for now) should be called in reverse order from most specific (sub) class to most general (super). Objects should probably implement general purpose get() and set() attributes. class Author; my $name; my $user = User.new('Douglas Adams'); print $user.get('name'); # same as $user.name my $key = 'name'; print $user.get($key); # likewise $user.set('name', 'Douglas Fir'); # $user.name = 'Douglas Fir' It might be desirable to allow special GET() and SET() methods to be defined to intercept all external accesses and updates to object attributes. When called from within an object definition, the get() method would not be diverted. class Astronaut; my ($name, ...); # NOTE: we're using named prototypes again (see RFC 57), e.g. # sub foo ($bar) { ===> sub foo { my ($bar) = @_; # ... ... # } } sub GET ($attr) { die "I'm sorry $name, I can't do that" if $attr eq 'open_doors'; # anything not prohibited is allowed $me.get($attr); # Go directly to get(). Do not pass GET(). } # Do not collect $200. sub SET ($attr, @value) { # do whatever $me.set($attr, @value); } Note that the SET() method might received multiple arguments (@value) when an attribute method is called as: $obj.method(10, 20, 30); The INSPECT() method (or something similar) should be called whenever the object itself is evaluated rather than a specific attribute. In conjunction with Damian Conway's RFC 21, the proposed want() function could be used to return a view of the object in many different formats. The default, implicit INSPECT() method might look something like this: sub INSPECT { # I know, I should be using switch and currying.... :-) if (want('HASH')) { # return hash of current attributes and values return map { ( $_ => $me.get($_) ) } @me.can; } if (want('ARRAY')) { # return list of values return map { $me.get($_) } @me.can; } elsif (want('SCALAR')) { # return self for copy by reference return $me; } elsif (want('STRING')) { return "$class: " . join(', ', map { "$_ => " . $me.get($_) } @me.can); } ...etc... } This would allow an object reference ($object in these example) to be used in many different ways. For example. my %objhash = %$object; # copy attribs/values into hash my @values = @$object; # copy values into list my $obj2 = $object; # copy by reference print "$object"; # stringification =head2 Loading External Classes There should be a mechanism for loading external class definitions. The simplest implementation would be to load external modules with C<use>, as in Perl 5. Modules could thus contain package variables and subs (like existing Perl 5 modules) and/or new class declarations. #!/usr/bin/perl6 # '-w' and 'use strict' by default :-) use DBI; my $dbh5 = DBI->connect(...) # Perl 5 interface my $dbh6 = DBI.connect(...) # Perl 6 interface We may prefer to implement a new keyword for loading class modules, such as 'import', 'load', etc. This would allow us to apply different heuristics for finding class modules, perhaps by naming them with a different suffix (e.g. ".pc" instead of ".pm") or by storing them in a different location to "regular" modules. A class definition, either declared in or imported into a Perl program would effectively create a scalar variable with the same name of the class in the current namespace (i.e. package or other class definition). class User; my ($id, $name, $email); package main; print $User; # e.g. prints "OBJECT(0x80cf15c)" print User; # same, implicit leading '$' This would clash with any existing variable called '$User'. For this reason, we would generally suggest that class names are capitalised and variable name lower case. We also have to consider possible clashes with other @User and %User, or adopt the "Highlander Variables" proposed in RFC 9. Loading an external class definition should have the same effect of defining this $Classname variable. This is akin to a package exporting names by default and it can be a Bad Thing. This is one possible justification for a new C<load> (or other) keyword that could be used like this: load User; # creates $User, a reference to a class load User as MyUser; # create $MyUser, as above It might be preferable (or optional) to use leading '$' characters. This makes it more obvious that a variable is being created. load $User; load $User as $MyUser; Hmm, quite icky. Perhaps then, C<load> could load the class and return a reference to the singleton class object (i.e. the one object representing the class itself, rather than an instance of the class). Then you could assign it to your own class variable. my $Dude = load User; # $Dude -> User class object my $larry = $Dude.new(...); When called in a void context, it could export the class object into the caller's package under the default name. load User; # $User -> User class object my $larry = $User.new(...); Alternately, C<load> could be a method (er, attribute) of the universal base class, Class? That way we don't need a new C<load> keyword. Class.load('User'); my $Dude = Class.load('User'); =head1 ISSUES This section covers a few of the outstanding issues. =head2 Attribute visibility I'm not sure about making all non-underscored attributes visible by default. It might be better to explicitly declare them in some way. our ($foo, $bar) : public; my ($wiz, $waz) : public; my ($id, $ssn) : readonly; # everything else is private my ($a, $b, $c); We can mark subs in the same way, but I think it would be tedious to have to declare every one as 'public'. sub foobar : public { } We could assume that an accessor method inherits the visibility of an attribute of the same name, but I think this might be hazardous, with too much happening "behind the scenes". This is why it might be better to default everything to public visibility and specifically mark private items by attribute, leading underscore or some other method. Furthermore, I see a typical use of this construct to create simple fixed structure records and we want to make these simple things simple. e.g. class Product; my ($id, $name, $price); package main; my $p = Product.new('xyz123', 'Carrots', 57); Default public visibility, no special constructor methods, nice and simple. Let the more complex uses require more complex syntax! =head2 Type Checking If type checking is to be implemented then it is proposed that data and interfaces be typed but variables left untyped (although note that Larry has already suggested that variables should be typable in Perl 6 so this may be a moot point). Typed variables work something like this: # typed variables my Dog $spot; # $spot can only ever be a Dog my Cat $felix; # ditto $felix a Cat $spot = $felix; # error: $spot is not a Cat Typed data and interfaces would work more like this: my $spot = Dog.new(); # $spot is a Dog for now my $felix = Cat.new(); # $felix just happens to be a Cat foo($spot); # OK, $spot.isa('Dog') foo($felix); # NOT OK, ! $felix.isa('Dog') $felix = $spot; # OK, variables are untyped foo($felix); # OK, $felix.isa('Dog') sub foo (Dog $dog) { ... } The second is more natural to Perl and should be much easier to implement. The only time that type checking is required is at the point when a subroutine with a typed interface (prototype definition) is called. The big downside is that type checking is performed at runtime, not compile time, which may be why Larry plans to implement typable variables anyway. This is the subject for another RFC. =head2 Multiple methods We might want to consider how we can use prototype signatures and/or attributes to resolve attribute calls against multiple defined subs of the same name. Ideally, this would implement multimethod dispatch (Yet Another Conway RFC in the pipeline). class Foo; my $bar; sub bar() { # ... return $bar; } sub bar($b) : lvalue { $bar = $b; } package main; my $f = Foo.new(10); $f.bar(20); # bar($b) $f.bar = 30; # bar($b) print $f.bar; # bar() Typed interfaces: class Foo; my $bar; sub bar(Bar $b) { # expects $b.isa('Bar') $bar = $b; } sub bar(Foo $f) { # expects $f.isa('Foo') $bar = $f.bar; } sub bar($n) { # any scalar $bar = Bar.new($n); } sub bar() { # no arguments return $bar; } =head2 Multiple Inheritance Here's a typical inheritance diamond problem. class Person; class User isa Person; class Employee isa Person; class Hacker isa User, Employee; +---------+ | Person | +---------+ | /_\ | +---------+---------+ | | +---------+ +----------+ | User | | Employee | +---------+ +----------+ | | +---------+---------+ | /_\ | +---------+ | Hacker | +---------+ This is bad for a number of reasons which are beyond the scope of this RFC. One possible solution is to linearise the class hierarchy and make it monotonous. That is, to re-arrange the base classes into a linear, single inheritance graph and to ensure that each base class is defined only once. This may prove restrictive but that may be a Good Thing. >From the previous example, Employee would inherit the same Person base class as User. +---------+ | Person | +---------+ | /_\ | +---------+ | User | +---------+ | /_\ | +---------+ | Employee| +---------+ | /_\ | +---------+ | Hacker | +---------+ There are other ways to skin this particular cat and further discussion probably the subject for another RFC. I'm sure Damian's got one in the pipeline :-) =head2 Dot Operator vs String Concatenation The string concatenation problem might not prove too difficult to avoid. In the trivial case, the element following the period in an object call will not be prefixed by any I<funny character>, or be a quoted string or constant. $foo.$bar; # string concatenation $foo.bar; # object method Note this would prevent us from dispatching a method by symbolic reference. my $method = 'bar'; $object.$method; # Uh-oh, now there's a leading '$' One work around would be to explicitly use the object get() and set() methods when the attribute name must be determined dynamically. $object.get($method); $object.set($method, $value); To avoid conflict with subroutines of a given name we could require that they be prefixed by C<&> to distinguish them so. sub bar { return "bar string"; } $foo.bar(); # object method; $foo.&bar(); # string concatenation of $foo and "bar string" Furthermore, we could enable the lexer/parser to always interpret ' . ' (with surrounding whitespace) as the concatenation operator and an unadorned '.' as the object dot operator, unless followed by a I<funny character>. $foo . bar(); # string concatenation $foo.bar(); # object method $foo.$bar; # string concatenation $foo.&bar(); # string concatenation =head2 String Interpolation We would also need to consider the case of string interpolated variable references. One solution would be to apply the dot operation only when a variable is explicitly scoped with curly braces, but ignore it otherwise. "$foo.bar.baz" ==> $foo . '.bar.baz' "${foo.bar}.baz" ==> $foo.bar() . '.baz' =head2 Immutable Classes Sealing classes (see Dylan, I'll dig out some refs) may be a very good idea for efficiency as well as for enforcing strict control on the users of a class. If it's possible to bind some or all of the object attributes at compile time then we should be able to make accessing object attributes and methods very efficient indeed. We can calculate the offsets of attributes in the class/object stash at compile time making our opcodes relocatable against any object activation record. (lots of hand waving) =head2 Standard Class Library We might want to implement a new class hierarchy containing a well-ordered and consistent set of class modules, similar to Java. This would compliment the more esoteric CPAN module collection. Having many ways to do something is a Good Thing, but sometimes we'd like to be told the one way that the authors of Perl think is a good way. Then again, we might not. =head1 REFERENCES RFC 9: Highlander variables RFC 21: Replace C<wantarray> with a generic C<want> function RFC 57: Subroutine prototypes and parameters Gwydion Dylan programming language homepage: http://www.gwydiondylan.org/
