Precedence table update
Here's the current precedence table as I see it, based mostly on what the, er, cabal came up with after the Perl conference. [Cabal members: note that I've demoted cmp and = from chaining relationals, and I've moved the pipe operators closer together. I've also generalized the two middle categories to chaining and non-chaining binaries, and stuffed but into the non-chaining, since we neglected to deal with it.] terms 42 eek $x /abc/ (1+2) a(1) :by(2) .meth listop(left) method postfix .foo .+foo .?foo .*foo .+foo .() .[] .{} .«» .=foo autoincrement ++ -- exponentiation ** symbolic unary ! + - ~ ? * ** +^ ~^ ?^ \ multiplicative * / % x xx + + + ~ ~ ~ additive+ - ~ +| +^ ~| ~^ junctive and (all) junctive or (any) | ^ named unary rand sleep abs etc. nonchaining binary = but does cmp = .. ^.. ..^ ^..^ chaining binary != == = = ~~ !~ eq ne lt le gt ge =:= tight and tight or|| ^^ // ternary ?? :: assignment = := ::= += **= xx= etc. (but not .=) list item separator , list op (rightward) == print push any all true not etc. pipe forward== loose and and loose oror xor err expr separator ; {} as control block, statement modifiers Some random comments: Hyper ops always have the same precedence as their unhyper versions. List operators are terms on their left side only, just as in Perl 5. Basically you can think of them kind of as an opening paren without a corresponding closing paren. (Instead they're closed by something of looser precedence.) The .meth as a term is the one we call unary dot because it doesn't have an invocant. But really it's just parsed as a term, because you only get it when a term is expected. The method postfixes are when an operator is expected. Dot itself is not really an operator--it just introduces a postfix operator. This is important to understand because, as a postfix operator, you only hyperize the left of it: @foo».bar All the bitwise operators are considered to be doing boolean algebra, where 1 * 1 == 1, so they're just classified with the ordinary multiply and divide operations. The bitwise shifts are also multiplicative, but because they multiply or divide by two. The cabal felt the various bit operators weren't important enough to warrant separate precedence levels anymore. Even in C, people almost always parenthesize them because they're not sure... However, even though they've merged with the arithmetic operators, you'll note that by the rules of boolean algebra, the ands are still one precedence level tighter than the ors. That is consistent across all the ands and ors of every time. (And the corresponding not is always tighter than either the and or the or.) The xx operator is in there with the x operator only for mnemonic purposes. By rights it should probably be a lot looser if we want it to multiply lists, but we'll just force people to parenthesize their lists, as we do with list assignment. Junctions deserve their own levels because they're actually constructing compound data values out of simple ones, so they need to be looser than the simple data operators but tighter than the comparison operators. Arguably they could be looser than named unaries, but it sort of feels right to keep all the generic unary and binary operators in the middle together. There is no binary nor operator... The generic non-chaining binaries are mostly the sorts of operators that construct new objects (ranges, pairs, mixins) out of two related values. Though admittedly, the numbers -1, 0, and 1 are a bit of a stretch to think of that way in the case of cmp and =. I don't think there's much merit in defining a comparison result object that just happens to return -1, 0, or 1... We haven't said much about associativity here. Mostly it's what you'd expect. ** and = are right associative, ++/-- are nonassociative. I think the main reason for lumping all the non-chaining binaries together is that we should call them non-associative, and you have to parenthesize to be clear whether you mean: ($a = $b) = $c ($a .. $b) .. $c ($a but $b) but $c or $a = ($b = $c) $a .. ($b .. $c) $a but ($b but $c) Assuming those mean anything at all. Certainly a pair can contain a pair as either its key or its value (or both). Nothing much surprising down to the list ops. We decided to make Ctrue and Cnot list ops just to save a level of precedence, though we'll probably actually make them able to process lists of values so you can say weird things like if all true $a, $b, $c {...} and if all not $a, $b, $c {...} though admittedly if not all $a, $b, $c {...} reads a little better, and also presumably works. Or how about this? if not all true $a, $b, $c {...} The == operator is not a list operator
Re: Precedence table update
On Aug 14, 2004, at 12:17 AM, Larry Wall wrote: Here's the current precedence table as I see it, based mostly on what the, er, cabal came up with after the Perl conference. Okay, time to get out the quill and parchment and start work on revising the Periodic Table of the Operators - Mark hope the tallow candle lasts 'till dawn Lentczner Mark Lentczner http://www.ozonehouse.com/mark/ [EMAIL PROTECTED]
Re: Precedence table update
On Sat, Aug 14, 2004 at 08:42:51AM -0700, Mark Lentczner wrote: : : On Aug 14, 2004, at 12:17 AM, Larry Wall wrote: : Here's the current precedence table as I see it, based mostly : on what the, er, cabal came up with after the Perl conference. : : Okay, time to get out the quill and parchment and start work on : revising the Periodic Table of the Operators : : - Mark hope the tallow candle lasts 'till dawn Lentczner You'll also want to make sure the zip operator (¥) gets in there, probably with the same precedence as == (unless we decide it's a scalar-only operator, in which case it can be tighter because it would only work on array refs). It also has self-associativity issues much like ^ and == and, if you squint, semicolon inside subscripts. Larry
Synopsis 1 draft 1
=head1 Title Synopsis 1: Overview =head1 Author Larry Wall [EMAIL PROTECTED] =head1 Version Maintainer: Date: Last Modified: Number: 1 Version: 0 This document summarizes Apocalypse 1, which covers the initial design concept. (These Synopses also contain updates to reflect the evolving design of Perl 6 over time, unlike the Apocalypses, which are frozen in time as historical documents. These updates are not marked--if a Synopsis disagrees with its Apocalypse, assume the Synopsis is correct.) The other basic assumption is that if we don't talk about something in these Synopses, it's the same as it was in Perl 5. =head1 Random Thoughts =over 4 =item * The word apocalypse historically meant merely a revealing, and we're using it in that unexciting sense. =item * If you ask for RFCs from the general public, you get a lot of interesting but contradictory ideas, because people tend to stake out polar positions, and none of the ideas can build on each other. =item * Larry's First Law of Language Redesign: Everyone wants the colon. =item * RFCs are rated on PSA: whether they point out a real Problem, whether they present a viable Solution, and whether that solution is likely to be Accepted as part of Perl 6. =item * Languages should be redesigned in roughly the same order as you would present the language to a new user. =item * Perl 6 should be malleable enough that it can evolve into the imaginary perfect language, Perl 7. This darwinian imperative implies support for multiple syntaxes above and and multiple platforms below. =item * Many details may change, but the essence of Perl will remain unchanged. Perl will continue to be a multiparadigmatic, context-sensitive language. We are not turning Perl into any other existing language. =item * Migration is important. The perl interpreter will assume that it is being fed Perl 5 code unless the code starts with a class or module keyword, or you specifically tell it you're running Perl 6 code in some other way. =item * Scaling is one of those areas where Perl needs to multiparadigmatic and context sensitive. Therefore your main code is allowed to be lax, while module and class code is (by default) required to be strict. =item * It must be possible to write policy metamodules that invoke other modules on the user's behalf. =item * If you want to treat everything as objects in Perl 6, Perl will help you do that. If you don't want to treat everything as objects, Perl will help you with that viewpoint as well. =item * Operators are just functions with funny names and syntax. =item * Language designers are still necessary to synthesize unrelated ideas into a coherent whole. =back
Synopsis 2 draft 1
=head1 Title Synopsis 2: Bits and Pieces =head1 Author Larry Wall [EMAIL PROTECTED] =head1 Version Maintainer: your name here Date: Last Modified: Number: 2 Version: 0 This document summarizes Apocalypse 2, which covers small-scale lexical items and typological issues. (These Synopses also contain updates to reflect the evolving design of Perl 6 over time, unlike the Apocalypses, which are frozen in time as historical documents. These updates are not marked--if a Synopsis disagrees with its Apocalypse, assume the Synopsis is correct.) =head1 Atoms =over 4 =item * In the abstract, Perl is written in Unicode, and has consistent Unicode semantics regardless of the underlying text representations. =item * Perl can count Unicode line and paragraph separators as line markers, but that behavior had better be configurable so that Perl's idea of line numbers matches what your editor thinks about Unicode lines. =back =head1 Molecules =over 4 =item * Multiline comments will be provided by extending the syntax of POD to nest C=begin COMMENT/C=end COMMENT correctly without the need for C=cut. (Doesn't have to be COMMENT--any unrecognized POD stream will do to make it a comment. Bare C=begin and C=end probably aren't good enough though, unless you want all your comments to end up in the manpage...) Probably we could have single paragraph comments with C=for COMMENT as well. That would let C=for keep its meaning as the equivalent of a C=begin and C=end combined. =item * Intra-line comments will not be supported in standard Perl (but it would be trivial to declare them as a macro). =back =head1 Built-In Data Types =over 4 =item * In support of OO encapsulation, there is a new fundamental datatype: opaque. External access to opaque objects is always through method calls, even for attributes. =item * Perl 6 will have an optional type system that helps you write safer code that performs better. =item * Perl 6 will support the notion of properties on various kinds of objects. Properties are like object attributes, except that they're managed by the individual object rather than by the object's class. According to A12, properties are actually implemented by a kind of mixin mechanism. =item * Properties applied to compile-time objects such as variables and classes are also called traits. Traits are not expected to change at run time. =item * Perl 6 is an OO engine, but you're not generally required to think in OO when that's inconvenient. However, some built-in concepts such as filehandles will be more object-oriented in a user-visible way. =item * A variable's type is an interface contract indicating what sorts of values the variable may contain. More precisely, it's a promise that the object or objects contained in the variable are capable of responding to the methods of the indicated role. See A12 for more about roles. A variable object may itself be bound to a container type that specifies how the container works without necessarily specifying what kinds of things it contains. =item * You'll be able to ask for the length of an array, but it won't be called that, because length does not specify units. So C.elems is the number of array elements. (You can also ask for the length of an array in bytes or codepoints or graphemes. Same for strings.) =item * Cmy Dog $spot by itself does not automatically call a CDog constructor. The actual constructor syntax turns out to be Cmy Dog $spot.=new;, making use of the C.= mutator method-call syntax. =item * If you say my int @array is MyArray; you are declaring that the elements of C@array are integers, but that the array itself is implemented by the CMyArray class. Untyped arrays and hashes are still perfectly acceptable, but have the same performance issues they have in Perl 5. =item * Built-in object types start with an uppercase letter: Int, Num, Str, Bit, Ref, Scalar, Array, Hash, Rule and Code]. Non-object (value) types are lowercase: int, num, str, bit, and ref. Value types are primarily intended for declaring compact array storage. However, Perl will try to make those look like their corresponding uppercase types if you treat them that way. =item * Perl 6 will intrinsically support big integers and rationals through its system of type declarations. CInt automatically supports promotion to arbitrary precision. CRat supports arbitrary precision rational arithmetic. Value types like Cint and Cnum imply the natural machine representation for integers and floating-point numbers, respectively, and do not promote to arbitrary precision. Untyped scalars use Int semantics rather than int. =item * Perl 6 should by default make standard IEEE floating point concepts visible, such as CInf (infinity) and CNaN (not a number). It should also be at least pragmatically possible to throw exceptions on overflow. =item * A Cstr is always a byte buffer, whereas a CStr is a Unicode string object of some sort.
Re: Synopsis 2 draft 1
@@ -165,7 +165,7 @@ =head1 Built-In Data Types =item * Built-in object types start with an uppercase letter: Int, Num, Str, -Bit, Ref, Scalar, Array, Hash, Rule and Code]. Non-object (value) types +Bit, Ref, Scalar, Array, Hash, Rule and Code. Non-object (value) types are lowercase: int, num, str, bit, and ref. Value types are primarily intended for declaring compact array storage. However, Perl will try to make those look like their corresponding uppercase types if @@ -198,6 +198,8 @@ =head1 Built-In Data Types =head1 Variables +=over 4 + =item * The C$pkg'var syntax is dead. Use C$pkg::var instead. @@ -243,7 +245,7 @@ =head1 Variables the C.as('%03d') method to do an implicit sprintf on the value. To format an array value separated by commas, supply a second argument: C.as('%03d', ', '). To format a hash value or list of pairs, include -formats for both key and value in the first string: C .as('%s: %s', \n). +formats for both key and value in the first string: C .as('%s: %s', \n) . =item * @@ -680,7 +682,7 @@ =head1 Context =back -=head Lists +=head1 Lists =over 4 @@ -759,6 +761,8 @@ =head Lists =back =head1 Files + +=over 4 =item *
Re: Synopsis 2 draft 1
Larry Wall writes: Synopsis 2: Bits and Pieces Nice. (Minor pod corrections sent as a diff under separate cover.) You may interpolate a package name into an identifier using C::($expr) where you'd ordinarily put the package name. The parens are required. XXX Actually, C::{$expr} might be made to work instead, given that that's how you treat a package symbol table as a hash, and inner packages are stored in their parent hash. And curlies would be more consistent with closure interpolation in strings. We'd just need to make sure C$::{$foo}::bar parses correctly as a single name token. Using braces seems more intuitive, and hence easier to remember. To get a Perlish representation of any data value, use the C.repr method. This will put quotes around strings, square brackets around list values, curlies around hash values, etc., such that standard Perl could reparse the result. XXX .repr is what Python calls it, I think. Is there a better name? Yes; I've no suggestions as to what it might be, but surely there's _got_ to be a better name than C.repr. To get a formatted representation of any scalar data value, use the C.as('%03d') method to do an implicit sprintf on the value. To format an array value separated by commas, supply a second argument: C.as('%03d', ', '). To format a hash value or list of pairs, include formats for both key and value in the first string: C .as('%s: %s', \n) . Yay -- that sounds very useful! As with Perl 5 array interpolation, the elements are separated by a space. (Except that a space is not added if the element already ends in some kind of whitespace. I like that exception; it means that if all your array elements end with line-breaks, you don't end up with all but the first one being indented (which confused me lots when I was just starting out with Perl, and I've seen many others do it since). A bare closure also interpolates in double-quotish context. It may not be followed by any dereferencers, since you can always put them inside the closure. ... The old disambiguation syntax ... is dead. Use closure curlies instead: {$foo[$bar]} {$foo}[$bar] That last example seems to violate the previous stipulation about not following a closure by dereferencers. XXX We could yet replace $foo with $foo.more or $foo.iter or $foo.shift or some such (but not $foo.next or $foo.readline), That sounds good to me -- C while ($file) is one of the least-intuitive bits of syntax to get across to people learning Perl; there doesn't seem to be reason why this particular method call should get a purely symbol name, especially when something much more common such as Cprint doesn't. Something lie C.iter isn't much more to type, and it doesn't involve pressing Shift (or possibly something even more exotic on international keyboards) to type the pointies. and steal the angles for something else. If past performance is anything to go by, the main victim of freeing the pointies for another purpose would be Piers -- threads on this mailing list of people discussing operator syntax have a habit of getting quickly out of control. For what it's worth, I'd be happy to use ordinary pointies instead of the guillemets for quoting words (and hash keys and the like), leaving the guillemets just for hyper ops. I still think your original analysis that word-quoting is more common than file-iterating is correct, that both of them are more common than hyper ops, and that there's some advantage to having the more complicated-looking (and -to-type) characters only being used for the more complicated operators. Thank you again for coming up with this! Smylers
Re: Synopsis 2 draft 1
On Sat, Aug 14, 2004 at 09:56:34PM +, Smylers wrote: : A bare closure also interpolates in double-quotish context. It may : not be followed by any dereferencers, since you can always put them : inside the closure. ... The old disambiguation syntax ... is dead. : Use closure curlies instead: : : {$foo[$bar]} : {$foo}[$bar] : : That last example seems to violate the previous stipulation about not : following a closure by dereferencers. That's the point--it isn't a dereferencer. It's literal brackets. It's replacing the old Perl 1 distinction: ${foo[$bar]} ${foo}[$bar] simply by putting the $ inside instead of outside, and relying on general closure interpolation, getting rid of two specific exceptions for the price of one generality. Seems like a win to me. Larry
Re: Synopsis 2 draft 1
On Sat, Aug 14, 2004 at 09:56:34PM +, Smylers wrote: : You may interpolate a package name into an identifier using : C::($expr) where you'd ordinarily put the package name. The parens : are required. : : XXX Actually, C::{$expr} might be made to work instead, given that : that's how you treat a package symbol table as a hash, and inner : packages are stored in their parent hash. And curlies would be more : consistent with closure interpolation in strings. We'd just need to : make sure C$::{$foo}::bar parses correctly as a single name token. : : Using braces seems more intuitive, and hence easier to remember. Okay, that section now reads: You may interpolate a string into a package or variable name using C::{$expr} where you'd ordinarily put a package or variable name. The string is allowed to contain additional instances of C::, which will be interpreted as package nesting. You may only interpolate entire names, since the construct starts with C::, and either ends immediately or is continued with another C:: outside the curlies. All symbolic references are done with this notation: $foo = Foo; $foobar = Foo::Bar; $::{$foo} # package-scoped $Foo $::{MY::$foo} # lexically-scoped $Foo $::{*::$foo} # global $Foo $::{$foobar}# $Foo::Bar $::{$foobar}::baz # $Foo::Bar::baz $::{$foo}::Bar::baz # $Foo::Bar::baz $::{$foobar}baz # ILLEGAL at compile time (no operator baz) ${$foobar}::baz # ILLEGAL at run time (no hard ref in $foobar) Note that unlike in Perl 5, initial C:: doesn't imply global. Package names are searched for from inner lexical scopes to outer, then from inner packages to outer. The global namespace is the last place it looks. You must use the C* package to force the search to start in the global namespace. : To get a formatted representation of any scalar data value, use the : C.as('%03d') method to do an implicit sprintf on the value. To : format an array value separated by commas, supply a second argument: : C.as('%03d', ', '). To format a hash value or list of pairs, : include formats for both key and value in the first string: C : .as('%s: %s', \n) . : : Yay -- that sounds very useful! The main problem with it is that there's no way to write the default behavior. : As with Perl 5 array interpolation, the elements are separated by a : space. (Except that a space is not added if the element already ends : in some kind of whitespace. : : I like that exception; it means that if all your array elements end with : line-breaks, you don't end up with all but the first one being indented : (which confused me lots when I was just starting out with Perl, and I've : seen many others do it since). That's the default behavior you can't write with .as(). Takes a .map or some such. : XXX We could yet replace $foo with $foo.more or $foo.iter or : $foo.shift or some such (but not $foo.next or $foo.readline), : : That sounds good to me -- C while ($file) is one of the : least-intuitive bits of syntax to get across to people learning Perl; : there doesn't seem to be reason why this particular method call should : get a purely symbol name, especially when something much more common : such as Cprint doesn't. : : Something lie C.iter isn't much more to type, and it doesn't involve : pressing Shift (or possibly something even more exotic on international : keyboards) to type the pointies. But .iter is ugly, ugly, ugly. Worse the .repr in my opinion. Unfortunately, most of the the good names are taken, like all, each. That's why we've tended to end up back with $IN. : and steal the angles for something else. : : For what it's worth, I'd be happy to use ordinary pointies instead of : the guillemets for quoting words (and hash keys and the like), leaving : the guillemets just for hyper ops. I still think your original analysis : that word-quoting is more common than file-iterating is correct, that : both of them are more common than hyper ops, and that there's some : advantage to having the more complicated-looking (and -to-type) : characters only being used for the more complicated operators. The problem with that is that, both visually and conceptually, it's more ambiguous with the infix: operator. You'd get people writing things like $foo2 and wondering why it blows up. $foo«2» is much more visually distinctive, and I think visual distinctions trump ease of typing. I'm actually wondering whether we should leave ... free for user-defined quoting purposes. Actually, we could leave ... defaulting to meaning .iter (or whatever), but if the user overrode the meaning of ..., they'd just have to use the method instead of One is almost tempted to say that iterators should be declared as self-growing arrays, but then you get problems with the fact that @*IN in list context would not, in fact, read the array destructively, which it needs to do. One is also tempted to play with syntax