Precedence table update

2004-08-14 Thread Larry Wall
Here's the current precedence table as I see it, based mostly
on what the, er, cabal came up with after the Perl conference.

[Cabal members: note that I've demoted cmp and = from chaining
relationals, and I've moved the pipe operators closer together.
I've also generalized the two middle categories to chaining and
non-chaining binaries, and stuffed but into the non-chaining,
since we neglected to deal with it.]

terms   42 eek $x /abc/ (1+2) a(1) :by(2) .meth listop(left)
method postfix  .foo .+foo .?foo .*foo .+foo .() .[] .{} .«» .=foo
autoincrement   ++ --
exponentiation  **
symbolic unary  ! + - ~ ? * ** +^ ~^ ?^ \
multiplicative  * / % x xx + + + ~ ~ ~
additive+ - ~ +| +^ ~| ~^
junctive and (all)  
junctive or (any)   | ^
named unary rand sleep abs etc.
nonchaining binary  = but does cmp = .. ^.. ..^ ^..^
chaining binary != ==  =  = ~~ !~ eq ne lt le gt ge =:=
tight and   
tight or|| ^^ //
ternary ?? ::
assignment  = := ::= += **= xx= etc. (but not .=)
list item separator ,
list op (rightward) == print push any all true not etc.
pipe forward==
loose and   and
loose oror xor err
expr separator  ; {} as control block, statement modifiers

Some random comments:

Hyper ops always have the same precedence as their unhyper versions.

List operators are terms on their left side only, just as in Perl 5.
Basically you can think of them kind of as an opening paren without a
corresponding closing paren.  (Instead they're closed by something
of looser precedence.)

The .meth as a term is the one we call unary dot because it doesn't
have an invocant.  But really it's just parsed as a term, because you
only get it when a term is expected.  The method postfixes are when
an operator is expected.  Dot itself is not really an operator--it
just introduces a postfix operator.  This is important to understand
because, as a postfix operator, you only hyperize the left of it:

@foo».bar

All the bitwise operators are considered to be doing boolean algebra,
where 1 * 1 == 1, so they're just classified with the ordinary multiply
and divide operations.  The bitwise shifts are also multiplicative,
but because they multiply or divide by two.  The cabal felt the various
bit operators weren't important enough to warrant separate precedence
levels anymore.  Even in C, people almost always parenthesize them
because they're not sure...

However, even though they've merged with the arithmetic operators,
you'll note that by the rules of boolean algebra, the ands are still
one precedence level tighter than the ors.  That is consistent 
across all the ands and ors of every time.  (And the corresponding
not is always tighter than either the and or the or.)

The xx operator is in there with the x operator only for mnemonic
purposes.  By rights it should probably be a lot looser if we want
it to multiply lists, but we'll just force people to parenthesize
their lists, as we do with list assignment.

Junctions deserve their own levels because they're actually
constructing compound data values out of simple ones, so they need
to be looser than the simple data operators but tighter than the
comparison operators.  Arguably they could be looser than named
unaries, but it sort of feels right to keep all the generic unary
and binary operators in the middle together.

There is no binary nor operator...

The generic non-chaining binaries are mostly the sorts of operators
that construct new objects (ranges, pairs, mixins) out of two related
values.  Though admittedly, the numbers -1, 0, and 1 are a bit of a
stretch to think of that way in the case of cmp and =.  I don't
think there's much merit in defining a comparison result object
that just happens to return -1, 0, or 1...

We haven't said much about associativity here.  Mostly it's what you'd
expect.  ** and = are right associative, ++/-- are nonassociative.
I think the main reason for lumping all the non-chaining binaries
together is that we should call them non-associative, and you have
to parenthesize to be clear whether you mean:

($a = $b) = $c
($a .. $b) .. $c
($a but $b) but $c

or

$a = ($b = $c)
$a .. ($b .. $c)
$a but ($b but $c)

Assuming those mean anything at all.  Certainly a pair can contain
a pair as either its key or its value (or both).

Nothing much surprising down to the list ops.  We decided to make
Ctrue and Cnot list ops just to save a level of precedence,
though we'll probably actually make them able to process lists of
values so you can say weird things like

if all true $a, $b, $c {...}

and

if all not $a, $b, $c {...}

though admittedly

if not all $a, $b, $c {...}

reads a little better, and also presumably works.  Or how about this?

if not all true $a, $b, $c {...}

The == operator is not a list operator 

Re: Precedence table update

2004-08-14 Thread Mark Lentczner
On Aug 14, 2004, at 12:17 AM, Larry Wall wrote:
Here's the current precedence table as I see it, based mostly
on what the, er, cabal came up with after the Perl conference.
Okay, time to get out the quill and parchment and start work on 
revising the Periodic Table of the Operators

- Mark hope the tallow candle lasts 'till dawn Lentczner
Mark Lentczner
http://www.ozonehouse.com/mark/
[EMAIL PROTECTED]


Re: Precedence table update

2004-08-14 Thread Larry Wall
On Sat, Aug 14, 2004 at 08:42:51AM -0700, Mark Lentczner wrote:
: 
: On Aug 14, 2004, at 12:17 AM, Larry Wall wrote:
: Here's the current precedence table as I see it, based mostly
: on what the, er, cabal came up with after the Perl conference.
: 
: Okay, time to get out the quill and parchment and start work on 
: revising the Periodic Table of the Operators
: 
:   - Mark hope the tallow candle lasts 'till dawn Lentczner

You'll also want to make sure the zip operator (¥) gets in there,
probably with the same precedence as == (unless we decide it's
a scalar-only operator, in which case it can be tighter because it
would only work on array refs).  It also has self-associativity issues
much like ^ and == and, if you squint, semicolon inside subscripts.

Larry


Synopsis 1 draft 1

2004-08-14 Thread Larry Wall
=head1 Title

Synopsis 1: Overview

=head1 Author

Larry Wall [EMAIL PROTECTED]

=head1 Version

Maintainer:
Date:
Last Modified:
Number: 1
Version: 0

This document summarizes Apocalypse 1, which covers the initial
design concept.  (These Synopses also contain updates to reflect
the evolving design of Perl 6 over time, unlike the Apocalypses,
which are frozen in time as historical documents.  These updates are
not marked--if a Synopsis disagrees with its Apocalypse, assume the
Synopsis is correct.)

The other basic assumption is that if we don't talk about something in these
Synopses, it's the same as it was in Perl 5.

=head1 Random Thoughts

=over 4

=item *

The word apocalypse historically meant merely a revealing,
and we're using it in that unexciting sense.

=item *

If you ask for RFCs from the general public, you get a lot of
interesting but contradictory ideas, because people tend to stake
out polar positions, and none of the ideas can build on each other.

=item *

Larry's First Law of Language Redesign: Everyone wants the colon.

=item *

RFCs are rated on PSA: whether they point out a real Problem,
whether they present a viable Solution, and whether that solution is
likely to be Accepted as part of Perl 6.

=item *

Languages should be redesigned in roughly the same order as you would
present the language to a new user.

=item *

Perl 6 should be malleable enough that it can evolve into the imaginary
perfect language, Perl 7.  This darwinian imperative implies support
for multiple syntaxes above and and multiple platforms below.

=item *

Many details may change, but the essence of Perl will remain unchanged.
Perl will continue to be a multiparadigmatic, context-sensitive
language.  We are not turning Perl into any other existing language.

=item *

Migration is important.  The perl interpreter will assume that it
is being fed Perl 5 code unless the code starts with a class or
module keyword, or you specifically tell it you're running Perl 6
code in some other way.

=item *

Scaling is one of those areas where Perl needs to multiparadigmatic and
context sensitive.  Therefore your main code is allowed to be lax, while
module and class code is (by default) required to be strict.

=item *

It must be possible to write policy metamodules that invoke other
modules on the user's behalf.

=item *

If you want to treat everything as objects in Perl 6, Perl will help
you do that.  If you don't want to treat everything as objects, Perl
will help you with that viewpoint as well.

=item *

Operators are just functions with funny names and syntax.

=item *

Language designers are still necessary to synthesize unrelated ideas
into a coherent whole.

=back


Synopsis 2 draft 1

2004-08-14 Thread Larry Wall
=head1 Title

Synopsis 2: Bits and Pieces

=head1 Author

Larry Wall [EMAIL PROTECTED]

=head1 Version

Maintainer: your name here
Date:
Last Modified:
Number: 2
Version: 0

This document summarizes Apocalypse 2, which covers small-scale
lexical items and typological issues.  (These Synopses also contain
updates to reflect the evolving design of Perl 6 over time, unlike the
Apocalypses, which are frozen in time as historical documents.
These updates are not marked--if a Synopsis disagrees with its
Apocalypse, assume the Synopsis is correct.)

=head1 Atoms

=over 4

=item *

In the abstract, Perl is written in Unicode, and has consistent Unicode
semantics regardless of the underlying text representations.

=item *

Perl can count Unicode line and paragraph separators as line markers,
but that behavior had better be configurable so that Perl's idea of
line numbers matches what your editor thinks about Unicode lines.

=back

=head1 Molecules

=over 4

=item *

Multiline comments will be provided by extending the syntax of POD
to nest C=begin COMMENT/C=end COMMENT correctly without the need
for C=cut.  (Doesn't have to be COMMENT--any unrecognized POD
stream will do to make it a comment.  Bare C=begin and C=end
probably aren't good enough though, unless you want all your comments
to end up in the manpage...)

Probably we could have single paragraph comments with C=for COMMENT
as well.  That would let C=for keep its meaning as the equivalent
of a C=begin and C=end combined.

=item *

Intra-line comments will not be supported in standard Perl (but it would
be trivial to declare them as a macro).

=back

=head1 Built-In Data Types

=over 4

=item *

In support of OO encapsulation, there is a new fundamental datatype:
opaque.  External access to opaque objects is always through method
calls, even for attributes.

=item *

Perl 6 will have an optional type system that helps you write safer
code that performs better.

=item *

Perl 6 will support the notion of properties on various kinds of
objects.  Properties are like object attributes, except that they're
managed by the individual object rather than by the object's class.
According to A12, properties are actually implemented by a
kind of mixin mechanism.

=item *

Properties applied to compile-time objects such as variables and
classes are also called traits.  Traits are not expected to change
at run time.

=item *

Perl 6 is an OO engine, but you're not generally required to think
in OO when that's inconvenient.  However, some built-in concepts such
as filehandles will be more object-oriented in a user-visible way.

=item *

A variable's type is an interface contract indicating what sorts
of values the variable may contain. More precisely, it's a promise
that the object or objects contained in the variable are capable of
responding to the methods of the indicated role.  See A12 for more
about roles.  A variable object may itself be bound to a container
type that specifies how the container works without necessarily
specifying what kinds of things it contains.

=item *

You'll be able to ask for the length of an array, but it won't be
called that, because length does not specify units.  So
C.elems is the number of array elements.  (You can also
ask for the length of an array in bytes or codepoints or graphemes.
Same for strings.)

=item *

Cmy Dog $spot by itself does not automatically call a CDog constructor.
The actual constructor syntax turns out to be Cmy Dog $spot.=new;,
making use of the C.= mutator method-call syntax.

=item *

If you say

my int @array is MyArray;

you are declaring that the elements of C@array are integers,
but that the array itself is implemented by the CMyArray class.
Untyped arrays and hashes are still perfectly acceptable, but have
the same performance issues they have in Perl 5.

=item *

Built-in object types start with an uppercase letter: Int, Num, Str,
Bit, Ref, Scalar, Array, Hash, Rule and Code].  Non-object (value) types
are lowercase: int, num, str, bit, and ref.  Value types are primarily
intended for declaring compact array storage.  However, Perl will
try to make those look like their corresponding uppercase types if
you treat them that way.

=item *

Perl 6 will intrinsically support big integers and rationals through
its system of type declarations.  CInt automatically supports
promotion to arbitrary precision.  CRat supports arbitrary precision
rational arithmetic.  Value types like Cint and Cnum imply
the natural machine representation for integers and floating-point
numbers, respectively, and do not promote to arbitrary precision.
Untyped scalars use Int semantics rather than int.

=item *

Perl 6 should by default make standard IEEE floating point concepts
visible, such as CInf (infinity) and CNaN (not a number).
It should also be at least pragmatically possible to throw exceptions
on overflow.

=item *

A Cstr is always a byte buffer, whereas a CStr is a Unicode string
object of some sort.  

Re: Synopsis 2 draft 1

2004-08-14 Thread Smylers
@@ -165,7 +165,7 @@ =head1 Built-In Data Types
 =item *
 
 Built-in object types start with an uppercase letter: Int, Num, Str,
-Bit, Ref, Scalar, Array, Hash, Rule and Code].  Non-object (value) types
+Bit, Ref, Scalar, Array, Hash, Rule and Code.  Non-object (value) types
 are lowercase: int, num, str, bit, and ref.  Value types are primarily
 intended for declaring compact array storage.  However, Perl will
 try to make those look like their corresponding uppercase types if
@@ -198,6 +198,8 @@ =head1 Built-In Data Types
 
 =head1 Variables
 
+=over 4
+
 =item *
 
 The C$pkg'var syntax is dead.  Use C$pkg::var instead.
@@ -243,7 +245,7 @@ =head1 Variables
 the C.as('%03d') method to do an implicit sprintf on the value.
 To format an array value separated by commas, supply a second argument:
 C.as('%03d', ', ').  To format a hash value or list of pairs, include
-formats for both key and value in the first string: C .as('%s: %s', \n).
+formats for both key and value in the first string: C .as('%s: %s', \n) .
 
 =item *
 
@@ -680,7 +682,7 @@ =head1 Context
 
 =back
 
-=head Lists
+=head1 Lists
 
 =over 4
 
@@ -759,6 +761,8 @@ =head Lists
 =back
 
 =head1 Files
+
+=over 4
 
 =item *



Re: Synopsis 2 draft 1

2004-08-14 Thread Smylers
Larry Wall writes:

 Synopsis 2: Bits and Pieces

Nice.  (Minor pod corrections sent as a diff under separate cover.)

 You may interpolate a package name into an identifier using
 C::($expr) where you'd ordinarily put the package name.  The parens
 are required.
 
 XXX Actually, C::{$expr} might be made to work instead, given that
 that's how you treat a package symbol table as a hash, and inner
 packages are stored in their parent hash.  And curlies would be more
 consistent with closure interpolation in strings.  We'd just need to
 make sure C$::{$foo}::bar parses correctly as a single name token.

Using braces seems more intuitive, and hence easier to remember.

 To get a Perlish representation of any data value, use the C.repr
 method.  This will put quotes around strings, square brackets around
 list values, curlies around hash values, etc., such that standard Perl
 could reparse the result.  XXX .repr is what Python calls it, I think.
 Is there a better name?

Yes; I've no suggestions as to what it might be, but surely there's
_got_ to be a better name than C.repr.

 To get a formatted representation of any scalar data value, use the
 C.as('%03d') method to do an implicit sprintf on the value.  To
 format an array value separated by commas, supply a second argument:
 C.as('%03d', ', ').  To format a hash value or list of pairs,
 include formats for both key and value in the first string: C
 .as('%s: %s', \n) .

Yay -- that sounds very useful!

 As with Perl 5 array interpolation, the elements are separated by a
 space.  (Except that a space is not added if the element already ends
 in some kind of whitespace.  

I like that exception; it means that if all your array elements end with
line-breaks, you don't end up with all but the first one being indented
(which confused me lots when I was just starting out with Perl, and I've
seen many others do it since).

 A bare closure also interpolates in double-quotish context.  It may
 not be followed by any dereferencers, since you can always put them
 inside the closure. ...  The old disambiguation syntax ... is dead.
 Use closure curlies instead:
 
 {$foo[$bar]}
 {$foo}[$bar]

That last example seems to violate the previous stipulation about not
following a closure by dereferencers.

 XXX We could yet replace $foo with $foo.more or $foo.iter or
 $foo.shift or some such (but not $foo.next or $foo.readline),

That sounds good to me -- C while ($file)  is one of the
least-intuitive bits of syntax to get across to people learning Perl;
there doesn't seem to be reason why this particular method call should
get a purely symbol name, especially when something much more common
such as Cprint doesn't.

Something lie C.iter isn't much more to type, and it doesn't involve
pressing Shift (or possibly something even more exotic on international
keyboards) to type the pointies.

 and steal the angles for something else.

If past performance is anything to go by, the main victim of freeing the
pointies for another purpose would be Piers -- threads on this mailing
list of people discussing operator syntax have a habit of getting
quickly out of control.

For what it's worth, I'd be happy to use ordinary pointies instead of
the guillemets for quoting words (and hash keys and the like), leaving
the guillemets just for hyper ops.  I still think your original analysis
that word-quoting is more common than file-iterating is correct, that
both of them are more common than hyper ops, and that there's some
advantage to having the more complicated-looking (and -to-type)
characters only being used for the more complicated operators.

Thank you again for coming up with this!

Smylers



Re: Synopsis 2 draft 1

2004-08-14 Thread Larry Wall
On Sat, Aug 14, 2004 at 09:56:34PM +, Smylers wrote:
:  A bare closure also interpolates in double-quotish context.  It may
:  not be followed by any dereferencers, since you can always put them
:  inside the closure. ...  The old disambiguation syntax ... is dead.
:  Use closure curlies instead:
:  
:  {$foo[$bar]}
:  {$foo}[$bar]
: 
: That last example seems to violate the previous stipulation about not
: following a closure by dereferencers.

That's the point--it isn't a dereferencer.  It's literal brackets.  It's
replacing the old Perl 1 distinction:

${foo[$bar]}
${foo}[$bar]

simply by putting the $ inside instead of outside, and relying on
general closure interpolation, getting rid of two specific exceptions
for the price of one generality.  Seems like a win to me.

Larry


Re: Synopsis 2 draft 1

2004-08-14 Thread Larry Wall
On Sat, Aug 14, 2004 at 09:56:34PM +, Smylers wrote:
:  You may interpolate a package name into an identifier using
:  C::($expr) where you'd ordinarily put the package name.  The parens
:  are required.
:  
:  XXX Actually, C::{$expr} might be made to work instead, given that
:  that's how you treat a package symbol table as a hash, and inner
:  packages are stored in their parent hash.  And curlies would be more
:  consistent with closure interpolation in strings.  We'd just need to
:  make sure C$::{$foo}::bar parses correctly as a single name token.
: 
: Using braces seems more intuitive, and hence easier to remember.

Okay, that section now reads:

You may interpolate a string into a package or variable name using
C::{$expr} where you'd ordinarily put a package or variable name.
The string is allowed to contain additional instances of C::, which
will be interpreted as package nesting.  You may only interpolate
entire names, since the construct starts with C::, and either ends
immediately or is continued with another C:: outside the curlies.
All symbolic references are done with this notation:

$foo = Foo;
$foobar = Foo::Bar;
$::{$foo}   # package-scoped $Foo
$::{MY::$foo} # lexically-scoped $Foo
$::{*::$foo}  # global $Foo
$::{$foobar}# $Foo::Bar
$::{$foobar}::baz   # $Foo::Bar::baz
$::{$foo}::Bar::baz # $Foo::Bar::baz
$::{$foobar}baz # ILLEGAL at compile time (no operator baz)
${$foobar}::baz # ILLEGAL at run time (no hard ref in $foobar)

Note that unlike in Perl 5, initial C:: doesn't imply global.
Package names are searched for from inner lexical scopes to outer,
then from inner packages to outer.  The global namespace is the last
place it looks.  You must use the C* package to force the search
to start in the global namespace.

:  To get a formatted representation of any scalar data value, use the
:  C.as('%03d') method to do an implicit sprintf on the value.  To
:  format an array value separated by commas, supply a second argument:
:  C.as('%03d', ', ').  To format a hash value or list of pairs,
:  include formats for both key and value in the first string: C
:  .as('%s: %s', \n) .
: 
: Yay -- that sounds very useful!

The main problem with it is that there's no way to write the default behavior.

:  As with Perl 5 array interpolation, the elements are separated by a
:  space.  (Except that a space is not added if the element already ends
:  in some kind of whitespace.  
: 
: I like that exception; it means that if all your array elements end with
: line-breaks, you don't end up with all but the first one being indented
: (which confused me lots when I was just starting out with Perl, and I've
: seen many others do it since).

That's the default behavior you can't write with .as().  Takes a .map or
some such.

:  XXX We could yet replace $foo with $foo.more or $foo.iter or
:  $foo.shift or some such (but not $foo.next or $foo.readline),
: 
: That sounds good to me -- C while ($file)  is one of the
: least-intuitive bits of syntax to get across to people learning Perl;
: there doesn't seem to be reason why this particular method call should
: get a purely symbol name, especially when something much more common
: such as Cprint doesn't.
: 
: Something lie C.iter isn't much more to type, and it doesn't involve
: pressing Shift (or possibly something even more exotic on international
: keyboards) to type the pointies.

But .iter is ugly, ugly, ugly.  Worse the .repr in my opinion.
Unfortunately, most of the the good names are taken, like all,
each.  That's why we've tended to end up back with $IN.

:  and steal the angles for something else.
: 
: For what it's worth, I'd be happy to use ordinary pointies instead of
: the guillemets for quoting words (and hash keys and the like), leaving
: the guillemets just for hyper ops.  I still think your original analysis
: that word-quoting is more common than file-iterating is correct, that
: both of them are more common than hyper ops, and that there's some
: advantage to having the more complicated-looking (and -to-type)
: characters only being used for the more complicated operators.

The problem with that is that, both visually and conceptually, it's
more ambiguous with the infix: operator.  You'd get people writing
things like $foo2 and wondering why it blows up.  $foo«2» is much
more visually distinctive, and I think visual distinctions trump
ease of typing.

I'm actually wondering whether we should leave ... free for
user-defined quoting purposes.  Actually, we could leave ...
defaulting to meaning .iter (or whatever), but if the user overrode the
meaning of ..., they'd just have to use the method instead of 

One is almost tempted to say that iterators should be declared as
self-growing arrays, but then you get problems with the fact that
@*IN in list context would not, in fact, read the array destructively,
which it needs to do.

One is also tempted to play with syntax