Re: Sequence points, undefined behaviour

2005-09-18 Thread Larry Wall
On Sat, Sep 10, 2005 at 12:22:29PM +0100, Nicholas Clark wrote:
: I might have missed this somewhere in the documentation, but is Perl 6 going
: to have any documented notion of things like sequence points, undefined
: behaviour, etc?

I don't claim to be the expertest expert on sequence points, but
we've said a few things unofficially that should probably get baked
into Synopses at some point if they aren't already.  The general
approach has been that there are some places we guarantee sequence,
and several places we guarantee the lack of sequence, and everything
in the middle is negotiable until 6.0.0 comes out, which is probably
saying that the Perl 6 translation of Pugs (along with its test suite)
will end up being the de facto standard where we haven't been explicit.

But we can say a few things we know now.

: Is it going to mandate that function arguments are evaluated
: in any particular order (eg left to right)?

Depends on what you mean by evaluated, of course.  Anyone making a
call to a function has to marshall the arguments in a Lazy structure
of some kind for binding to the eventual formals, and presumably
that marshalling happens left-to-right.  But the actual evaluation
of the arguments might not happen at that point if what you're
marshalling is intrinsically lazy.

But Perl 6 makes a fundamental distinction between item parameters
and list parameters.  Item (scalar) bindings are not lazy by default,
so as soon as you know you're binding to a scalar, you can evaluate
completely (though not past the Ref level).  I've tried to define the
semantics of binding such that these scalar parameters can be bound
left to right, even in the presence of named parameters, by saying
that the named parameters are interrogated repeatedly each time a
positional parameter is being filled in but appears to be missing.
The named parameters are not required to be evaluated unless bound,
I suspect, but I could be wrong on that.

List parameters are *not* forced to evaluate even when bound, though
some of the arguments may be naturally eager and evaluated themselves
when marshalled into the Lazy.  The lazy pieces of the slurpy array are
forced to evaluate in the order they're demanded, which is typically
left to right, but not necessarily.  You can pop the slurpy array,
and even if there are infinities in it, if the innards of the Lazy
can figure out how to give you the last value, they will.

You can force left-to-right evaluation at call time via **.

Junctional and hyper arguments are specifically allowed to be evaluated in
any order, and it is erroneous for the program to depend on the order.

: Is it going to fix the behaviour if you modify a variable more than
: once in an expression?

Doubt it.  But it's hard to detect it in the general case (and no,
side-effect free languages are not the general case :-), so we just
probably call it erroneous, and encourage people to use a programming
style that limits side effects to one per statement, if not fewer.
Certainly I foresee a number of people adopting a pure-ish discipline
of only writing

my $x ::= stuff();

which can presumably can be treated as completely definitional
if stuff() can be presumed to have no side effects.  (And the ::=
implies that if there are side effects, they happen at compile time.)

Of course, for provably pure stuff you can get away with

my $x := stuff();

or even

my $x = stuff();

And maybe there's even a way to enforce it under use Monads, but I ain't
writing that module...

: Currently Perl 5 makes no written guarantees on many things, yet everyone has
: come to expect the current implementation's choices to be the way things must
: be. Is Perl 6 going to do the same, or will it be an explicit goal that
: anyone can implement a clean room Perl 6 compiler/runtime with only the
: written spec, without needing to match specific case behaviour encoded
: solely in regression tests?

We are not good enough spec writers to get away with that, and even
if we were, nobody's paying us enough to do it.  :-)

I suspect the validation suite will have to serve as the spec for at
least some of the behavior.  That being said, we can certainly do a
lot towards writing the tests in a sufficiently literate style that
they are almost as accessible as a spec, and maybe have more chance
of actually being right than a spec would have.

Maybe if we work the XP aspects right, the spec emerges from the tests.

: Perl 5 is also lets C's undefined behaviour poke through to the language
: level.  Should Perl 6 avoid this completely? What's 1  -1 in Perl 6?

Those sorts of things should be nailed down as much as possible.
We need a consistent semantics more than we need to squeeze every
last ounce of optimization out of the resulting assembly code.
Specialized code can do specialized things with pragmas and types,
but by default we need to be general whenever we can get away with it.

Larry


Junctions, patterns, and fmap again

2005-09-18 Thread Luke Palmer
Okay, due to some discussion on #perl6, I'll assume that the reason my
fmap proposal was Warnocked is because a fair number of people didn't
understand it.  Also, for the people who did understand, this message
includes a complete proposal for the workings of Junctions that will
fluster Damian again.  :-)

Part 1: fmap

De-symmetricalize hyper.  So, what used to be @foo »+« 1 is now @foo
»+ 1; the hyper marker is only on the side of the operator that is
being hypered over.  Of course, we still have @foo »+« @bar.

An object may do the role[1] Functor, in which case it defines a
method 'fmap' which is a generalization of map.  For instance, let's
try it with a tree.

class Tree does Functor {
method fmap (code) {...}
}
class Branch is Tree {
has Tree $.left;
has Tree $.right;
method fmap (code) {
# Return an identical tree with the leaves mapped with code
return Branch.new(
left  = $.left.fmap(code),
right = $.right.fmap(code),
);
}
}
class Leaf is Tree {
has $.data;
method fmap (code) {
# Just apply code to the value in the leaf
return Leaf.new( data = code($.data) )
}
}

Now if we have a $tree that looks like:

+---+---3
|   +---4
+---5

$tree.fmap:{ $_ * 2 } returns:

+---+---6
|   +---8
+---10

The formal signature of fmap is, if T is a Functor:

multi fmap (T[::U], code:(::U -- ::V) -- T[::V])

That is, it takes a T of some type, and a code that maps some type to
some other type, and returns a T of that other type.

Now, hypers are just syntactic sugar for various forms of fmap:

$x »+ $y# $x.fmap:{ $_ + $y }
$x +« $y# $y.fmap:{ $x + $_ }
foo(»$x«)   # $x.fmap:{ foo($_) }

I have a plan for the $x »+« $y form (and also foo(»$x«, »$y«, »$z«)),
but I don't want to go into that right now.  It basically involves
zipping the structures up into tuples and applying the function to the
tuples.

Part 2: Junctions

A Junction is a set of values together with a logical operation to be
applied when the Junction is in boolean context.  When you add to a
Junction, you add to each of its values.  When you pass a Junction to
a function, you call the function for each of its values and
reconstruct based on the return values.  All in all, a Junction sounds
like a perfect candidate to be a Functor.  Except all the fmapping is
implicit, which is what makes Junctions break the transitivity of ,
the orthogonality of the patterns 1 and 2, and all that other
stuff that people like me covet so.

So my proposal is to make a Junction into a plain old Functor.  So
what used to be:

if any(@values) == 4 {...}

Is now:

if any(@values) »== 4 {...}

And the only thing that makes junctions different from Sets (which are
also Functors) is their behavior in boolean context (and their ability
to be Patterns; see below).

Yeah, it's a little teeny bit longer, but I think it is pretty easy to
get used to.  And now junctinve threading looks like threading (just
like with hypers).  The best part is that now it is okay to pass
Junctions to functions since they don't screw with your logical
assumptions, and the signature (Junction $x) is no longer semantically
special; it is a type check just like any other typed signature (and
optional just like any other typed signature :-).

Part 3: Patterns

Like I've said before, a Pattern is a thing that can be on the right
side of a ~~ operator.  Most builtin things are Patterns: numbers,
strings, regexes, lists (maybe), booleans, closures, classes, and
junctions.  Pattern is really a role[2] that requires a match(Any --
Bool).  So then $x ~~ $y is equivalent to $y.match($x).

Here is a table of standard Patterns:

numbers, strings:  eqv
regexes:   match (note this gives us /foo/.match($str) for free)
lists: dunno
booleans:  boolean truth (ignores left side)
closures:  apply closure and test truth
classes:   .does
junctions: see below

A Junction of Patterns does Pattern under its logical operation.  So
$x ~~ any($y,$z) is equivalent to $x ~~ $y || $x ~~ $z.  This is the
only autothreading operation.  And that is so you can say:

given $value {
when 1 | 2 | 3   {...}
when /^foo/ | /^bar/ {...}
}

The signature of grep is:

sub grep (Pattern $pat, [EMAIL PROTECTED]) {...}

So then these all make sense:

grep /foo/, @values;
grep 1|2,   @values;
grep Node,  @objects

And the regular closure form of grep is only for straight boolean
tests against the argument:

grep { lc eq 'foo' } @strings;

This really doesn't give us anything new.  But in my opinion, it
solidifies what we already have greatly, and makes it much easier to
think about and work with (no more guessing :-).  It also trivializes
the smart match table in S04.

Luke

[1] Really a