Unified prelude, FFI, multiple runtimes

2005-09-12 Thread Yuval Kogman
Hi,

please point your browser to

http://nothingmuch.woobling.org/compilation_of_circular_prelude.;
~ any(graffle vdx pdf png jpg).pick;

My proposition:

1.  *the* perl 6 compiler should ship a reference implementation of
the prelude, that is circular. For example

multi *infix:* (Int $x, Int $y) { [+] $x xx $y }

is the reference implementation of multiplication on integers.

2.  each block of code has a cryptographic digest, which is the hash
of it's body with the digests of all the functions it calls.
This digest is stable regardless of optimization semantics, and
is applied to the PIL structure.

3.  a foreign function interface must exist, and have function
somewhat like the Inline:: modules function today (presumably
with a lower level but zero-copy interface an option).

4.  FFI definitions are defined using a uniform interface, whose
most basic interface is a constructor for a code object that
takes a runtime and a string body, e.g.

Code::Foreign.new(:languageC, :body(int foo () { return 10 
}));

(Some FFIs might only allow construction of foreign functions at
compile time.)

5.  Functions have a notion of equivelence. This is managed based on
the digest. For example

my c_int_mul = BEGIN { Code::Foreign.new(:languageC, :body(
int foo (int x, int y) { return x * y }
) };

multi infix:* (int $x, int $y -- int) {
[+] $x xx $y;
}

my $str = infix:*int, int -- int.digest; # must specify 
the variant

c_int_mul.set_equiv($str); # could be in another file

# or, if they are maintained together

c_int_mul.set_equiv(infix:*int, int -- int);

This equivelence is with respect to the semantics of the input
and output. The test suite supposedly can assure that these are
really equivelent by running the same tests against either
version.

6.  the runtime is just an FFI provider, which happens to be the
default one too

7.  When applying code in the runtime, the runtime is free to use
any equivelent function

8.  In order to run perl code at all, some native equivelent
functions must be provided by the runtime, for the basic
operations, and things that can't have reference implementations
like IO calls (which are implemented in the prelude as stubs).

9.  static analysis may be leveraged to compile direct calls to
native functions when compile time resolution is possible. In
the example graph, for example, no eval or symbol table
assignments are made, so there is no way the code will ever
change. Hence the entire program can be pre-resolved. This
should be controlled via the 'optimize' pragma.

the reasons for this will now be discussed.

A unified prelude with reference implementations for even the
simplest operations gives us several things:

* a circular structure that doesn't make sense when we try to
  run it, since the prelude depends on itself (bad)
* must break circularity by going to native operations
* a reference spec that alternative runtimes can compare to
* a way to kickstart implementations
* just provide a root set of native operation, enough to
  break circularity in one single point
* demagicalization of the language

Since FFIs are going to be a core feature of perl 6, they can be
used to bootstrap the whole compilation process. In effect, the
primitive operations are now just FFI calls to the runtime we
happen to be executing on.

To promote code reuse and to simplify the model the notion of
equivelence is introduced, letting the runtime pick which version of
a function (FFI or native) it uses.

To make things safe, when the prelude is bug fixed and the runtime
is not yet updated, the cryptographic hash of the function changed,
so it is no longer equal to the native one based on the way they are
paired.

To make things modular, the paring of FFI and pure perl functions is
orthogonal to their location of definition based on the hashing
scheme.

This has some nice properties:

Modules like Template::Stash::XS are depracated. Instead, an FFI
based Template::Stash can be automatically loaded from the
runtime library prefix, and it will be set as equivalenet to the
pure perl Template::Stash.

Modules like DBD::your_db which rely on a certain library can be
stubbed in Perl 6 to tell the user that they must use a certain
runtime. The stubs are set as equal to the FFI calls into the
library.

WRT MMD, you can set the entire MM equivalent to
a certain foreign function, and 

Re: Unified prelude, FFI, multiple runtimes

2005-09-12 Thread Yuval Kogman
On Mon, Sep 12, 2005 at 13:15:33 +0300, Yuval Kogman wrote:
 To make things safe, when the prelude is bug fixed and the runtime
 is not yet updated, the cryptographic hash of the function changed,
 so it is no longer equal to the native one based on the way they are
 paired.

It should be noted that this equivalence should be transitive.. For
example, if the prelude's definition of foo is changed, but this is
for readability or optimization, and did not change behavior at all,
it can be simply marked as equal to the hash of the previous
version.

This way the runtimes' equality tables don't have to be updated
every time the prelude changes - only when it's behavior changes.

-- 
 ()  Yuval Kogman [EMAIL PROTECTED] 0xEBD27418  perl hacker 
 /\  kung foo master: /me sneaks up from another MIME part: neeyah!



pgpwQmIzz5xGR.pgp
Description: PGP signature


Re: Unified prelude, FFI, multiple runtimes

2005-09-12 Thread Yuval Kogman
On Mon, Sep 12, 2005 at 13:15:33 +0300, Yuval Kogman wrote:

The circularity issue was not made clear in the email or the
diagram. Here is what I meant:

The prelude operators are mutually recursive at some point, and
completely pure. An pathetic example:

multi infix:- (Int $x, Int $y) { $x + 1 - $y - 1 }
multi infix:- (Int $x, 0) { $x }

multi infix:+ (Int $x, Int $y) { ($x + 1) + ($y - 1) }
multi infix:+ (Int $x, 0) { $x }

multi infix:+ (Int $x, 1) { ... } # really a stub, requires native 
increment
multi infix:- (Int $x, 1) { ... } # really a stub, requires native 
decrement

(I should note that once the language grows it becomes rather
circular especially when control flow and combinators get
involved...)

This allows runtime authors to provide a minimal root set of
operations to break the circularity of the prelude (and even this
can be done incrementally, i guess), and get a slow but functional
system in very little time.

This root set is dynamic, and can be chosen by the runtime
implementor based on how well this fits into the the target arch.

This opens a door to a variety of levels of target
implementations... The metamodel, for example, could be implemented
in the prelude. When it is not implemented natively it will be slow,
but will work. Serious runtime implementations can override the
parts of the metamodel that profiling has shown to be slow.

-- 
 ()  Yuval Kogman [EMAIL PROTECTED] 0xEBD27418  perl hacker 
 /\  kung foo master: /me does not drink tibetian laxative tea: neeyah!



pgpsTL94mNGSZ.pgp
Description: PGP signature


Summary for the last 3 weeks

2005-09-12 Thread The Perl 6 Summarizer
The Perl 6 Summary from 2005-08-24 to 2005-09-11
It's been a while hasn't it?

We'll start as usual with perl6-compiler

This week in perl6-compiler
  Changed ??:: to ??!! in Pugs
Following discussion of the ternary operator in perl6-language, Benjamin
Smith altered pugs to use the new ??!! syntax.

http://xrl.us/hjb5

Meanwhile in perl6-internals
  Which 'English'?
Joshua Hoblitt posted a patch to intro.pod fixing a few typos and
wondered whether the docs should be in British or American English.
Apparently the Perl 5 style rule is that British authors shouldn't be
required to write American English, and vice versa, but that a
consistent style within a document is preferred. The consensus so far
seems to be Any documentation is good so write what's comfortable for
you.

http://xrl.us/hjb6

  Meanwhile, at the test
Warne just got Trescothick out. 109 for 4

  Python PMCs
Sam Ruby, Leo and Chip had a discussion of how to implement python
semantics for parrot. I'm not sure I followed what was going on, but it
looked like good 'crunchy' stuff.

http://xrl.us/hjb7

  Zcode interpreter release
Amir Karger announced that he'd adopted the Zcode interpreter that Leo
posted in February (having, according to Amir, done the hard parts).
Apparently there's 41 opcodes to do just to get version 3 of the
Z-machine working, and then there's the problem of making Zops into
loadable Parrot ops. He had a few problems with the test suite which got
fixed in discussion.

http://xrl.us/hjb8

  Pirate refactoring report
Michal Wallace posted an update on Pirate (the python to parrot
compiler). He and Curtis Hall have been taking advantage of a Google
Summer of Code grant to refactor the (Curse! Now Flintoff's out. Caught
 bowled Warne for 8) current mess. Their first step was a generic
transformation module which has apparently made life easier for the
compiler module.

They've also produced a plan in code for how they hope they'll have
things working once the refactoring's finished and asked for comments.
So far comments have not been forthcoming.

http://xrl.us/hjb9

http://xrl.us/hjca

  Tcl in the leo-ctx5 branch
Will Coleda's been having a crack at getting ParTcl working with the
leo-ctx5 branch and had a few problems. It turns out that he'd tickled a
bug that Leo described as 'a bit non-trivial'. It took him a while, but
it got fixed eventually (Over 10 days, but he did have the excuse of
being at YAPC::Europe for a chunk of that time).

http://xrl.us/hjcb

  Meanwhile at the Oval
They've gone in for lunch at 127 for 5. Hopefully I'll be able to get
down to some summary writing without being on the edge of my seat for a
while.

  Branch Review
Chip posted a review of the leo-ctx5 branch prior, describing it as A
significant improvement. The body of the review covers user visible
changes and a few niggles with the current state of the branch. Leo
replied with a few questions and explanations.

http://xrl.us/hjcc

  GMC release
Nattfodd announced the 'release' of GMC, the generation garbage
collector he's been working on as part of Google's Summer of Code. It's
not quite bug free yet, but the SoC deadline was the 1st of September,
so that's when it got released. Discussion ensued, hopefully helping to
triangulate bugs.

http://xrl.us/hjcd

  Call for B0rked
Following a discussion with Chip and Leo, chromatic posted a call for
entries in a 'very specific TODO list'. A list of things that should
work, but don't. He contributed a couple himself. Various other
suggestions were offered.

http://xrl.us/hjce

  Optimizer Documentation
Curtis Rawls spent part of his Summer of Code project wishing there was
better documentation of the Parrot optimizer. So he wrote some. Brent
Royal-Gordon converted it to POD format, and Leo asked for someone to
add it to the repository.

http://xrl.us/hjcf

  HLL Namespace Design
Matt Diephouse posted a list of namespace features that he thinks are
necessary to support all the target languages and asked for some
comments. He got several, including one from Larry.

http://xrl.us/hjcg

  Global Destruction
Nicholas Clark had some questions about finalization and destruction in
Parrot. In particular, he asked: Does parrot make any guarantee that
all objects will be rendered down to bare memory before program exit.
Leo answered and the answer was good enough for Ponie. Huzzah.

http://xrl.us/hjch

Meanwhile, at the Oval
Ah... they're back on the pitch... I may be slowing down again...

Meanwhile, in perl6-language
  Demagicalizing Pairs
Discussion of Luke's proposal to demagicalize Pairs continued. It turns
out that it's actually a discussion of how to do named argument
calling...


Object Model Pictures

2005-09-12 Thread Stevan Little

Hello again.

In my never ending quest to implement the Perl 6 object model, I have  
started drawing pictures. Here is the latest version:


http://svn.openfoundry.org/pugs/perl5/Perl6-MetaModel2.0/docs/ 
p6_object_model.jpg


(and for OmniGraffle users: http://svn.openfoundry.org/pugs/perl5/ 
Perl6-MetaModel2.0/docs/p6_object_model.graffle)


I would appreciate any comments/questions/suggestions anyone has to  
offer.


A few things to note:

- Roles are not yet included in this, I am of the opinion that the  
core MetaModel should be role-less, and roles will be laid on top of  
the core metamodel post-bootstrapping. This makes the most sense  
currently, however, this may change down the road, either way I think  
it is an implementation detail as long as it looks the same when  
everything is loaded.


- the word meta is used to describe the three-circle items.  
Please do not read to much into this terminology, meta simply means  
you should not touch this unless you *really* know what you are  
doing. Suggestions on better names are welcome.


- the reason Foo and $foo are smaller is because they are user  
created items.


Okay, thats all my caveats, fire away please.

Stevan


Re: Unified prelude, FFI, multiple runtimes

2005-09-12 Thread Luke Palmer
On 9/12/05, Yuval Kogman [EMAIL PROTECTED] wrote:
 Hi,

Hi.  These are superficial thoughts, before I've had time to really
think about the Big Picture.

 2.  each block of code has a cryptographic digest, which is the hash
 of it's body with the digests of all the functions it calls.
 This digest is stable regardless of optimization semantics, and
 is applied to the PIL structure.

Okay, a little clarification needed here.  Is the digest of the code
itself a collection of digests, one for the lexical body, and one for
each funciton it calls (1)?  Or is it a hash that combines all of
those somehow?

How do you deal with recursive/mutually recursive functions?  (I'm
pretty sure there's a way, I just can't think of it)

What about temp foo = sub {...} ?

 5.  Functions have a notion of equivelence. This is managed based on
 the digest. For example
 
 my c_int_mul = BEGIN { Code::Foreign.new(:languageC, 
 :body(
 int foo (int x, int y) { return x * y }
 ) };
 
 multi infix:* (int $x, int $y -- int) {
 [+] $x xx $y;
 }
 
 my $str = infix:*int, int -- int.digest; # must specify 
 the variant

You mean:

my $str = infix:*:(int, int -- int).digest;

Also, you said that the digest contains the digests of all called
functions.  How do you deal with multis there, which may depend on the
runtime types?

 c_int_mul.set_equiv($str); # could be in another file
 
 # or, if they are maintained together
 
 c_int_mul.set_equiv(infix:*int, int -- int);
 
 This equivelence is with respect to the semantics of the input
 and output. The test suite supposedly can assure that these are
 really equivelent by running the same tests against either
 version.

Okay, good.  So you have a way of marking that two functions do the
same thing, without having the program try to figure that out (which
is impossible).  That's something that Haskell didn't figure out :-)

I suppose it would be nice to have subtype semantics, in that
function A does the same thing as function B for all arguments that
are valid for function B (but function A may have additional
functionality for arguments that are not valid for B).  Then you
specify equivalence by specifying subtype equivalence in both
directions (with a shortcut, of course).

 9.  static analysis may be leveraged to compile direct calls to
 native functions when compile time resolution is possible. In
 the example graph, for example, no eval or symbol table
 assignments are made, so there is no way the code will ever
 change. Hence the entire program can be pre-resolved. This
 should be controlled via the 'optimize' pragma.

Rather than having the compiler try to infer for itself, which would
come up negative 99% of the time and just waste compile time.

 Since FFIs are going to be a core feature of perl 6, they can be
 used to bootstrap the whole compilation process. In effect, the
 primitive operations are now just FFI calls to the runtime we
 happen to be executing on.

And if you have a circular reference implementation, the implementors
can decide to implement whatever generating subset is easiest and get
a working Perl.  I like this.

Hmm, could we pull the idea of a generating subset out to
roles/theories?  That is, have a role specify that a and b are
implemented in terms of each other, so if you provide one, you fulfill
the role contract.  In Haskell, if you don't fulfill a class contract
that's mutually recursive, you get infinite loops.  It be nice to
catch that.

Then I guess we would have theory PerlCore.  Neat.

 To make things modular, the paring of FFI and pure perl functions is
 orthogonal to their location of definition based on the hashing
 scheme.

So as we're hashing PIL, we'd leave out line number information and whatnot.

 WRT MMD, you can set the entire MM equivalent to
 a certain foreign function, and you can also set any variant
 individually. You can even set a single variant to be equivalent to
 a multimethod to make the FFI implementation simpler. The compiler
 simply presents the runtime with all the possible MMD choices, and
 lets the runtime choose between conflicting ones.

Like a nice wrapping MMD scheme ought to.

All in all, I like the idea.  I hope there are no major technical
hurdles.  The hashing scheme scares me a little, because it's easy to
create equivalent code that does not look the same from PIL, but it
seems like you covered that base.

Luke


Re: Unified prelude, FFI, multiple runtimes

2005-09-12 Thread Yuval Kogman
A proof of concept is available here:

http://svn.openfoundry.org/pugs/docs/notes/circular_prelude_stuff.pl

And logs where I explain the guts to Luke are availble here:

http://colabti.de/irclogger/irclogger_log/perl6?date=2005-09-12,Monsel=785#l1413

-- 
 ()  Yuval Kogman [EMAIL PROTECTED] 0xEBD27418  perl hacker 
 /\  kung foo master: /me supports the ASCII Ribbon Campaign: neeyah!!!



pgppQ5g7Net6u.pgp
Description: PGP signature


Re: Unified prelude, FFI, multiple runtimes

2005-09-12 Thread Yuval Kogman
On Mon, Sep 12, 2005 at 13:27:21 -0600, Luke Palmer wrote:
 On 9/12/05, Yuval Kogman [EMAIL PROTECTED] wrote:
  Hi,
 
 Hi.  These are superficial thoughts, before I've had time to really
 think about the Big Picture.
 
  2.  each block of code has a cryptographic digest, which is the hash
  of it's body with the digests of all the functions it calls.
  This digest is stable regardless of optimization semantics, and
  is applied to the PIL structure.
 
 Okay, a little clarification needed here.  Is the digest of the code
 itself a collection of digests, one for the lexical body, and one for
 each funciton it calls (1)?  Or is it a hash that combines all of
 those somehow?

If there is a given function x, that calls another function y, and
there is an FFI function z that is the same as x (it refers to
it's hash). If 'y' is changed, then the semantics of x may change,
hence the hash must change (a call to z is no longer equal to a call
to x). (unless y says it is equal to the old y, by mentioning it's
hash).

 How do you deal with recursive/mutually recursive functions?  (I'm
 pretty sure there's a way, I just can't think of it)

Right now I don't, i'm cheating.

In the future I'd like to hash the body for all functions, and then
hash the hash of the body and the hashes of all the called bodies to
make a function hash.

 What about temp foo = sub {...} ?

This has an entirely different hash, so it doesn't go to the same
thing.

  my $str = infix:*int, int -- int.digest; # must 
  specify the variant
 
 You mean:
 
 my $str = infix:*:(int, int -- int).digest;

uh, yeah... I keep forgetting the syntax.
 
 Also, you said that the digest contains the digests of all called
 functions.  How do you deal with multis there, which may depend on the
 runtime types?

replacements are per variant i guess, and the catch all (completely
unqualified multi) has a hash built out of all it's variants.

 I suppose it would be nice to have subtype semantics, in that
 function A does the same thing as function B for all arguments that
 are valid for function B (but function A may have additional
 functionality for arguments that are not valid for B).  Then you
 specify equivalence by specifying subtype equivalence in both
 directions (with a shortcut, of course).

I'll leave this part to you ;-)

  assignments are made, so there is no way the code will ever
  change. Hence the entire program can be pre-resolved. This
  should be controlled via the 'optimize' pragma.
 
 Rather than having the compiler try to infer for itself, which would
 come up negative 99% of the time and just waste compile time.

I disagree... Most one liners and small scripts could really benefit
from this, and it's an easy test - just set a flag the moment you
see either the eval opcode or the ::= opcode.

 And if you have a circular reference implementation, the implementors
 can decide to implement whatever generating subset is easiest and get
 a working Perl.  I like this.

Yes, this is why I brought it up =)


 Hmm, could we pull the idea of a generating subset out to
 roles/theories?  That is, have a role specify that a and b are
 implemented in terms of each other, so if you provide one, you fulfill
 the role contract.  In Haskell, if you don't fulfill a class contract
 that's mutually recursive, you get infinite loops.  It be nice to
 catch that.

it's too late for me to understand that... sorry...

 So as we're hashing PIL, we'd leave out line number information and whatnot.

Yeah, this is also possibly the canonical representation. For
example

sub foo ($x) {
$x ? 1 : 2;
}

sub foo ($x) { 
if ($x) {
return 1;
} else {
return 2;
}
}

 All in all, I like the idea.  I hope there are no major technical
 hurdles.  The hashing scheme scares me a little, because it's easy to
 create equivalent code that does not look the same from PIL, but it
 seems like you covered that base.

I won't expect a reference implementation to be refactored too much,
but even if it does, you just say that this function is the same as
the function with the hash value x, where x is the hash it had
before the refactoring.

-- 
 ()  Yuval Kogman [EMAIL PROTECTED] 0xEBD27418  perl hacker 
 /\  kung foo master: /me tips over a cow: neeyah!!



pgpzsQPvto7BO.pgp
Description: PGP signature


Re: Object Model Pictures

2005-09-12 Thread Nathan Gray
On Mon, Sep 12, 2005 at 03:10:55PM -0400, Stevan Little wrote:
 In my never ending quest to implement the Perl 6 object model, I have  
 started drawing pictures. Here is the latest version:
 
 http://svn.openfoundry.org/pugs/perl5/Perl6-MetaModel2.0/docs/ 
 p6_object_model.jpg

Awesome diagram.

 I would appreciate any comments/questions/suggestions anyone has to  
 offer.

In my never ending quest to understand the Perl 6 object model, I think
it has become clearer each time I've discovered another document or
picture.

Thank you Stevan!

 A few things to note:
 
 - Roles are not yet included in this, I am of the opinion that the  
 core MetaModel should be role-less, and roles will be laid on top of  
 the core metamodel post-bootstrapping. This makes the most sense  
 currently, however, this may change down the road, either way I think  
 it is an implementation detail as long as it looks the same when  
 everything is loaded.

Yep, someone needs to make a diagram about Roles, too.

-kolibrie