Re: Program lifecycle

2000-08-11 Thread Dan Sugalski

At 09:34 PM 8/10/00 -0400, Bradley M. Kuhn wrote:
Three notes on the Syntax tree (which I would probably call Intermediate
Representation, or IR, but the name is irrelevant :).

Yep. It's more a MI, with bytecodes being an LI. (And both are IRs) I've 
been browsing through compiler design books, as if you couldn't tell... :)

First, I believe that it is completely reasonable and probably useful to
consider allowing an optimization step that operates directly on the IR.
Bytecodes are often harder to optimize than the IR.

Fair enough. I think we should pass the syntax tree on to the optimzier in 
addition to the bytecode stream for just that sort of thing.

Second, I think that it is terribly important that they format of the IR be
well described in a document separate from the code.  I am willing to help
maintain this document, but I think it is imperative that we don't let the
"implementation be the reference".  If the code changes, the document much
change to reflect it.

Oh, yes! Once things get a bit more solid on the language end, I expect 
we'll have syntax and bytecode working groups to hammer those out and nail 
'em down.

Third, I am very glad that Dan has placed the execution engine very far from
the IR.  Whether or not we want to have an execution engine (which I tend to
call a VM :) that works directly on the IR or one that always goes through
bytecode, or both, I think we must keep a high wall of abstraction between
the IR and the VM.

Bytecode is an IR too, just with a different target.

I do want that heavy wall, though. That backend could be a perl2jvm 
translator, or a TIL version of the interpreter (if we don't go that way to 
start), or a frontend to a 'real' compiler like GCC or the Dec compilers. 
(Both gcc and the dec compilers have one backend for a whole bunch of 
language front-ends. If they can front-end fortran, C++, and cobol, we 
ought to be able to do it with perl...)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Program lifecycle

2000-08-10 Thread Dan Sugalski

At 09:57 PM 8/9/00 -0700, Matthew Cline wrote:
On Wed, 09 Aug 2000, Nathan Torkington wrote:
  It seems to me that a perl5 program exists as several things:
   - pure source code (ASCII or Unicode)
   - a stream of tokens from the parser
   - a munged stream of tokens from the parser (e.g., use Foo has
 become  BEGIN { require Foo; Foo-import })
   - an unthreaded and unoptimized optree

Isn't there a tree of whatchamacallits between a token stream and
the optree, and also a symbol table?  I'm not too up on compilers...

I think so. There are some thingamabobs in there too. :)

I think we'll see at least a syntax tree, a bytecode stream, and an optree 
in perl 6, depending on where you look. That's still sort of up in the air, 
though. (We might see machine code too, if I can convince myself that it 
can be done portably)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Program lifecycle

2000-08-10 Thread Dan Sugalski

At 10:01 PM 8/9/00 -0600, Nathan Torkington wrote:
Would it make sense for the parsing of a Perl program to be done as:
  - tokenize without rewriting (e.g., use stays as it is)
  - structure without rewriting (e.g., constant subs are unfolded)
  - rewrite for optimizations and actual ops

The structure I've been thinking of looks like:

Program Text
 |
 |
 |
 V
 +--+
 |   Lex/parse  |
 +--+
|
Syntax tree
|
V
 +--+
 | Bytecoder|
 +--+
|
Bytecodes
|
V
 +--+
 | Optimizer|
 +--+
|
Optimized
bytecodes
|
V
 +--+
 | Execution|
 |  Engine  |
 +--+

With each box being replaceable, and the process being freezable between 
boxes. The lexer and parser probably ought to be separated, thinking about 
it, and we probably want to allow folks to wedge at least C code into each 
bit. (I'm not sure whether allowing you to write part of the optimizer in 
perl would be a win, but I suppose if it was saving the byte stream to disk...)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Program lifecycle

2000-08-10 Thread Chaim Frenkel

You may also want to be able to short circuit some of the steps.
Especially where the startup time may outweigh the win of optimization.

And if there could be different execution engines. Machine level,
bytecode, (and perhaps straight out of the syntax tree.)

Hmm, might that make some debugging easier?

chaim

 "DS" == Dan Sugalski [EMAIL PROTECTED] writes:

DS The structure I've been thinking of looks like:

DS Program Text
DS  |
DS  |
DS  |
DS  V
DS  +--+
DS  |   Lex/parse  |
DS  +--+
DS |
DS Syntax tree
DS |
DS V
DS  +--+
DS  | Bytecoder|
DS  +--+
DS |
DS Bytecodes
DS |
DS V
DS  +--+
DS  | Optimizer|
DS  +--+
DS |
DS Optimized
DS bytecodes
DS |
DS V
DS  +--+
DS  | Execution|
DS  |  Engine  |
DS  +--+

DS With each box being replaceable, and the process being freezable between 
DS boxes. The lexer and parser probably ought to be separated, thinking about 
DS it, and we probably want to allow folks to wedge at least C code into each 
DS bit. (I'm not sure whether allowing you to write part of the optimizer in 
DS perl would be a win, but I suppose if it was saving the byte stream to disk...)

-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: Program lifecycle

2000-08-10 Thread Chaim Frenkel

 "NT" == Nathan Torkington [EMAIL PROTECTED] writes:

NT  - source filters munge the pure source code
NT  - cpp-like macros would work with token streams
NT  - pretty printers need unmunged tokens in an unoptimized tree, which
NTmay well be unfeasible

I was thinking of macros as being passed some arguments but then can
either manipulate the raw source code or ask the lexer/parser for
parsed tokens.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: Program lifecycle

2000-08-10 Thread Dan Sugalski

At 03:36 PM 8/10/00 -0400, Chaim Frenkel wrote:
You may also want to be able to short circuit some of the steps.
Especially where the startup time may outweigh the win of optimization.

The only one that's skippable is the optimizer, really. I'd  planned on 
having to pass it some indicator of how aggressive it should be,

And if there could be different execution engines. Machine level,
bytecode, (and perhaps straight out of the syntax tree.)

Yup. Hence the "replaceable" bit. :) The boxes would all have a fixed and 
well-defined interface, and the various streams (syntax tree and bytcode) 
would also be well-defined. If you wanted to build an execution box that 
instead dumped out java bytecodes, well, sounds like a good plan to me. :)

Hmm, might that make some debugging easier?

Might. Hard to say, though if we get them as black boxes at least it'll 
make debugging more compartmentalized.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk