Re: Parrot Forth 0.1

Matt Diephouse Mon, 18 Oct 2004 12:58:58 -0700

On Mon, 18 Oct 2004 14:17:59 -0500 (CDT), Michel Pelletier
<[EMAIL PROTECTED]> wrote:
> Okay, note that the code I mentioned (the
> speration of core from core words) is not
> checked in right now, but the version in CVS
> does do NCG.


Noted.

> Using the direct threading model, this does 2000
> global lookups and subroutine invokes, which in
> turn, do the actual "work" of 1000
> multiplications and the associated stack
> traffic.  The lookups and invokes are pure
> inner-loop overhead.
> 
> Using NCG this does 1000 multiplications and the
> associated stack traffic (which can be optimized
> out for the most part) with no lookups or
> invokes.
> 
> The overhead of diect threading vs. NCG does not
> need to be benchmarked, it can be proven by
> argument: both methods execute the same code the
> same way, but the NCG method does 2000 less
> global lookups and invokes.

Indeed. Pardon my ignorance. I hadn't thought things all the way through.

> The "extra" compiler overhead is trivial, and it
> only applies to compile-time; generally when a
> program is started.  At run-time (when all those
> lookups and invokes are happening in the direct
> thread case) there is no additional compilation
> overhead because a word is compiled only once.

This still doesn't seem right. The compilation from Forth to PIR only
happens once, yes. But each time the defined word is used, the PIR
code, which is injected, must be compiled to bytecode.

You said earlier that:

> direct thrading this would rsult in the
> execution of:
> 
>   find_global $P0, "dup"
>   invoke $P0
>   find_global $P0, "mul"
>   invoke $P0
> 
> in NCG it would result in the execution of:
> 
>   .POP
>   .NOS = .TOS
>   .PUSH2            # this can be optimized out
>   .POP2               # of NCG, but not direct threading
>   .TOS = .TOS * .NOS
>   .PUSH

The second PIR sequence is longer. It will take IMCC more time to
compile that than the first example. As the words become less trivial,
this will become more true.

But like you said, this only happens at (a) compile time or (b) at the
interactive prompt. And optimizing out push/pop combos will speed
things up more, though I'm not sure how to implement that optimization
using PIR.

So it may be programs can fall on either side of the fence of this
issue. Building words in terms of other words will give NCG an
advantage. But using relatively simple words many times will give
direct threading an advantage. But I do believe you when you say that
NCG is fastest overall (read, for most programs).

Furthermore, our two models will behave differently when you redefine
a word. Consider this Forth example:

 : inc 1 + ;
 : print+ inc . ;
 : inc 2 + ;

Should print+ increment by one or by two? gforth increments by one.
I'd be interesting in knowing which was the "correct" behavior.

-- 
matt

Re: Parrot Forth 0.1

Reply via email to