Re: JIT and platforms warning

2004-10-29 Thread Leopold Toetsch
Leopold Toetsch [EMAIL PROTECTED] wrote:

 arm, mips, and sun4 JIT platforms need definitely some work to even keep
 up with the current state of the JIT interface.

That's actually wrong - sorry. Sun4 JIT is fairly complete and is up to
date and should't have been in above sentence.

I've messed that up with the arm platform, which doesn't use register
mappings at all. All opcodes are reloading and storing from/to Parrot
registers. Mips only has 3 JITted opcodes.

Sorry again Stéphane,
leo


Re: [PATCH] Re: JIT and platforms warning

2004-10-24 Thread Leopold Toetsch
Jeff Clites [EMAIL PROTECTED] wrote:
 On Oct 23, 2004, at 4:20 AM, Leopold Toetsch wrote:

 Jeff Clites [EMAIL PROTECTED] wrote:

 See attached the patch, plus the new asm.s file.

 Doesn't run, segfaults on even mops.pasm - please check.

 I can't reproduce that here; parrot -j works for me

Sorry, false alarm. I must have had indirect register access tured on.

Runs fine and gets applied.

leo


Re: [PATCH] Re: JIT and platforms warning

2004-10-24 Thread Leopold Toetsch
Jeff Clites [EMAIL PROTECTED] wrote:
 On Oct 23, 2004, at 3:42 AM, Leopold Toetsch wrote:

 We were allocating the volatile float registers first (or, only)--so
 Cset_s_sc was blowing away an N-register, even with only one in use.
 That's why I was surprised there weren't more failures.

Yes. As said i386 had the same problem and not one test file did break.

I've now added a quick hack to the JIT compiler, which seems to do the
right thing, or better, it's a step towards that:

$extern = 1 if $asm =~ /call_func/;

This marks the function being extern to JIT, so the register preserving
code is activated. It's not quite the best solution, because only
used volatiles have to be preserved/restored. More below ...

 Anyway, I think we need a more general solution.
...
 Solution: The JIT compiler/optimizer already calculates the register
 mapping. We have to use this information for JIT pro- and epilogs.

 That makes sense--I hadn't initially realized we were tracking this,
 but since we are, we should use it. Things may be a bit tricky
 fixup-wise, since the size of the prolog will depend on how many float
 registers we need to preserve.

Not really. When creating the prolog/epilog you know exactly the
register mapping of each section. So we just have to find the maximum
for each register kind, then decide, if we should use non-volatiles or
volatiles or both and emit the appropriate code.

We probably need some platform settings and tweakables, which allocation
policy should be used, e.g.

  PARROT_JIT_PREFER_VOLATILES

and maybe a threshold, when to change allocation policy. But anyway, in
jit_info-optimizer, you'll get the count of used registers. With that
information the platform code can calculate the used stack storage and
emit the appropriate pro- and epilogs. E.g. from
Mach-O- Runtime Conventions for PowerPC - PowerPC Stack Structure

  spaceToSave = linkageArea + params + localVars + 4 * nGPRS + 8 * nFPRS

rounded up to x*16. It's the same, what currently is a hardcode define.

 2) JITed functions with calls into Parrot

 The jit2h JIT compiler needs a hint, that we call external code, e.g.

CALL_FUNCTION(string_copy)

So, back to external functions. With this syntax (and a needed jit_info
argument), we can do:

$extern = -1 if $asm =~ /CALL_FUNCTION/;

This activates the saving and restoring of used volatiles. I don't think
that we have to write back non-volatiles to Parrot registers, at least
not, if we define that JITted code like that will not see a correct
Parrot register set. If a function would need the Parrot registers, just
don't provide a JITted version for it.

OTOH this could be problematic if the function throws an exception, and
if the exception handler is able to provide a Parrot core dump.

 One other question: Your P_ARITH optimization in jit/ppc/core.jit. I
 can't come up with a case where this kicks in,

It's alomost only for examples/benchmarks/mops.pasm: + 50% speed. Now
PPC JIT code is two instructions for the loop instead of three ;)

 Thanks,

 JEff

leo


[PATCH] Re: JIT and platforms warning

2004-10-23 Thread Jeff Clites
On Oct 22, 2004, at 3:57 AM, Leopold Toetsch wrote:
Jeff Clites wrote:
On Oct 22, 2004, at 1:01 AM, Leopold Toetsch wrote:
[JIT changes]
I just finished tracking down the source of a couple of JIT test 
failures on PPC--due to recent changes but only indirectly related, 
and pointing out things which needed fixing anyway (float register 
preservation issues). I'll send it in tomorrow after I've had a 
chance to clean it up and add some comments.
Please make sure to get a recent CVS copy and try to prepare the patch 
during my night ;)
Of course!
I've changed the PPC float register allocation yesterday, because it 
did look bogus - i.e. the non-volatile FPRs were allocated but not 
saved. That should be fixed.
Yep, that was the core of the issue. There's no free lunch--if we use 
the nonvolatile registers, we need to preserve/restore them in 
begin/end, but if we use the volatile registers, we need to preserve 
them across function calls (incl. normal op calls). So I added code to 
do the appropriate save/restore, and use the non-volatile registers for 
mapping--that should be less asm than what we'd have to do to use the 
volatile registers. (The surprising thing was that we only got 2 
failures when using the volatile registers--I'll look into creating 
some tests that would detect problems with register preservation.)

The other tricky part was that saving/restoring the FP registers is one 
instruction per saved register, so saving all 18 was exceeding the asm 
size we allocate in src/jit.c (in some cases), since we emit Parrot_end 
for all restart ops. The fix for this was to pull this asm out into a 
utility routine, and just call that from the asm. (This is only done 
for restoring the registers so far--I should do it for the preservation 
step too, but right now that's just inline.)

The attached patch also contains some other small improvements I'd been 
working on, and a few more jitted ops to demonstrate calling a C 
function from a jitted op.

While you're investigating PPC JIT: JIT debugging via stabs doesn't 
work at all here. I get around the gdb message (missing data segment) 
by inserting .data\n.text\n in the stabs file, but that's all. After 
this change and regenerating file.o I can load it with 
add-symbol-file file.o without any complaint, but the access to the 
memory region the jitted code occupies is not allowed, disassemble 
doesn't work ...
I'll take a look and see if I can figure it out. I remember gdb being 
uncooperative about disassembling, frustratingly, and I've moved to the 
habit of using something like x/300i jit_code as a workaround. 
Clearly it can access the memory region, so it seems like a gdb bug.

See attached the patch, plus the new asm.s file.
JEff
config/gen/platform/darwin/asm.s:


asm.s
Description: application/applefile


asm.s
Description: application/text


ppc-jit-preserve-fp.patch
Description: application/text


Re: [PATCH] Re: JIT and platforms warning

2004-10-23 Thread Leopold Toetsch
Jeff Clites [EMAIL PROTECTED] wrote:

 See attached the patch, plus the new asm.s file.

Doesn't run, segfaults on even mops.pasm - please check.

 JEff

leo


Re: [PATCH] Re: JIT and platforms warning

2004-10-23 Thread Leopold Toetsch
Jeff Clites [EMAIL PROTECTED] wrote:

 Yep, that was the core of the issue. There's no free lunch--if we use
 the nonvolatile registers, we need to preserve/restore them in
 begin/end, but if we use the volatile registers, we need to preserve
 them across function calls (incl. normal op calls).

Good point and JIT/i386 does it wrong in core.ops. But normal non-JITted
code is not the problem - the framework preserves mapped registers or
better - it has to copy all mapped registers to Parrot registers so that C
code is able to see the actual values.

And the framework also knows not to restore non-volatile registers, at
least if the platform code defines PRESERVED_type_REGISTERS and
arranges the registers correctly.

 ... So I added code to
 do the appropriate save/restore, and use the non-volatile registers for
 mapping--that should be less asm than what we'd have to do to use the
 volatile registers. (The surprising thing was that we only got 2
 failures when using the volatile registers--I'll look into creating
 some tests that would detect problems with register preservation.)

Well, allocation strategy on PPC (or I386) is to first use the
non-volatile registers. PPC has 14 usable registers. To provoke a
failure you'd need e.g. 15 different I-registers and then a JITted
function call like Cset_s_sc. But in normal cases you have more string
functions in that place and - as these are normally not JITted -
registers are saved and restored around the external function.

Oddly JIT/i386 has that problem too and there are now only 2
non-volatile registers. But albeit there are function calls, like
Cstring_bool, the whole test suite passes.

Anyway, I think we need a more general solution. We have basically two
problems:

1) JIT startup and end code size and memory usage

Given: a bunch of small overloaded vtable functions. All of these are
called through runops_fromc_*(). So for every function PPC JIT now would
move ~ 2 * 300 byte to and from the stack. That's too much and not
needed in that case.

Solution: The JIT compiler/optimizer already calculates the register
mapping. We have to use this information for JIT pro- and epilogs.

The allocation strategy should be adjusted and depend on code size and
register usage:
- big chunks of compiled code like whole modules should use the
  non-volatile registers first and then (if needed) the volatile
  registers.
- small chunks of code should use the volatile registers to reduce
  function startup and end sizes.

2) JITed functions with calls into Parrot

The jit2h JIT compiler needs a hint, that we call external code, e.g.

   CALL_FUNCTION(string_copy)

This notion is needed anyway for the EXEC core, which has to generate
fixup code for the executable. So with that information available, the
JIT compiler can surround such code with:

   PRESERVE_REGS();
   ...
   RESTORE_REGS();

The framework can now depending on the register mapping save and restore
volatile registers if needed.

 The other tricky part was that saving/restoring the FP registers is one
 instruction per saved register, so saving all 18 was exceeding the asm
 size we allocate in src/jit.c (in some cases), since we emit Parrot_end
 for all restart ops.

Yep, that's suboptimal. I've done that on i386 because it was just easy.
But you are right, the Parrot_end() code should really be there only
once.

 The attached patch also contains some other small improvements I'd been
 working on, and a few more jitted ops to demonstrate calling a C
 function from a jitted op.

I'll apply it, because it's obviously correct albeit suboptimal ;) But
we can improve things always later.

 ... , frustratingly, and I've moved to the
 habit of using something like x/300i jit_code as a workaround.

Ah, yes - forgot that.

 Clearly it can access the memory region, so it seems like a gdb bug.

Yep.

 JEff

Thanks,
leo


Re: [PATCH] Re: JIT and platforms warning

2004-10-23 Thread Jeff Clites
On Oct 23, 2004, at 4:20 AM, Leopold Toetsch wrote:
Jeff Clites [EMAIL PROTECTED] wrote:
See attached the patch, plus the new asm.s file.
Doesn't run, segfaults on even mops.pasm - please check.
I can't reproduce that here; parrot -j works for me with 
examples/{benchmarks,assembly}/mops.pasm, and all 'make testj' tests 
pass. (They were passing before, and I updated to pick up changes since 
I sent the patch, and still all passes.) I also tried building against 
a system ICU (was building against the parrot-supplied version), in 
case there were issues with calling into shared lib code (long shot), 
and no difference.

Below is myconfig--I'm not doing any special configuration (just 
running 'perl Configure.pl', and not building optimized). Do you have 
any uncommitted local changes which might be involved? I'm up-to-date 
from CVS, and have no uncommitted changes except for those in the 
patch.

JEff
Summary of my parrot 0.1.1 configuration:
  configdate='Sat Oct 23 13:06:30 2004'
  Platform:
osname=darwin, archname=darwin
jitcapable=1, jitarchname=ppc-darwin,
jitosname=DARWIN, jitcpuarch=ppc
execcapable=1
perl=perl
  Compiler:
cc='cc', ccflags='-g -pipe -pipe -fno-common -no-cpp-precomp 
-DHAS_TELLDIR_PROTOTYPE  -pipe -fno-common -Wno-long-double  
-I/sw/include',
  Linker and Libraries:
ld='c++', ldflags='  -flat_namespace  -L/sw/lib',
cc_ldflags='',
libs='-lm -lgmp'
  Dynamic Linking:
share_ext='.dylib', ld_share_flags='-dynamiclib',
load_ext='.bundle', ld_load_flags='-bundle -undefined suppress'
  Types:
iv=long, intvalsize=4, intsize=4, opcode_t=long, opcode_t_size=4,
ptrsize=4, ptr_alignment=1 byteorder=4321,
nv=double, numvalsize=8, doublesize=8

% gcc -v
Reading specs from /usr/libexec/gcc/darwin/ppc/3.3/specs
Thread model: posix
gcc version 3.3 20030304 (Apple Computer, Inc. build 1495)


Re: [PATCH] Re: JIT and platforms warning

2004-10-23 Thread Jeff Clites
On Oct 23, 2004, at 3:42 AM, Leopold Toetsch wrote:
Jeff Clites [EMAIL PROTECTED] wrote:
Yep, that was the core of the issue. There's no free lunch--if we use
the nonvolatile registers, we need to preserve/restore them in
begin/end, but if we use the volatile registers, we need to preserve
them across function calls (incl. normal op calls).
Good point and JIT/i386 does it wrong in core.ops. But normal 
non-JITted
code is not the problem - the framework preserves mapped registers or
better - it has to copy all mapped registers to Parrot registers so 
that C
code is able to see the actual values.
Ah, good. The problem I was seeing was caused by Cset_s_sc, since my 
jit_emit_call_func was written with the assumption that we should be 
mapping only non-volatile registers (even though we were actually 
mapping volatiles, but at the end of the list, and even though we 
weren't yet preserving the appropriate float registers). But a recent 
check-in had us mapping the volatile float registers only, which 
exposed the problem.

... So I added code to
do the appropriate save/restore, and use the non-volatile registers 
for
mapping--that should be less asm than what we'd have to do to use the
volatile registers. (The surprising thing was that we only got 2
failures when using the volatile registers--I'll look into creating
some tests that would detect problems with register preservation.)
Well, allocation strategy on PPC (or I386) is to first use the
non-volatile registers. PPC has 14 usable registers. To provoke a
failure you'd need e.g. 15 different I-registers and then a JITted
function call like Cset_s_sc.
We were allocating the volatile float registers first (or, only)--so 
Cset_s_sc was blowing away an N-register, even with only one in use. 
That's why I was surprised there weren't more failures.

Anyway, I think we need a more general solution.
...
Solution: The JIT compiler/optimizer already calculates the register
mapping. We have to use this information for JIT pro- and epilogs.
That makes sense--I hadn't initially realized we were tracking this, 
but since we are, we should use it. Things may be a bit tricky 
fixup-wise, since the size of the prolog will depend on how many float 
registers we need to preserve. (We can save a variable number of int 
registers with just a single instruction in the PPC case, as long as we 
are using registers in order, or we don't mind saving a few extra if 
we are not.)

The allocation strategy should be adjusted and depend on code size and
register usage:
- big chunks of compiled code like whole modules should use the
  non-volatile registers first and then (if needed) the volatile
  registers.
- small chunks of code should use the volatile registers to reduce
  function startup and end sizes.
Yep, makes sense. And there may be a case for not mapping at all for 
very small chunks, but I'd have to experiment to know if that's be a 
win.

2) JITed functions with calls into Parrot
The jit2h JIT compiler needs a hint, that we call external code, e.g.
   CALL_FUNCTION(string_copy)
This notion is needed anyway for the EXEC core, which has to generate
fixup code for the executable. So with that information available, the
JIT compiler can surround such code with:
   PRESERVE_REGS();
   ...
   RESTORE_REGS();
The framework can now depending on the register mapping save and 
restore
volatile registers if needed.
Sounds good.
The other tricky part was that saving/restoring the FP registers is 
one
instruction per saved register, so saving all 18 was exceeding the asm
size we allocate in src/jit.c (in some cases), since we emit 
Parrot_end
for all restart ops.
Yep, that's suboptimal. I've done that on i386 because it was just 
easy.
But you are right, the Parrot_end() code should really be there only
once.
Yep, instead of emitting it multiple times and conditionally jumping 
over it, we can emit it just once and conditionally jump to it.

... , frustratingly, and I've moved to the
habit of using something like x/300i jit_code as a workaround.
Ah, yes - forgot that.
At least stepping does the right thing now, with your fix for the stabs 
file. Very nice to be able to do p I0 and such.

One other question: Your P_ARITH optimization in jit/ppc/core.jit. I 
can't come up with a case where this kicks in, since in the tests I've 
tried, prev_op is always NULL when JITting if_i_ic or unless_i_ic. If I 
set things to not map any int registers, then I don't hit the cases in 
jit_emit.h which set prev_op to 0, but build_asm is doing it--I can't 
come up with a small test case which isn't being treated as multiple 
sections, it seems. There's something there w.r.t. sections which I 
don't understand, but anyway I just wanted to know how to set things up 
so that I can see your optimization in action.

Thanks,
JEff


Re: JIT and platforms warning

2004-10-22 Thread Jeff Clites
On Oct 22, 2004, at 1:01 AM, Leopold Toetsch wrote:
[JIT changes]
I just finished tracking down the source of a couple of JIT test 
failures on PPC--due to recent changes but only indirectly related, and 
pointing out things which needed fixing anyway (float register 
preservation issues). I'll send it in tomorrow after I've had a chance 
to clean it up and add some comments.

JEff


Re: JIT and platforms warning

2004-10-22 Thread Leopold Toetsch
Jeff Clites wrote:
On Oct 22, 2004, at 1:01 AM, Leopold Toetsch wrote:
[JIT changes]

I just finished tracking down the source of a couple of JIT test 
failures on PPC--due to recent changes but only indirectly related, and 
pointing out things which needed fixing anyway (float register 
preservation issues). I'll send it in tomorrow after I've had a chance 
to clean it up and add some comments.
Please make sure to get a recent CVS copy and try to prepare the patch 
during my night ;) I've changed the PPC float register allocation 
yesterday, because it did look bogus - i.e. the non-volatile FPRs were 
allocated but not saved. That should be fixed.

While you're investigating PPC JIT: JIT debugging via stabs doesn't work 
at all here. I get around the gdb message (missing data segment) by 
inserting .data\n.text\n in the stabs file, but that's all. After this 
change and regenerating file.o I can load it with add-symbol-file 
file.o without any complaint, but the access to the memory region the 
jitted code occupies is not allowed, disassemble doesn't work ...

JEff
leo