On Oct 23, 2004, at 3:42 AM, Leopold Toetsch wrote:

Jeff Clites <[EMAIL PROTECTED]> wrote:

Yep, that was the core of the issue. There's no free lunch--if we use
the nonvolatile registers, we need to preserve/restore them in
begin/end, but if we use the volatile registers, we need to preserve
them across function calls (incl. normal op calls).

Good point and JIT/i386 does it wrong in core.ops. But normal non-JITted
code is not the problem - the framework preserves mapped registers or
better - it has to copy all mapped registers to Parrot registers so that C
code is able to see the actual values.

Ah, good. The problem I was seeing was caused by C<set_s_sc>, since my jit_emit_call_func was written with the assumption that we should be mapping only non-volatile registers (even though we were actually mapping volatiles, but at the end of the list, and even though we weren't yet preserving the appropriate float registers). But a recent check-in had us mapping the volatile float registers only, which exposed the problem.


... So I added code to
do the appropriate save/restore, and use the non-volatile registers for
mapping--that should be less asm than what we'd have to do to use the
volatile registers. (The surprising thing was that we only got 2
failures when using the volatile registers--I'll look into creating
some tests that would detect problems with register preservation.)

Well, allocation strategy on PPC (or I386) is to first use the non-volatile registers. PPC has 14 usable registers. To provoke a failure you'd need e.g. 15 different I-registers and then a JITted function call like C<set_s_sc>.

We were allocating the volatile float registers first (or, only)--so C<set_s_sc> was blowing away an N-register, even with only one in use. That's why I was surprised there weren't more failures.


Anyway, I think we need a more general solution.
...
Solution: The JIT compiler/optimizer already calculates the register
mapping. We have to use this information for JIT pro- and epilogs.

That makes sense--I hadn't initially realized we were tracking this, but since we are, we should use it. Things may be a bit tricky fixup-wise, since the size of the prolog will depend on how many float registers we need to preserve. (We can save a variable number of int registers with just a single instruction in the PPC case, as long as we are using registers "in order", or we don't mind saving a few extra if we are not.)


The allocation strategy should be adjusted and depend on code size and
register usage:
- big chunks of compiled code like whole modules should use the
  non-volatile registers first and then (if needed) the volatile
  registers.
- small chunks of code should use the volatile registers to reduce
  function startup and end sizes.

Yep, makes sense. And there may be a case for not mapping at all for very small chunks, but I'd have to experiment to know if that's be a win.


2) JITed functions with calls into Parrot

The jit2h JIT compiler needs a hint, that we call external code, e.g.

   CALL_FUNCTION("string_copy")

This notion is needed anyway for the EXEC core, which has to generate
fixup code for the executable. So with that information available, the
JIT compiler can surround such code with:

   PRESERVE_REGS();
   ...
   RESTORE_REGS();

The framework can now depending on the register mapping save and restore
volatile registers if needed.

Sounds good.

The other tricky part was that saving/restoring the FP registers is one
instruction per saved register, so saving all 18 was exceeding the asm
size we allocate in src/jit.c (in some cases), since we emit Parrot_end
for all restart ops.

Yep, that's suboptimal. I've done that on i386 because it was just easy.
But you are right, the Parrot_end() code should really be there only
once.

Yep, instead of emitting it multiple times and conditionally jumping over it, we can emit it just once and conditionally jump to it.


... , frustratingly, and I've moved to the
habit of using something like "x/300i jit_code" as a workaround.

Ah, yes - forgot that.

At least stepping does the right thing now, with your fix for the stabs file. Very nice to be able to do "p I0" and such.



One other question: Your "P_ARITH" optimization in jit/ppc/core.jit. I can't come up with a case where this kicks in, since in the tests I've tried, prev_op is always NULL when JITting if_i_ic or unless_i_ic. If I set things to not map any int registers, then I don't hit the cases in jit_emit.h which set prev_op to 0, but build_asm is doing it--I can't come up with a small test case which isn't being treated as multiple sections, it seems. There's something there w.r.t. sections which I don't understand, but anyway I just wanted to know how to set things up so that I can see your optimization in action.


Thanks,

JEff



Reply via email to