[JIT] bsr/ret in native code

2002-06-14 Thread Aldo Calpini


hello there,
in one of my endless tours inside the JIT world, I came up with this idea
which seems to give a major speed increase.

basically, I'm substituting the Parrot method for subroutines (push the
current address in the call stack and then jump) with a plain native
x86 ASM call instruction. and of course, the ret instruction is just a
plain native ret instruction.

that way I'm completely avoiding the call stack, just relaying to the
CPU internal stack for this.

to make it work, I had to JIT all the 2-parameters eq/ne instructions
to perform a ret on successful comparison instead of a pop and goto.

this is of course a major change in the internal working of the interpreter
when using the -j option, so I'm not sure it is a Good Thing. you would
not be able, for example, to inspect the call stack from inside a Parrot
program anymore.

anyway, this is a little sample of the implementation:

  Parrot_bsr_ic {
emit_call_op2(jit_info, *INT_CONST[1]);
  }

  Parrot_ret {
emitm_ret(NATIVECODE);
  }

  Parrot_eq_i_i {
emitm_movl_m_r(NATIVECODE, emit_EAX, emit_None, emit_None, emit_None, 
INT_REG[1]);
emitm_cmpl_r_m(NATIVECODE, emit_EAX, emit_None, emit_None, emit_None, 
INT_REG[2]);
emitm_jxs(NATIVECODE, emitm_jne, +1);
emitm_ret(NATIVECODE);
  }

there are of course a lot more eq_X_X and ne_X_X combination, but they're
all similar to this.

the emit_call_op2 in jit.h is just a slight variant of emit_call_op which
only uses 32 bit displacement for backward calls (don't ask me why, but it 
seems to work like this):

  static void emit_call_op2(Parrot_jit_info *jit_info, opcode_t disp){
long offset;
opcode_t opcode;

opcode = jit_info-op_i + disp;

if(opcode = jit_info-op_i) {
  offset = jit_info-op_map[opcode].offset -
   (jit_info-native_ptr - jit_info-arena_start);
  emitm_calll(jit_info-native_ptr, offset - 5);
  return;
}

Parrot_jit_newfixup(jit_info);
jit_info-fixups-type = JIT_X86JUMP;
jit_info-fixups-param.opcode = opcode;
emitm_calll(jit_info-native_ptr, 0xc0def00d);
  }

if anybody sees a problem with this approach, please let me know, otherwise
I'll go on with the patch.

cheers,
Aldo

__END__
$_=q,just perl,,s, , another ,,s,$, hacker,,print;




Re: [JIT] bsr/ret in native code

2002-06-14 Thread Dan Sugalski

At 9:54 AM +0200 6/14/02, Aldo Calpini wrote:
you would
not be able, for example, to inspect the call stack from inside a Parrot
program anymore.

That, unfortunately, makes it untenable, since we need to be able to 
do this in the general case. Also, we'll fill up the thread stack 
pretty quickly. Not hugely fast, mind, but it's still an issue when 
we have a potentially small stack on hand. (20-40K won't be unusual, 
unfortunately)

Believe me, I'd love to get the speed this way, but it'll make some 
code untenable, and the lack of stack inspection may be a problem. 
(If it turns out later to not be a problem, well, we can do it then. 
I like the idea, I just think the limits'll be a problem. Hopefully 
I'm wrong :)
-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: [JIT] bsr/ret in native code

2002-06-14 Thread Larry Wall

On Fri, 14 Jun 2002, Dan Sugalski wrote:
: At 9:54 AM +0200 6/14/02, Aldo Calpini wrote:
: you would
: not be able, for example, to inspect the call stack from inside a Parrot
: program anymore.
: 
: That, unfortunately, makes it untenable, since we need to be able to 
: do this in the general case. Also, we'll fill up the thread stack 
: pretty quickly. Not hugely fast, mind, but it's still an issue when 
: we have a potentially small stack on hand. (20-40K won't be unusual, 
: unfortunately)
: 
: Believe me, I'd love to get the speed this way, but it'll make some 
: code untenable, and the lack of stack inspection may be a problem. 
: (If it turns out later to not be a problem, well, we can do it then. 
: I like the idea, I just think the limits'll be a problem. Hopefully 
: I'm wrong :)

Hmm.  The routines called from tight loops tend to be leaf nodes.
It might very well be useful to keep track of which routines don't
inspect the stack.  It might even be worthwhile to make a language
rule saying that any routine that uses Ccaller or Cwant must so
indicate in the declaration somehow, via a superpositional return
type or a property.

Larry




Re: [JIT] bsr/ret in native code

2002-06-14 Thread Larry Wall

On Fri, 14 Jun 2002, Nicholas Clark wrote:
: But surely an routine that calls another routine can potentially have its
: stack inspected by the caller?

Certainly.

: So it would only make sense for leaf nodes, and even then they might
: get inspected by overloaded values or methods on objects that were passed
: as parameters?

Yes.

: So is it possible to make it useful in a general case, or were you meaning
: that a subroutine can declare I don't need to be on the stack, document
: itself as such, and then any indirect calls it makes don't get to see it
: (but at their own risk). It's still a form of action-at-a-distance, so
: is it that good?

Probably can't make the optimization unless we have the body and can
tell either that there are no indirect calls or that any indirect
calls made are known safe.  I can see some routines that could use
this optimization that couldn't use inlining (such as when we have
no guarantee against redefinition, except in that case you still have
to go indirect through the header).

: Or would the property of I don't use caller or want still be useful on a
: subroutine, because the run-time could determine that it would be
: inline-able (or whatever) inside a loop at run time, based on parameters
: passed to it? (and call it non-inline if the parameters were not base perl
: types)

Maybe.  I'm not an expert on run-time optimizations.  I just know that
the more info you have, the easier it is to know when you can get
away with a particular optimization.  And that there are advantages
and disadvantages to knowing anything at any particular stage.  And I
really like optional declarations, because then the programmer gets
to make the tradeoff.

Larry




Re: [JIT] bsr/ret in native code

2002-06-14 Thread Dan Sugalski

At 1:49 PM -0700 6/14/02, Larry Wall wrote:
On Fri, 14 Jun 2002, Nicholas Clark wrote:
: Or would the property of I don't use caller or want still be useful on a
: subroutine, because the run-time could determine that it would be
: inline-able (or whatever) inside a loop at run time, based on parameters
: passed to it? (and call it non-inline if the parameters were not base perl
: types)

Maybe.  I'm not an expert on run-time optimizations.  I just know that
the more info you have, the easier it is to know when you can get
away with a particular optimization.  And that there are advantages
and disadvantages to knowing anything at any particular stage.  And I
really like optional declarations, because then the programmer gets
to make the tradeoff.

There's also the problem of active data--does a variable's 
tie/overload functions have access to their calling stack? If so, 
it's doubly hard to figure out whether there's anything that may 
inspect the call stack.
-- 
 Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk