Dan Sugalski <[EMAIL PROTECTED]> wrote:
> At 11:17 PM +0200 9/1/03, Leopold Toetsch wrote:

>>I don't see the point here especially why we would need a temporary PMC.
>>If we have an array of packed ints, I just need a pointer to the element
>>to work on it. This is very similar to the C<key> opcode I had in some of
>>my proposals.

> We can't do that. There's insufficient information about the
> variables at compiletime to do anything other than call into the PMC
> to do the operation, so the internal representation's irrelevant. As
> far as parrot is concerned, it's a PMC and it doesn't know anything
> about the inside.

We don't loose the abstraction with a special C<key> opcode. This C<key>
opcode is the abstraction: get a pointer to an item in the aggregate (or
prepare for a LHS). This is a vtable method of the aggregate. Splitting
the 3-keyed operations down to simpler parts, that's it - no permutations.

>>Not when you need 64 opcodes for the keyed variants. 64:1 isn't
>>"somewhat balanced".

> Erm.... you need to redo your math.

So I'll try. We have 4 different "addressing" modes of one single keyed
argument:

 set_p_k
 set_p_kc
 set_p_ki
 set_p_kic

Now when we want to support 3-keyed operations there are 4*4*4 different
opcodes of an all-3-keyed operation (+ some more if one isn't keyed).

If we only have e.g. <op>_p_k_p_k_p_k we are missing the nice
optimization of integer indices, we have to go through key_integer(),
have to check for NULL keys and we need a full fledged (albeit) constant
Key PMC per argument.
Further: you have to implement type morphing, range checking BIG*
promotion in the non-keyed variants and in the keyed vtable variants
too. This is error-prone and a waste of memory.

Anf finally, you are saying that these are most usefull for aggregate
containing non-PMCs. What about

  @a[i] = @b[i] + 1;

or the equivalent Perl6 vector operations[1]. More permutaions on opcodes
to implement these?

> ... Even *if* we went with a full set
> of keyed and unkeyed parameters (and I don't see a reason to do so,
> though we certainly can) it's not 64. At worst it's 16 for three-arg
> ops, and that's for both the keyed int, keyed normal, and nonkeyed
> version.

What I'm missing here?

>>Implementation details wanted ;-)

> I'll go thump the ops preprocessor, then. There's no reason to
> actually write all the code for it, as it can be autogenerated.

But please with a Configure switch to turn it off ;-)

[1] I expect the most speed gain here, when we can optimize to
hell.

  @a >>=<< @b + 1       // @a, @b are known to be arrays of packed int

  n = @b.elements()
  c = n / CHUNK_SIZE
  r = n % CHUNK_SIZE
  for i (0..c-1)
    ptra = @a.make_chunk(i)
    ptrb = @b.get_chunk(i)
    for (1..CHUNK_SIZE)
      *ptr++ = *ptrb++ + 1
  // do rest ...
  @a.elements = @b.elements

This would be a seqence of some specialized opcodes (enventually with a
runtime check in front) and its fully JITtable.

leo

Reply via email to