> >>We're talking bytecode. That will indeed be a case of "huge arrays of
> >>tightly packed integers".
> >
> >For bytecode, it's not a big problem, certainly not one I'm worried about.
> >Machines that want 64-bit ints have, likely speaking, more than enough
> >memory to handle the larger bytecode.
>
> I'm more worried about the cache. For 32 bit bytecodes, the same program
> will be only half the size than for 64 bit. Or: you can fit a twice as
> large program in the cache, or two of them (for multitasking). That will
> mean a speed-up, and likely a vast one for programs with sizes close
> enough the cache size.
>
If this is the case ten why are we using 32bit register identifiers?
Obviously it makes code easier to write.  But at some point are we going
to compress the byte-code?  Along with a previous email that suggested
that byte-code is only going to be valid on a given machine with a
pre-compiled parrot / perl6 core, the bytecode won't need to worry about
the number of registers, etc.  Most VM architectures use 8 or 16 bits for
the op-code (including the register map).

Here are the pros and cons as I see them:
Cons:
  o 8bit op-codes dramatically limits number of "macro-ops" or advanced
ops
  o 16bit op-codes have potentially screwy alignment issues.
  o 4bit register addresses have definatly screwy alignment issues
(requires masking to extract values which takes more cpu-time)
  o sub-32-bit ops might be slower on non x86 architecture (since more and
more are 64 or 32bit only; and require special ops to munch sub-32bit
data; namely alpha)
  o If the constant-table index was chosen 16bits, we'd limit the number
of entries to 64K per code-segment (But who would ever use more than this
in p-code ;)
  o If the address was limited to 16 bits, we'd have code-size limitation
of som 4K ops per code-segment.
  o allowing p-code-size to be an issue dramatically increases the number
of op-codes earlier on in development (write this as a bug-enchancement
instead?).  Namely we have inc16_i8_ic8, inc16_i8_ic16, inc16_i8_ic32,
add16_i8_i8_ic16, add16_i8_i8_ic32.
  o larger number of op-codes means more c-code, which means a trade-off
between D-cache (for byte-code) and C-cache (for extra c-code).

Pros:
  o dramaticaly compressed op-code size (from 128bits on average to
32,48,64).  Saves both on disk space and cache-space - tighter
inner-loops.
  o Potentially more highly optimized c-code; 16bit adds are somewhat
faster on some architectures - do what's needed when it's needed.  64bit
archs will upcast everythign anyway.
  o If we eventually determined a max-code-length (of like 64bits after
alignment), then we could just make all code that big.  This would
dramatically reduce c-code over-head (no offset = DEFAULT_SIZE; offset +=
rel_address; return offset; .. code = offset;).  This would additionally
reduce the risk of jumping into the middle of an op-code.  Heck we could
do this now; simply profile all op-codes to determine the max code-size,
then pad all op-codes to that size.  Given that we're not into
dynamic-opcodes, and most everything is being pushed into the constants
area, I don't see much danger in it.

Food for thought..

-Michael

Reply via email to