> >>We're talking bytecode. That will indeed be a case of "huge arrays of
> >>tightly packed integers".
> >
> >For bytecode, it's not a big problem, certainly not one I'm worried about.
> >Machines that want 64-bit ints have, likely speaking, more than enough
> >memory to handle the larger bytecode.
>
> I'm more worried about the cache. For 32 bit bytecodes, the same program
> will be only half the size than for 64 bit. Or: you can fit a twice as
> large program in the cache, or two of them (for multitasking). That will
> mean a speed-up, and likely a vast one for programs with sizes close
> enough the cache size.
>
If this is the case ten why are we using 32bit register identifiers?
Obviously it makes code easier to write. But at some point are we going
to compress the byte-code? Along with a previous email that suggested
that byte-code is only going to be valid on a given machine with a
pre-compiled parrot / perl6 core, the bytecode won't need to worry about
the number of registers, etc. Most VM architectures use 8 or 16 bits for
the op-code (including the register map).
Here are the pros and cons as I see them:
Cons:
o 8bit op-codes dramatically limits number of "macro-ops" or advanced
ops
o 16bit op-codes have potentially screwy alignment issues.
o 4bit register addresses have definatly screwy alignment issues
(requires masking to extract values which takes more cpu-time)
o sub-32-bit ops might be slower on non x86 architecture (since more and
more are 64 or 32bit only; and require special ops to munch sub-32bit
data; namely alpha)
o If the constant-table index was chosen 16bits, we'd limit the number
of entries to 64K per code-segment (But who would ever use more than this
in p-code ;)
o If the address was limited to 16 bits, we'd have code-size limitation
of som 4K ops per code-segment.
o allowing p-code-size to be an issue dramatically increases the number
of op-codes earlier on in development (write this as a bug-enchancement
instead?). Namely we have inc16_i8_ic8, inc16_i8_ic16, inc16_i8_ic32,
add16_i8_i8_ic16, add16_i8_i8_ic32.
o larger number of op-codes means more c-code, which means a trade-off
between D-cache (for byte-code) and C-cache (for extra c-code).
Pros:
o dramaticaly compressed op-code size (from 128bits on average to
32,48,64). Saves both on disk space and cache-space - tighter
inner-loops.
o Potentially more highly optimized c-code; 16bit adds are somewhat
faster on some architectures - do what's needed when it's needed. 64bit
archs will upcast everythign anyway.
o If we eventually determined a max-code-length (of like 64bits after
alignment), then we could just make all code that big. This would
dramatically reduce c-code over-head (no offset = DEFAULT_SIZE; offset +=
rel_address; return offset; .. code = offset;). This would additionally
reduce the risk of jumping into the middle of an op-code. Heck we could
do this now; simply profile all op-codes to determine the max code-size,
then pad all op-codes to that size. Given that we're not into
dynamic-opcodes, and most everything is being pushed into the constants
area, I don't see much danger in it.
Food for thought..
-Michael