On Sun, 14 Aug 2011, Philippe Marschall wrote:
On 14.08.2011 22:00, Levente Uzonyi wrote:
On Sun, 14 Aug 2011, Philippe Marschall wrote:
Hi
In Seaside we get a lot of performance gains out of
primitiveFindFirstInString. One thing that always annoyed me a bit is
that it's not as optimized as it could be.
The inclusionMap you give it are 256 consecutive boolean values (0 or
1). There is no need for this to be a 256 element ByteArray when each
element can only be 0 or 1. We could as well make it a 32 element
ByteArray and each byte holding eight bit values. Instead of using the
asciiValue to directly index into the inclusionMap we would use the top
five bits to index into the inclusionMap and the bottom 3 bits to "index
into the byte".
Did that make any sense?
Do you want to save space?
I'm trying to trade memory access (which is slow) for a bit shift and
two bit ands (which is fast).
256 bytes + object header easily fit into the L1 cache of any recent x86
CPU, even the old P4's had 8kB of it. Moving less bytes around may make
some difference though, but if you create your ByteArrays right before you
use them, then it's pretty likely that your object is already in the
cache.
Are you storing lots of inclusion maps (maybe
CharacterSets)? If not, then IMHO it's not worth to adding this feature
to this primitive, because runtime performance will be worse on both the
image side and the VM side.
Why?
Filling the contents of a compact 32 sized ByteArray in the image takes
more time than a 256 byte ByteArray. On the VM side memory access + shifts
+ bitands cost more than just memory access. But (as usual) a benchmark
can tell if it's really worth to do it or not.
Levente
Cheers
Philippe