I've found this simple blog post:
http://journal.thobe.org/2011/02/better-support-for-short-strings-in.html

It suggests the creation of a ShortArray struct, that like arrays is 
size_t.sizeof*2, that's usable for short arrays and short strings.

In 64 bit mode that's 16 bytes long, just like a dynamic array. The higher bit 
of the struct may denote a "short array". If it's 0, then the the second nibble 
of the first byte represents the length (even if it's an array of bytes its 
length can't be more than 15), otherwise the struct contains a length/ptr pair, 
like a normal dynamic array. This is able to remove many heap allocations 
(without any compression or tricks), because in programs a good percentage of 
strings are short.

Every time an item of the array is accessed, the struct opIndex/opIndexAssign 
has to test the first bit, this lowers the performance, but there are no 
pointers to follow in other parts of the memory, so the overall performance is 
probably not much different, and if the array is in RAM instead of caches, it 
may be even faster.

The ptr is not always a pointer, so the GC needs to be aware of this and do not 
follow the pointer if the first bit of the struct is 0. To manage this cleanly, 
it's useful a standard method like onGCScan() I have suggested time ago, to 
specify the runtime semantics for the garbage collector.

Bye,
bearophile

Reply via email to