Doug On Thu, Jun 11, 2009 at 6:53 PM, Doug Lea<d...@cs.oswego.edu> wrote:
> * Lazily create all arrays Done. (in scala version it was done from the very beginning) > * Separate the front and back of index array. (Note that > you don't have any locality here to lose by doing so, > since you jump to index h+hashlen.) Initially it were 2 different arrays, 'firstIndex' and 'nextIndex', but extra array is one extra pointer in HashMap object and extra array object header, so overall 2 arrays = more memory And second array contents now is tied to elements, it's used to store/check deleted list, etc. so it's not trivial to make lazy allocation. > * Use default initial capacity of 4, Done already > And further, the performance differences are very small across > these load factors in a few tests I ran. This suggests a > different sort of tactic for helping with indirections and > prefetches. Suppose that index[h] was usually just h; that is, > with the key in elements[2*h]. (This only works in current > design if loadFactor==1, but can work with smaller values if > elements array is sized at hashlen.) Under this scheme, you can > optimistically read this element at the same time as the I did a little test of this and saw no performance gain, I even posted a link to sources right after changing subject from 'fast' to 'hybrid'. Do you have any idea why second array prefetching may not happen? > index. To make this work, you'd need a bitmap of available slots > rather than a freelist, adding C/32 space overhead, plus some > heuristic juggling around inside put and resize. It might be > worth an experiment or two. Doing this requires yet further > work/thought about how sizing varies with loadFactor > arguments. There is no need for extra bitmap - it can be stored along hashcodes. > The exact values depend of course on hash quality, insertion > order, etc. But this also reflects fact that the collisions array > (or back half of current index array) is typically less than 30% > occupied. This suggests shrinking its default size and letting > it independently resize, using the same sort of approach as > with main hash: start out with say 25% size, by letting sets of > 4 indices share an overflow slot, and expand as needed. Or > maybe start with 50%, which will still be enough footprint savings > to bother. As we already discussed, this approach has greater footprint since we always need to alllocate 3*(hashtable size) words PLUS overflow. There would be sence to do that only if prefetching trick with the same array index for hash and key/value would work. But it did not in my experiment, don't know why. Maybe I'll test it on different computers lately, or you'll help with some advice. Alex