On Mon, Nov 12, 2012 at 09:58:14PM +0400, Dmitry Olshansky wrote: > 11/12/2012 9:44 PM, H. S. Teoh пишет: [...] > >For storing integer types, you could optimize for space by dividing > >your key space into blocks of, say, 32 entries each. Then your hash > >function will hash only the upper (n-5) bits of the key, and each > >hash slot will be a bit array of 32 bits, which are indexed by the > >lower 5 bits of the key. Assuming your keys are relatively densely > >populated, this will save quite a significant amount of space. In the > >sparse case, you probably don't waste that much space either, since > >each slot is only one 32-bit int wide. > > Problem is that keys will collide so each slot will have to have upper > bits of key *and* the small set. The bucket itself is still inside > intrusive linked list.
True. > Otherwise the idea is great. About 3 words of overhead per word-sized > set. > > > > >You can probably also increase the size of the slots to, say, 64 bits > >or more, if you want to maximize the usage to GC allocation block > >overhead ratio. Yes, each slot is only 1 int wide, but the GC does > >have to maintain bookkeeping information that's probably a lot more > >than that. > > 16 bytes. Though you'd have to know how bucket entry looks like > which is impossible with built-in hash map sadly. [...] Actually, the built-in AA's bucket struct is duplicated in object_.d; it's what the range-based API uses to walk the AA. (Yeah it's extremely ugly. It must be cleaned up.) T -- There is no gravity. The earth sucks.
