On Apr 19, 2013, at 17:31, Andi Kleen <a...@firstfloor.org> wrote: > Later on I think it's better to either always use large hash tables > (virtual memory is cheap) or to dynamically size them based on a > estimate of the available types.
That logic doesn't really work for hash tables. Assuming the hash keys as close to random (as they should be), there is no locality of reference, so most/all of the hash table will be part of the working set: hash tables don't just use virtual memory, they use real memory. A very sparsely populated hash table may end up wasting most of each VM page to just store a few hashed values. Bad for locality, and bad for performance. -Geert