On Sun, 28 Nov 2010 21:34:29 +0000 David Laight <[email protected]> wrote:
> On Sun, Nov 28, 2010 at 06:27:07PM +0100, Manuel Bouyer wrote: > > > Wouldn't a hash table work? > > > > I think it's too dependant on uid distribution, or even how much uid you > > have. a tree scales and adapts better. > > I agree, hash tables often degenerate into a linear search. > This means that it is still linear, just 'n' times faster that a > simple linear search. I don't think this is true. With a hash table that uses separate chaining, you might end up with a small number of links for some buckets, but the number of links will be very small. Hash tables are superior to trees when you know the size of your data set in advance, which means you don't need to resize hash table too often. I've had some issues with a hash table that was power of 2 in size, and most of the keys were multiples of 4, or 8 integers, for example. However making hash table size a prime number and using a good hash function, results in very good key distribution. This works especially well when hashing simple intergers, e.g. file descriptors, etc. Can you give a concrete example where you think hash table would be suboptimal?
