Re: hash tables

Glynn Clements Sat, 14 Nov 1998 03:23:23 -0500

Leon Breedt wrote:

> i've been hearing a lot about hash tables used in conjunction with linked
> lists...can anyone give me a good description of how one is implemented?
> i know the reason for them (speed).

A hash table is just a fixed-size array, where the index is calculated
using a hash of the data which is to be stored. The hash function
typicall depends upon the type of data to be stored, but the general
principle is that all hash values should be equally likely.

As it is possible for two data to have the same hash, you need a way
to deal with collisions.

One option is to have a rehash function. If you find that the slot is
already used, you use the rehash function to pick a different slot. If
that is used, then you rehash again, etc. This approach works best if
you can set an upper bound on the number of items in the table. You
generally need the array to be about 50% larger than than the total
number of items. If the table becomes too full, the number of
collisions increases, and access slows down.

The other option is to make each entry a linked list. This way, you
never need to rehash, and there is no limit on the number of items. 
The mean access time increases in proportion to to the ratio of the
number of items to the table size.

-- 
Glynn Clements <[EMAIL PROTECTED]>
Re: hash tables

Reply via email to