Re: hashtables

Thor Lancelot Simon Tue, 24 Jul 2018 18:09:01 -0700

On Tue, Jul 24, 2018 at 10:14:57PM +0000, co...@sdf.org wrote:
> 
> I don't have any opinions about hashtables - I just know that
> implementing one would take me really long and would probably still be
> buggy in the end, and I didn't find a similar API that claims to be a
> hashtable.
> 
> Is there some code I could use? I'm totally OK with not a hashtable too.


The basic approach used in our kernel is to declare an array of HEADs
for one of our queue.h datastructures (you usually only have to walk
in one direction, and hardly ever even that, so it's usually a LIST).

Each LIST is a "bucket".  The hash picks out the "bucket" (the LIST_HEAD)
and then you just insert at the head (if adding) or walk down the LIST
until you find the entry you're looking for (if searching).

As a hash function, if you've got things of a convenient size, you can
just use the % operator as your hash function (use an array of buckets
that's of a convenient prime size).  If what you have to hash is larger
there's a hash(9) API.

There are examples all over the kernel; the multicast and ifaddr hashes
in netinet/in.c are fairly simple, though they've had some locking added.
Maybe someone can suggest one even simpler.

This extends fairly naturally to other use cases -- you can put a
mutex at the head of each chain and make a datastructure that allows
parallel access to different buckets; you can sacrifice evenness of
the hash and use a power-of-2 hash size and shift/mask for the lookup,
which is faster most of the time but has a much worse worst case.

This kind of implementation doesn't lend itself to being expanded
so it's collision-free, but most else you might like to do is not too
hard.

Thor

Re: hashtables

Reply via email to