RE: Hash Tables / Indexes

Norman L. Smith Tue, 13 Nov 2001 05:55:20 -0800


> From: Karl J. Stubsjoen
> Hello,
>
> Can someone explain hash tables or hash indexes and if we can
> take advantage
> of them in MySQL?
>
> Thanks!
>
> Karl


A few words on a meaty subject...

Hashing is a search method based on an arithmetic or algorithmic
transformation of the key.

Hashing is a means to speed up table searching by calculating the position
for a record in the actual table (hash table) or in the index (hash index)
for a table based on the value of the key.  Hash tables or hash indexes are
best suited for applications where the number of records that will populate
a table is relatively small.  This technique is often used in compilers and
assemblers or in applications where the tables will completely fit into main
memory.  Most in-memory database applications (such as real-time data
collection) use hashing.  To understand hashing, assume f(key) = table
address.  Some function or algorithmic procedure can be applied to the key
that will yield the address of the record in the table.  You directly
calculate the address in the table instead of searching by comparing the
key.  A very simple example is to assume that the key is a unique number in
the range of 1 -> N.  The size of the table is (N x record size).  The
records are stored in the hash table at ((key * record size) - record size).
Given a key the record's location in the table can be easily calculated.
Real world applications are a bit more complicated.  The simple example only
works as an example. Real world keys are not always a simple number and if
the key is numeric its range most often exceeds the available storage.  The
solution is to use a hashing function that will create an address in the
range of the available addresses in the table and also provides a means to
handle collisions on insertions. Collisions are cases where an existing
record already occupies the calculated address.  In practice the hashing
function calculates the target address in the table.  A re-hash function if
needed recalculates addresses to handle collisions and for searching beyond
the first hashed address if that address does not hold the target of the
search.

If you are interested in the messy details I suggest you refer to one or all
of the following:

"The Art of Computer Progamming: Volume 3 Sorting and Searching", by Donald
E. Knuth
"Making Hash With Tables" by Terry Dollhoff, Programming Techniques: Program
Design
"Full Table Quadratic Searching for Scatter Storage" by Colin A. Day
"Weighted Increment Linear Search for Scatter Tables" by F. Luccio
"Scatter Storage Techniques" by Robert Morris, Communication of the ACM
January 1968
"Hashing Methods for Direct Access Files" by Terry Dollhoff, Auerbach
Computer Programming Management, Folio 15-02-02

Hashing is an old technique, thus most current database management systems
use more modern methods.  Before relational database theory was adapted,
network model databases often used hashing for indexing.  Hashing is still
used in those special cases where it is the best solution.

See "7.6.11.3 Adaptive hash indexes" in the MySQL Manual.

I don't think that you can directly take advantage of hashing in MySQL. With
Innodb tables MySQL uses hashing where it is useful.

I hope this is helpful to you.

Regards,

Norman L. Smith



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

RE: Hash Tables / Indexes

Reply via email to