I have used the technique of adding byte values for a hash count in two 
shops over two decades to cut the length of a threaded list.  Both times I 
used mod256 to good effect.

The first time the string was an eight character variable name, the second 
time a set of strings over 64 bytes long.  Both could contain spaces.

In the first case the set was in the low thousands and short/long ratio 
was less than 3/1.

In the second case, worst case set was above 1 million and the short long 
ratio was less than 1.5/1.  In this case, the performance hit is low 
enough (and probably the systems fast enough) that it does not pay to go 
mod512.  A million pages is still a technology limit for the kind of 
reports being indexed.

One thing that helped in both cases was allocating control blocks pointed 
to by the hash tables in sets of pages to improve locality of reference.

IBM Mainframe Discussion List <IBM-MAIN@bama.ua.edu> wrote on 07/27/2010 
03:18:49 PM:

> From: Binyamin Dissen <bdis...@dissensoftware.com>
> Subject: Hashing algorythm for text strings without embedded blanks
> 
> Is there a preferred hashing algorithm for such strings? The strings 
will be
> fixed length with trailing blanks.
> 
> I was thinking of simply doing a MOD64 (or higher) of the sum of 
thenon-blank
> part of the string by words, but as this is text with a limited 
character set,
> would this lead to excessive alias chains? Has research been done in 
this
> area? There will mostly be search activity but also add activity.


-----------------------------------------
The information contained in this communication (including any
attachments hereto) is confidential and is intended solely for the
personal and confidential use of the individual or entity to whom
it is addressed. If the reader of this message is not the intended
recipient or an agent responsible for delivering it to the intended
recipient, you are hereby notified that you have received this
communication in error and that any review, dissemination, copying,
or unauthorized use of this information, or the taking of any
action in reliance on the contents of this information is strictly
prohibited. If you have received this communication in error,
please notify us immediately by e-mail, and delete the original
message. Thank you 

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to