As I wrote, there is the issue of dynamically balancing the tree as random
insertions come in.

On Tue, 27 Jul 2010 15:43:08 -0500 Hal Merritt <hmerr...@jackhenry.com> wrote:

:>Am I missing something? Why not use a binary chop search? That is, each 
compare cuts the universe in half.  For a million entries, you would need 20 or 
less compares. 
:>
:>http://en.wikipedia.org/wiki/Binary_search_algorithm
:>
:>
:>-----Original Message-----
:>From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On Behalf 
Of Binyamin Dissen
:>Sent: Tuesday, July 27, 2010 3:14 PM
:>To: IBM-MAIN@bama.ua.edu
:>Subject: Re: Hashing algorythm for text strings without embedded blanks
:>
:>On Tue, 27 Jul 2010 15:43:47 -0400 zMan <zedgarhoo...@gmail.com> wrote:
:>
:>:>OK, I have a splitting headache and may be dense anyway: how is it easier to
:>:>look it up once it's hashed? What am I missing?
:>
:>Not easier, faster.
:>
:>Assume 100000 strings. A serial search will take on average 50000 compares. A
:>mod64 hash should reduce it on average to under 800 compares, hash to slot and
:>run the alias chain.
:>
:>A b-tree would be better but then there is the issue of balancing it while
:>others are busy transversing it.
:>
:>:>On Tue, Jul 27, 2010 at 3:36 PM, Binyamin Dissen <bdis...@dissensoftware.com
:>:>> wrote:
:>
:>:>> On Tue, 27 Jul 2010 15:32:24 -0400 zMan <zedgarhoo...@gmail.com> wrote:
:>
:>:>> :>What's the goal of the hashing -- obfuscation? Shortening the data?
:>
:>:>> Lookup.
:>
:>:>> Does the entry exist, and if so what are its attributes. If does not 
exist,
:>:>> add.
:>
:>:>> :>On Tue, Jul 27, 2010 at 3:18 PM, Binyamin Dissen <
:>:>> bdis...@dissensoftware.com
:>:>> :>> wrote:
:>
:>:>> :>> Is there a preferred hashing algorithm for such strings? The strings
:>:>> will
:>:>> :>> be
:>:>> :>> fixed length with trailing blanks.
:>
:>:>> :>> I was thinking of simply doing a MOD64 (or higher) of the sum of the
:>:>> :>> non-blank
:>:>> :>> part of the string by words, but as this is text with a limited
:>:>> character
:>:>> :>> set,
:>:>> :>> would this lead to excessive alias chains? Has research been done in
:>:>> this
:>:>> :>> area? There will mostly be search activity but also add activity.

--
Binyamin Dissen <bdis...@dissensoftware.com>
http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me,
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems,
especially those from irresponsible companies.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to