We recently had to solve that problem -- again. Our table was 18 million entries.
In the past I used simple hashing: adding the bytes of the "key" together and using the rightmost byte as index to a table of 256 string heads. the most recent occasion had worst case scenario data -- only 10 hashes were formed. The quick fix was to add the 1st, 3rd, 5th, ... bytes to the accumulator, but add the 2nd, 4th, 6th, ... bytes shifted left by 4 bits. Got the worst case up to 100+ hashes which seems to get us out of the woods. If there is any chance the table might get paged (we did when I first worked on this here), allocate new entries a page at a time. Make the free element table also have 256 entries. This gives storage isolation for another performance boost. We are still looking at the most recent worst case (it is part of a 100# in a 5# bag problem). When I have time I am going to increase the table to 1024 enties (one page) to see if there is any benefit. If it seems there is I will use a three step hash w/ 3 bit offset. Note our emergency solution was to partition the whole problem into 9 segments and then get each segment into 6 parallel processing runs. We are trying to get a batch year-end process to finish before the end of the 1st quarter. IBM Mainframe Discussion List <[email protected]> wrote on 01/10/2008 09:38:42 PM: > Has anyone every seen any doc on using radix partition trees? I'm > thinking it may have been one of the "rainbow" books. > I vaguely remember data tree structures and I've got a table search > problem that might be the perfect application for a tree-structured data > repository. The table might have up to 1,000,000 entries, all in > storage, and a balanced n-ary tree has GOT to be faster than using a > binary search. The nature of the data is such that a plain old-fashioned > list, in sorted order, isn't real amenable to a binary search, either. ----------------------------------------- The information contained in this communication (including any attachments hereto) is confidential and is intended solely for the personal and confidential use of the individual or entity to whom it is addressed. The information may also constitute a legally privileged confidential communication. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this communication in error and that any review, dissemination, copying, or unauthorized use of this information, or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message. Thank you ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

