Long lists accessed via a binary search at one time might cause paging problems. At one time IBM recommended using a linear search over a binary search to overcome paging difficulties. Today's experience is that paging seldom occurs.
Today the problem is taking advantage of caching as a table increases in size. The cost of fetching from memory as opposed from cache can wipe out any gain from the binary search. With a cache line of 32 bytes, the cost of doing a linear search of a table with entrylength=4 and entrycount=8 is very low -- almost the same as doing the first compare in a binary search. By dividing the list into parts and indexing into the parts by either a fast computed hash as in my prior message or directly using part of the "key", the original speed can be restored. This is an n-dimensional analogue to a binary tree. While I have used the technique for list entries much larger than length=4, the principles are the same. IBM Mainframe Assembler List <[email protected]> wrote on 08/23/2010 12:24:11 PM: > From: Patrick Roehl <[email protected]> > To: [email protected] > Date: 08/23/2010 01:14 PM > Subject: Re: Efficient Memory List > Sent by: IBM Mainframe Assembler List <[email protected]> > > The entries are short (4 bytes) and the key is the entire entry. My > understanding is that in this situation a hash table has no benefit. Is > this correct? > > I should emphasize that the entries coming in are random in nature and all > that is needed to be determined is if a new 4-byte entry coming in is > already present or is unique to the list. This is because processing is > performed on new entries (the entries are the key for that processing), but > that processing should not be duplicated. Once the input stream has been > exhausted the list of entries can be discarded. > > In most situations the list size would be less than a few hundred entries. > However, there is no specified upper limit (other than running out of > memory) and the size could potentially be a million entire or more. > > -----Original Message----- > From: IBM Mainframe Assembler List [ mailto:[email protected]] > On Behalf Of Paul Gilmartin > Sent: Monday, August 23, 2010 11:56 AM > To: [email protected] > Subject: Re: Efficient Memory List > > On Aug 23, 2010, at 08:45, Patrick Roehl wrote: > > > > > > 3) The binary search from step 1 indicates where the new entry should be > > inserted. To add the entry to the list, individual entries are moved one > > at a time (to avoid overlapping moves) to open a spot in the list for the > > new entry. > > > At least, keep the active entries at the top of the storage > area and MVCL downward to avoid overlapping moves while > avoiding the overhead of one-at-a-time. > > Would name/token services work for you? What's its > storage overhead? At some large number of entries it > will outperform moving the list for each new entry. > At some larger number it becomes an abuse of system > resources. > > Hash sounds good. > > -- gil ----------------------------------------- The information contained in this communication (including any attachments hereto) is confidential and is intended solely for the personal and confidential use of the individual or entity to whom it is addressed. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this communication in error and that any review, dissemination, copying, or unauthorized use of this information, or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message. Thank you
