I've dealt with similar problems, though usually with threaded lists.

The solution was to make a table of addresses of sub-tables.  The 
particular sub-table was chosen by adding the bytes of the table entry -- 
not optimal with respect to the number of entries but very fast.  E.g. the 
entry C'DOG ' has byte values x'C4D6C740' which add to X'2A1'.  With 256 
addresses in address table, use the x'A1'th table entry at offset X'280'.

If you can guarantee a good distribution of values of any of the bytes in 
your table, use that byte w/o calculation.

That worked for me for a hyphenation dictionary.  With about 50k word 
fragments in the table, using the first three letters of the word fragment 
broke the table reasonable well into three level tables  E.g., the main 
table pointed to the A table which pointed to the AB table which pointed 
to the ABB list

IBM Mainframe Assembler List <[email protected]> wrote on 
08/23/2010 10:45:31 AM:

> From: Patrick Roehl <[email protected]>
> To: [email protected]
> Date: 08/23/2010 10:51 AM
> Subject: Efficient Memory List
> Sent by: IBM Mainframe Assembler List <[email protected]>
> 
> Iâ??m looking for advice on how to handle a potentially large list of 
data.
> The list is comprised of 4-byte entries and the application needs to 
know
> if an incoming item is already present or is new to the list.  This is 
the
> approach that is currently in use and that Iâ??d like to improve upon:
> 
> 1) Perform a binary search and process no further if the item is already
> present
> 
> 2) If there is not enough room to add a new entry, allocate a new 
storage
> area 1.5 times the size of the old area, MVCL the existing data to the 
new
> area, and free the old area.
> 
> 3) The binary search from step 1 indicates where the new entry should be
> inserted.  To add the entry to the list, individual entries are moved 
one
> at a time (to avoid overlapping moves) to open a spot in the list for 
the
> new entry.
> 
> This old process has worked well for fairly small lists but Iâ??d like
> opinions on how to improve this process for large lists (say, a million 
or
> more).
> 
> Using SORT is not an option because of the multi-threaded online
> environment (itâ??s running in CICS).
> 
> The list is only used by a single process that handles data as it
> arrives.  To process correctly, it must be able to determine immediately
> if the data being presented has already been processed.  When all of the
> incoming data for that process has been handled the list is discarded.
> 
> Speed and efficiency are important.  All suggestions regarding logic and
> coding techniques are appreciated!


-----------------------------------------
The information contained in this communication (including any
attachments hereto) is confidential and is intended solely for the
personal and confidential use of the individual or entity to whom
it is addressed. If the reader of this message is not the intended
recipient or an agent responsible for delivering it to the intended
recipient, you are hereby notified that you have received this
communication in error and that any review, dissemination, copying,
or unauthorized use of this information, or the taking of any
action in reliance on the contents of this information is strictly
prohibited. If you have received this communication in error,
please notify us immediately by e-mail, and delete the original
message. Thank you

Reply via email to