That's why high performance general solutions for this type of search problem 
are hard to come by.

It sounds like the original proposal was to match an input record with a small 
number of fixed length fields with a table with the same layout with an unknown 
number of entries.  The request was for the minimum number of instructions, but 
we assume that he means the minimum number executed on average, rather than 
coded, and really not the minimum number of instructions, but the minimum 
execution time.

We do not know how many input records will be processed before the table has to 
be refreshed.  If only one or a few, then a sequential search is probably best, 
because the cost of the sort would overwhelm the benefit of the binary search.  
If many (particularly more records than rows in the table), then the sort is 
worth it.  If there are only a few entries in the table, less than, say 20, a 
sequential search is probably best.

But since the "*"s to indicate a wild card are at fixed locations in the table, 
a CLI should be better performance than TR or TRT.

Gary Weinhold
Senior Application Architect

DATAKINETICS | Data Performance & Optimization

Phone:  +1.613.523.5500 x216<tel:+1.613.523.5500%20x216>
Email:  [email protected]<mailto:[email protected]>

[http://www.dkl.com/wp-content/uploads/2015/07/dkl_logo.png]<http://www.dkl.com/>

Visit us online at www.DKL.com<http://www.dkl.com/>

[http://www.dkl.com/wp-content/uploads/2015/08/banner.png]<http://www.dkl.com/mailsig>

E-mail Notification: The information contained in this email and any 
attachments is confidential and may be subject to copyright or other 
intellectual property protection. If you are not the intended recipient, you 
are not authorized to use or disclose this information, and we request that you 
notify us by reply mail or telephone and delete the original message from your 
mail system.



__________
On 2017-06-19 06:09, Martin Ward wrote:
On 19/06/17 00:32, Richard Kuebbing wrote:
Is a binary search ever appropriate on machines like the Z where the
penalty for not being in the cache is so high?

Searching a 1GB table (say, 1 million 1kB entries) via sequential
search would require loading 500MB of data into the cache on average.

Using binary search, only 20 blocks of memory need to be loaded
into the cache.

As processors get faster and memory gets larger, our programs can handle
much larger data sets, so the choice of the right algorithm becomes
*more* important, not less! The ratio between O(n) and O(log n),
or between O(n^2) and O(n log n) increases as n gets larger.

For O(n) versus O(n log n) (eg sequential search vs binary search)
non-cached memory speed has not increased in line with memory size:
so a fast modern mainframe takes *longer* to sequentially search
its entire memory space compared to an old, slow mainframe.

For O(n^2) versus O(n log n) (eg insertion sort vs quicksort)
the differences become larger even if memory speed keeps track
with increasing memory size.

Reply via email to