That's why high performance general solutions for this type of search problem are hard to come by.
It sounds like the original proposal was to match an input record with a small number of fixed length fields with a table with the same layout with an unknown number of entries. The request was for the minimum number of instructions, but we assume that he means the minimum number executed on average, rather than coded, and really not the minimum number of instructions, but the minimum execution time. We do not know how many input records will be processed before the table has to be refreshed. If only one or a few, then a sequential search is probably best, because the cost of the sort would overwhelm the benefit of the binary search. If many (particularly more records than rows in the table), then the sort is worth it. If there are only a few entries in the table, less than, say 20, a sequential search is probably best. But since the "*"s to indicate a wild card are at fixed locations in the table, a CLI should be better performance than TR or TRT. Gary Weinhold Senior Application Architect DATAKINETICS | Data Performance & Optimization Phone: +1.613.523.5500 x216<tel:+1.613.523.5500%20x216> Email: [email protected]<mailto:[email protected]> [http://www.dkl.com/wp-content/uploads/2015/07/dkl_logo.png]<http://www.dkl.com/> Visit us online at www.DKL.com<http://www.dkl.com/> [http://www.dkl.com/wp-content/uploads/2015/08/banner.png]<http://www.dkl.com/mailsig> E-mail Notification: The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. __________ On 2017-06-19 06:09, Martin Ward wrote: On 19/06/17 00:32, Richard Kuebbing wrote: Is a binary search ever appropriate on machines like the Z where the penalty for not being in the cache is so high? Searching a 1GB table (say, 1 million 1kB entries) via sequential search would require loading 500MB of data into the cache on average. Using binary search, only 20 blocks of memory need to be loaded into the cache. As processors get faster and memory gets larger, our programs can handle much larger data sets, so the choice of the right algorithm becomes *more* important, not less! The ratio between O(n) and O(log n), or between O(n^2) and O(n log n) increases as n gets larger. For O(n) versus O(n log n) (eg sequential search vs binary search) non-cached memory speed has not increased in line with memory size: so a fast modern mainframe takes *longer* to sequentially search its entire memory space compared to an old, slow mainframe. For O(n^2) versus O(n log n) (eg insertion sort vs quicksort) the differences become larger even if memory speed keeps track with increasing memory size.
