[email protected] schrieb:
So my questions are:
In light of the extreme effect caching can have on performance, does it
make any sense to have any table, except a quite small one, be searched
using a binary search?
IMO, sequential search of tables is only appropriate for very small
tables of,
say, 20 to 30 entries. For tables with more entries, other search techniques
should be considered, that is, binary search, trees or hashing. I prefer
self-balancing trees like AVL trees, because, when loading the tables,
it is not necessary that the entries arrive in key sequence. Apart from
that,
the AVL trees have the same access times as binary search (search time
proportional to ld n, given n entries of the table, because ld n is the
path length
of the AVL tree), only different by a constant factor.
A sequential search on a table with a million entries needs 500.000
compares (avg),
but a binary or tree search needs only 20 compares (max). Even caching
makes no big
difference. You should see a lot more CPU load on the sequential search.
Is there a rule of thumb on when tables should be split into "search
arguments" and "data" to speed up the search?
That does only make sense to me if the data is VERY large compared to
the keys.
Are there any other experienced based rules for tables on modern
mainframes?
See above; when using AVL trees etc., storage management is important.
Kind regards
Bernd