In general, the main problem with large, compound keys is that said keys do not hash well; and by hash well I mean that they do not hash to proximate groups, as for example, sequential numeric keys would.
There is read-ahead logic and RAM in your disk drive(s). There is read-ahead logic and RAM in your disk controller(s). There is read-ahead logic in the O/S. None of this works very well when records are randomly scattered throughout the file. If I used sequential, numeric keys, and I wanted all the records created yesterday, they would likely all be near each other on the physical disk. When I accessed the first one, the disk/controller/os will have pre-fetched many of the day's other records as well. That makes for speedy access. This is part of the reason why I think long, compound keys are a PITA and are to be avoided. Simple numeric keys will process quicker because they hash better, and are easier to type too. This is often the problem with "intelligent" keys; by embedding data in the key, you almost always make the key longer and the file hash poorly. IMO it makes way more sense to use simple numeric keys and create real attributes for the data you are tempted to build the key out of. I say this with 20/20 hindsight, as I have designed many systems with large files that use compound keys, every one of which I have come to regret. Roy, you could prove this by writing a program that reads every record in your original file and writes it out to a new file (with the same modulo & sep as the original file) using a simple incrementing counter as the key. I will bet that the new file performs better than your original one does, even though it should have more attributes (necessary to accommodate the data values that were embedded in the key to the original file). My 0.02, /Scott Ballinger Pareto Corporation Edmonds, WA USA 206 713 6006 ------- u2-users mailing list [email protected] To unsubscribe please visit http://listserver.u2ug.org/
