This is a pretty ugly file! Here's what I see: 1) Modulo is way too big! 3 million groups for .9 million records; 1.6 GB physical space for 150 MB of data. Note the large number of empty groups in the 25% column at the bottom of the FILE.STAT report. Probably the modulo was pushed to TRY to make up for the really lousy hashing! More about this below in 2).
2) Lousy hashing distribution. Note 2.8 million empty and sparse groups in the 25% column; but at the same time 25,000 groups 200% + full. This isn't due to record size as the largest record is 2644 bytes. Note that the largest group has 7417 records - if all these were "average size" (Murphy says they aren't, though) that group would have 1.25 MB of data. Murphy also says that the most popular records live at the end of the largest group so there is your performance problem, quite likely -- tons of I/O required to get to the end of the large groups. What to do? Step 1 - See if another type will do a better job. Forget about HASH.HELP and forget about the key patterns documented for the various types -- yes, I know that type 18 "should" work best, but life isn't that simple. [AD] If you have FAST, use it. [/AD] If not, use HASH.AID to simulate the various types. In using HASH.AID I'd suggest picking a reasonable modulo, say around 200,001 or so. ** BIG NOTE ** This modulo choice is based on a separation of 4 which I'd recommend for a 2K data buffer -- if you want to stay with separation 1 use a modulo of 800,001 or so ** END BIG NOTE ** Before running HASH.AID clear the HASH.AID.FILE (CLEAR.FILE HASH.AID.FILE). Then use HASH.AID with your modulo and separation of choice and interate through all the available types -- syntax is HASH.AID SALES-HIST-BR1 and let it prompt you for the Type, Modulo and Separation; for Type enter "2,18,1" which is like FOR 1 TO 18, STEP 1. Don't bother reading the output, just enter "N" and let it scroll by. When it's all done use LIST HASH.AID.FILE to examine the results. Look for the type that yields the smallest "Largest Group" the fewest "Oversize Groups" and the closest together "Smallest Group" and "Largest Group". If one of the types does a lot better than type 18 give it a try and see if it does better. Note that one flaw with HASH.AID is that it doesn't report empty groups (alas!). If you find a better type it may solve or help your problem. If not, Step 2 - Read the very helpful post by Scott Ballinger in which he notes that large, complex record keys sometimes don't hash well and could cause the sort of problem you are seeing. If none of the other file types do better than type 18 I'm afraid this is what you are facing. Were the file isolated the fix would be to move any important information carried by the record key into one or more fields and replace the compound record keys with sequential numeric, which as Scott notes, often hash more reliably. However, if the file is heavily embedded in the application software this might not be a trivial change to make! Hope this helps! Let us know how it turns out or if other questions arise... Jeff Fitzgerald Fitzgerald & Long, Inc. www.fitzlong.com -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of roy Sent: Tuesday, October 16, 2007 2:14 PM To: [email protected] Subject: RE: [U2] Size of Key Question File name = SALES-HIST-BR1 File type = 18 Number of groups in file (modulo) = 3000017 Separation = 1 Number of records = 883026 Number of physical bytes = 1667799040 Number of data bytes = 150663032 Average number of records per group = 0.2943 Average number of bytes per group = 50.2207 Minimum number of records in a group = 0 Maximum number of records in a group = 7417 Average number of bytes per record = 170.6213 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2644 Average number of fields per record = 25.6579 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75% 100% 125% 150% 175% 200% full 2855826 50132 31541 14753 12862 5383 4611 24909 Press any key to continue... ------- u2-users mailing list [email protected] To unsubscribe please visit http://listserver.u2ug.org/
