Sorry I don't have time to really dive into this. A lot of the documentation are things that work most of the time. But sizing dynamic files is more art than science some times.
But just looking at your stats, KEYDATA might be an option as well. You have a pretty high ratio of bytes per record to a 12 character key and will you ever get enough keys into a group to cause a split? Maybe. But KEYDATA uses both the data and keys to figure out when to split. The only problem I have had using KEYDATA is if you have a meaningful key with separators like '#' in them. The hashing gets goofed up and I can get good distribution into groups. I get splitting but I end up having groups that never get anything in them. So I will have one group with 11 records and then 2 with zero bytes: 4 0 0 5 795 1> 6 0 0 7 4888 9>>>>>>>>> 8 0 0 9 669 1> 10 0 0 11 5987 11>>>>>>>>>>> 12 0 0 13 0 0 14 0 0 15 5795 11>>>>>>>>>>> 16 0 0 17 1113 2>> 18 0 0 19 9842 16>>>>>>>>>>>>>>>> I have one file that has nice numeric keys and KEYDATA works great: -rwxrwxr-x 1 root mcc 2000000000 Jun 17 17:40 dat001 -rwxrwxr-x 1 root mcc 2000000000 Jun 17 17:40 dat002 -rwxrwxr-x 1 root mcc 2000000000 Jun 17 17:40 dat003 -rwxrwxr-x 1 root mcc 2000000000 Jun 17 17:10 dat004 -rwxrwxr-x 1 root mcc 2000000000 Jun 17 17:40 dat005 -rwxrwxr-x 1 5126 mcc 2000000000 Jun 28 09:02 dat006 -rwxrwxr-x 1 30012 mcc 2000000000 Jun 17 17:46 dat007 -rwxrwxr-x 1 9421 mcc 2000000000 Jun 17 17:46 dat008 -rwxrwxr-x 1 30334 mcc 2000000000 Jun 28 09:02 dat009 -rwxrwxr-x 1 30334 mcc 2000000000 Jun 17 16:16 dat010 -rw-rw-r-- 1 9319 mcc 2000000000 Jun 17 17:40 dat011 -rwxrwxr-x 1 30334 mcc 2000000000 Jun 17 17:40 dat012 -rwxrwxr-x 1 30334 mcc 2000000000 Jun 17 17:40 dat013 -rwxrwxr-x 1 30334 mcc 2000000000 Jun 17 17:40 dat014 -rwxrwxr-x 1 30334 mcc 941259776 Jun 28 09:02 dat015 -rwxrwxr-x 1 root mcc 1999994880 Jun 25 12:57 idx001 -rwxrwxr-x 1 root mcc 1999994880 Jun 17 17:10 idx002 -rwxrwxr-x 1 root mcc 1999994880 Jun 17 17:40 idx003 -rwxrwxr-x 1 root mcc 1999994880 Jun 25 12:57 idx004 -rwxrwxr-x 1 root mcc 1999994880 Jun 17 17:30 idx005 -rwxrwxr-x 1 root mcc 1999994880 Jun 17 17:24 idx006 -rwxrwxr-x 1 c00655 mcc 1999994880 Jun 17 17:40 idx007 -rwxrwxr-x 1 8575 mcc 1999994880 Jun 17 16:16 idx008 -rw-rw-rw- 1 udtcron mcc 1999994880 Jun 17 17:46 idx009 -rw-rw-rw- 1 30334 mcc 1999994880 Jun 25 12:57 idx010 -rw-rw-rw- 1 30334 mcc 1999994880 Jun 25 12:57 idx011 -rw-rw-rw- 1 udtcron mcc 1765761024 Jun 26 15:17 idx012 -rwxrwxr-x 1 root mcc 954723328 Jun 28 09:02 over001 I haven't had to do anything to this file for a couple of years. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Dave Laansma Sent: Thursday, June 28, 2012 9:29 AM To: [email protected] Subject: [U2] Really trying to understand dynamic file sizing I've only got a handful of dynamic files but of course they're huge and have a big impact on our daily and monthly processing. I'd REALLY like to understand the tuning mechanisms for these files, specifically SPLIT/MERGE. The formulas that I got on previous responses just don't seem to make sense on one particular file. So here's a FILE.STAT and ANALYZE.FILE of a file that I believe is in need of resizing and/or reconfiguring. I believe that if I can get some input on this file, I'll be able to apply that knowledge to my other files. First, I understand quite clearly that the modulo of 235889 is about half of what it should be, at least for a block size of 4096. Second, unless I'm doing something wrong, I computed my SPLIT LOAD to be 1, which just doesn't seem right. I'd like to resize this file this weekend and I know that if I do one thing incorrectly it could make my performance even worse. Any input would be greatly appreciated. File name(Dynamic File) = OH Number of groups in file (modulo) = 235889 Dynamic hashing, hash type = 0 Split/Merge type = KEYONLY Block size = 4096 File has 234167 groups in level one overflow. Number of records = 1387389 Total number of bytes = 2132217978 Average number of records per group = 5.9 Standard deviation from average = 1.6 Average number of bytes per group = 9039.1 Standard deviation from average = 9949.2 Average number of bytes in a record = 1536.9 Average number of bytes in record ID = 12.4 Standard deviation from average = 4009.7 Minimum number of bytes in a record = 659 Maximum number of bytes in a record = 2205579 Minimum number of fields in a record = 237 Maximum number of fields in a record = 414 Average number of fields per record = 328.3 Standard deviation from average = 32.0 Dynamic File name = OH Number of groups in file (modulo) = 235889 Minimum groups of file = 235889 Hash type = 0, blocksize = 4096 Split load = 10, Merge load = 5 Split/Merge type = KEYONLY Sincerely, David Laansma IT Manager Hubbard Supply Co. Direct: 810-342-7143 Office: 810-234-8681 Fax: 810-234-6142 www.hubbardsupply.com <http://www.hubbardsupply.com> "Delivering Products, Services and Innovative Solutions" _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users ------------------------------------------------------------------------------ CONFIDENTIALITY NOTICE: If you have received this email in error, please immediately notify the sender by e-mail at the address shown. This email transmission may contain confidential information. This information is intended only for the use of the individual(s) or entity to whom it is intended even if addressed incorrectly. Please delete it from your files if you are not the intended recipient. Thank you for your compliance. Copyright (c) 2012 Cigna ============================================================================== _______________________________________________ U2-Users mailing list [email protected] http://listserver.u2ug.org/mailman/listinfo/u2-users
