RE: [U2] Size of key question
I think that the maximum number of records in a group of 18032 is really hurting the file performance. I would find that group, and see what keys are being hashed to it. How many other groups approach this number of records? How bad are the other partfiles in this regard? Something about your keys is causing a lot of them to hash to one group or a selected subset of groups. Also, how many groups out there have absolutely no records in them? You've got 3,000,000 groups but only 2,000,000 records - so you should have something in the neighborhood of 1,000,000 empty groups. If you have 2,500,000 empties - then your key values are not hashing well under type 18. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of roy Sent: Wednesday, October 17, 2007 9:52 AM To: u2-users@listserver.u2ug.org Subject: [U2] Size of key question _ From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 10:33 PM To: 'u2-users@listserver.u2ug.org' Subject: File key question Wow! I got so many ideas from this group that I thought I would give some more Information. This file is a distributed file in 2 parts The 3rd field (separated by *) determines which partfile with 1 going to the file I sent before and the rest going to this file There are no triggers on the file. The system was 'fast' until recently when tens of thousands of new records were added. I agree the file is poorly sized, it was an attempt to solve one problem but created another. The keys, I believe, are an issue Here is a sample SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS 407*BAR*1*498*GLU*2274***SUMMARY 491*BAR*1*498*GLU*2274***SUMMARY SUMMARY*BAR*1*498*GLU*2274***SUMMARY 1896*BAR*1*498*GLU*2274***SUMMARY 460*BAR*1*498*GLU*2274***SUMMARY 1199*SUMMARY*1*465*432**** 1185*SUMMARY*1*412*ABE*SUMMARY***BLIN 3281*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL 3558*SUMMARY*1*450*HIR*SUMMARY***AMEL 3811*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER 3558*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*CAG*1*429*LOR*1810***BURN 252*PON*1*640*SHE*FRT*** 2177*CS1*1*491*ABE*SUMMARY*** 590*PRO*1*491*JOR*SUMMARY***WW8RHH 3715*PRO*1*491*JOR*SUMMARY***WW8RHH Press any key to continue... Here is the file stat from the other half of the file File name = SALES-HIST-BRS File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 2071678 Number of physical bytes= 1894588928 Number of data bytes= 356514112 Average number of records per group = 0.6906 Average number of bytes per group = 118.8374 Minimum number of records in a group= 0 Maximum number of records in a group= 18032 Average number of bytes per record = 172.0895 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2704 Average number of fields per record = 25.7575 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75%100%125%150%175%200% full 2789365 44352 31348 23325 3311192219893 59402 Press any key to continue... I'll be working on this for a while. Any resize, copy etc. has such an impact on the system that my access is limited. Thanks for the input so far. Roy Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of key question
Has this file always had 2 parts? Just wondering as I saw that if the other half of the file was around the same size, then the full file is well under 1Gb so, reducing the mod bringing into 1 part would eliminate overhead of distributing data into 2 files call to your routine on every read/write operation Ross Ferris Stamina Software Visage Better by Design! -Original Message- From: [EMAIL PROTECTED] [mailto:owner-u2- [EMAIL PROTECTED] On Behalf Of roy Sent: Wednesday, 17 October 2007 11:52 PM To: u2-users@listserver.u2ug.org Subject: [U2] Size of key question _ From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 10:33 PM To: 'u2-users@listserver.u2ug.org' Subject: File key question Wow! I got so many ideas from this group that I thought I would give some more Information. This file is a distributed file in 2 parts The 3rd field (separated by *) determines which partfile with 1 going to the file I sent before and the rest going to this file There are no triggers on the file. The system was 'fast' until recently when tens of thousands of new records were added. I agree the file is poorly sized, it was an attempt to solve one problem but created another. The keys, I believe, are an issue Here is a sample SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS 407*BAR*1*498*GLU*2274***SUMMARY 491*BAR*1*498*GLU*2274***SUMMARY SUMMARY*BAR*1*498*GLU*2274***SUMMARY 1896*BAR*1*498*GLU*2274***SUMMARY 460*BAR*1*498*GLU*2274***SUMMARY 1199*SUMMARY*1*465*432**** 1185*SUMMARY*1*412*ABE*SUMMARY***BLIN 3281*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL 3558*SUMMARY*1*450*HIR*SUMMARY***AMEL 3811*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER 3558*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*CAG*1*429*LOR*1810***BURN 252*PON*1*640*SHE*FRT*** 2177*CS1*1*491*ABE*SUMMARY*** 590*PRO*1*491*JOR*SUMMARY***WW8RHH 3715*PRO*1*491*JOR*SUMMARY***WW8RHH Press any key to continue... Here is the file stat from the other half of the file File name = SALES-HIST-BRS File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 2071678 Number of physical bytes= 1894588928 Number of data bytes= 356514112 Average number of records per group = 0.6906 Average number of bytes per group = 118.8374 Minimum number of records in a group= 0 Maximum number of records in a group= 18032 Average number of bytes per record = 172.0895 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2704 Average number of fields per record = 25.7575 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75%100%125%150%175%200% full 2789365 44352 31348 23325 3311192219893 59402 Press any key to continue... I'll be working on this for a while. Any resize, copy etc. has such an impact on the system that my access is limited. Thanks for the input so far. Roy Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
Re: [U2] Size of Key Question
The previous responders certainly know alot more about this stuff than I and I they mentioned all of the things to do. However, if I had to do just ONE thing, I would certainly pick the separation expansion to 4. The way I read it, the average group contains 1/3 of a logical record, so expanding the separation should reduce the reads even if they are cached. You'd also of course have to choose a modulus that would allow enough overall space. On 10/17/07, Ross Ferris [EMAIL PROTECTED] wrote: Sounds like the file may be V E R Y poorly sized, as Jeff F suggested Was this process fast previously? What has happened on the system around the time it started to get slow? Are there any triggers on the file Ross Ferris Stamina Software Visage Better by Design! -Original Message- From: [EMAIL PROTECTED] [mailto:owner-u2- [EMAIL PROTECTED] On Behalf Of roy Sent: Wednesday, 17 October 2007 3:45 AM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question Random reads and updates on a file with ~2 million records. I separated the reads and writes to a separate program that only does this processing to no avail. Topas shows 100% disk usage during this process and all other users are affected. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Fitzgerald Sent: Tuesday, October 16, 2007 1:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question I wouldn't expect much difference in file access speed with long record keys versus short keys. What are you doing with the file that seems slow? -- i.e. random reads of individual records, updates, sequential selects and processing, etc. If the slowness is seen in an application program, are there other possibilities? Does the file have alternate keys or associated files that might be causing the slowness? Could locking be a bottleneck? Just for grins it would be interesting to see a FILE.STAT on the file. Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com -Original Message- From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 12:17 PM To: 'u2-users@listserver.u2ug.org' Subject: Size of Key Question Can someone comment on what effect if any the length of the key has on the speed of disk access? The software I am working with has one file with a complex key of 64 alpha-numeric characters and that file seems to be very slow no matter the modulo and sep or even file type I choose. This is in UV 10.2 Pick Flavor on AIX 5.3 Any insight would be appreciated. Thanks, Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ -- john --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
[U2] Size of key question
_ From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 10:33 PM To: 'u2-users@listserver.u2ug.org' Subject: File key question Wow! I got so many ideas from this group that I thought I would give some more Information. This file is a distributed file in 2 parts The 3rd field (separated by *) determines which partfile with 1 going to the file I sent before and the rest going to this file There are no triggers on the file. The system was 'fast' until recently when tens of thousands of new records were added. I agree the file is poorly sized, it was an attempt to solve one problem but created another. The keys, I believe, are an issue Here is a sample SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS 407*BAR*1*498*GLU*2274***SUMMARY 491*BAR*1*498*GLU*2274***SUMMARY SUMMARY*BAR*1*498*GLU*2274***SUMMARY 1896*BAR*1*498*GLU*2274***SUMMARY 460*BAR*1*498*GLU*2274***SUMMARY 1199*SUMMARY*1*465*432**** 1185*SUMMARY*1*412*ABE*SUMMARY***BLIN 3281*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL 3558*SUMMARY*1*450*HIR*SUMMARY***AMEL 3811*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER 3558*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*CAG*1*429*LOR*1810***BURN 252*PON*1*640*SHE*FRT*** 2177*CS1*1*491*ABE*SUMMARY*** 590*PRO*1*491*JOR*SUMMARY***WW8RHH 3715*PRO*1*491*JOR*SUMMARY***WW8RHH Press any key to continue... Here is the file stat from the other half of the file File name = SALES-HIST-BRS File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 2071678 Number of physical bytes= 1894588928 Number of data bytes= 356514112 Average number of records per group = 0.6906 Average number of bytes per group = 118.8374 Minimum number of records in a group= 0 Maximum number of records in a group= 18032 Average number of bytes per record = 172.0895 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2704 Average number of fields per record = 25.7575 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75%100%125%150%175%200% full 2789365 44352 31348 23325 3311192219893 59402 Press any key to continue... I'll be working on this for a while. Any resize, copy etc. has such an impact on the system that my access is limited. Thanks for the input so far. Roy Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
Re: [U2] Size of key question
My thought on seeing the keys is that the key is being used for things that should be in the data, not the key... My feeble 2-bits. Karl On Wed, 2007-10-17 at 09:51 -0400, roy wrote: _ From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 10:33 PM To: 'u2-users@listserver.u2ug.org' Subject: File key question Wow! I got so many ideas from this group that I thought I would give some more Information. This file is a distributed file in 2 parts The 3rd field (separated by *) determines which partfile with 1 going to the file I sent before and the rest going to this file There are no triggers on the file. The system was 'fast' until recently when tens of thousands of new records were added. I agree the file is poorly sized, it was an attempt to solve one problem but created another. The keys, I believe, are an issue Here is a sample SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS 407*BAR*1*498*GLU*2274***SUMMARY 491*BAR*1*498*GLU*2274***SUMMARY SUMMARY*BAR*1*498*GLU*2274***SUMMARY 1896*BAR*1*498*GLU*2274***SUMMARY 460*BAR*1*498*GLU*2274***SUMMARY 1199*SUMMARY*1*465*432**** 1185*SUMMARY*1*412*ABE*SUMMARY***BLIN 3281*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL 3558*SUMMARY*1*450*HIR*SUMMARY***AMEL 3811*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER 3558*SUMMARY*1*450*HIR*SUMMARY***AMER SUMMARY*CAG*1*429*LOR*1810***BURN 252*PON*1*640*SHE*FRT*** 2177*CS1*1*491*ABE*SUMMARY*** 590*PRO*1*491*JOR*SUMMARY***WW8RHH 3715*PRO*1*491*JOR*SUMMARY***WW8RHH Press any key to continue... Here is the file stat from the other half of the file File name = SALES-HIST-BRS File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 2071678 Number of physical bytes= 1894588928 Number of data bytes= 356514112 Average number of records per group = 0.6906 Average number of bytes per group = 118.8374 Minimum number of records in a group= 0 Maximum number of records in a group= 18032 Average number of bytes per record = 172.0895 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2704 Average number of fields per record = 25.7575 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75%100%125%150%175%200% full 2789365 44352 31348 23325 3311192219893 59402 Press any key to continue... I'll be working on this for a while. Any resize, copy etc. has such an impact on the system that my access is limited. Thanks for the input so far. Roy Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- Karl L. Pearson [EMAIL PROTECTED] http://consulting.ourldsfamily.com --- My Thoughts on Terrorism In America right after 9/11/2001: http://www.ourldsfamily.com/wtc.shtml --- The world is a dangerous place to live... not because of the people who are evil, but because of the people who don't do anything about it. - Albert Einstein --- To mess up your Linux PC, you have to really work at it; to mess up a microsoft PC you just have to work on it. --- Now for a random _short_ fortune: While having never invented a sin, I'm trying to perfect several. --- --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
[U2] Size of Key Question
_ From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 12:17 PM To: 'u2-users@listserver.u2ug.org' Subject: Size of Key Question Can someone comment on what effect if any the length of the key has on the speed of disk access? The software I am working with has one file with a complex key of 64 alpha-numeric characters and that file seems to be very slow no matter the modulo and sep or even file type I choose. This is in UV 10.2 Pick Flavor on AIX 5.3 Any insight would be appreciated. Thanks, Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of Key Question
I wouldn't expect much difference in file access speed with long record keys versus short keys. What are you doing with the file that seems slow? -- i.e. random reads of individual records, updates, sequential selects and processing, etc. If the slowness is seen in an application program, are there other possibilities? Does the file have alternate keys or associated files that might be causing the slowness? Could locking be a bottleneck? Just for grins it would be interesting to see a FILE.STAT on the file. Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com -Original Message- From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 12:17 PM To: 'u2-users@listserver.u2ug.org' Subject: Size of Key Question Can someone comment on what effect if any the length of the key has on the speed of disk access? The software I am working with has one file with a complex key of 64 alpha-numeric characters and that file seems to be very slow no matter the modulo and sep or even file type I choose. This is in UV 10.2 Pick Flavor on AIX 5.3 Any insight would be appreciated. Thanks, Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of Key Question
Random reads and updates on a file with ~2 million records. I separated the reads and writes to a separate program that only does this processing to no avail. Topas shows 100% disk usage during this process and all other users are affected. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Fitzgerald Sent: Tuesday, October 16, 2007 1:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question I wouldn't expect much difference in file access speed with long record keys versus short keys. What are you doing with the file that seems slow? -- i.e. random reads of individual records, updates, sequential selects and processing, etc. If the slowness is seen in an application program, are there other possibilities? Does the file have alternate keys or associated files that might be causing the slowness? Could locking be a bottleneck? Just for grins it would be interesting to see a FILE.STAT on the file. Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com -Original Message- From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 12:17 PM To: 'u2-users@listserver.u2ug.org' Subject: Size of Key Question Can someone comment on what effect if any the length of the key has on the speed of disk access? The software I am working with has one file with a complex key of 64 alpha-numeric characters and that file seems to be very slow no matter the modulo and sep or even file type I choose. This is in UV 10.2 Pick Flavor on AIX 5.3 Any insight would be appreciated. Thanks, Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
Re: [U2] Size of Key Question
and the FILE.STAT? On 10/16/07, roy [EMAIL PROTECTED] wrote: Random reads and updates on a file with ~2 million records. I separated the reads and writes to a separate program that only does this processing to no avail. Topas shows 100% disk usage during this process and all other users are affected. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Fitzgerald Sent: Tuesday, October 16, 2007 1:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question I wouldn't expect much difference in file access speed with long record keys versus short keys. What are you doing with the file that seems slow? -- i.e. random reads of individual records, updates, sequential selects and processing, etc. If the slowness is seen in an application program, are there other possibilities? Does the file have alternate keys or associated files that might be causing the slowness? Could locking be a bottleneck? Just for grins it would be interesting to see a FILE.STAT on the file. Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com -Original Message- From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 12:17 PM To: 'u2-users@listserver.u2ug.org' Subject: Size of Key Question Can someone comment on what effect if any the length of the key has on the speed of disk access? The software I am working with has one file with a complex key of 64 alpha-numeric characters and that file seems to be very slow no matter the modulo and sep or even file type I choose. This is in UV 10.2 Pick Flavor on AIX 5.3 Any insight would be appreciated. Thanks, Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ -- john --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of Key Question
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of john reid Sent: Tuesday, October 16, 2007 2:17 PM To: u2-users@listserver.u2ug.org Subject: Re: [U2] Size of Key Question and the FILE.STAT? File name = SALES-HIST-BR1 File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 883026 Number of physical bytes= 1667799040 Number of data bytes= 150663032 Average number of records per group = 0.2943 Average number of bytes per group = 50.2207 Minimum number of records in a group= 0 Maximum number of records in a group= 7417 Average number of bytes per record = 170.6213 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2644 Average number of fields per record = 25.6579 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75%100%125%150%175%200% full 2855826 50132 31541 14753 1286253834611 24909 Press any key to continue... On 10/16/07, roy [EMAIL PROTECTED] wrote: Random reads and updates on a file with ~2 million records. I separated the reads and writes to a separate program that only does this processing to no avail. Topas shows 100% disk usage during this process and all other users are affected. --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
Re: [U2] Size of Key Question
In general, the main problem with large, compound keys is that said keys do not hash well; and by hash well I mean that they do not hash to proximate groups, as for example, sequential numeric keys would. There is read-ahead logic and RAM in your disk drive(s). There is read-ahead logic and RAM in your disk controller(s). There is read-ahead logic in the O/S. None of this works very well when records are randomly scattered throughout the file. If I used sequential, numeric keys, and I wanted all the records created yesterday, they would likely all be near each other on the physical disk. When I accessed the first one, the disk/controller/os will have pre-fetched many of the day's other records as well. That makes for speedy access. This is part of the reason why I think long, compound keys are a PITA and are to be avoided. Simple numeric keys will process quicker because they hash better, and are easier to type too. This is often the problem with intelligent keys; by embedding data in the key, you almost always make the key longer and the file hash poorly. IMO it makes way more sense to use simple numeric keys and create real attributes for the data you are tempted to build the key out of. I say this with 20/20 hindsight, as I have designed many systems with large files that use compound keys, every one of which I have come to regret. Roy, you could prove this by writing a program that reads every record in your original file and writes it out to a new file (with the same modulo sep as the original file) using a simple incrementing counter as the key. I will bet that the new file performs better than your original one does, even though it should have more attributes (necessary to accommodate the data values that were embedded in the key to the original file). My 0.02, /Scott Ballinger Pareto Corporation Edmonds, WA USA 206 713 6006 --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of Key Question
This is a pretty ugly file! Here's what I see: 1) Modulo is way too big! 3 million groups for .9 million records; 1.6 GB physical space for 150 MB of data. Note the large number of empty groups in the 25% column at the bottom of the FILE.STAT report. Probably the modulo was pushed to TRY to make up for the really lousy hashing! More about this below in 2). 2) Lousy hashing distribution. Note 2.8 million empty and sparse groups in the 25% column; but at the same time 25,000 groups 200% + full. This isn't due to record size as the largest record is 2644 bytes. Note that the largest group has 7417 records - if all these were average size (Murphy says they aren't, though) that group would have 1.25 MB of data. Murphy also says that the most popular records live at the end of the largest group so there is your performance problem, quite likely -- tons of I/O required to get to the end of the large groups. What to do? Step 1 - See if another type will do a better job. Forget about HASH.HELP and forget about the key patterns documented for the various types -- yes, I know that type 18 should work best, but life isn't that simple. [AD] If you have FAST, use it. [/AD] If not, use HASH.AID to simulate the various types. In using HASH.AID I'd suggest picking a reasonable modulo, say around 200,001 or so. ** BIG NOTE ** This modulo choice is based on a separation of 4 which I'd recommend for a 2K data buffer -- if you want to stay with separation 1 use a modulo of 800,001 or so ** END BIG NOTE ** Before running HASH.AID clear the HASH.AID.FILE (CLEAR.FILE HASH.AID.FILE). Then use HASH.AID with your modulo and separation of choice and interate through all the available types -- syntax is HASH.AID SALES-HIST-BR1 and let it prompt you for the Type, Modulo and Separation; for Type enter 2,18,1 which is like FOR 1 TO 18, STEP 1. Don't bother reading the output, just enter N and let it scroll by. When it's all done use LIST HASH.AID.FILE to examine the results. Look for the type that yields the smallest Largest Group the fewest Oversize Groups and the closest together Smallest Group and Largest Group. If one of the types does a lot better than type 18 give it a try and see if it does better. Note that one flaw with HASH.AID is that it doesn't report empty groups (alas!). If you find a better type it may solve or help your problem. If not, Step 2 - Read the very helpful post by Scott Ballinger in which he notes that large, complex record keys sometimes don't hash well and could cause the sort of problem you are seeing. If none of the other file types do better than type 18 I'm afraid this is what you are facing. Were the file isolated the fix would be to move any important information carried by the record key into one or more fields and replace the compound record keys with sequential numeric, which as Scott notes, often hash more reliably. However, if the file is heavily embedded in the application software this might not be a trivial change to make! Hope this helps! Let us know how it turns out or if other questions arise... Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of roy Sent: Tuesday, October 16, 2007 2:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question File name = SALES-HIST-BR1 File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 883026 Number of physical bytes= 1667799040 Number of data bytes= 150663032 Average number of records per group = 0.2943 Average number of bytes per group = 50.2207 Minimum number of records in a group= 0 Maximum number of records in a group= 7417 Average number of bytes per record = 170.6213 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2644 Average number of fields per record = 25.6579 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75%100%125%150%175%200% full 2855826 50132 31541 14753 1286253834611 24909 Press any key to continue... --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of Key Question
Yes (agreeing with Jeff on file sizing isn't a very reckless thing to do), except I'd stress this: don't use a seperation of 1. Go to 4, at least. If it turns out that a high percentage of the records are over 2K, then try a sep of 8. In certain cases, you may want to go to 16, but this isn't one of them. Never go above 16. Here's links to Mark Baldridge's series on the subject of file sizing. http://www.ibm.com/developerworks/edu/dm-dw-dm-0512baldridge-i.html http://www.ibm.com/developerworks/edu/dm-dw-dm-0603baldridge-i.html http://www.ibm.com/developerworks/edu/dm-dw-dm-0606baldridge-i.html http://www.ibm.com/developerworks/edu/dm-dw-dm-0611baldridge-i.html Registration is required, but free. Subject: RE: [U2] Size of Key Question Date: Tue, 16 Oct 2007 19:07:18 -0400 From: [EMAIL PROTECTED] To: u2-users@listserver.u2ug.org This is a pretty ugly file! Here's what I see: 1) Modulo is way too big! 3 million groups for .9 million records; 1.6 GB physical space for 150 MB of data. Note the large number of empty groups in the 25% column at the bottom of the FILE.STAT report. Probably the modulo was pushed to TRY to make up for the really lousy hashing! More about this below in 2). 2) Lousy hashing distribution. Note 2.8 million empty and sparse groups in the 25% column; but at the same time 25,000 groups 200% + full. This isn't due to record size as the largest record is 2644 bytes. Note that the largest group has 7417 records - if all these were average size (Murphy says they aren't, though) that group would have 1.25 MB of data. Murphy also says that the most popular records live at the end of the largest group so there is your performance problem, quite likely -- tons of I/O required to get to the end of the large groups. What to do? Step 1 - See if another type will do a better job. Forget about HASH.HELP and forget about the key patterns documented for the various types -- yes, I know that type 18 should work best, but life isn't that simple. [AD] If you have FAST, use it. [/AD] If not, use HASH.AID to simulate the various types. In using HASH.AID I'd suggest picking a reasonable modulo, say around 200,001 or so. ** BIG NOTE ** This modulo choice is based on a separation of 4 which I'd recommend for a 2K data buffer -- if you want to stay with separation 1 use a modulo of 800,001 or so ** END BIG NOTE ** Before running HASH.AID clear the HASH.AID.FILE (CLEAR.FILE HASH.AID.FILE). Then use HASH.AID with your modulo and separation of choice and interate through all the available types -- syntax is HASH.AID SALES-HIST-BR1 and let it prompt you for the Type, Modulo and Separation; for Type enter 2,18,1 which is like FOR 1 TO 18, STEP 1. Don't bother reading the output, just enter N and let it scroll by. When it's all done use LIST HASH.AID.FILE to examine the results. Look for the type that yields the smallest Largest Group the fewest Oversize Groups and the closest together Smallest Group and Largest Group. If one of the types does a lot better than type 18 give it a try and see if it does better. Note that one flaw with HASH.AID is that it doesn't report empty groups (alas!). If you find a better type it may solve or help your problem. If not, Step 2 - Read the very helpful post by Scott Ballinger in which he notes that large, complex record keys sometimes don't hash well and could cause the sort of problem you are seeing. If none of the other file types do better than type 18 I'm afraid this is what you are facing. Were the file isolated the fix would be to move any important information carried by the record key into one or more fields and replace the compound record keys with sequential numeric, which as Scott notes, often hash more reliably. However, if the file is heavily embedded in the application software this might not be a trivial change to make! Hope this helps! Let us know how it turns out or if other questions arise... Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of roy Sent: Tuesday, October 16, 2007 2:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question File name = SALES-HIST-BR1 File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 883026 Number of physical bytes = 1667799040 Number of data bytes = 150663032 Average number of records per group = 0.2943 Average number of bytes per group = 50.2207 Minimum number of records in a group = 0 Maximum number of records in a group = 7417 Average number of bytes per record = 170.6213 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2644 Average number of fields per record = 25.6579 Minimum number of fields per record = 11 Maximum number of fields per record = 41 Groups 25% 50% 75% 100% 125% 150% 175% 200% full 2855826 50132 31541 14753 12862 5383 4611 24909 Press any key to continue... --- u2-users mailing list u2-users
RE: [U2] Size of Key Question
Jeff F. will certainly have better critique, but it appears that the key structure and hash-algorithm aren't very well suited to each other. You have 883,026 records in 3,000,017 groups, and one of the groups has 7,417 records in it, so you have at least 2,124,407 empty groups. I believe every disk sold in the last 5 or more years reads at least 4 frames at a time, so a separation of 4 (or 8, etc.) will likely improve speed as well. The fact that you have over 8% of 883,026 records hashing to the same group looks like the primary problem. The usual hash algorithms tend to give the best spread of records when the last several bytes of the key have the widest range of values. How are the 64 byte keys composed? Kind Regards, Richard Lewis --- On Tue 10/16, roy [EMAIL PROTECTED] wrote: File name = SALES-HIST-BR1 File type = 18 Number of groups in file (modulo) = 317 Separation = 1 Number of records = 883026 Maximum number of records in a group = 7417 Average number of bytes per record = 170.6213 Minimum number of bytes in a record = 64 Maximum number of bytes in a record = 2644 ___ No banners. No pop-ups. No kidding. Make My Way your home on the Web - http://www.myway.com --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/
RE: [U2] Size of Key Question
Sounds like the file may be V E R Y poorly sized, as Jeff F suggested Was this process fast previously? What has happened on the system around the time it started to get slow? Are there any triggers on the file Ross Ferris Stamina Software Visage Better by Design! -Original Message- From: [EMAIL PROTECTED] [mailto:owner-u2- [EMAIL PROTECTED] On Behalf Of roy Sent: Wednesday, 17 October 2007 3:45 AM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question Random reads and updates on a file with ~2 million records. I separated the reads and writes to a separate program that only does this processing to no avail. Topas shows 100% disk usage during this process and all other users are affected. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Fitzgerald Sent: Tuesday, October 16, 2007 1:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2] Size of Key Question I wouldn't expect much difference in file access speed with long record keys versus short keys. What are you doing with the file that seems slow? -- i.e. random reads of individual records, updates, sequential selects and processing, etc. If the slowness is seen in an application program, are there other possibilities? Does the file have alternate keys or associated files that might be causing the slowness? Could locking be a bottleneck? Just for grins it would be interesting to see a FILE.STAT on the file. Jeff Fitzgerald Fitzgerald Long, Inc. www.fitzlong.com -Original Message- From: Roy Beard [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 12:17 PM To: 'u2-users@listserver.u2ug.org' Subject: Size of Key Question Can someone comment on what effect if any the length of the key has on the speed of disk access? The software I am working with has one file with a complex key of 64 alpha-numeric characters and that file seems to be very slow no matter the modulo and sep or even file type I choose. This is in UV 10.2 Pick Flavor on AIX 5.3 Any insight would be appreciated. Thanks, Roy C. Beard Distributor Solutions Inc P.O. Box 110520 Palm Bay, FL 32911-0520 321-956-6500 501-642-8698 Fax --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ --- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/