RE: [U2] Size of key question

2007-10-19 Thread Dave Davis
I think that the maximum number of records in a group of 18032 is
really hurting the file performance.  I would find that group, and see
what keys are being hashed to it.  How many other groups approach this
number of records?  How bad are the other partfiles in this regard?

Something about your keys is causing a lot of them to hash to one group
or a selected subset of groups.

Also, how many groups out there have absolutely no records in them?
You've got 3,000,000 groups but only 2,000,000 records - so you should
have something in the neighborhood of 1,000,000 empty groups.  If you
have 2,500,000 empties - then your key values are not hashing well under
type 18.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of roy
Sent: Wednesday, October 17, 2007 9:52 AM
To: u2-users@listserver.u2ug.org
Subject: [U2] Size of key question

  _  

From: Roy Beard [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 16, 2007 10:33 PM
To: 'u2-users@listserver.u2ug.org'
Subject: File key question

 

Wow!

 

I got so many ideas from this group that I thought I would give some
more Information.

 

This file is a distributed file in 2 parts The 3rd field (separated by
*) determines which partfile with 1 going to the file I sent before
and the rest going to this file

 

There are no triggers on the file.

 

The system was 'fast' until recently when tens of thousands of new
records were added.  I agree the file is poorly sized, it was an attempt
to solve one problem but created another.

 

The keys, I believe, are an issue

 

Here is a sample

SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS

407*BAR*1*498*GLU*2274***SUMMARY

491*BAR*1*498*GLU*2274***SUMMARY

SUMMARY*BAR*1*498*GLU*2274***SUMMARY

1896*BAR*1*498*GLU*2274***SUMMARY

460*BAR*1*498*GLU*2274***SUMMARY

1199*SUMMARY*1*465*432****

1185*SUMMARY*1*412*ABE*SUMMARY***BLIN

3281*SUMMARY*1*412*ABE*SUMMARY***BLIN

SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN

SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL

3558*SUMMARY*1*450*HIR*SUMMARY***AMEL

3811*SUMMARY*1*450*HIR*SUMMARY***AMER

SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER

3558*SUMMARY*1*450*HIR*SUMMARY***AMER

SUMMARY*CAG*1*429*LOR*1810***BURN

252*PON*1*640*SHE*FRT***

2177*CS1*1*491*ABE*SUMMARY***

590*PRO*1*491*JOR*SUMMARY***WW8RHH

3715*PRO*1*491*JOR*SUMMARY***WW8RHH

Press any key to continue...

 

Here is the file stat from the other half of the file  File name =
SALES-HIST-BRS

File type   = 18

Number of groups in file (modulo)   = 317

Separation  = 1

Number of records   = 2071678

Number of physical bytes= 1894588928

Number of data bytes= 356514112

 

Average number of records per group = 0.6906

Average number of bytes per group   = 118.8374

Minimum number of records in a group= 0

Maximum number of records in a group= 18032

 

Average number of bytes per record  = 172.0895

Minimum number of bytes in a record = 64

Maximum number of bytes in a record = 2704

 

Average number of fields per record = 25.7575

Minimum number of fields per record = 11

Maximum number of fields per record = 41

 

Groups   25% 50% 75%100%125%150%175%200%
full

 2789365   44352   31348   23325   3311192219893   59402

Press any key to continue...

 

 

I'll be working on this for a while.  Any resize, copy etc. has such an
impact on the system that my access is limited.

 

Thanks for the input so far.

 

Roy

 

 

 

 

 

 

Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520

 

321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of key question

2007-10-18 Thread Ross Ferris
Has this file always had 2 parts? Just wondering as I saw that if the
other half of the file was around the same size, then the full file is
well under 1Gb  so, reducing the mod  bringing into 1 part would
eliminate overhead of distributing data into 2 files  call to your
routine on every read/write operation 

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: [EMAIL PROTECTED] [mailto:owner-u2-
[EMAIL PROTECTED] On Behalf Of roy
Sent: Wednesday, 17 October 2007 11:52 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] Size of key question

  _

From: Roy Beard [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 16, 2007 10:33 PM
To: 'u2-users@listserver.u2ug.org'
Subject: File key question



Wow!



I got so many ideas from this group that I thought I would give some
more
Information.



This file is a distributed file in 2 parts The 3rd field (separated by
*)
determines which partfile with 1 going to the file I sent before and
the
rest going to this file



There are no triggers on the file.



The system was 'fast' until recently when tens of thousands of new
records
were added.  I agree the file is poorly sized, it was an attempt to
solve
one problem but created another.



The keys, I believe, are an issue



Here is a sample

SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS

407*BAR*1*498*GLU*2274***SUMMARY

491*BAR*1*498*GLU*2274***SUMMARY

SUMMARY*BAR*1*498*GLU*2274***SUMMARY

1896*BAR*1*498*GLU*2274***SUMMARY

460*BAR*1*498*GLU*2274***SUMMARY

1199*SUMMARY*1*465*432****

1185*SUMMARY*1*412*ABE*SUMMARY***BLIN

3281*SUMMARY*1*412*ABE*SUMMARY***BLIN

SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN

SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL

3558*SUMMARY*1*450*HIR*SUMMARY***AMEL

3811*SUMMARY*1*450*HIR*SUMMARY***AMER

SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER

3558*SUMMARY*1*450*HIR*SUMMARY***AMER

SUMMARY*CAG*1*429*LOR*1810***BURN

252*PON*1*640*SHE*FRT***

2177*CS1*1*491*ABE*SUMMARY***

590*PRO*1*491*JOR*SUMMARY***WW8RHH

3715*PRO*1*491*JOR*SUMMARY***WW8RHH

Press any key to continue...



Here is the file stat from the other half of the file  File name
= SALES-HIST-BRS

File type   = 18

Number of groups in file (modulo)   = 317

Separation  = 1

Number of records   = 2071678

Number of physical bytes= 1894588928

Number of data bytes= 356514112



Average number of records per group = 0.6906

Average number of bytes per group   = 118.8374

Minimum number of records in a group= 0

Maximum number of records in a group= 18032



Average number of bytes per record  = 172.0895

Minimum number of bytes in a record = 64

Maximum number of bytes in a record = 2704



Average number of fields per record = 25.7575

Minimum number of fields per record = 11

Maximum number of fields per record = 41



Groups   25% 50% 75%100%125%150%175%200%
full

 2789365   44352   31348   23325   3311192219893   59402

Press any key to continue...





I'll be working on this for a while.  Any resize, copy etc. has such an
impact on the system that my access is limited.



Thanks for the input so far.



Roy













Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520



321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] Size of Key Question

2007-10-17 Thread john reid
The previous responders certainly know alot more about this stuff than
I and I they mentioned all of the things to do.  However, if I had to
do just ONE thing, I would certainly pick the separation expansion to
4. The way I read it, the average group contains 1/3 of a logical
record, so expanding the separation should reduce the reads even if
they are cached.  You'd also of course have to choose a modulus that
would allow enough overall space.

On 10/17/07, Ross Ferris [EMAIL PROTECTED] wrote:
 Sounds like the file may be V E R Y poorly sized, as Jeff F suggested

 Was this process fast previously? What has happened on the system
 around the time it started to get slow? Are there any triggers on the
 file

 Ross Ferris
 Stamina Software
 Visage  Better by Design!


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:owner-u2-
 [EMAIL PROTECTED] On Behalf Of roy
 Sent: Wednesday, 17 October 2007 3:45 AM
 To: u2-users@listserver.u2ug.org
 Subject: RE: [U2] Size of Key Question
 
 Random reads and updates on a file with ~2 million records.  I
 separated
 the
 reads and writes to a separate program that only does this processing
 to
 no
 avail.
 
 Topas shows 100% disk usage during this process and all other users are
 affected.
 
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jeff
 Fitzgerald
 Sent: Tuesday, October 16, 2007 1:14 PM
 To: u2-users@listserver.u2ug.org
 Subject: RE: [U2] Size of Key Question
 
 I wouldn't expect much difference in file access speed with long record
 keys versus short keys.  What are you doing with the file that seems
 slow? -- i.e. random reads of individual records, updates, sequential
 selects and processing, etc.  If the slowness is seen in an application
 program, are there other possibilities?  Does the file have alternate
 keys or associated files that might be causing the slowness?  Could
 locking be a bottleneck?  Just for grins it would be interesting to see
 a FILE.STAT on the file.
 
 Jeff Fitzgerald
 Fitzgerald  Long, Inc.
 www.fitzlong.com
 
 -Original Message-
 From: Roy Beard [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 16, 2007 12:17 PM
 To: 'u2-users@listserver.u2ug.org'
 Subject: Size of Key Question
 
 
 
 Can someone comment on what effect if any the length of the key has on
 the speed of disk access?  The software I am working with has one file
 with a complex key of 64  alpha-numeric characters and that file seems
 to be very slow no matter the modulo and sep or even file type I
 choose.
 This is in UV
 10.2 Pick Flavor on AIX 5.3
 
 
 
 Any insight would be appreciated.
 
 
 
 Thanks,
 
 
 
 
 
 Roy C. Beard
 
 Distributor Solutions Inc
 
 P.O. Box 110520
 
 Palm Bay, FL 32911-0520
 
 
 
 321-956-6500
 
 501-642-8698   Fax
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/



-- 
john
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


[U2] Size of key question

2007-10-17 Thread roy
  _  

From: Roy Beard [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 16, 2007 10:33 PM
To: 'u2-users@listserver.u2ug.org'
Subject: File key question

 

Wow!

 

I got so many ideas from this group that I thought I would give some more
Information.

 

This file is a distributed file in 2 parts The 3rd field (separated by *)
determines which partfile with 1 going to the file I sent before and the
rest going to this file

 

There are no triggers on the file.

 

The system was 'fast' until recently when tens of thousands of new records
were added.  I agree the file is poorly sized, it was an attempt to solve
one problem but created another.

 

The keys, I believe, are an issue

 

Here is a sample

SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS

407*BAR*1*498*GLU*2274***SUMMARY

491*BAR*1*498*GLU*2274***SUMMARY

SUMMARY*BAR*1*498*GLU*2274***SUMMARY

1896*BAR*1*498*GLU*2274***SUMMARY

460*BAR*1*498*GLU*2274***SUMMARY

1199*SUMMARY*1*465*432****

1185*SUMMARY*1*412*ABE*SUMMARY***BLIN

3281*SUMMARY*1*412*ABE*SUMMARY***BLIN

SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN

SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL

3558*SUMMARY*1*450*HIR*SUMMARY***AMEL

3811*SUMMARY*1*450*HIR*SUMMARY***AMER

SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER

3558*SUMMARY*1*450*HIR*SUMMARY***AMER

SUMMARY*CAG*1*429*LOR*1810***BURN

252*PON*1*640*SHE*FRT***

2177*CS1*1*491*ABE*SUMMARY***

590*PRO*1*491*JOR*SUMMARY***WW8RHH

3715*PRO*1*491*JOR*SUMMARY***WW8RHH

Press any key to continue...

 

Here is the file stat from the other half of the file  File name
= SALES-HIST-BRS

File type   = 18

Number of groups in file (modulo)   = 317

Separation  = 1

Number of records   = 2071678

Number of physical bytes= 1894588928

Number of data bytes= 356514112

 

Average number of records per group = 0.6906

Average number of bytes per group   = 118.8374

Minimum number of records in a group= 0

Maximum number of records in a group= 18032

 

Average number of bytes per record  = 172.0895

Minimum number of bytes in a record = 64

Maximum number of bytes in a record = 2704

 

Average number of fields per record = 25.7575

Minimum number of fields per record = 11

Maximum number of fields per record = 41

 

Groups   25% 50% 75%100%125%150%175%200%  full

 2789365   44352   31348   23325   3311192219893   59402

Press any key to continue...

 

 

I'll be working on this for a while.  Any resize, copy etc. has such an
impact on the system that my access is limited.

 

Thanks for the input so far.

 

Roy

 

 

 

 

 

 

Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520

 

321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] Size of key question

2007-10-17 Thread Karl Pearson
My thought on seeing the keys is that the key is being used for things
that should be in the data, not the key... My feeble 2-bits.

Karl


On Wed, 2007-10-17 at 09:51 -0400, roy wrote:
 _  
 
 From: Roy Beard [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, October 16, 2007 10:33 PM
 To: 'u2-users@listserver.u2ug.org'
 Subject: File key question
 
  
 
 Wow!
 
  
 
 I got so many ideas from this group that I thought I would give some more
 Information.
 
  
 
 This file is a distributed file in 2 parts The 3rd field (separated by *)
 determines which partfile with 1 going to the file I sent before and the
 rest going to this file
 
  
 
 There are no triggers on the file.
 
  
 
 The system was 'fast' until recently when tens of thousands of new records
 were added.  I agree the file is poorly sized, it was an attempt to solve
 one problem but created another.
 
  
 
 The keys, I believe, are an issue
 
  
 
 Here is a sample
 
 SUMMARY*PRO*1*SUMMARY*CDJ*9876***WS
 
 407*BAR*1*498*GLU*2274***SUMMARY
 
 491*BAR*1*498*GLU*2274***SUMMARY
 
 SUMMARY*BAR*1*498*GLU*2274***SUMMARY
 
 1896*BAR*1*498*GLU*2274***SUMMARY
 
 460*BAR*1*498*GLU*2274***SUMMARY
 
 1199*SUMMARY*1*465*432****
 
 1185*SUMMARY*1*412*ABE*SUMMARY***BLIN
 
 3281*SUMMARY*1*412*ABE*SUMMARY***BLIN
 
 SUMMARY*SUMMARY*1*412*ABE*SUMMARY***BLIN
 
 SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMEL
 
 3558*SUMMARY*1*450*HIR*SUMMARY***AMEL
 
 3811*SUMMARY*1*450*HIR*SUMMARY***AMER
 
 SUMMARY*SUMMARY*1*450*HIR*SUMMARY***AMER
 
 3558*SUMMARY*1*450*HIR*SUMMARY***AMER
 
 SUMMARY*CAG*1*429*LOR*1810***BURN
 
 252*PON*1*640*SHE*FRT***
 
 2177*CS1*1*491*ABE*SUMMARY***
 
 590*PRO*1*491*JOR*SUMMARY***WW8RHH
 
 3715*PRO*1*491*JOR*SUMMARY***WW8RHH
 
 Press any key to continue...
 
  
 
 Here is the file stat from the other half of the file  File name
 = SALES-HIST-BRS
 
 File type   = 18
 
 Number of groups in file (modulo)   = 317
 
 Separation  = 1
 
 Number of records   = 2071678
 
 Number of physical bytes= 1894588928
 
 Number of data bytes= 356514112
 
  
 
 Average number of records per group = 0.6906
 
 Average number of bytes per group   = 118.8374
 
 Minimum number of records in a group= 0
 
 Maximum number of records in a group= 18032
 
  
 
 Average number of bytes per record  = 172.0895
 
 Minimum number of bytes in a record = 64
 
 Maximum number of bytes in a record = 2704
 
  
 
 Average number of fields per record = 25.7575
 
 Minimum number of fields per record = 11
 
 Maximum number of fields per record = 41
 
  
 
 Groups   25% 50% 75%100%125%150%175%200%  full
 
  2789365   44352   31348   23325   3311192219893   59402
 
 Press any key to continue...
 
  
 
 
 
 I'll be working on this for a while.  Any resize, copy etc. has such an
 impact on the system that my access is limited.
 
  
 
 Thanks for the input so far.
 
  
 
 Roy
 
  
 
 
 
 
 
 
 
 
 
 
 
 Roy C. Beard
 
 Distributor Solutions Inc
 
 P.O. Box 110520
 
 Palm Bay, FL 32911-0520
 
  
 
 321-956-6500
 
 501-642-8698   Fax
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/
---
Karl L. Pearson
[EMAIL PROTECTED]
http://consulting.ourldsfamily.com
---
My Thoughts on Terrorism In America right after 9/11/2001:
http://www.ourldsfamily.com/wtc.shtml
---
The world is a dangerous place to live... not because of
the people who are evil, but because of the people who
don't do anything about it.
- Albert Einstein
---
To mess up your Linux PC, you have to really work at it;
to mess up a microsoft PC you just have to work on it.
---
Now for a random _short_ fortune:
While having never invented a sin, I'm trying to perfect several. 
---
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


[U2] Size of Key Question

2007-10-16 Thread roy
  _  

From: Roy Beard [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 16, 2007 12:17 PM
To: 'u2-users@listserver.u2ug.org'
Subject: Size of Key Question

 

Can someone comment on what effect if any the length of the key has on the
speed of disk access?  The software I am working with has one file with a
complex key of 64  alpha-numeric characters and that file seems to be very
slow no matter the modulo and sep or even file type I choose.  This is in UV
10.2 Pick Flavor on AIX 5.3

 

Any insight would be appreciated.

 

Thanks,

 

 

Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520

 

321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of Key Question

2007-10-16 Thread Jeff Fitzgerald
I wouldn't expect much difference in file access speed with long record
keys versus short keys.  What are you doing with the file that seems
slow? -- i.e. random reads of individual records, updates, sequential
selects and processing, etc.  If the slowness is seen in an application
program, are there other possibilities?  Does the file have alternate
keys or associated files that might be causing the slowness?  Could
locking be a bottleneck?  Just for grins it would be interesting to see
a FILE.STAT on the file. 

Jeff Fitzgerald
Fitzgerald  Long, Inc.
www.fitzlong.com

-Original Message-
From: Roy Beard [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 16, 2007 12:17 PM
To: 'u2-users@listserver.u2ug.org'
Subject: Size of Key Question

 

Can someone comment on what effect if any the length of the key has on
the speed of disk access?  The software I am working with has one file
with a complex key of 64  alpha-numeric characters and that file seems
to be very slow no matter the modulo and sep or even file type I choose.
This is in UV
10.2 Pick Flavor on AIX 5.3

 

Any insight would be appreciated.

 

Thanks,

 

 

Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520

 

321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of Key Question

2007-10-16 Thread roy
Random reads and updates on a file with ~2 million records.  I separated the
reads and writes to a separate program that only does this processing to no
avail.

Topas shows 100% disk usage during this process and all other users are
affected.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeff Fitzgerald
Sent: Tuesday, October 16, 2007 1:14 PM
To: u2-users@listserver.u2ug.org
Subject: RE: [U2] Size of Key Question

I wouldn't expect much difference in file access speed with long record
keys versus short keys.  What are you doing with the file that seems
slow? -- i.e. random reads of individual records, updates, sequential
selects and processing, etc.  If the slowness is seen in an application
program, are there other possibilities?  Does the file have alternate
keys or associated files that might be causing the slowness?  Could
locking be a bottleneck?  Just for grins it would be interesting to see
a FILE.STAT on the file. 

Jeff Fitzgerald
Fitzgerald  Long, Inc.
www.fitzlong.com

-Original Message-
From: Roy Beard [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 16, 2007 12:17 PM
To: 'u2-users@listserver.u2ug.org'
Subject: Size of Key Question

 

Can someone comment on what effect if any the length of the key has on
the speed of disk access?  The software I am working with has one file
with a complex key of 64  alpha-numeric characters and that file seems
to be very slow no matter the modulo and sep or even file type I choose.
This is in UV
10.2 Pick Flavor on AIX 5.3

 

Any insight would be appreciated.

 

Thanks,

 

 

Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520

 

321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] Size of Key Question

2007-10-16 Thread john reid
and the FILE.STAT?

On 10/16/07, roy [EMAIL PROTECTED] wrote:
 Random reads and updates on a file with ~2 million records.  I separated the
 reads and writes to a separate program that only does this processing to no
 avail.

 Topas shows 100% disk usage during this process and all other users are
 affected.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Fitzgerald
 Sent: Tuesday, October 16, 2007 1:14 PM
 To: u2-users@listserver.u2ug.org
 Subject: RE: [U2] Size of Key Question

 I wouldn't expect much difference in file access speed with long record
 keys versus short keys.  What are you doing with the file that seems
 slow? -- i.e. random reads of individual records, updates, sequential
 selects and processing, etc.  If the slowness is seen in an application
 program, are there other possibilities?  Does the file have alternate
 keys or associated files that might be causing the slowness?  Could
 locking be a bottleneck?  Just for grins it would be interesting to see
 a FILE.STAT on the file.

 Jeff Fitzgerald
 Fitzgerald  Long, Inc.
 www.fitzlong.com

 -Original Message-
 From: Roy Beard [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, October 16, 2007 12:17 PM
 To: 'u2-users@listserver.u2ug.org'
 Subject: Size of Key Question



 Can someone comment on what effect if any the length of the key has on
 the speed of disk access?  The software I am working with has one file
 with a complex key of 64  alpha-numeric characters and that file seems
 to be very slow no matter the modulo and sep or even file type I choose.
 This is in UV
 10.2 Pick Flavor on AIX 5.3



 Any insight would be appreciated.



 Thanks,





 Roy C. Beard

 Distributor Solutions Inc

 P.O. Box 110520

 Palm Bay, FL 32911-0520



 321-956-6500

 501-642-8698   Fax
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/
 ---
 u2-users mailing list
 u2-users@listserver.u2ug.org
 To unsubscribe please visit http://listserver.u2ug.org/



-- 
john
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of Key Question

2007-10-16 Thread roy
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of john reid
Sent: Tuesday, October 16, 2007 2:17 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] Size of Key Question

and the FILE.STAT?
File name   = SALES-HIST-BR1
File type   = 18
Number of groups in file (modulo)   = 317
Separation  = 1
Number of records   = 883026
Number of physical bytes= 1667799040
Number of data bytes= 150663032

Average number of records per group = 0.2943
Average number of bytes per group   = 50.2207
Minimum number of records in a group= 0
Maximum number of records in a group= 7417

Average number of bytes per record  = 170.6213
Minimum number of bytes in a record = 64
Maximum number of bytes in a record = 2644

Average number of fields per record = 25.6579
Minimum number of fields per record = 11
Maximum number of fields per record = 41

Groups   25% 50% 75%100%125%150%175%200%  full
 2855826   50132   31541   14753   1286253834611   24909
Press any key to continue...


On 10/16/07, roy [EMAIL PROTECTED] wrote:
 Random reads and updates on a file with ~2 million records.  I separated
the
 reads and writes to a separate program that only does this processing to
no
 avail.

 Topas shows 100% disk usage during this process and all other users are
 affected.
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] Size of Key Question

2007-10-16 Thread Scott Ballinger
In general, the main problem with large, compound keys is that said keys do
not hash well; and by hash well I mean that they do not hash to proximate
groups, as for example, sequential numeric keys would.

There is read-ahead logic and RAM in your disk drive(s). There is read-ahead
logic and RAM in your disk controller(s). There is read-ahead logic in the
O/S. None of this works very well when records are randomly scattered
throughout the file.

If I used sequential, numeric keys, and I wanted all the records created
yesterday, they would likely all be near each other on the physical disk.
When I accessed the first one, the disk/controller/os will have pre-fetched
many of the day's other records as well. That makes for speedy access.

This is part of the reason why I think long, compound keys are a PITA and
are to be avoided. Simple numeric keys will process quicker because they
hash better, and are easier to type too.  This is often the problem with
intelligent keys; by embedding data in the key, you almost always make the
key longer and the file hash poorly. IMO it makes way more sense to use
simple numeric keys and create real attributes for the data you are tempted
to build the key out of.  I say this with 20/20 hindsight, as I have
designed many systems with large files that use compound keys, every one of
which I have come to regret.

Roy, you could prove this by writing a program that reads every record in
your original file and writes it out to a new file (with the same modulo 
sep as the original file) using a simple incrementing counter as the key. I
will bet that the new file performs better than your original one does, even
though it should have more attributes (necessary to accommodate the data
values that were embedded in the key to the original file).

My 0.02,
/Scott Ballinger
Pareto Corporation
Edmonds, WA USA
206 713 6006
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of Key Question

2007-10-16 Thread Jeff Fitzgerald
This is a pretty ugly file!  Here's what I see:

1)  Modulo is way too big!  3 million groups for .9 million records;
1.6 GB physical space for 150 MB of data.  Note the large number of
empty groups in the 25% column at the bottom of the FILE.STAT report.
Probably the modulo was pushed to TRY to make up for the really lousy
hashing!  More about this below in 2).

2)  Lousy hashing distribution.  Note 2.8 million empty and sparse
groups in the 25% column; but at the same time 25,000 groups 200% +
full.  This isn't due to record size as the largest record is 2644
bytes.  Note that the largest group has 7417 records - if all these were
average size (Murphy says they aren't, though) that group would have
1.25 MB of data.  Murphy also says that the most popular records live at
the end of the largest group so there is your performance problem, quite
likely -- tons of I/O required to get to the end of the large groups.

What to do?

Step 1 - See if another type will do a better job.  Forget about
HASH.HELP and forget about the key patterns documented for the various
types -- yes, I know that type 18 should work best, but life isn't
that simple.  [AD] If you have FAST, use it. [/AD]  If not, use HASH.AID
to simulate the various types.  In using HASH.AID I'd suggest picking a
reasonable modulo, say around 200,001 or so.  ** BIG NOTE ** This modulo
choice is based on a separation of 4 which I'd recommend for a 2K data
buffer -- if you want to stay with separation 1 use a modulo of 800,001
or so ** END BIG NOTE **  Before running HASH.AID clear the
HASH.AID.FILE (CLEAR.FILE HASH.AID.FILE).  Then use HASH.AID with your
modulo and separation of choice and interate through all the available
types -- syntax is HASH.AID  SALES-HIST-BR1 and let it prompt you for
the Type, Modulo and Separation; for Type enter 2,18,1 which is like
FOR 1 TO 18, STEP 1.  Don't bother reading the output, just enter N
and let it scroll by.  When it's all done use LIST HASH.AID.FILE to
examine the results.  Look for the type that yields the smallest
Largest Group the fewest Oversize Groups and the closest together
Smallest Group and Largest Group.  If one of the types does a lot
better than type 18 give it a try and see if it does better.  Note that
one flaw with HASH.AID is that it doesn't report empty groups (alas!).

If you find a better type it may solve or help your problem.  If not,

Step 2 - Read the very helpful post by Scott Ballinger in which he notes
that large, complex record keys sometimes don't hash well and could
cause the sort of problem you are seeing.  If none of the other file
types do better than type 18 I'm afraid this is what you are facing.
Were the file isolated the fix would be to move any important
information carried by the record key into one or more fields and
replace the compound record keys with sequential numeric, which as Scott
notes, often hash more reliably.  However, if the file is heavily
embedded in the application software this might not be a trivial change
to make!

Hope this helps!  Let us know how it turns out or if other questions
arise...

Jeff Fitzgerald
Fitzgerald  Long, Inc.
www.fitzlong.com



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of roy
Sent: Tuesday, October 16, 2007 2:14 PM
To: u2-users@listserver.u2ug.org
Subject: RE: [U2] Size of Key Question

File name   = SALES-HIST-BR1
File type   = 18
Number of groups in file (modulo)   = 317
Separation  = 1
Number of records   = 883026
Number of physical bytes= 1667799040
Number of data bytes= 150663032

Average number of records per group = 0.2943
Average number of bytes per group   = 50.2207
Minimum number of records in a group= 0
Maximum number of records in a group= 7417

Average number of bytes per record  = 170.6213
Minimum number of bytes in a record = 64
Maximum number of bytes in a record = 2644

Average number of fields per record = 25.6579
Minimum number of fields per record = 11
Maximum number of fields per record = 41

Groups   25% 50% 75%100%125%150%175%200%
full
 2855826   50132   31541   14753   1286253834611   24909
Press any key to continue...
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of Key Question

2007-10-16 Thread Dan Fitzgerald
Yes (agreeing with Jeff on file sizing isn't a very reckless thing to do),
except I'd stress this: don't use a seperation of 1. Go to 4, at least. If it
turns out that a high percentage of the records are over 2K, then try a sep of
8. In certain cases, you may want to go to 16, but this isn't one of them.
Never go above 16.

Here's links to Mark Baldridge's series on the subject of file sizing.


http://www.ibm.com/developerworks/edu/dm-dw-dm-0512baldridge-i.html
http://www.ibm.com/developerworks/edu/dm-dw-dm-0603baldridge-i.html
http://www.ibm.com/developerworks/edu/dm-dw-dm-0606baldridge-i.html
http://www.ibm.com/developerworks/edu/dm-dw-dm-0611baldridge-i.html

Registration is required, but free.

 Subject: RE: [U2] Size of Key Question Date: Tue, 16 Oct 2007 19:07:18
-0400 From: [EMAIL PROTECTED] To: u2-users@listserver.u2ug.org  This is a
pretty ugly file! Here's what I see:  1) Modulo is way too big! 3 million
groups for .9 million records; 1.6 GB physical space for 150 MB of data. Note
the large number of empty groups in the 25% column at the bottom of the
FILE.STAT report. Probably the modulo was pushed to TRY to make up for the
really lousy hashing! More about this below in 2).  2) Lousy hashing
distribution. Note 2.8 million empty and sparse groups in the 25% column; but
at the same time 25,000 groups 200% + full. This isn't due to record size as
the largest record is 2644 bytes. Note that the largest group has 7417
records - if all these were average size (Murphy says they aren't, though)
that group would have 1.25 MB of data. Murphy also says that the most popular
records live at the end of the largest group so there is your performance
problem, quite likely -- tons of I/O required to get to the end of the large
groups.  What to do?  Step 1 - See if another type will do a better job.
Forget about HASH.HELP and forget about the key patterns documented for the
various types -- yes, I know that type 18 should work best, but life isn't
that simple. [AD] If you have FAST, use it. [/AD] If not, use HASH.AID to
simulate the various types. In using HASH.AID I'd suggest picking a
reasonable modulo, say around 200,001 or so. ** BIG NOTE ** This modulo
choice is based on a separation of 4 which I'd recommend for a 2K data buffer
-- if you want to stay with separation 1 use a modulo of 800,001 or so ** END
BIG NOTE ** Before running HASH.AID clear the HASH.AID.FILE (CLEAR.FILE
HASH.AID.FILE). Then use HASH.AID with your modulo and separation of choice
and interate through all the available types -- syntax is HASH.AID
SALES-HIST-BR1 and let it prompt you for the Type, Modulo and Separation; for
Type enter 2,18,1 which is like FOR 1 TO 18, STEP 1. Don't bother reading
the output, just enter N and let it scroll by. When it's all done use LIST
HASH.AID.FILE to examine the results. Look for the type that yields the
smallest Largest Group the fewest Oversize Groups and the closest
together Smallest Group and Largest Group. If one of the types does a
lot better than type 18 give it a try and see if it does better. Note that
one flaw with HASH.AID is that it doesn't report empty groups (alas!).  If
you find a better type it may solve or help your problem. If not,  Step 2 -
Read the very helpful post by Scott Ballinger in which he notes that large,
complex record keys sometimes don't hash well and could cause the sort of
problem you are seeing. If none of the other file types do better than type
18 I'm afraid this is what you are facing. Were the file isolated the fix
would be to move any important information carried by the record key into one
or more fields and replace the compound record keys with sequential numeric,
which as Scott notes, often hash more reliably. However, if the file is
heavily embedded in the application software this might not be a trivial
change to make!  Hope this helps! Let us know how it turns out or if other
questions arise...  Jeff Fitzgerald Fitzgerald  Long, Inc.
www.fitzlong.com-Original Message- From:
[EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of roy Sent: Tuesday,
October 16, 2007 2:14 PM To: u2-users@listserver.u2ug.org Subject: RE: [U2]
Size of Key Question  File name = SALES-HIST-BR1 File type = 18 Number of
groups in file (modulo) = 317 Separation = 1 Number of records = 883026
Number of physical bytes = 1667799040 Number of data bytes = 150663032 
Average number of records per group = 0.2943 Average number of bytes per
group = 50.2207 Minimum number of records in a group = 0 Maximum number of
records in a group = 7417  Average number of bytes per record = 170.6213
Minimum number of bytes in a record = 64 Maximum number of bytes in a record
= 2644  Average number of fields per record = 25.6579 Minimum number of
fields per record = 11 Maximum number of fields per record = 41  Groups 25%
50% 75% 100% 125% 150% 175% 200% full 2855826 50132 31541 14753 12862 5383
4611 24909 Press any key to continue... --- u2-users mailing list
u2-users

RE: [U2] Size of Key Question

2007-10-16 Thread rbl000
Jeff F. will certainly have better critique, but it appears that the key 
structure and hash-algorithm aren't very well suited to each other.

You have 883,026 records in 3,000,017 groups, and one of the groups has 7,417 
records in it, so you have at least 2,124,407 empty groups.

I believe every disk sold in the last 5 or more years reads at least 4 frames 
at a time, so a separation of 4 (or 8, etc.) will likely improve speed as well.

The fact that you have over 8% of 883,026 records hashing to the same group 
looks like the primary problem.  The usual hash algorithms tend to give the 
best spread of records when the last several bytes of the key have the widest 
range of values.  How are the 64 byte keys composed?

Kind Regards,

Richard Lewis

 --- On Tue 10/16, roy  [EMAIL PROTECTED]  wrote:
File name = SALES-HIST-BR1
File type = 18
Number of groups in file (modulo) = 317
Separation = 1
Number of records = 883026

Maximum number of records in a group = 7417

Average number of bytes per record = 170.6213
Minimum number of bytes in a record = 64
Maximum number of bytes in a record = 2644


___
No banners. No pop-ups. No kidding.
Make My Way  your home on the Web - http://www.myway.com
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] Size of Key Question

2007-10-16 Thread Ross Ferris
Sounds like the file may be V E R Y poorly sized, as Jeff F suggested

Was this process fast previously? What has happened on the system
around the time it started to get slow? Are there any triggers on the
file

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: [EMAIL PROTECTED] [mailto:owner-u2-
[EMAIL PROTECTED] On Behalf Of roy
Sent: Wednesday, 17 October 2007 3:45 AM
To: u2-users@listserver.u2ug.org
Subject: RE: [U2] Size of Key Question

Random reads and updates on a file with ~2 million records.  I
separated
the
reads and writes to a separate program that only does this processing
to
no
avail.

Topas shows 100% disk usage during this process and all other users are
affected.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeff
Fitzgerald
Sent: Tuesday, October 16, 2007 1:14 PM
To: u2-users@listserver.u2ug.org
Subject: RE: [U2] Size of Key Question

I wouldn't expect much difference in file access speed with long record
keys versus short keys.  What are you doing with the file that seems
slow? -- i.e. random reads of individual records, updates, sequential
selects and processing, etc.  If the slowness is seen in an application
program, are there other possibilities?  Does the file have alternate
keys or associated files that might be causing the slowness?  Could
locking be a bottleneck?  Just for grins it would be interesting to see
a FILE.STAT on the file.

Jeff Fitzgerald
Fitzgerald  Long, Inc.
www.fitzlong.com

-Original Message-
From: Roy Beard [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 16, 2007 12:17 PM
To: 'u2-users@listserver.u2ug.org'
Subject: Size of Key Question



Can someone comment on what effect if any the length of the key has on
the speed of disk access?  The software I am working with has one file
with a complex key of 64  alpha-numeric characters and that file seems
to be very slow no matter the modulo and sep or even file type I
choose.
This is in UV
10.2 Pick Flavor on AIX 5.3



Any insight would be appreciated.



Thanks,





Roy C. Beard

Distributor Solutions Inc

P.O. Box 110520

Palm Bay, FL 32911-0520



321-956-6500

501-642-8698   Fax
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/