Re: [U2] UniVerse LIST statement question [not-secure]

2012-07-05 Thread Hennessey, Mark F.
Thanks for all of the responses. 

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Hennessey, Mark F.
Sent: Monday, July 02, 2012 9:53 AM
To: 'U2 Users List'
Subject: [U2] UniVerse LIST statement question [not-secure]

I need to do a UniVerse LIST statement that would only populate a column if the 
contents met certain criteria.

For example, suppose we have a file with details of telephone usage and that 3 
associated mulitvalued fields contain date call was made, duration and if the 
call was a toll call. Is it possible to limit the output of the date call 
made and associated columns to a date range without that being a select 
criteria? If I were to do something like:

LIST CALLS EMP.NAME EMP.LOCATION WITH DATE.CALL GE 2012-06-01 AND WITH 
DATE.CALL LE 2012-06-30  DURATION TOLL WITH @ID EQ '123456'

I would get zero record if employee 123456 did not make any calls in June. What 
I would like to see is the employer name and location returned with the date, 
duration and toll columns empty. I'm trying to do this in a LIST statement as 
it will be run by U2 Web Services (and for the time being a subroutine is off 
the table...)

Any advice, or an authoritative NO, It can not be done would be greatly 
appreciated.

Mark Hennessey
State of Connecticut
Department of Social Services
Information Technology Services
Child Support Systems
Voice: 860-424-5261
Fax: 860-424-4813



CONFIDENTIAL INFORMATION: The information contained in this e-mail may be 
confidential and protected from general disclosure. If the recipient or reader 
of this e-mail is not the intended recipient or a person responsible to receive 
this e-mail for the intended recipient, please do not disseminate, distribute 
or copy it. If you received this e-mail in error, please notify the sender by 
replying to this message and delete this e-mail immediately. We will take 
immediate and appropriate action to see to it that this mistake is 
corrected.[*LD*]
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users



___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Chris Austin

Disk space is not a factor, as we are a smaller shop and disk space comes 
cheap. However, one thing I did notice is when I increased the modulus to a 
very large
number which then increased my disk space to about 3-4x of my record data, my 
SELECT queries were slower. 

Are the 2 factors when choosing HOW the file is used based on whether your 
using?

1) a lot of SELECTS (then looping through the records) 
2) grabbing individual records (not using a SELECT)

With this file we really do a lot of SELECTS (option 1), then loop through the 
records. With that being said and based on the reading I've done here it would 
appear it's better to have a little overflow
and not use up so much disk space for modulus (groups) for this application 
since we do use a lot of SELECT queries. Is this correct?

Most of my records are ~ 250 bytes, there's a handful that are 'up to 512 
bytes'. 

It would seem to me that I would want to REDUCE my split to ~70% to reduce 
overflow, and maybe increase my MINIMUM.MODULUS to a # a little bit bigger than 
my current modulus (~10% bigger) since this
will be a growing file and will never merge. In my case using the formula might 
not make sense since this file will never merge. Does this make sense?


File name ..   GENACCTRN_POSTED
Pathname ...   GENACCTRN_POSTED
File type ..   DYNAMIC
File style and revision    32BIT Revision 12
Hashing Algorithm ..   GENERAL
No. of groups (modulus)    92903 current ( minimum 31, 87 empty,
28248 overflowed, 2510 badly )
Number of records ..   1292377
Large record size ..   3267 bytes
Number of large records    180
Group size .   4096 bytes
Load factors ...   80% (split), 50% (merge) and 80% (actual)
Total size .   501219328 bytes
Total size of record data ..   287426366 bytes
Total size of record IDs ...   21539682 bytes
Unused space ...   192245088 bytes
Total space for records    501211136 bytes


With all that being said if I change the following:

1) SPLIT.LOAD to 70%
2) MINIMUM.MODULUS  130,000

That's all I should really need to do to 'tweak' the performance of this file.. 
If this doesn't sound right I would be interested to hear how it should be 
tweaked instead. Thanks for all the help so far, I think
this is all starting to make sense.

Chris


 From: ro...@stamina.com.au
 To: u2-users@listserver.u2ug.org
 Date: Wed, 4 Jul 2012 01:36:26 +
 Subject: Re: [U2] RESIZE - dynamic files
 
 I would suggest that then actual goal is to achieve maximum performance for 
 your system, so knowing HOW the file is used on a daily basis can also 
 influence decisions. Disk is a cheap commodity, so having some wastage in 
 file utilization shouldn't factor. 
 
 
 Ross Ferris
 Stamina Software
 Visage  Better by Design!

  
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Chris Austin

I was able to drop from 30% overflow to 12% by making 2 changes:

1) changed the split from 80% to 70% (that alone reduce 10% overflow)
2) changed the MINIMUM.MODULUS to 118,681 (calculated this way - [ (record 
data + id) * 1.1 * 1.42857 (70% split load)] / 4096 )

My disk size only went up 8%..

My file looks like this now:

File name ..   GENACCTRN_POSTED
Pathname ...   GENACCTRN_POSTED
File type ..   DYNAMIC
File style and revision    32BIT Revision 12
Hashing Algorithm ..   GENERAL
No. of groups (modulus)    118681 current ( minimum 118681, 140 empty,
14431 overflowed, 778 badly )
Number of records ..   1292377
Large record size ..   3267 bytes
Number of large records    180
Group size .   4096 bytes
Load factors ...   70% (split), 50% (merge) and 63% (actual)
Total size .   546869248 bytes
Total size of record data ..   287789178 bytes
Total size of record IDs ...   21539538 bytes
Unused space ...   237532340 bytes
Total space for records    546861056 bytes

Chris



 From: keith.john...@datacom.co.nz
 To: u2-users@listserver.u2ug.org
 Date: Wed, 4 Jul 2012 14:05:02 +1200
 Subject: Re: [U2] RESIZE - dynamic files
 
 Doug may have had a key bounce in his input
 
  Let's do the math:
 
  258687736 (Record Size)
  192283300 (Key Size)
  
 
 The key size is actually 19283300 in Chris' figures
 
 Regarding 68,063 being less than the current modulus of 82,850.  I think the 
 answer may lie in the splitting process.
 
 As I understand it, the first time a split occurs group 1 is split and its 
 contents are split between new group 1 and new group 2. All the other groups 
 effectively get 1 added to their number. The next split is group 3 (which was 
 2) into 3 and 4 and so forth. A pointer is kept to say where the next split 
 will take place and also to help sort out how to adjust the algorithm to 
 identify which group matches a given key.
 
 Based on this, if you started with 1000 groups, by the time you have split 
 the 500th time you will have 1500 groups.  The first 1000 will be relatively 
 empty, the last 500 will probably be overflowed, but not terribly badly.  By 
 the time you get to the 1000th split, you will have 2000 groups and they 
 will, one hopes, be quite reasonably spread with very little overflow.
 
 So I expect the average access times would drift up and down in a cycle.  The 
 cycle time would get longer as the file gets bigger but the worst time would 
 be roughly the the same each cycle.
 
 Given the power of two introduced into the algorithm by the before/after the 
 split thing, I wonder if there is such a need to start off with a prime?
 
 Regards, Keith
 
 PS I'm getting a bit Tony^H^H^H^Hverbose nowadays.
 
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
  
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Martin Phillips
Hi,

The various suggestions about setting the minimum modulus to reduce overflow 
are all very well but effectively you are turning a
dynamic file into a static one, complete with all the continual maintenance 
work needed to keep the parameters in step with the
data.

In most cases, the only parameter that is worth tuning is the group size to try 
to pack things nicely. Even this is often fine left
alone though getting it to match the underlying o/s page size is helpful.

I missed the start of this thread but, unless you have a performance problem or 
are seriously short of space, my recommendation
would be to leave the dynamic files to look after themselves.

A file without overflow is not necessarily the best solution. Winding the split 
load down to 70% means that at least 30% of the file
is dead space. The implication of this is that the file is larger and will take 
more disk reads to process sequentially from one end
to the other.


Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
+44 (0)1604-709200



-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
Sent: 05 July 2012 15:19
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] RESIZE - dynamic files


I was able to drop from 30% overflow to 12% by making 2 changes:

1) changed the split from 80% to 70% (that alone reduce 10% overflow)
2) changed the MINIMUM.MODULUS to 118,681 (calculated this way - [ (record 
data + id) * 1.1 * 1.42857 (70% split load)] / 4096 )

My disk size only went up 8%..

My file looks like this now:

File name ..   GENACCTRN_POSTED
Pathname ...   GENACCTRN_POSTED
File type ..   DYNAMIC
File style and revision    32BIT Revision 12
Hashing Algorithm ..   GENERAL
No. of groups (modulus)    118681 current ( minimum 118681, 140 empty,
14431 overflowed, 778 badly )
Number of records ..   1292377
Large record size ..   3267 bytes
Number of large records    180
Group size .   4096 bytes
Load factors ...   70% (split), 50% (merge) and 63% (actual)
Total size .   546869248 bytes
Total size of record data ..   287789178 bytes
Total size of record IDs ...   21539538 bytes
Unused space ...   237532340 bytes
Total space for records    546861056 bytes

Chris



 From: keith.john...@datacom.co.nz
 To: u2-users@listserver.u2ug.org
 Date: Wed, 4 Jul 2012 14:05:02 +1200
 Subject: Re: [U2] RESIZE - dynamic files
 
 Doug may have had a key bounce in his input
 
  Let's do the math:
 
  258687736 (Record Size)
  192283300 (Key Size)
  
 
 The key size is actually 19283300 in Chris' figures
 
 Regarding 68,063 being less than the current modulus of 82,850.  I think the 
 answer may lie in the splitting process.
 
 As I understand it, the first time a split occurs group 1 is split and its 
 contents are split between new group 1 and new group 2.
All the other groups effectively get 1 added to their number. The next split is 
group 3 (which was 2) into 3 and 4 and so forth. A
pointer is kept to say where the next split will take place and also to help 
sort out how to adjust the algorithm to identify which
group matches a given key.
 
 Based on this, if you started with 1000 groups, by the time you have split 
 the 500th time you will have 1500 groups.  The first
1000 will be relatively empty, the last 500 will probably be overflowed, but 
not terribly badly.  By the time you get to the 1000th
split, you will have 2000 groups and they will, one hopes, be quite reasonably 
spread with very little overflow.
 
 So I expect the average access times would drift up and down in a cycle.  The 
 cycle time would get longer as the file gets bigger
but the worst time would be roughly the the same each cycle.
 
 Given the power of two introduced into the algorithm by the before/after the 
 split thing, I wonder if there is such a need to
start off with a prime?
 
 Regards, Keith
 
 PS I'm getting a bit Tony^H^H^H^Hverbose nowadays.
 
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
  
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Rick Nuckolls
Chis,

I still am wondering what is prompting you to continue using the larger group 
size.

I think that Martin, and the UV documentation is correct in this case; you 
would be as well or better off with the defaults.

-Rick

On Jul 5, 2012, at 9:13 AM, Martin Phillips martinphill...@ladybridge.com 
wrote:
coming
 Hi,
 
 The various suggestions about setting the minimum modulus to reduce overflow 
 are all very well but effectively you are turning a
 dynamic file into a static one, complete with all the continual maintenance 
 work needed to keep the parameters in step with the
 data.
 
 In most cases, the only parameter that is worth tuning is the group size to 
 try to pack things nicely. Even this is often fine left
 alone though getting it to match the underlying o/s page size is helpful.
 
 I missed the start of this thread but, unless you have a performance problem 
 or are seriously short of space, my recommendation
 would be to leave the dynamic files to look after themselves.
 
 A file without overflow is not necessarily the best solution. Winding the 
 split load down to 70% means that at least 30% of the file
 is dead space. The implication of this is that the file is larger and will 
 take more disk reads to process sequentially from one end
 to the other.
 
 
 Martin Phillips
 Ladybridge Systems Ltd
 17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
 +44 (0)1604-709200
 
 
 
 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org 
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
 Sent: 05 July 2012 15:19
 To: u2-users@listserver.u2ug.org
 Subject: Re: [U2] RESIZE - dynamic files
 
 
 I was able to drop from 30% overflow to 12% by making 2 changes:
 
 1) changed the split from 80% to 70% (that alone reduce 10% overflow)
 2) changed the MINIMUM.MODULUS to 118,681 (calculated this way - [ (record 
 data + id) * 1.1 * 1.42857 (70% split load)] / 4096 )
 
 My disk size only went up 8%..
 
 My file looks like this now:
 
 File name ..   GENACCTRN_POSTED
 Pathname ...   GENACCTRN_POSTED
 File type ..   DYNAMIC
 File style and revision    32BIT Revision 12
 Hashing Algorithm ..   GENERAL
 No. of groups (modulus)    118681 current ( minimum 118681, 140 empty,
14431 overflowed, 778 badly )
 Number of records ..   1292377
 Large record size ..   3267 bytes
 Number of large records    180
 Group size .   4096 bytes
 Load factors ...   70% (split), 50% (merge) and 63% (actual)
 Total size .   546869248 bytes
 Total size of record data ..   287789178 bytes
 Total size of record IDs ...   21539538 bytes
 Unused space ...   237532340 bytes
 Total space for records    546861056 bytes
 
 Chris
 
 
 
 From: keith.john...@datacom.co.nz
 To: u2-users@listserver.u2ug.org
 Date: Wed, 4 Jul 2012 14:05:02 +1200
 Subject: Re: [U2] RESIZE - dynamic files
 
 Doug may have had a key bounce in his input
 
 Let's do the math:
 
 258687736 (Record Size)
 192283300 (Key Size)
 
 
 The key size is actually 19283300 in Chris' figures
 
 Regarding 68,063 being less than the current modulus of 82,850.  I think the 
 answer may lie in the splitting process.
 
 As I understand it, the first time a split occurs group 1 is split and its 
 contents are split between new group 1 and new group 2.
 All the other groups effectively get 1 added to their number. The next split 
 is group 3 (which was 2) into 3 and 4 and so forth. A
 pointer is kept to say where the next split will take place and also to help 
 sort out how to adjust the algorithm to identify which
 group matches a given key.
 
 Based on this, if you started with 1000 groups, by the time you have split 
 the 500th time you will have 1500 groups.  The first
 1000 will be relatively empty, the last 500 will probably be overflowed, but 
 not terribly badly.  By the time you get to the 1000th
 split, you will have 2000 groups and they will, one hopes, be quite 
 reasonably spread with very little overflow.
 
 So I expect the average access times would drift up and down in a cycle.  
 The cycle time would get longer as the file gets bigger
 but the worst time would be roughly the the same each cycle.
 
 Given the power of two introduced into the algorithm by the before/after the 
 split thing, I wonder if there is such a need to
 start off with a prime?
 
 Regards, Keith
 
 PS I'm getting a bit Tony^H^H^H^Hverbose nowadays.
 
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 
 ___
 U2-Users mailing list
 

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Chris Austin

Rick,

You are correct, I should be using the smaller size (I just haven't changed it 
yet). Based on the reading I have done you should
only use the larger group size when the average record size is greater than 
1000 bytes. 

As far as being better off with the defaults that's basically what I'm trying 
to test (as well as learn how linear hashing works). I was able
to reduce my overflow by 18% and I only increased my empty groups by a very 
small amount as well as only increased my file size
by 8%. This in theory should be better for reads/writes than what I had before. 

To test the performance I need to write a ton of records and then capture the 
output and compare the output using timestamps. 

Chris


 From: r...@lynden.com
 To: u2-users@listserver.u2ug.org
 Date: Thu, 5 Jul 2012 09:22:02 -0700
 Subject: Re: [U2] RESIZE - dynamic files
 
 Chis,
 
 I still am wondering what is prompting you to continue using the larger group 
 size.
 
 I think that Martin, and the UV documentation is correct in this case; you 
 would be as well or better off with the defaults.
 
 -Rick
 
 On Jul 5, 2012, at 9:13 AM, Martin Phillips martinphill...@ladybridge.com 
 wrote:
 coming
  Hi,
  
  The various suggestions about setting the minimum modulus to reduce 
  overflow are all very well but effectively you are turning a
  dynamic file into a static one, complete with all the continual maintenance 
  work needed to keep the parameters in step with the
  data.
  
  In most cases, the only parameter that is worth tuning is the group size to 
  try to pack things nicely. Even this is often fine left
  alone though getting it to match the underlying o/s page size is helpful.
  
  I missed the start of this thread but, unless you have a performance 
  problem or are seriously short of space, my recommendation
  would be to leave the dynamic files to look after themselves.
  
  A file without overflow is not necessarily the best solution. Winding the 
  split load down to 70% means that at least 30% of the file
  is dead space. The implication of this is that the file is larger and will 
  take more disk reads to process sequentially from one end
  to the other.
  
  
  Martin Phillips
  Ladybridge Systems Ltd
  17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
  +44 (0)1604-709200
  
  
  
  -Original Message-
  From: u2-users-boun...@listserver.u2ug.org 
  [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
  Sent: 05 July 2012 15:19
  To: u2-users@listserver.u2ug.org
  Subject: Re: [U2] RESIZE - dynamic files
  
  
  I was able to drop from 30% overflow to 12% by making 2 changes:
  
  1) changed the split from 80% to 70% (that alone reduce 10% overflow)
  2) changed the MINIMUM.MODULUS to 118,681 (calculated this way - [ (record 
  data + id) * 1.1 * 1.42857 (70% split load)] / 4096 )
  
  My disk size only went up 8%..
  
  My file looks like this now:
  
  File name ..   GENACCTRN_POSTED
  Pathname ...   GENACCTRN_POSTED
  File type ..   DYNAMIC
  File style and revision    32BIT Revision 12
  Hashing Algorithm ..   GENERAL
  No. of groups (modulus)    118681 current ( minimum 118681, 140 empty,
 14431 overflowed, 778 badly )
  Number of records ..   1292377
  Large record size ..   3267 bytes
  Number of large records    180
  Group size .   4096 bytes
  Load factors ...   70% (split), 50% (merge) and 63% (actual)
  Total size .   546869248 bytes
  Total size of record data ..   287789178 bytes
  Total size of record IDs ...   21539538 bytes
  Unused space ...   237532340 bytes
  Total space for records    546861056 bytes
  
  Chris
  
  
  
  From: keith.john...@datacom.co.nz
  To: u2-users@listserver.u2ug.org
  Date: Wed, 4 Jul 2012 14:05:02 +1200
  Subject: Re: [U2] RESIZE - dynamic files
  
  Doug may have had a key bounce in his input
  
  Let's do the math:
  
  258687736 (Record Size)
  192283300 (Key Size)
  
  
  The key size is actually 19283300 in Chris' figures
  
  Regarding 68,063 being less than the current modulus of 82,850.  I think 
  the answer may lie in the splitting process.
  
  As I understand it, the first time a split occurs group 1 is split and its 
  contents are split between new group 1 and new group 2.
  All the other groups effectively get 1 added to their number. The next 
  split is group 3 (which was 2) into 3 and 4 and so forth. A
  pointer is kept to say where the next split will take place and also to 
  help sort out how to adjust the algorithm to identify which
  group matches a given key.
  
  Based on this, if you started with 1000 groups, by the time you have split 
  the 500th time you will have 1500 groups.  The first
  1000 will be relatively empty, the last 500 will probably be overflowed, 
  but not terribly badly.  By the time you get 

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Rick Nuckolls
Chris,

For the type of use that you described earlier; BASIC selects and reads, 
reducing overflow will have negligible performance benefit, especially compared 
to changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file in 
relatively small percentages, then it will never merge anyway (because you will 
need to delete 20-30% of the file for that to happen with the mergeload at 50%, 
so your optimum minimum modulus solution will probably be how ever large it 
grows  The overhead for a group split is not as bad as it sounds unless your 
updates/sec count is extremely high, such as during a copy.

If you do regular SELECT and SCANS of the entire file, then your goal should be 
to reduce the total disk size of the file, and not worry much about common 
overflow. The important thing is that the file is dynamic, so you will never 
encounter the issues that undersized statically hashed files develop.

We have thousands of dynamically hashed files on our (Solaris) systems, with an 
extremely low problem rate.

Rick

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
Sent: Thursday, July 05, 2012 11:21 AM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] RESIZE - dynamic files


Rick,

You are correct, I should be using the smaller size (I just haven't changed it 
yet). Based on the reading I have done you should
only use the larger group size when the average record size is greater than 
1000 bytes. 

As far as being better off with the defaults that's basically what I'm trying 
to test (as well as learn how linear hashing works). I was able
to reduce my overflow by 18% and I only increased my empty groups by a very 
small amount as well as only increased my file size
by 8%. This in theory should be better for reads/writes than what I had before. 

To test the performance I need to write a ton of records and then capture the 
output and compare the output using timestamps. 

Chris


 From: r...@lynden.com
 To: u2-users@listserver.u2ug.org
 Date: Thu, 5 Jul 2012 09:22:02 -0700
 Subject: Re: [U2] RESIZE - dynamic files
 
 Chis,
 
 I still am wondering what is prompting you to continue using the larger group 
 size.
 
 I think that Martin, and the UV documentation is correct in this case; you 
 would be as well or better off with the defaults.
 
 -Rick
 
 On Jul 5, 2012, at 9:13 AM, Martin Phillips martinphill...@ladybridge.com 
 wrote:
 coming
  Hi,
  
  The various suggestions about setting the minimum modulus to reduce 
  overflow are all very well but effectively you are turning a
  dynamic file into a static one, complete with all the continual maintenance 
  work needed to keep the parameters in step with the
  data.
  
  In most cases, the only parameter that is worth tuning is the group size to 
  try to pack things nicely. Even this is often fine left
  alone though getting it to match the underlying o/s page size is helpful.
  
  I missed the start of this thread but, unless you have a performance 
  problem or are seriously short of space, my recommendation
  would be to leave the dynamic files to look after themselves.
  
  A file without overflow is not necessarily the best solution. Winding the 
  split load down to 70% means that at least 30% of the file
  is dead space. The implication of this is that the file is larger and will 
  take more disk reads to process sequentially from one end
  to the other.
  
  
  Martin Phillips
  Ladybridge Systems Ltd
  17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
  +44 (0)1604-709200
  
  
  
  -Original Message-
  From: u2-users-boun...@listserver.u2ug.org 
  [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
  Sent: 05 July 2012 15:19
  To: u2-users@listserver.u2ug.org
  Subject: Re: [U2] RESIZE - dynamic files
  
  
  I was able to drop from 30% overflow to 12% by making 2 changes:
  
  1) changed the split from 80% to 70% (that alone reduce 10% overflow)
  2) changed the MINIMUM.MODULUS to 118,681 (calculated this way - [ (record 
  data + id) * 1.1 * 1.42857 (70% split load)] / 4096 )
  
  My disk size only went up 8%..
  
  My file looks like this now:
  
  File name ..   GENACCTRN_POSTED
  Pathname ...   GENACCTRN_POSTED
  File type ..   DYNAMIC
  File style and revision    32BIT Revision 12
  Hashing Algorithm ..   GENERAL
  No. of groups (modulus)    118681 current ( minimum 118681, 140 empty,
 14431 overflowed, 778 badly )
  Number of records ..   1292377
  Large record size ..   3267 bytes
  Number of large records    180
  Group size .   4096 bytes
  Load factors ...   70% (split), 50% (merge) and 63% (actual)
  Total size .   546869248 bytes
  Total size of record data ..   287789178 bytes
  Total size of record IDs ...   

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Wjhonson

The hardward look ahead of the disk drive reader will grab consecutive 
frames into memory, since it assumes you'll want the next frame next.
So the less overflow you have, the faster a full file scan will become.
At least that's my theory ;)




-Original Message-
From: Rick Nuckolls r...@lynden.com
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Thu, Jul 5, 2012 2:29 pm
Subject: Re: [U2] RESIZE - dynamic files


Chris,
For the type of use that you described earlier; BASIC selects and reads, 
educing overflow will have negligible performance benefit, especially compared 
o changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file in 
elatively small percentages, then it will never merge anyway (because you will 
eed to delete 20-30% of the file for that to happen with the mergeload at 50%, 
o your optimum minimum modulus solution will probably be how ever large it 
rows  The overhead for a group split is not as bad as it sounds unless your 
pdates/sec count is extremely high, such as during a copy.
If you do regular SELECT and SCANS of the entire file, then your goal should be 
o reduce the total disk size of the file, and not worry much about common 
verflow. The important thing is that the file is dynamic, so you will never 
ncounter the issues that undersized statically hashed files develop.
We have thousands of dynamically hashed files on our (Solaris) systems, with an 
xtremely low problem rate.
Rick
-Original Message-
rom: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Chris Austin
ent: Thursday, July 05, 2012 11:21 AM
o: u2-users@listserver.u2ug.org
ubject: Re: [U2] RESIZE - dynamic files

ick,
You are correct, I should be using the smaller size (I just haven't changed it 
et). Based on the reading I have done you should
nly use the larger group size when the average record size is greater than 1000 
ytes. 
As far as being better off with the defaults that's basically what I'm trying 
to 
est (as well as learn how linear hashing works). I was able
o reduce my overflow by 18% and I only increased my empty groups by a very 
mall amount as well as only increased my file size
y 8%. This in theory should be better for reads/writes than what I had before. 
To test the performance I need to write a ton of records and then capture the 
utput and compare the output using timestamps. 
Chris

 From: r...@lynden.com
 To: u2-users@listserver.u2ug.org
 Date: Thu, 5 Jul 2012 09:22:02 -0700
 Subject: Re: [U2] RESIZE - dynamic files
 
 Chis,
 
 I still am wondering what is prompting you to continue using the larger group 
ize.
 
 I think that Martin, and the UV documentation is correct in this case; you 
ould be as well or better off with the defaults.
 
 -Rick
 
 On Jul 5, 2012, at 9:13 AM, Martin Phillips martinphill...@ladybridge.com 
rote:
 coming
  Hi,
  
  The various suggestions about setting the minimum modulus to reduce overflow 
re all very well but effectively you are turning a
  dynamic file into a static one, complete with all the continual maintenance 
ork needed to keep the parameters in step with the
  data.
  
  In most cases, the only parameter that is worth tuning is the group size to 
ry to pack things nicely. Even this is often fine left
  alone though getting it to match the underlying o/s page size is helpful.
  
  I missed the start of this thread but, unless you have a performance problem 
r are seriously short of space, my recommendation
  would be to leave the dynamic files to look after themselves.
  
  A file without overflow is not necessarily the best solution. Winding the 
plit load down to 70% means that at least 30% of the file
  is dead space. The implication of this is that the file is larger and will 
ake more disk reads to process sequentially from one end
  to the other.
  
  
  Martin Phillips
  Ladybridge Systems Ltd
  17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
  +44 (0)1604-709200
  
  
  
  -Original Message-
  From: u2-users-boun...@listserver.u2ug.org 
  [mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Chris Austin
  Sent: 05 July 2012 15:19
  To: u2-users@listserver.u2ug.org
  Subject: Re: [U2] RESIZE - dynamic files
  
  
  I was able to drop from 30% overflow to 12% by making 2 changes:
  
  1) changed the split from 80% to 70% (that alone reduce 10% overflow)
  2) changed the MINIMUM.MODULUS to 118,681 (calculated this way - [ (record 
ata + id) * 1.1 * 1.42857 (70% split load)] / 4096 )
  
  My disk size only went up 8%..
  
  My file looks like this now:
  
  File name ..   GENACCTRN_POSTED
  Pathname ...   GENACCTRN_POSTED
  File type ..   DYNAMIC
  File style and revision    32BIT Revision 12
  Hashing Algorithm ..   GENERAL
  No. of groups (modulus)    118681 current ( minimum 118681, 140 empty,
 14431 overflowed, 778 badly )
  

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Charles Stevenson

Chris,

I can appreciate what you are doing as an academic exercise.

You seem happy how it looks at this moment, where, because you set  
MINIMUM.MODULUS  118681, you ended up with a current load of 63%.
But think about it:  as you add records, the load will reach 70%, per 
SPLIT.LOAD 70,  then splits will keep occuring and current modlus with 
grow past 118681.  MINIMUM.MODULUS will never matter again.  (This was 
described as an ever-growing file.)


If the current config is what you want, why not just set SPLIT.LOAD 63  
MINIMUM.MODULUS 1.   That way the ratio that you like today will stay 
like this forever.


MINIMUM.MODULUS will not matter unless data is deleted.  It says to not 
shrink the file structure below that minimally allocated disk space, 
even if there is no data to occupy it.  That's really all 
MINIMUM.MODULUS is for.


Play with it all you want, because it puts you in a good place when some 
crisis happens.  At the end of the day, with this file, you'll find your 
tuning won't matter much.  Not a lot of help, but not much harm if you 
tweak it wrong, either.



On 7/5/2012 1:20 PM, Chris Austin wrote:

Rick,

You are correct, I should be using the smaller size (I just haven't changed it 
yet). Based on the reading I have done you should
only use the larger group size when the average record size is greater than 
1000 bytes.

As far as being better off with the defaults that's basically what I'm trying 
to test (as well as learn how linear hashing works). I was able
to reduce my overflow by 18% and I only increased my empty groups by a very 
small amount as well as only increased my file size
by 8%. This in theory should be better for reads/writes than what I had before.

To test the performance I need to write a ton of records and then capture the 
output and compare the output using timestamps.

Chris


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Wols Lists
On 05/07/12 16:12, Martin Phillips wrote:
 A file without overflow is not necessarily the best solution. Winding the 
 split load down to 70% means that at least 30% of the file
 is dead space. The implication of this is that the file is larger and will 
 take more disk reads to process sequentially from one end
 to the other.

Whoops Martin, I think you've made the classic percentages mistake here ...

The file is 30/70, or 42% dead space at least. A file with the default
80% split is at least 25% dead space.

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


[U2] The words to the Pick systems rap

2012-07-05 Thread Wjhonson

http://books.google.com/books?id=ShGYef744mgCpg=PA41


To add a little humour


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Wols Lists
On 05/07/12 14:49, Chris Austin wrote:
 
 Disk space is not a factor, as we are a smaller shop and disk space comes 
 cheap. However, one thing I did notice is when I increased the modulus to a 
 very large
 number which then increased my disk space to about 3-4x of my record data, my 
 SELECT queries were slower. 
 
 Are the 2 factors when choosing HOW the file is used based on whether your 
 using?
 
 1) a lot of SELECTS (then looping through the records) 

Is that a BASIC select, or a RETRIEVE select?

 2) grabbing individual records (not using a SELECT)
 
 With this file we really do a lot of SELECTS (option 1), then loop through 
 the records. With that being said and based on the reading I've done here it 
 would appear it's better to have a little overflow
 and not use up so much disk space for modulus (groups) for this application 
 since we do use a lot of SELECT queries. Is this correct?

If your selects are BASIC selects, then you won't notice too much
difference. If they are RETRIEVE selects, then reducing SPLIT will
increase the cost of the SELECT.

In both cases, if the RETRIEVE select is not BY, then the cost of
processing the list should not be seriously impacted.

(On a SELECT WITH index, however, reducing overflow will speed things up
a bit, probably not an awful lot.)
 
 Most of my records are ~ 250 bytes, there's a handful that are 'up to 512 
 bytes'. 
 
 It would seem to me that I would want to REDUCE my split to ~70% to reduce 
 overflow, and maybe increase my MINIMUM.MODULUS to a # a little bit bigger 
 than my current modulus (~10% bigger) since this
 will be a growing file and will never merge. In my case using the formula 
 might not make sense since this file will never merge. Does this make sense?
 
If the file will only ever grow, then MINIMUM.MODULUS is probably a
waste of time. You are best using that in one of two circumstances,
either (a) you are populating a file with a large number of initial
records and you are forcing the modulus to what it's likely to end up
anyway, or (b) your file grows and shrinks violently in size, and you
are forcing it to its typical state.

The first scenario simply avoids a bunch of inevitable splits, the
second avoids a yoyo split/merge/split scenario.

I'd just leave the settings at 80/20, and only use MINIMUM.MODULUS if I
was creating a copy of the file (setting the new minimum at the current
modulo of the existing file).

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Rick Nuckolls
Most disks and disk systems cache huge amounts of information these days, and, 
depending on 20 factors or so, one solution will be better than another for a 
given file.

For the wholesale, SELECT F WITH, The fewest disk records will almost 
always win. For files that have ~10 records/group and have ~10% of the groups 
overflowed, then perhaps 1% of record reads will do a second read for the 
overflow buffer because the target key was not in the primary group.  Writing a 
new record would possibly hit the 10% mark for reading overflow buffers. But 
lowering the split.load will increase the number of splits slightly, and 
increase the total number of groups considerably.  What you have shown is that 
you need to increase the the modulus (and select time) of a large file more 
than 10% in order to decrease the read and update times for you records 0.5% of 
the time (assuming, that you have only reduced the number of overflow groups by 
~50%.)

As Charles suggests, this is an interesting exercise, but your actual results 
will rapidly change if you actually add /remove records from your file, change 
the load or number of files on your system, put in a new drive, cpu, memory 
board, or install a new release of Universe, move to raid, etc.

-Rick

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Thursday, July 05, 2012 2:38 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] RESIZE - dynamic files


The hardward look ahead of the disk drive reader will grab consecutive 
frames into memory, since it assumes you'll want the next frame next.
So the less overflow you have, the faster a full file scan will become.
At least that's my theory ;)




-Original Message-
From: Rick Nuckolls r...@lynden.com
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Thu, Jul 5, 2012 2:29 pm
Subject: Re: [U2] RESIZE - dynamic files


Chris,
For the type of use that you described earlier; BASIC selects and reads, 
educing overflow will have negligible performance benefit, especially compared 
o changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file in 
elatively small percentages, then it will never merge anyway (because you will 
eed to delete 20-30% of the file for that to happen with the mergeload at 50%, 
o your optimum minimum modulus solution will probably be how ever large it 
rows  The overhead for a group split is not as bad as it sounds unless your 
pdates/sec count is extremely high, such as during a copy.
If you do regular SELECT and SCANS of the entire file, then your goal should be 
o reduce the total disk size of the file, and not worry much about common 
verflow. The important thing is that the file is dynamic, so you will never 
ncounter the issues that undersized statically hashed files develop.
We have thousands of dynamically hashed files on our (Solaris) systems, with an 
xtremely low problem rate.
Rick
-Original Message-
rom: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Chris Austin
ent: Thursday, July 05, 2012 11:21 AM
o: u2-users@listserver.u2ug.org
ubject: Re: [U2] RESIZE - dynamic files

ick,
You are correct, I should be using the smaller size (I just haven't changed it 
et). Based on the reading I have done you should
nly use the larger group size when the average record size is greater than 1000 
ytes. 
As far as being better off with the defaults that's basically what I'm trying 
to 
est (as well as learn how linear hashing works). I was able
o reduce my overflow by 18% and I only increased my empty groups by a very 
mall amount as well as only increased my file size
y 8%. This in theory should be better for reads/writes than what I had before. 
To test the performance I need to write a ton of records and then capture the 
utput and compare the output using timestamps. 
Chris

 From: r...@lynden.com
 To: u2-users@listserver.u2ug.org
 Date: Thu, 5 Jul 2012 09:22:02 -0700
 Subject: Re: [U2] RESIZE - dynamic files
 
 Chis,
 
 I still am wondering what is prompting you to continue using the larger group 
ize.
 
 I think that Martin, and the UV documentation is correct in this case; you 
ould be as well or better off with the defaults.
 
 -Rick
 
 On Jul 5, 2012, at 9:13 AM, Martin Phillips martinphill...@ladybridge.com 
rote:
 coming
  Hi,
  
  The various suggestions about setting the minimum modulus to reduce overflow 
re all very well but effectively you are turning a
  dynamic file into a static one, complete with all the continual maintenance 
ork needed to keep the parameters in step with the
  data.
  
  In most cases, the only parameter that is worth tuning is the group size to 
ry to pack things nicely. Even this is often fine left
  alone though getting it to match the underlying o/s page size is helpful.
  
  I missed the start of this thread but, unless you have a performance problem 
r are seriously short 

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Wjhonson

A BASIC SELECT cannot use criteria at all.
It is going to walk through every record in the file, in order.
And that's the sticky wicket. That whole in order business.
The disk drive controller has no clue on linked frames, but it *will* do 
optimistic look aheads for you.
So you are much better off, for BASIC SELECTs having nothing in overflow, at 
all. :)
That way, when you go to ask for the *next* frame, it will always be 
contiguous, and already sitting in memory.








-Original Message-
From: Rick Nuckolls r...@lynden.com
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Thu, Jul 5, 2012 4:43 pm
Subject: Re: [U2] RESIZE - dynamic files


Most disks and disk systems cache huge amounts of information these days, and, 
epending on 20 factors or so, one solution will be better than another for a 
iven file.
For the wholesale, SELECT F WITH, The fewest disk records will almost 
always 
in. For files that have ~10 records/group and have ~10% of the groups 
verflowed, then perhaps 1% of record reads will do a second read for the 
verflow buffer because the target key was not in the primary group.  Writing a 
ew record would possibly hit the 10% mark for reading overflow buffers. But 
owering the split.load will increase the number of splits slightly, and 
ncrease the total number of groups considerably.  What you have shown is that 
ou need to increase the the modulus (and select time) of a large file more than 
0% in order to decrease the read and update times for you records 0.5% of the 
ime (assuming, that you have only reduced the number of overflow groups by 
50%.)
As Charles suggests, this is an interesting exercise, but your actual results 
ill rapidly change if you actually add /remove records from your file, change 
he load or number of files on your system, put in a new drive, cpu, memory 
oard, or install a new release of Universe, move to raid, etc.
-Rick
-Original Message-
rom: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Wjhonson
ent: Thursday, July 05, 2012 2:38 PM
o: u2-users@listserver.u2ug.org
ubject: Re: [U2] RESIZE - dynamic files

he hardward look ahead of the disk drive reader will grab consecutive 
frames into memory, since it assumes you'll want the next frame next.
o the less overflow you have, the faster a full file scan will become.
t least that's my theory ;)


Original Message-
rom: Rick Nuckolls r...@lynden.com
o: 'U2 Users List' u2-users@listserver.u2ug.org
ent: Thu, Jul 5, 2012 2:29 pm
ubject: Re: [U2] RESIZE - dynamic files

hris,
or the type of use that you described earlier; BASIC selects and reads, 
ducing overflow will have negligible performance benefit, especially compared 
 changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file in 
latively small percentages, then it will never merge anyway (because you will 
ed to delete 20-30% of the file for that to happen with the mergeload at 50%, 
 your optimum minimum modulus solution will probably be how ever large it 
ows  The overhead for a group split is not as bad as it sounds unless your 
dates/sec count is extremely high, such as during a copy.
f you do regular SELECT and SCANS of the entire file, then your goal should be 
 reduce the total disk size of the file, and not worry much about common 
erflow. The important thing is that the file is dynamic, so you will never 
counter the issues that undersized statically hashed files develop.
e have thousands of dynamically hashed files on our (Solaris) systems, with an 
tremely low problem rate.
ick
Original Message-
om: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Chris Austin
nt: Thursday, July 05, 2012 11:21 AM
: u2-users@listserver.u2ug.org
bject: Re: [U2] RESIZE - dynamic files
ick,
ou are correct, I should be using the smaller size (I just haven't changed it 
t). Based on the reading I have done you should
ly use the larger group size when the average record size is greater than 1000 
tes. 
s far as being better off with the defaults that's basically what I'm trying to 
est (as well as learn how linear hashing works). I was able
 reduce my overflow by 18% and I only increased my empty groups by a very 
all amount as well as only increased my file size
 8%. This in theory should be better for reads/writes than what I had before. 
o test the performance I need to write a ton of records and then capture the 
tput and compare the output using timestamps. 
hris
 From: r...@lynden.com
To: u2-users@listserver.u2ug.org
Date: Thu, 5 Jul 2012 09:22:02 -0700
Subject: Re: [U2] RESIZE - dynamic files

Chis,

I still am wondering what is prompting you to continue using the larger group 
ze.

I think that Martin, and the UV documentation is correct in this case; you 
uld be as well or better off with the defaults.

-Rick

On Jul 5, 2012, at 9:13 AM, Martin Phillips martinphill...@ladybridge.com 
ote:
coming
 Hi,
 
 The 

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Rick Nuckolls
This will be mostly true if the full extent of the file was allocated at one 
time as a contiguous block, which could be a big plus.
As a file grows, sectors will be allocated piecemeal and when the hardware 
reads ahead, it will not necessarily be reading sectors in the same file.
Curiously, an old Pr1me CAM file had a trick around this, though it was late 
coming onto the scene.  Unix also has a few tricks, but they are only partial 
solutions to file fragmentation.  And Windows

Rick

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Thursday, July 05, 2012 5:12 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] RESIZE - dynamic files


A BASIC SELECT cannot use criteria at all.
It is going to walk through every record in the file, in order.
And that's the sticky wicket. That whole in order business.
The disk drive controller has no clue on linked frames, but it *will* do 
optimistic look aheads for you.
So you are much better off, for BASIC SELECTs having nothing in overflow, at 
all. :)
That way, when you go to ask for the *next* frame, it will always be 
contiguous, and already sitting in memory.








-Original Message-
From: Rick Nuckolls r...@lynden.com
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Thu, Jul 5, 2012 4:43 pm
Subject: Re: [U2] RESIZE - dynamic files


Most disks and disk systems cache huge amounts of information these days, and, 
epending on 20 factors or so, one solution will be better than another for a 
iven file.
For the wholesale, SELECT F WITH, The fewest disk records will almost 
always 
in. For files that have ~10 records/group and have ~10% of the groups 
verflowed, then perhaps 1% of record reads will do a second read for the 
verflow buffer because the target key was not in the primary group.  Writing a 
ew record would possibly hit the 10% mark for reading overflow buffers. But 
owering the split.load will increase the number of splits slightly, and 
ncrease the total number of groups considerably.  What you have shown is that 
ou need to increase the the modulus (and select time) of a large file more than 
0% in order to decrease the read and update times for you records 0.5% of the 
ime (assuming, that you have only reduced the number of overflow groups by 
50%.)
As Charles suggests, this is an interesting exercise, but your actual results 
ill rapidly change if you actually add /remove records from your file, change 
he load or number of files on your system, put in a new drive, cpu, memory 
oard, or install a new release of Universe, move to raid, etc.
-Rick
-Original Message-
rom: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Wjhonson
ent: Thursday, July 05, 2012 2:38 PM
o: u2-users@listserver.u2ug.org
ubject: Re: [U2] RESIZE - dynamic files

he hardward look ahead of the disk drive reader will grab consecutive 
frames into memory, since it assumes you'll want the next frame next.
o the less overflow you have, the faster a full file scan will become.
t least that's my theory ;)


Original Message-
rom: Rick Nuckolls r...@lynden.com
o: 'U2 Users List' u2-users@listserver.u2ug.org
ent: Thu, Jul 5, 2012 2:29 pm
ubject: Re: [U2] RESIZE - dynamic files

hris,
or the type of use that you described earlier; BASIC selects and reads, 
ducing overflow will have negligible performance benefit, especially compared 
 changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file in 
latively small percentages, then it will never merge anyway (because you will 
ed to delete 20-30% of the file for that to happen with the mergeload at 50%, 
 your optimum minimum modulus solution will probably be how ever large it 
ows  The overhead for a group split is not as bad as it sounds unless your 
dates/sec count is extremely high, such as during a copy.
f you do regular SELECT and SCANS of the entire file, then your goal should be 
 reduce the total disk size of the file, and not worry much about common 
erflow. The important thing is that the file is dynamic, so you will never 
counter the issues that undersized statically hashed files develop.
e have thousands of dynamically hashed files on our (Solaris) systems, with an 
tremely low problem rate.
ick
Original Message-
om: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
n Behalf Of Chris Austin
nt: Thursday, July 05, 2012 11:21 AM
: u2-users@listserver.u2ug.org
bject: Re: [U2] RESIZE - dynamic files
ick,
ou are correct, I should be using the smaller size (I just haven't changed it 
t). Based on the reading I have done you should
ly use the larger group size when the average record size is greater than 1000 
tes. 
s far as being better off with the defaults that's basically what I'm trying to 
est (as well as learn how linear hashing works). I was able
 reduce my overflow by 18% and I only 

Re: [U2] RESIZE - dynamic files

2012-07-05 Thread Chris Austin

That's what we use, 'BASIC SELECT' statements for this table, looping through 
records to build reports. It's an accounting table that has about 200-300 
records WRITES a day, with an average
of about ~250 bytes per record. We obviously have more READ operations since we 
are always building up these reports so I was hoping my #'s looked right. 

1) I reduced overflow by 18%.
2) I only increased file size ~8%.

So we do a combination of BASIC SELECTS and WRITES. Everything is done in the 
latest version of Rocket's Universe, PICK using BASIC for our programs that 
contain the SELECTS.

Chris


 To: u2-users@listserver.u2ug.org
 From: wjhon...@aol.com
 Date: Thu, 5 Jul 2012 20:12:21 -0400
 Subject: Re: [U2] RESIZE - dynamic files
 
 
 A BASIC SELECT cannot use criteria at all.
 It is going to walk through every record in the file, in order.
 And that's the sticky wicket. That whole in order business.
 The disk drive controller has no clue on linked frames, but it *will* do 
 optimistic look aheads for you.
 So you are much better off, for BASIC SELECTs having nothing in overflow, at 
 all. :)
 That way, when you go to ask for the *next* frame, it will always be 
 contiguous, and already sitting in memory.
 
 
 
 
 
 
 
 
 -Original Message-
 From: Rick Nuckolls r...@lynden.com
 To: 'U2 Users List' u2-users@listserver.u2ug.org
 Sent: Thu, Jul 5, 2012 4:43 pm
 Subject: Re: [U2] RESIZE - dynamic files
 
 
 Most disks and disk systems cache huge amounts of information these days, 
 and, 
 epending on 20 factors or so, one solution will be better than another for a 
 iven file.
 For the wholesale, SELECT F WITH, The fewest disk records will almost 
 always 
 in. For files that have ~10 records/group and have ~10% of the groups 
 verflowed, then perhaps 1% of record reads will do a second read for the 
 verflow buffer because the target key was not in the primary group.  Writing 
 a 
 ew record would possibly hit the 10% mark for reading overflow buffers. But 
 owering the split.load will increase the number of splits slightly, and 
 ncrease the total number of groups considerably.  What you have shown is that 
 ou need to increase the the modulus (and select time) of a large file more 
 than 
 0% in order to decrease the read and update times for you records 0.5% of the 
 ime (assuming, that you have only reduced the number of overflow groups by 
 50%.)
 As Charles suggests, this is an interesting exercise, but your actual results 
 ill rapidly change if you actually add /remove records from your file, change 
 he load or number of files on your system, put in a new drive, cpu, memory 
 oard, or install a new release of Universe, move to raid, etc.
 -Rick
 -Original Message-
 rom: u2-users-boun...@listserver.u2ug.org 
 [mailto:u2-users-boun...@listserver.u2ug.org] 
 n Behalf Of Wjhonson
 ent: Thursday, July 05, 2012 2:38 PM
 o: u2-users@listserver.u2ug.org
 ubject: Re: [U2] RESIZE - dynamic files
 
 he hardward look ahead of the disk drive reader will grab consecutive 
 frames into memory, since it assumes you'll want the next frame next.
 o the less overflow you have, the faster a full file scan will become.
 t least that's my theory ;)
 
 
 Original Message-
 rom: Rick Nuckolls r...@lynden.com
 o: 'U2 Users List' u2-users@listserver.u2ug.org
 ent: Thu, Jul 5, 2012 2:29 pm
 ubject: Re: [U2] RESIZE - dynamic files
 
 hris,
 or the type of use that you described earlier; BASIC selects and reads, 
 ducing overflow will have negligible performance benefit, especially compared 
  changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file in 
 latively small percentages, then it will never merge anyway (because you will 
 ed to delete 20-30% of the file for that to happen with the mergeload at 50%, 
  your optimum minimum modulus solution will probably be how ever large it 
 ows  The overhead for a group split is not as bad as it sounds unless your 
 dates/sec count is extremely high, such as during a copy.
 f you do regular SELECT and SCANS of the entire file, then your goal should 
 be 
  reduce the total disk size of the file, and not worry much about common 
 erflow. The important thing is that the file is dynamic, so you will never 
 counter the issues that undersized statically hashed files develop.
 e have thousands of dynamically hashed files on our (Solaris) systems, with 
 an 
 tremely low problem rate.
 ick
 Original Message-
 om: u2-users-boun...@listserver.u2ug.org 
 [mailto:u2-users-boun...@listserver.u2ug.org] 
 n Behalf Of Chris Austin
 nt: Thursday, July 05, 2012 11:21 AM
 : u2-users@listserver.u2ug.org
 bject: Re: [U2] RESIZE - dynamic files
 ick,
 ou are correct, I should be using the smaller size (I just haven't changed it 
 t). Based on the reading I have done you should
 ly use the larger group size when the average record size is greater than 
 1000 
 tes. 
 s far as being better off with the defaults that's basically what I'm trying 
 to 
 est