Re: [U2] RESIZE - dynamic files

Susan Lynch Fri, 06 Jul 2012 11:08:43 -0700

Chris,

10 years ago, when I was administering a UniVerse system, the answer would
have been "minimize both to the best of your ability".  But I don't know
how UniVerse has changed in the interim, during which time I have been
working on UniData systems, which are enormously different in their
handling of records in groups from any other Pick-type system I have ever
worked on (all of which were much more similar to UniVerse at that time).
And when last I administered a UniVerse system, there were no dynamic
files..


With that caveat, here are the factors:

1) a record in a UniVerse file that is stored in overflow is going to take
2 or more disk reads to retrieve if you are retrieving it by id.  However,
in a Basic select (structured as in Will's example, with no quotes, no
"WITH" criteria), the system will walk through the file group by group,
and will read each record, so yes, it will take 2 (or more, depending on
how deeply that group is in overflow) reads to get the data, but it will
have done the first read anyway to read those records - so for the Basic
SELECT, you probably want to minimize the number of groups read to the
extent that you can do so without putting many of the groups into
overflow.

2) to add records to the file, you have to access the file by the record
id, which means hashing the id to the group, then walking through the
group to see if the id is already in use, and if not, adding the record to
the end of the data area in use.  So for that, you absolutely want to
minimize the amount of overflow, because overflow slows you down on the
'adds'.

3) any sort/select or query read of the database will be slowed down
significantly by overflow, but you said you don't do much of that anyway.

Susan M. Lynch
F. W. Davison & Company, Inc.
-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
Sent: 07/06/2012 12:56 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] RESIZE - dynamic files


So is there a performance increase in BASIC SELECTS by reducing overflow?
Some people are saying to reduce disk space to speed up the BASIC SELECT
while others say to reduce overflow.. I'm a bit confused. All of our
programs that read that table use a BASIC SELECT WITH..

for a BASIC select do you gain anything by reducing overflow?

Chris


> To: u2-users@listserver.u2ug.org
> From: wjhon...@aol.com
> Date: Thu, 5 Jul 2012 20:12:21 -0400
> Subject: Re: [U2] RESIZE - dynamic files
>
>
> A BASIC SELECT cannot use criteria at all.
> It is going to walk through every record in the file, in order.
> And that's the sticky wicket. That whole "in order" business.
> The disk drive controller has no clue on linked frames, but it *will* do
optimistic look aheads for you.
> So you are much better off, for BASIC SELECTs having nothing in
overflow, at all. :)
> That way, when you go to ask for the *next* frame, it will always be
contiguous, and already sitting in memory.
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Rick Nuckolls <r...@lynden.com>
> To: 'U2 Users List' <u2-users@listserver.u2ug.org>
> Sent: Thu, Jul 5, 2012 4:43 pm
> Subject: Re: [U2] RESIZE - dynamic files
>
>
> Most disks and disk systems cache huge amounts of information these
days, and,
> epending on 20 factors or so, one solution will be better than another
for a
> iven file.
> For the wholesale, SELECT F WITH...., The fewest disk records will
almost always
> in. For files that have ~10 records/group and have ~10% of the groups
> verflowed, then perhaps 1% of record reads will do a second read for the

> verflow buffer because the target key was not in the primary group.
Writing a
> ew record would possibly hit the 10% mark for reading overflow buffers.
But
> owering the split.load will increase the number of splits slightly, and
> ncrease the total number of groups considerably.  What you have shown is
that
> ou need to increase the the modulus (and select time) of a large file
more than
> 0% in order to decrease the read and update times for you records 0.5%
of the
> ime (assuming, that you have only reduced the number of overflow groups
by
> 50%.)
> As Charles suggests, this is an interesting exercise, but your actual
results
> ill rapidly change if you actually add /remove records from your file,
change
> he load or number of files on your system, put in a new drive, cpu,
memory
> oard, or install a new release of Universe, move to raid, etc.
> -Rick
> -----Original Message-----
> rom: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
> n Behalf Of Wjhonson
> ent: Thursday, July 05, 2012 2:38 PM
> o: u2-users@listserver.u2ug.org
> ubject: Re: [U2] RESIZE - dynamic files
>
> he hardward "look ahead" of the disk drive reader will grab consecutive
> frames" into memory, since it assumes you'll want the "next" frame next.
> o the less overflow you have, the faster a full file scan will become.
> t least that's my theory ;)
>
>
> ----Original Message-----
> rom: Rick Nuckolls <r...@lynden.com>
> o: 'U2 Users List' <u2-users@listserver.u2ug.org>
> ent: Thu, Jul 5, 2012 2:29 pm
> ubject: Re: [U2] RESIZE - dynamic files
>
> hris,
> or the type of use that you described earlier; BASIC selects and reads,
> ducing overflow will have negligible performance benefit, especially
compared
>  changing the GROUP.SIZE back to 1 (2048) bytes.  If you purge the file
in
> latively small percentages, then it will never merge anyway (because you
will
> ed to delete 20-30% of the file for that to happen with the mergeload at
50%,
>  your optimum minimum modulus solution will probably be "how ever large
it
> ows"  The overhead for a group split is not as bad as it sounds unless
your
> dates/sec count is extremely high, such as during a copy.
> f you do regular SELECT and SCANS of the entire file, then your goal
should be
>  reduce the total disk size of the file, and not worry much about common

> erflow. The important thing is that the file is dynamic, so you will
never
> counter the issues that undersized statically hashed files develop.
> e have thousands of dynamically hashed files on our (Solaris) systems,
with an
> tremely low problem rate.
> ick
> ----Original Message-----
> om: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
> n Behalf Of Chris Austin
> nt: Thursday, July 05, 2012 11:21 AM
> : u2-users@listserver.u2ug.org
> bject: Re: [U2] RESIZE - dynamic files
> ick,
> ou are correct, I should be using the smaller size (I just haven't
changed it
> t). Based on the reading I have done you should
> ly use the larger group size when the average record size is greater
than 1000
> tes.
> s far as being better off with the defaults that's basically what I'm
trying to
> est (as well as learn how linear hashing works). I was able
>  reduce my overflow by 18% and I only increased my empty groups by a
very
> all amount as well as only increased my file size
>  8%. This in theory should be better for reads/writes than what I had
before.
> o test the performance I need to write a ton of records and then capture
the
> tput and compare the output using timestamps.
> hris
>  From: r...@lynden.com
> To: u2-users@listserver.u2ug.org
> Date: Thu, 5 Jul 2012 09:22:02 -0700
> Subject: Re: [U2] RESIZE - dynamic files
>
> Chis,
>
> I still am wondering what is prompting you to continue using the larger
group
> ze.
>
> I think that Martin, and the UV documentation is correct in this case;
you
> uld be as well or better off with the defaults.
>
> -Rick
>
> On Jul 5, 2012, at 9:13 AM, "Martin Phillips"
<martinphill...@ladybridge.com>
> ote:
> coming
> > Hi,
> >
> > The various suggestions about setting the minimum modulus to reduce
overflow
> e all very well but effectively you are turning a
> > dynamic file into a static one, complete with all the continual
maintenance
> rk needed to keep the parameters in step with the
> > data.
> >
> > In most cases, the only parameter that is worth tuning is the group
size to
> y to pack things nicely. Even this is often fine left
> > alone though getting it to match the underlying o/s page size is
helpful.
> >
> > I missed the start of this thread but, unless you have a performance
problem
>  are seriously short of space, my recommendation
> > would be to leave the dynamic files to look after themselves.
> >
> > A file without overflow is not necessarily the best solution. Winding
the
> lit load down to 70% means that at least 30% of the file
> > is dead space. The implication of this is that the file is larger and
will
> ke more disk reads to process sequentially from one end
> > to the other.
> >
> >
> > Martin Phillips
> > Ladybridge Systems Ltd
> > 17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
> > +44 (0)1604-709200
> >
> >
> >
> > -----Original Message-----
> > From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
> n Behalf Of Chris Austin
> > Sent: 05 July 2012 15:19
> > To: u2-users@listserver.u2ug.org
> > Subject: Re: [U2] RESIZE - dynamic files
> >
> >
> > I was able to drop from 30% overflow to 12% by making 2 changes:
> >
> > 1) changed the split from 80% to 70% (that alone reduce 10% overflow)
> > 2) changed the MINIMUM.MODULUS to 118,681 (calculated this way -> [
(record
> ta + id) * 1.1 * 1.42857 (70% split load)] / 4096 )
> >
> > My disk size only went up 8%..
> >
> > My file looks like this now:
> >
> > File name ..................   GENACCTRN_POSTED
> > Pathname ...................   GENACCTRN_POSTED
> > File type ..................   DYNAMIC
> > File style and revision ....   32BIT Revision 12
> > Hashing Algorithm ..........   GENERAL
> > No. of groups (modulus) ....   118681 current ( minimum 118681, 140
empty,
> >                                            14431 overflowed, 778 badly
)
> > Number of records ..........   1292377
> > Large record size ..........   3267 bytes
> > Number of large records ....   180
> > Group size .................   4096 bytes
> > Load factors ...............   70% (split), 50% (merge) and 63%
(actual)
> > Total size .................   546869248 bytes
> > Total size of record data ..   287789178 bytes
> > Total size of record IDs ...   21539538 bytes
> > Unused space ...............   237532340 bytes
> > Total space for records ....   546861056 bytes
> >
> > Chris
> >
> >
> >
> >> From: keith.john...@datacom.co.nz
> >> To: u2-users@listserver.u2ug.org
> >> Date: Wed, 4 Jul 2012 14:05:02 +1200
> >> Subject: Re: [U2] RESIZE - dynamic files
> >>
> >> Doug may have had a key bounce in his input
> >>
> >>> Let's do the math:
> >>>
> >>> 258687736 (Record Size)
> >>> 192283300 (Key Size)
> >>> ========
> >>
> >> The key size is actually 19283300 in Chris' figures
> >>
> >> Regarding 68,063 being less than the current modulus of 82,850.  I
think
> e answer may lie in the splitting process.
> >>
> >> As I understand it, the first time a split occurs group 1 is split
and its
> ntents are split between new group 1 and new group 2.
> > All the other groups effectively get 1 added to their number. The next
split
>  group 3 (which was 2) into 3 and 4 and so forth. A
> > pointer is kept to say where the next split will take place and also
to help
> rt out how to adjust the algorithm to identify which
> > group matches a given key.
> >>
> >> Based on this, if you started with 1000 groups, by the time you have
split
> e 500th time you will have 1500 groups.  The first
> > 1000 will be relatively empty, the last 500 will probably be
overflowed, but
> t terribly badly.  By the time you get to the 1000th
> > split, you will have 2000 groups and they will, one hopes, be quite
> asonably spread with very little overflow.
> >>
> >> So I expect the average access times would drift up and down in a
cycle.
> e cycle time would get longer as the file gets bigger
> > but the worst time would be roughly the the same each cycle.
> >>
> >> Given the power of two introduced into the algorithm by the
before/after
> e split thing, I wonder if there is such a need to
> > start off with a prime?
> >>
> >> Regards, Keith
> >>
> >> PS I'm getting a bit Tony^H^H^H^Hverbose nowadays.
> >>
> >> _______________________________________________
> >> U2-Users mailing list
> >> U2-Users@listserver.u2ug.org
> >> http://listserver.u2ug.org/mailman/listinfo/u2-users
> >
> > _______________________________________________
> > U2-Users mailing list
> > U2-Users@listserver.u2ug.org
> > http://listserver.u2ug.org/mailman/listinfo/u2-users
> >
> > _______________________________________________
> > U2-Users mailing list
> > U2-Users@listserver.u2ug.org
> > http://listserver.u2ug.org/mailman/listinfo/u2-users
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
>                               
> _____________________________________________
> -Users mailing list
> -us...@listserver.u2ug.org
> tp://listserver.u2ug.org/mailman/listinfo/u2-users
> _____________________________________________
> -Users mailing list
> -us...@listserver.u2ug.org
> tp://listserver.u2ug.org/mailman/listinfo/u2-users
> _______________________________________________
> 2-Users mailing list
> 2-us...@listserver.u2ug.org
> ttp://listserver.u2ug.org/mailman/listinfo/u2-users
> ______________________________________________
> 2-Users mailing list
> 2-us...@listserver.u2ug.org
> ttp://listserver.u2ug.org/mailman/listinfo/u2-users
>
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
                                        
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Re: [U2] RESIZE - dynamic files

Reply via email to