So is there a performance increase in BASIC SELECTS by reducing overflow? Some people are saying to reduce disk space to speed up the BASIC SELECT while others say to reduce overflow.. I'm a bit confused. All of our programs that read that table use a BASIC SELECT WITH..
for a BASIC select do you gain anything by reducing overflow? Chris > To: u2-users@listserver.u2ug.org > From: wjhon...@aol.com > Date: Thu, 5 Jul 2012 20:12:21 -0400 > Subject: Re: [U2] RESIZE - dynamic files > > > A BASIC SELECT cannot use criteria at all. > It is going to walk through every record in the file, in order. > And that's the sticky wicket. That whole "in order" business. > The disk drive controller has no clue on linked frames, but it *will* do > optimistic look aheads for you. > So you are much better off, for BASIC SELECTs having nothing in overflow, at > all. :) > That way, when you go to ask for the *next* frame, it will always be > contiguous, and already sitting in memory. > > > > > > > > > -----Original Message----- > From: Rick Nuckolls <r...@lynden.com> > To: 'U2 Users List' <u2-users@listserver.u2ug.org> > Sent: Thu, Jul 5, 2012 4:43 pm > Subject: Re: [U2] RESIZE - dynamic files > > > Most disks and disk systems cache huge amounts of information these days, > and, > epending on 20 factors or so, one solution will be better than another for a > iven file. > For the wholesale, SELECT F WITH...., The fewest disk records will almost > always > in. For files that have ~10 records/group and have ~10% of the groups > verflowed, then perhaps 1% of record reads will do a second read for the > verflow buffer because the target key was not in the primary group. Writing > a > ew record would possibly hit the 10% mark for reading overflow buffers. But > owering the split.load will increase the number of splits slightly, and > ncrease the total number of groups considerably. What you have shown is that > ou need to increase the the modulus (and select time) of a large file more > than > 0% in order to decrease the read and update times for you records 0.5% of the > ime (assuming, that you have only reduced the number of overflow groups by > 50%.) > As Charles suggests, this is an interesting exercise, but your actual results > ill rapidly change if you actually add /remove records from your file, change > he load or number of files on your system, put in a new drive, cpu, memory > oard, or install a new release of Universe, move to raid, etc. > -Rick > -----Original Message----- > rom: u2-users-boun...@listserver.u2ug.org > [mailto:u2-users-boun...@listserver.u2ug.org] > n Behalf Of Wjhonson > ent: Thursday, July 05, 2012 2:38 PM > o: u2-users@listserver.u2ug.org > ubject: Re: [U2] RESIZE - dynamic files > > he hardward "look ahead" of the disk drive reader will grab consecutive > frames" into memory, since it assumes you'll want the "next" frame next. > o the less overflow you have, the faster a full file scan will become. > t least that's my theory ;) > > > ----Original Message----- > rom: Rick Nuckolls <r...@lynden.com> > o: 'U2 Users List' <u2-users@listserver.u2ug.org> > ent: Thu, Jul 5, 2012 2:29 pm > ubject: Re: [U2] RESIZE - dynamic files > > hris, > or the type of use that you described earlier; BASIC selects and reads, > ducing overflow will have negligible performance benefit, especially compared > changing the GROUP.SIZE back to 1 (2048) bytes. If you purge the file in > latively small percentages, then it will never merge anyway (because you will > ed to delete 20-30% of the file for that to happen with the mergeload at 50%, > your optimum minimum modulus solution will probably be "how ever large it > ows" The overhead for a group split is not as bad as it sounds unless your > dates/sec count is extremely high, such as during a copy. > f you do regular SELECT and SCANS of the entire file, then your goal should > be > reduce the total disk size of the file, and not worry much about common > erflow. The important thing is that the file is dynamic, so you will never > counter the issues that undersized statically hashed files develop. > e have thousands of dynamically hashed files on our (Solaris) systems, with > an > tremely low problem rate. > ick > ----Original Message----- > om: u2-users-boun...@listserver.u2ug.org > [mailto:u2-users-boun...@listserver.u2ug.org] > n Behalf Of Chris Austin > nt: Thursday, July 05, 2012 11:21 AM > : u2-users@listserver.u2ug.org > bject: Re: [U2] RESIZE - dynamic files > ick, > ou are correct, I should be using the smaller size (I just haven't changed it > t). Based on the reading I have done you should > ly use the larger group size when the average record size is greater than > 1000 > tes. > s far as being better off with the defaults that's basically what I'm trying > to > est (as well as learn how linear hashing works). I was able > reduce my overflow by 18% and I only increased my empty groups by a very > all amount as well as only increased my file size > 8%. This in theory should be better for reads/writes than what I had before. > o test the performance I need to write a ton of records and then capture the > tput and compare the output using timestamps. > hris > From: r...@lynden.com > To: u2-users@listserver.u2ug.org > Date: Thu, 5 Jul 2012 09:22:02 -0700 > Subject: Re: [U2] RESIZE - dynamic files > > Chis, > > I still am wondering what is prompting you to continue using the larger group > ze. > > I think that Martin, and the UV documentation is correct in this case; you > uld be as well or better off with the defaults. > > -Rick > > On Jul 5, 2012, at 9:13 AM, "Martin Phillips" <martinphill...@ladybridge.com> > ote: > coming > > Hi, > > > > The various suggestions about setting the minimum modulus to reduce > > overflow > e all very well but effectively you are turning a > > dynamic file into a static one, complete with all the continual maintenance > rk needed to keep the parameters in step with the > > data. > > > > In most cases, the only parameter that is worth tuning is the group size to > y to pack things nicely. Even this is often fine left > > alone though getting it to match the underlying o/s page size is helpful. > > > > I missed the start of this thread but, unless you have a performance > > problem > are seriously short of space, my recommendation > > would be to leave the dynamic files to look after themselves. > > > > A file without overflow is not necessarily the best solution. Winding the > lit load down to 70% means that at least 30% of the file > > is dead space. The implication of this is that the file is larger and will > ke more disk reads to process sequentially from one end > > to the other. > > > > > > Martin Phillips > > Ladybridge Systems Ltd > > 17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England > > +44 (0)1604-709200 > > > > > > > > -----Original Message----- > > From: u2-users-boun...@listserver.u2ug.org > > [mailto:u2-users-boun...@listserver.u2ug.org] > n Behalf Of Chris Austin > > Sent: 05 July 2012 15:19 > > To: u2-users@listserver.u2ug.org > > Subject: Re: [U2] RESIZE - dynamic files > > > > > > I was able to drop from 30% overflow to 12% by making 2 changes: > > > > 1) changed the split from 80% to 70% (that alone reduce 10% overflow) > > 2) changed the MINIMUM.MODULUS to 118,681 (calculated this way -> [ (record > ta + id) * 1.1 * 1.42857 (70% split load)] / 4096 ) > > > > My disk size only went up 8%.. > > > > My file looks like this now: > > > > File name .................. GENACCTRN_POSTED > > Pathname ................... GENACCTRN_POSTED > > File type .................. DYNAMIC > > File style and revision .... 32BIT Revision 12 > > Hashing Algorithm .......... GENERAL > > No. of groups (modulus) .... 118681 current ( minimum 118681, 140 empty, > > 14431 overflowed, 778 badly ) > > Number of records .......... 1292377 > > Large record size .......... 3267 bytes > > Number of large records .... 180 > > Group size ................. 4096 bytes > > Load factors ............... 70% (split), 50% (merge) and 63% (actual) > > Total size ................. 546869248 bytes > > Total size of record data .. 287789178 bytes > > Total size of record IDs ... 21539538 bytes > > Unused space ............... 237532340 bytes > > Total space for records .... 546861056 bytes > > > > Chris > > > > > > > >> From: keith.john...@datacom.co.nz > >> To: u2-users@listserver.u2ug.org > >> Date: Wed, 4 Jul 2012 14:05:02 +1200 > >> Subject: Re: [U2] RESIZE - dynamic files > >> > >> Doug may have had a key bounce in his input > >> > >>> Let's do the math: > >>> > >>> 258687736 (Record Size) > >>> 192283300 (Key Size) > >>> ======== > >> > >> The key size is actually 19283300 in Chris' figures > >> > >> Regarding 68,063 being less than the current modulus of 82,850. I think > e answer may lie in the splitting process. > >> > >> As I understand it, the first time a split occurs group 1 is split and its > ntents are split between new group 1 and new group 2. > > All the other groups effectively get 1 added to their number. The next > > split > group 3 (which was 2) into 3 and 4 and so forth. A > > pointer is kept to say where the next split will take place and also to > > help > rt out how to adjust the algorithm to identify which > > group matches a given key. > >> > >> Based on this, if you started with 1000 groups, by the time you have split > e 500th time you will have 1500 groups. The first > > 1000 will be relatively empty, the last 500 will probably be overflowed, > > but > t terribly badly. By the time you get to the 1000th > > split, you will have 2000 groups and they will, one hopes, be quite > asonably spread with very little overflow. > >> > >> So I expect the average access times would drift up and down in a cycle. > e cycle time would get longer as the file gets bigger > > but the worst time would be roughly the the same each cycle. > >> > >> Given the power of two introduced into the algorithm by the before/after > e split thing, I wonder if there is such a need to > > start off with a prime? > >> > >> Regards, Keith > >> > >> PS I'm getting a bit Tony^H^H^H^Hverbose nowadays. > >> > >> _______________________________________________ > >> U2-Users mailing list > >> U2-Users@listserver.u2ug.org > >> http://listserver.u2ug.org/mailman/listinfo/u2-users > > > > _______________________________________________ > > U2-Users mailing list > > U2-Users@listserver.u2ug.org > > http://listserver.u2ug.org/mailman/listinfo/u2-users > > > > _______________________________________________ > > U2-Users mailing list > > U2-Users@listserver.u2ug.org > > http://listserver.u2ug.org/mailman/listinfo/u2-users > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users > > _____________________________________________ > -Users mailing list > -us...@listserver.u2ug.org > tp://listserver.u2ug.org/mailman/listinfo/u2-users > _____________________________________________ > -Users mailing list > -us...@listserver.u2ug.org > tp://listserver.u2ug.org/mailman/listinfo/u2-users > _______________________________________________ > 2-Users mailing list > 2-us...@listserver.u2ug.org > ttp://listserver.u2ug.org/mailman/listinfo/u2-users > ______________________________________________ > 2-Users mailing list > 2-us...@listserver.u2ug.org > ttp://listserver.u2ug.org/mailman/listinfo/u2-users > > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users