I have had some good experiences with KEYDATA. But I think KEYDATA will use slightly more disk space.
The only time I had problems with KEYDATA was when the keys to the file did not hash well (meaningful keys with a '#' separator followed by a 2 digit number). If you have sequential keys, this should not be a problem. I think KEYDATA works better for dynamic files that have a small number of characters for a key and a large record. I don't see your key size info here. I don't think your current configuration will result in splitting. You probably never get enough keys into a group to cause a split. I like to look at the UNIX listing of a dynamic file to see the relation of datxxx to overxx files. I have one KEYDATA dynamic file (our largest), that has 22 datxxx files and one over001 file. The size of the over001 file is 746 meg. So 99% of the data is in the dat portion of the file where the hashing should be able to get to it quickly. When I first converted this file to KEYDATA, the result of the rebuild had 19 datxxx files and one over001. So splitting has added 3 more dats, dat020, dat021 and dat022. The file pre KEYDATA was 8 dat's and 8 over's. Performance was not good prior to the KEYDATA conversion. It is very good now that I have it as KEYDATA, as almost none of the file is in overflow. But that said, KEYDATA can have it's pitfalls. The bad file I had would split and split continually. It was sucking up major disk space. And when I looked at the GROUP.STAT, I would see chunks of groups with zero records in them. The meaningful keys would never hash to the new groups that we added to the file. I tried HASH TYPE 1, but that was even worse. So I went back to KEYONLY for those files. One other thing I have found with dynamic files. (I guess this would apply to large static files as well). Indexes that grow a lot need to be rebuilt from scratch periodically. The file from above with 22 dats was just rebuilt. It started out with 7 idx segments. After the rebuild of the indexes, there were only 5 idx files. So the indexes compressed down and saved me close to 4 gig in space. I would experiment with KEYDATA with the file you sent info on. I think it looks like a good candidate. Especially if the file has a lot of data in the overxxx part of the file vs. the datxxx. The trade off is performance vs. more disk space used. This probably has gone on long enough. I could branch into a discussion about why I don't use memresize on dynamic files. But I think I covered that a while ago. -Rod -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jeffrey Butera I' Sent: Monday, October 23, 2006 8:10 AM To: u2-users@listserver.u2ug.org Subject: [U2] Unidata dynamic file tuning I'm seeking advice on tuning a dynamic file in Unidata. In short, the file has a numeric (sequential) key and contains only about 10 small fields: date, time, user, etc and a single 'large' field: a block of text. The text can be anywhere from one sentence to pages in length, hence I know this is going to be a lumpy file as the records will vary wildly in size. Records will never be deleted, and we add about 4000 new records/month. I converted this from static to dynamic knowing I'd hit the 2gig limit in a couple of years. The file is behaving fine, I'm just trying to see if I can find some better parameters for it, and a dynamic files are my weakest area in Unidata. Here are the current parameters: File name(Dynamic File) = H08.CR.FF.TEXT Number of groups in file (modulo) = 15511 Dynamic hashing, hash type = 0 Split/Merge type = KEYONLY Block size = 4096 File has 15294 groups in level one overflow. Number of records = 133578 Total number of bytes = 180532137 Average number of records per group = 8.6 Standard deviation from average = 1.1 Average number of bytes per group = 11639.0 Standard deviation from average = 6392.6 Average number of bytes in a record = 1351.5 Average number of bytes in record ID = 6.4 Standard deviation from average = 2080.9 Minimum number of bytes in a record = 51 Maximum number of bytes in a record = 105864 Minimum number of fields in a record = 14 Maximum number of fields in a record = 14 Average number of fields per record = 14.0 In particular, I'm curious about better choices for split/merge loads: Dynamic File name = H08.CR.FF.TEXT Number of groups in file (modulo) = 15511 Minimum groups of file = 15511 Hash type = 0, blocksize = 4096 Split load = 20, Merge load = 10 Split/Merge type = KEYONLY Any insight or comments appreciated. -- Jeff Butera, Ph.D. Administrative Systems Hampshire College [EMAIL PROTECTED] 413-559-5556 "...our behavior matters more than the beliefs that we profess." Elizabeth Deutsch Earle ------- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/ ------- u2-users mailing list u2-users@listserver.u2ug.org To unsubscribe please visit http://listserver.u2ug.org/