Hi All
I am trying to resize a large dynamic Unidata file for a customer but am
struggling to determine the best modulo/seperation figures to use. I
was under the impression that I should try and minimise the amount of
overflow files but my latest attempt just seems to have made the file
worse. The records are historical going back a number of years and vary
widely in size.
Because we have had issues in the past with trying to resize large files
our standard practice now is to create a new file including indexes with
the new modulo/separation and copy all the records into it.
The original file had 26,521,431 records and was 62 GB in size with this
file structure 25/2/33 (dat/idx/overflow). The new file was created
with a modulo/separator of 1300021/4 and having copied roughly 1/2 the
data I now have a file with 15,442,816 records, 28 GB and 5/1/22
(dat/idx/overflow).
The formula I used to get the new figures was
Records per Block = (file block size - pointer array) / (Average record
length + Standard deviation from average + Average key length + 9)
modulo = Total number of records / records per block
I have paused the copy for now because I would like to know whether I
should start again or just continue. Any thoughts/assistance would be
appreciated. The GUIDE.STATS.LIS from the new file is below
Regards
Andrew
Basic statistics:
File type............................... Dynamic Hashing
File size
[dat001].............................. 1073737728
...
[dat005].............................. 1029955584
[over001]............................. 1073737728
...
[over022]............................. 861589504
File modulo............................. 1300021
File minimum modulo..................... 1300021
File split factor....................... 60
File merge factor....................... 40
File hash type.......................... 0
File block size......................... 4096
Free blocks in overflow file(s)......... 13
Group count:
Number of level 1 overflow groups....... 5715317
Primary groups in level 1 overflow...... 1298667
Record count:
Total number of records................. 15442816
Average number of records per group..... 11.88
Standard deviation from average......... 3.55
Record length:
Average record length................... 55.46
Standard deviation from average......... 1511.47
Key length:
Average key length...................... 13.88
Standard deviation from average......... 0.76
Data size:
Average data size....................... 79.34
Standard deviation from average......... 1533.38
Total data size......................... 1225203127
-------
u2-users mailing list
[email protected]
To unsubscribe please visit http://listserver.u2ug.org/