On Fri, 16 Mar 2012, Giovanni Di Milia wrote: > I also checked the database where we are uploading all 10M records and > right now, after almost 6M records uploaded, the same files are 1.4G > for the MYD and 1.2G for the MYI.
Thanks for your tests. Seems perfectly reasonable given your instance size and your available RAM size. For MARC field values that vary considerably from the beginning such as titles, or for values that are short such as authors, it does not matter too much how far we increase the limit globally. For big tables it may give some increase; e.g. for the abstract field (bib52x) on the CDS instance of Invenio, indexing 35 vs 85 leading characters raised the index size from 47M to 84M, but this is totally acceptable. The total size of all bib[0-9][0-9]x.MYI indexes for the CDS instance is 427 MB; for the INSPIRE instance it is 597M. If we increase the indexing to say 100 leading characters, the indexes may go up to 800M or thereabouts, I would guess, which is totally acceptable. So I'd say let's increase the limit for all bibxxx tables globally indeed, from 35 to say 100, which should give us better-prepared Invenio defaults for an instance with `unknown' situations of any generic value list going into any generic MARC tags. BTW, you seem to have a limit of 200 for bib99x now. I think your URLs may all fit well into 100, isn't it? Can you please do one more test: CREATE TABLE test_bib99x LIKE bib99x; ALTER TABLE test_bib99x DROP INDEX kv, ADD INDEX kv (value(100)); INSERT INTO test_bib99x SELECT * FROM bib99x; OPTIMIZE TABLE bib99x; OPTIMIZE TABLE test_bib99x; and check sizes of `bib99x.MYI' and `test_bib99x.MYI' tables? (And perhaps also check the querying/insertion speed, if time permits.) (E.g. on the INSPIRE TEST instance having 2M rows in bib99x, going from 35 to 100 for references increased bib99x.MYI from 50M to 53M only; but we don't store there big similar URLs like you do.) Summa summarum, I'll commit a global change to have the new kv index default value of 100 everywhere. Best regards -- Tibor Simko

