Hi Tibor,
> If bibxxx values are bad, while collectionname values are good, then I'm
> afraid the only reasonable solution is to re-populate bibxxx tables via
> `bibupload -r foo.xml'.  A 60k record job should be OK for a night.  

This solution worked, thanks for the support! 
The script I used for the generation of the XML files: 
https://documents.epfl.ch/groups/i/in/infoscience-dev/www/scripts/clean_indexes.py
And to truncate all indexes: 
https://documents.epfl.ch/groups/i/in/infoscience-dev/www/scripts/clean_indexes.py

I followed the bibindex documentation to regenerate all the indexes after the 
bibupload process (2 hours)


> If you have similar URL values, then you may want to enlarge indexes of
> some bibxxx tables for much faster bibuploading, see 35 vs 100 key
> limit:
> 
> | CREATE TABLE IF NOT EXISTS bib84x (
> |   id mediumint(8) unsigned NOT NULL auto_increment,
> |   tag varchar(6) NOT NULL default '',
> |   value text NOT NULL,
> |   PRIMARY KEY  (id),
> |   KEY kt (tag),
> |   KEY kv (value(35))
> | ) TYPE=MyISAM;
> | 
> | CREATE TABLE IF NOT EXISTS bib85x (
> |   id mediumint(8) unsigned NOT NULL auto_increment,
> |   tag varchar(6) NOT NULL default '',
> |   value text NOT NULL,
> |   PRIMARY KEY  (id),
> |   KEY kt (tag),
> |   KEY kv (value(100)) -- URLs need usually a larger index for speedy lookups
> | ) TYPE=MyISAM;
> 

Good idea, thanks for the suggestion!

Best regards,
Greg

> Best regards
> -- 
> Tibor Simko ** CERN Document Server ** <http://cds.cern.ch/>

____________________________________________________________________

Gregory Favre
Coordinateur Infoscience
École Polytechnique Fédérale de Lausanne
KIS - DIT
Case Postale 121
CH-1015 Lausanne
+41 21 693 22 88
+ 41 79 599 09 06
[email protected]
http://plan.epfl.ch/?sciper=128933
____________________________________________________________________




Reply via email to