#671: BibUpload: optional use of bibxxx tables
--------------------------+----------------------
  Reporter:  simko        |      Owner:  bthiell
      Type:  enhancement  |     Status:  in_merge
  Priority:  major        |  Milestone:
 Component:  BibUpload    |    Version:
Resolution:               |   Keywords:
--------------------------+----------------------
Changes (by bthiell):

 * status:  assigned => in_merge


Comment:

 First version of this is available on my Github:
 https://github.com/badzil/Invenio/tree/light_bibupload

 The configuration variable CFG_BIBUPLOAD_BIBXXX_TAGS is a comma-separated
 list of tags which are handled at upload time. It is recommended to keep
 storing 035, 037, 970 and 980 to allow bibupload and webcoll to run
 correctly. Depending on the collections' dbqueries, other tags might be
 necessary. If this variable is left empty, then Invenio will run normally,
 i.e. store all tags. This should remain the default behavior.

 At index time, bibindex first populates the bibxxx tables and then
 continues with its regular business. I've added my code to
 bibindex.bibindex_bibxxx_manager and only a call to this in
 bibindex.bibindex_engine.

 I tested the regular Invenio and the fast upload and the results are
 consistent:
 * regular upload took 16 minutes.
 * fast upload took 4 minutes and populating the bibxxx tables took 12
 minutes.
 Same total time but the shorter upload time allows us to move quicker with
 the initial upload while indexing our metadata in Solr.

 Comments are welcome.

-- 
Ticket URL: <https://invenio-software.org/ticket/671#comment:3>
Invenio <http://invenio-software.org>

Reply via email to