#972: bibreformat to recreate cache for modified formats
------------------------+------------------------
Reporter:  skaplun      |       Owner:
    Type:  enhancement  |      Status:  new
Priority:  minor        |   Component:  BibFormat
 Version:               |  Resolution:
Keywords:               |
------------------------+------------------------

Comment (by jcaffaro):

 Here are some of the comments discussed IRL about this task:

  * one would have to keep track of the modification time of the templates
 and output formats: the formats are currently stored as files on disk, and
 relying solely on the modification timestamp might trigger false positives
 (in case files are just "touched", via a "make install", etc.).  An
 implementation would for eg. have to maintain a "hash" of each template
 and output format to realize if it was modified or not. (Do not forget
 that one has to consider that formats (output+templates) can be modified
 both from the admin interface and via CLI)

  * such an implementation would usually reformat more records that needed,
 due to the lack of knowledge of what has changed in the format (some
 update might concerns only a fraction of the records using the modified
 format: an admin would know which records would not need be reformatted,
 if any, while an automated procedure would do some brute-force
 reformatting). Modifications can sometimes even result in no change in the
 output (when doing some code cleaning, etc.).

  * such an implementation would have to consider the possibility to delay
 the reformatting at more appropriate hours of the day: currently long
 reformatting tasks are scheduled by the admin to run at time when the load
 of the system is lower.

  * it might be tricky (and computationally heavy) to automatically select
 the records that need to be updated: due to the rule-based system behind
 output formats, it is easy to retrieve the template to apply to a given
 record, but less straightforward to retrieve the records that are
 concerned by a given template (This task is similar to the assignment of
 records to collection by WebColl, excepted that a record should only
 belong to the first collection that matches it. One could also imagine
 keeping track for each record of the template that was used to format it,
 but it would miss any format template re-assignment after a modification
 of an output format).

  * because of the above, it should be thought thoroughly if such a
 behaviour should be a default of bibreformat, or if there should some
 other option to be used with bibreformat to enable that behaviour. For eg.
 with:
 {{{
  $ bibreformat --update-modified-formats --runtime="2002-10-27 23:00:00"
 }}}
    which would be run by the admin after a template or output format is
 modified. It should also be considered if such an option would be useful,
 or would be anyway bypassed by the admin by just calling a standard
 bibreformatting for the specifically affected records. Alternatively, as
 proposed IRL, it might be enough to remind the admin about the need to run
 bibreformat after each modification of a template through the web
 interface.

  * Such an implementation would have to be considered in view of a future
 possible replacement of bibreformat by a lazy reformatting/caching of the
 records.

-- 
Ticket URL: <http://invenio-software.org/ticket/972#comment:1>
Invenio <http://invenio-software.org>

Reply via email to