#972: bibreformat to recreate cache for modified formats
------------------------+------------------------
Reporter: skaplun | Owner:
Type: enhancement | Status: new
Priority: minor | Component: BibFormat
Version: | Resolution:
Keywords: |
------------------------+------------------------
Comment (by jcaffaro):
Here are some of the comments discussed IRL about this task:
* one would have to keep track of the modification time of the templates
and output formats: the formats are currently stored as files on disk, and
relying solely on the modification timestamp might trigger false positives
(in case files are just "touched", via a "make install", etc.). An
implementation would for eg. have to maintain a "hash" of each template
and output format to realize if it was modified or not. (Do not forget
that one has to consider that formats (output+templates) can be modified
both from the admin interface and via CLI)
* such an implementation would usually reformat more records that needed,
due to the lack of knowledge of what has changed in the format (some
update might concerns only a fraction of the records using the modified
format: an admin would know which records would not need be reformatted,
if any, while an automated procedure would do some brute-force
reformatting). Modifications can sometimes even result in no change in the
output (when doing some code cleaning, etc.).
* such an implementation would have to consider the possibility to delay
the reformatting at more appropriate hours of the day: currently long
reformatting tasks are scheduled by the admin to run at time when the load
of the system is lower.
* it might be tricky (and computationally heavy) to automatically select
the records that need to be updated: due to the rule-based system behind
output formats, it is easy to retrieve the template to apply to a given
record, but less straightforward to retrieve the records that are
concerned by a given template (This task is similar to the assignment of
records to collection by WebColl, excepted that a record should only
belong to the first collection that matches it. One could also imagine
keeping track for each record of the template that was used to format it,
but it would miss any format template re-assignment after a modification
of an output format).
* because of the above, it should be thought thoroughly if such a
behaviour should be a default of bibreformat, or if there should some
other option to be used with bibreformat to enable that behaviour. For eg.
with:
{{{
$ bibreformat --update-modified-formats --runtime="2002-10-27 23:00:00"
}}}
which would be run by the admin after a template or output format is
modified. It should also be considered if such an option would be useful,
or would be anyway bypassed by the admin by just calling a standard
bibreformatting for the specifically affected records. Alternatively, as
proposed IRL, it might be enough to remind the admin about the need to run
bibreformat after each modification of a template through the web
interface.
* Such an implementation would have to be considered in view of a future
possible replacement of bibreformat by a lazy reformatting/caching of the
records.
--
Ticket URL: <http://invenio-software.org/ticket/972#comment:1>
Invenio <http://invenio-software.org>