#816: BibUpload: smart uploading of record changes via 005 revision diffing
-------------------------+----------------------
Reporter: simko | Owner: vvenkatr
Type: enhancement | Status: new
Priority: major | Milestone:
Component: BibUpload | Version:
Keywords: |
-------------------------+----------------------
Once we have automatic generation of MARC tag 005 containing record
revision identifier in place, see ticket:815, then we shall be able to
take advantage of the revision numbers at the upload time.
Consider a situation where a cataloguer C1 works on the long author
list of some record R in its revision R1, using an external editor.
After two days of cleaning the cataloguer submits his/her changes back
to the system. However, in the meantime another cataloguer C2 worked
on the keywords of the document, and yet another cataloguer C3 amended
the subject categories, so that at the moment the cataloguer C1
submits the author changes back, the record is already in its revision
R3. Currently, one has to use a careful combination of
append/correct/replace modes of various jobs in order not to overwrite
changes among themselves.
Consider another situation, using the usual record editor. Currently
BibEdit always uses "replace" mode, which necessitates record locking.
This may result in the queue blocking situations and an interesting
interplay between live editing of records and batch uploading of
various jobs touching the same record, while queue blockage should not
be always necessary.
The goal of this ticket is to take advantage of the revision
information stored in MARC tag 005 of the incoming MARCXML, once
ticket:815 is implemented. In our example, the cataloguer C1 submits
a file featuring revision R1 in its MARC tag 005. BibUpload can
compare existing revision R1 already present in the system against
changes wanted by C1, it can take a diff and apply it on top of the
latest revision R3, provided that there were no conflicts. (That is,
when all the fields touched by the job the cataloguer C1 submitted
against R1 were not touched in any later revision Rn.) In other words,
taking advantage of 005, bibupload would work internally as a git
differ/patcher between various versions and various wanted changes,
so to speak, the assumption being that is would be usually able to
resolve diffs/patches automatically without human intervention.
This has an advantage that one does not have to use carefully the
"replace" mode using BibEdit or external BatchUpload jobs, since it
would be the core of the uploading system that would figure out
changes wanted by various cataloguers to various revisions of the
record and that would apply them automatically whenever possible.
This will also limit the number of queue blockage situations, since it
will be be always necessary to block the editing queue.
(Note that this will have an effect on the "monotask" concept of
bibsched, as well as on the "quarantined records" branch. The changes
on bibsched will be part of another ticket.)
--
Ticket URL: <http://invenio-software.org/ticket/816>
Invenio <http://invenio-software.org>