Ciao Sam, May I try to shift your interesting use-case from [*user > submission*] scenario to [*library-collaborative-network > import*] scenario? It should became: “*Imagine a *[import or harvesting task]* that might allow a *[library of the collaborative network]* to modify records that might have not been created by the same *[task]* (say they are of other types, or they have been imported on the fly from another *[library of the collaborative network]*)*”
Please, don't consider this scenario too strange: for example, it's the case of “*the incoming modification B*” (proposed by one library) centered only on the descriptive section of “*the original record A*” (created by another library, and already present in Invenio); the administrative part of the record (say: holdings, localizations, …) has to be preserved merging A and B. Now, remaining in your example of “*handles the author field as composed by a name and affiliation*”, and dealing with the case “*for every field in A that has a subfield $8 (i.e. origin), which happens to be mentioned in a corresponding field in B,* [..action..]”, my question is: do you mean by “*happens to be mentioned*” that you also compare the *content* of the *field/subdield * in addition to the tag/character? And, if yes: will the comparison (between the same field/subfield of records A and B) be made also evaluating all the occurrences of repeatable field/subfield in both sides? Trying to make sense for my question: in library collaborative network that kind of merging (and comparison) is made on fields which also present some subfield with coded-data or even ID (like the authoriry-list-ID of the name, ore the code of the library). Thanks for your attention! Cheer, Cristian On Tue, May 31, 2011 at 9:58 AM, Samuele Kaplun <[email protected]>wrote: > Hi, > > this a RFC for adding a new mode to bibupload that we might call > --merge/-m, and that is very oriented to WebSubmit use case. > > Imagine a submission that might allow a user to modify records that > might have not been created by the same submission (say they are of > other types, or they have been imported on the fly from another > repository). > > It is an unrealistic dream that the submission will handle any possible > metadata that might already be existing in the record. (e.g. say it > nicely handles the author field as composed by a name and affiliation. > What if the incoming record being modified has a dedicated subfield to > store the email?) > > I would like to propose the addition of a new mode that should handle in > the smartest way these cases (i.e. existing subfields not foreseen by > the submission), with the following algorithm: lets call the original > record A, and the incoming modification B > > * for every field in A that is not existing in B this is left > untouched (as in bibupload --correct). > * for every field in A that has a subfield $8 (i.e. origin), which > happens to be mentioned in a corresponding field in B, the field > is removed from A and fields from B are taken (this is because > if they have the same origin, they are most probably managed in > a known way) > * for all the other fields they are taken in order of appearance > (as submission interface are usually gentle enough to keep the > order of fields stable): > * for every subfield in the field from A also existing in > the corresponding field from B, the value from the > subfield in B is taken and the value from A is > discarded. > * any other subfield in A that is not considered in B is > kept (this is to not loose additional information). > * if there are more fields in B than in A for a given tag, > these are also added to the record. > * If there are more fields in A than in B, these are not > copied over. (this is the only destructive action) > > What do you think? > > Cheers, > Sam > > -- > Samuele Kaplun > Invenio Developer ** <http://invenio-software.org/> > >
