Ivan Masár created DS-1523:
------------------------------

             Summary: detection of duplicate items during import and submission
                 Key: DS-1523
                 URL: https://jira.duraspace.org/browse/DS-1523
             Project: DSpace
          Issue Type: New Feature
            Reporter: Ivan Masár


Users expressed the need for DSpace to detect whether an item they're about to 
import/submit already exists in the repository. This issue is trying to capture 
the requirements for this feature.

The major point here is the definition of a duplicity. Some uses already have a 
strict definition of a duplicity, e.g. an equal value of a metadata field 
(dc.identifier.uuid). Others may depend on similarity of multiple metadata 
fields (e.g. dc.title, dc.issn) which may be expressed by Levenshtein distance 
while the rest may even be different (e.g. different values in 
dc.contributor.autor).

This leads me to the conclusion that we need to provide a way for users to 
define their own method of comparison by means of a plugin. The disadvantage of 
this approach is that checking each imported item against all existing items 
using an user-defined (possibly non-optimally fast) method may slow down import 
and therefore the feature needs to be opt-in. Of course we should provide 
implementations for some commonly used cases, like those mentioned above. The 
input to the comparison method should be the item DSO (so that its metadata and 
bitstreams can be read) with the parent object filled in so that the search can 
be restricted to a community/collection in order to make it possible to reduce 
the search scope.

Here are some recent discussion on this topic:
* 
http://dspace.2283337.n4.nabble.com/KE1019161-Import-Editing-Items-in-DSpace-and-evaluating-the-existence-with-an-other-value-then-the-iD-td4662400.html
* DS-1515
* 
http://dspace.2283337.n4.nabble.com/how-to-filter-the-repeat-items-with-import-tools-td4662729.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to