[ https://issues.apache.org/jira/browse/SOLR-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646417#action_12646417 ]
Hoss Man commented on SOLR-799: ------------------------------- bq. It seems like uniqueField should normally enforce uniqueness, regardless of what this component does. agreed. Whilei can imagine use cases for adding a signature field that is independent from the uniqueKey field (ie: query time duplicate pruning/collapsing) I'm having a really hard time thinking of any use cases where someone would need special deletion logic on a (non uniqueKey) signature field. if you want docs with identical signatures deleted, why wouldn't you make that the uniqueKey field? ... if you have both, you could really confuse the hell out of someone who doesn't understand why adding one doc deleted a different doc with a completely different uniqueKey. > Add support for hash based exact/near duplicate document handling > ----------------------------------------------------------------- > > Key: SOLR-799 > URL: https://issues.apache.org/jira/browse/SOLR-799 > Project: Solr > Issue Type: New Feature > Components: update > Reporter: Mark Miller > Priority: Minor > Attachments: SOLR-799.patch, SOLR-799.patch > > > Hash based duplicate document detection is efficient and allows for blocking > as well as field collapsing. Lets put it into solr. > http://wiki.apache.org/solr/Deduplication -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.