Deduplication uses lucene indexWriter.updateDocument using the signature term. I don't think it's possible as a default feature to choose wich document to index, the "original" should be always the last to be indexed. /IndexWriter.updateDocument Updates a document by first deleting the document(s) containing term and then adding the new document. The delete and then add are atomic as seen by a reader on the same index (flush may happen only after the add)./
With grouping you have all your documents indexed so it gives you more flexibility -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-3-Grouping-vs-DeDuplication-and-Deduplication-Use-Case-tp3294711p3295023.html Sent from the Solr - User mailing list archive at Nabble.com.