Skipping duplicates in DataImportHandler based on uniqueKey

Andrew Clegg Sun, 02 May 2010 08:48:19 -0700

Hi,

Is there a way to get the DataImportHandler to skip already-seen records
rather than reindexing them?


The UpdateHandler has an <add overwrite="false" ... > capability which (as I
understand it) means that a document whose uniqueKey matches one already in
the index will be skipped instead of overwritten.

Can the DIH be made to behave this way?

If not, would it be an easy patch? This is using the XPathEntityProcessor by
the way.

Thanks,

Andrew.
--
:: http://biotext.org.uk/ ::
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p771559.html
Sent from the Solr - User mailing list archive at Nabble.com.

Skipping duplicates in DataImportHandler based on uniqueKey

Reply via email to