Hi,

Is there a way to get the DataImportHandler to skip already-seen records
rather than reindexing them?

The UpdateHandler has an <add overwrite="false" ... > capability which (as I
understand it) means that a document whose uniqueKey matches one already in
the index will be skipped instead of overwritten.

Can the DIH be made to behave this way?

If not, would it be an easy patch? This is using the XPathEntityProcessor by
the way.

Thanks,

Andrew.
--
:: http://biotext.org.uk/ ::
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p771559.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to