Hi folks,

Recently ran into a data merge use case where I want to backfill a ton of
documents off of storage into solr, but only if they don't already exist in
Solr.  (If they exist, they're newer.)

I couldn't find an efficient way to do this in bulk; if any document in my
batch ran into a conflict, the whole batch would fail.  And
single-doc-per-request is super slow.

So I changed DistributedUpdateProcessor to look for a request parameter,
and if present, any conflict documents are silently dropped, but the
request as a whole goes through.

Any interest in upstreaming this?

Scott

Reply via email to