Hi people,

I urgently need your help!

I have solr 3.3 configured and running. I do uncremental indexing 4 times a
day using bulk updates. Some documents are identical to some extent and I
wish to skip them, not to index.
But here is the problem as I could not find a way to tell solr ignore new
duplicate docs and keep old indexed docs. I don't care that it's new. Just
determine by ID that such document is in the index already and that's it.

I use solrj for indexing. I have tried setting overwrite=false and dedupe
apprache but nothing helped me. I either have that a newer doc overwrites
old one or I get duplicate.

I think it's a very simple and basic feature and it must exist. What did I
make wrong or didn't do?

Tried google but I couldn't find a solution there althoght many people
encounted such problem.

I start considering that I must query index to check if a doc to be added
is in the index already and do not add it to array but I have so many docs
that I am affraid it's not a good solution.

Best Regards
Alexander Aristov

Reply via email to