Re: No duplicates in search response?

Per Steffensen Mon, 21 Oct 2013 04:04:41 -0700

After debugging a little I can confirm that the dedup is happening inQueryComponent.mergeIds.

Distributed search has always done an quick-n-dirty dedup (i.e. it's
considered an error condition to have the same ID in different shards
anyway).

Actually it is in the same shard we have two documents with the same ID.They are routed to the same shard because the have the same ID. RememberI tweek my request-params (basically setting overwrite=false) so that Iend up with indexWriter.addDocument (for both documents) inDirectUpdateHandler2 instead of indexWriter.updateDocument

There is a little inconsistency. The dedup does not reflect on totalnumFound unless you actually happen to get the document(s) back in yourquery.Simple example: I have only two document in my entire collection(consisting of several shards). They both live in the same shard andhave the same ID (actually they are complete duplicates). I get thisfunny behavior when searching* Searching with rows=0 or rows=1, I get the numFound=2 back - and inthe case of rows=1 I get the document (or one of them)* Searching with rows>=2, I get numFound=1 back - and the document (orone of them)

It should be in QueryComponent.mergeIds

-Yonik



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: No duplicates in search response?

Reply via email to