On Fri, Oct 18, 2013 at 4:35 PM, Per Steffensen <[email protected]> wrote: > It is a distributed search! So where in the code is the de-dup happening for > distributed searches? And is it correct that this is new in 4.4.0 (vs > 4.0.0)? Or did I just accidently change my config to turn it on.
Distributed search has always done an quick-n-dirty dedup (i.e. it's considered an error condition to have the same ID in different shards anyway). It should be in QueryComponent.mergeIds -Yonik > Regards, Per Steffensen > > > On 10/18/13 10:09 PM, Yonik Seeley wrote: >> >> AFAIK, the only dedup that is done on purpose is during distributed >> search. >> So either a distributed search is happening, or there has been some >> other change that accidentally started de-duping (such as some sort of >> map from ID to Doc for other reasons). >> >> -Yonik >> >> >> On Fri, Oct 18, 2013 at 4:03 PM, Per Steffensen <[email protected]> >> wrote: >>> >>> Hi >>> >>> I send update/add-requests to Solr in a way so that >>> indexWriter.addDocument >>> is used in DirectUpdateHandler2 instead of indexWriter.updateDocument. In >>> two separate requests I send two identical documents into Solr. In Solr >>> 4.0.0 I get both documents back when I search. In Solr 4.4.0 I only get >>> one >>> document back. I have investigated a little into what happens in Solr >>> 4.4.0, >>> and I believe I see that both documents actually in the Lucene indices >>> (in >>> QueryComponent.process the searcher.search line returns two docs for one >>> of >>> my shards). So it must be somewhere in the search-flow that it is decided >>> to >>> send only one of them back to the client. In Solr 4.0.0 I get both back >>> to >>> the client. >>> >>> Is this known/intended behavior? Can someone point me to the code where >>> "duplicates" are filtered, and/or to the JIRA issue where this feature >>> was >>> introduced. Not that I necessarily want to do it, but can this >>> searh-dedup >>> be turned off? >>> >>> Regards, Per Steffensen >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
