Are you able to see any evidence that some of the 500k docs are being added 
twice? Check the maxDocs on the Solr admin page. I vaguely recall there being 
some issue with docs in SolrCloud being added multiple times (which under the 
covers is really add, delete, add). I think that could cause the docCount to be 
different across "identical" indexes. That would also explain why a forceMerge 
fixes it, as the deleted documents are then fully removed.

-Michael

-----Original Message-----
From: Markus Jelsma [mailto:markus.jel...@openindex.io] 
Sent: Wednesday, January 23, 2013 5:38 PM
To: solr-user@lucene.apache.org
Subject: RE: Issues with docFreq/docCount on SolrCloud

Hi again,

I've tried various settings for TieredMergePolicy to make sure the docFreq, 
maxDoc and docCount don't deviate too much. We've also did tests after 
increasing reclaimDeletesWeight from 2.0 to 8.0 and slightly more frequent 
merging. In these tests we reindexed the same 500k docs each time in different 
cores with various settings at the same time.

We still see documents in distributed queries being scored slightly different 
leading to documents jumping positions in the resultset, which is obviously 
unacceptable.

To clarify, these documents don't jump positions because of them having the 
same score and being sorted by Lucene docID, it's the actual score being 
different. Also, the index doesn't change when we fire queries and it's not a 
problem of lacking distributed IDF. It is, of course, acceptable for documents 
to jump position on a frequently changing index, that's the way it works. But 
not for a multiple replica's on a static index.

Is there anyone around here with suggestions, hints or anything?

The next thing we might try is to route the same user to the same replica of a 
shard by overriding the http shard handler but i'm not sure this is a proper 
solution. This, at least, might prevent users from seeing documents jumping 
positions in the same result set.

Thanks,
Markus
 
-----Original message-----
> From:Markus Jelsma <markus.jel...@openindex.io>
> Sent: Mon 21-Jan-2013 20:31
> To: solr-user@lucene.apache.org
> Subject: Issues with docFreq/docCount on SolrCloud
> 
> Hi,
> 
> We have a few trunk clusters running with two replica's for each shard. We 
> sometimes see results jumping positions for identical queries. We've tracked 
> it down to differences in docFreq and docCount between the leader and 
> replica's. The only way to force all cores in the shard to be consistent is 
> to optimize or forceMerge the segments.
> 
> Is there anyone here who can give advice on this issue? For obvious reasons 
> we don't want to to optimize 50GB of data on some regular basis but we do 
> want to make sure the variations in docFreq/docCount does not lead to results 
> jumping positions in the resultset for identical queries.
> 
> We already have like most of you small issues due to the lack of distributed 
> IDF, having this problem as well makes SolrCloud less predictable and harder 
> to debug.
> 
> Thanks,
> Markus
> 

Reply via email to