[ 
https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235725#comment-14235725
 ] 

Erick Erickson commented on SOLR-5821:
--------------------------------------

Well, "it depends". SolrCloud is designed for eventual consistency.

There are two ways to interpret "different results"
1> the docs returned by different replicas for the same shard
     aren't identically ordered
2> the number of docs found varies depending on which replica
     is queried.

For <2>, in the absence of ongoing indexing and after the last
soft commit (or hard commit with openSearcher = true) has fired
on all replicas, that would be a bug.

Consider the autocommit intervals on your replicas. They wont
all fire at the same instant. Say you index a doc and
replica A's autocommit trigger happens immediately _after_ that
doc hits it, but replica B's autocommit happens to trip
just _before_ the doc arrived there. The doc will not be visible
from replica B until after the next autocommit happens there.

So if you stop indexing and wait until all the replica's autocommits
have surely fired and _still_ have different numFounds from different
replicas, we need to know about that as it's unexpected.

BTW, you can insure you get data from only one node by appending
&distrib=false to a query.

For <1> it's a little different. Depending on the vagaries of merging,
the internal Lucene doc ID for the same two docs may be different
on different machines. If the sort criteria (score by default) happens
to be identical for two docs, the tiebreaker is the internal Lucene
doc ID. So depending on the shard, doc1 might sort before or after
doc 2, so the same query on a corpus with no indexing going on
may appear in different orders. 

You can cure this by specifying a secondary sort that you control.
This case, however, should _NOT_ return different numFound


> Search inconsistency on SolrCloud replicas
> ------------------------------------------
>
>                 Key: SOLR-5821
>                 URL: https://issues.apache.org/jira/browse/SOLR-5821
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.6.1, 4.7.1
>         Environment: SolrCloud:
> 1 shard, 2 replicas
> Both instances/replicas have identical hardware/software:
> CPU(s): 4
> RAM: 8Gb
> HDD: 100Gb
> OS: CentOS 6.5
> ZooKeeper 3.4.5
> Tomcat 8.0.3
> Solr 4.6.1
> Servers are utilized to run Solr only.
>            Reporter: Maxim Novikov
>            Priority: Critical
>              Labels: cloud, inconsistency, replica, search
>         Attachments: Screen Shot 2014-04-05 at 2.26.26 AM.png, Screen Shot 
> 2014-04-05 at 2.26.41 AM.png
>
>
> We use the following infrastructure:
> SolrCloud with 1 shard and 2 replicas. The index is built using 
> DataImportHandler (importing data from the database). The number of items in 
> the index can vary from 100 to 100,000,000.
> After indexing part of the data (not necessarily all the data, it is enough 
> to have a small number of items in the search index), we can observe that 
> Solr instances (replicas) return different results for the same search 
> queries. I believe it happens because some of the results have the same 
> scores, and Solr instances return those in a random order.
> PS This is a critical issue for us as we use a load balancer to scale Solr 
> through replicas, and as a result of this issue, we retrieve various results 
> for the same queries all the time. They are not necessarily completely 
> different, but even a couple of items that differ is a deal breaker.
> The expected behaviour would be to always get identical results for the same 
> search queries from all replicas. Otherwise, this "cloud" thing works just 
> unreliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to