[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

Jessica Cheng (JIRA) Thu, 14 Nov 2013 17:38:19 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823195#comment-13823195
 ]


Jessica Cheng commented on SOLR-4260:
-------------------------------------

We're seeing the same thing, running v4.5.0.

Mark, if you don't mind clarifying...

{quote}
The safest thing to do is to stop the cluster and start it again - that 
triggers a process that tries to pick the most up to date replica and trades up 
to 100 updates or so among each other if some are on some replicas and not 
others.
{quote}
What if the difference is greater than 100? Is there any other way to figure 
out who is the "truth" and force that state onto the other replicas by doing a 
full sync?

{quote}
If you are sure the leader is simply behind, you can just bounce it and let a 
replica take over as leader.
{quote}
Newbie question: Why would the leader be behind? Aren't all updates sent to the 
leader first and then the leader distribute it to the replicas? Also, I was 
under the impression that this update call is synchronous, so once an update 
request returns successfully to the client, why would any replica be behind?



> Inconsistent numDocs between leader and replica
> -----------------------------------------------
>
>                 Key: SOLR-4260
>                 URL: https://issues.apache.org/jira/browse/SOLR-4260
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 5.0
>         Environment: 5.0.0.2013.01.04.15.31.51
>            Reporter: Markus Jelsma
>            Priority: Critical
>             Fix For: 5.0
>
>         Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

Reply via email to