[jira] [Commented] (SOLR-8173) CLONE - Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered.

Frank Kelly (JIRA) Fri, 03 Mar 2017 10:35:08 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894821#comment-15894821
 ]


Frank Kelly commented on SOLR-8173:
-----------------------------------

I agree. 
This is a critical problem when ZooKeeper and Solr disagree as who the leader 
there needs to be a winner rather stay in some unrecoverable state. Even if it 
just randomly picked one shard - a fully operational but slightly "off" search 
index is better than no index at all.



> CLONE - Leader recovery process can select the wrong leader if all replicas 
> for a shard are down and trying to recover as well as lose updates that 
> should have been recovered.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8173
>                 URL: https://issues.apache.org/jira/browse/SOLR-8173
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Matteo Grolla
>            Assignee: Mark Miller
>            Priority: Critical
>              Labels: leader, recovery
>         Attachments: solr_8983.log, solr_8984.log
>
>
> I'm doing this test
> collection test is replicated on two solr nodes running on 8983, 8984
> using external zk
> initially both nodes are empty
> 1)turn on solr 8983
> 2)add,commit a doc x con solr 8983
> 3)turn off solr 8983
> 4)turn on solr 8984
> 5)shortly after (leader still not elected) turn on solr 8983
> 6)8984 is elected as leader
> 7)doc x is present on 8983 but not on 8984 (check issuing a query)
> In attachment are the solr.log files of both instances



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8173) CLONE - Leader recovery process can select the wrong leader if all replicas for a shard are down and trying to recover as well as lose updates that should have been recovered.

Reply via email to