[jira] [Commented] (SOLR-7109) Indexing threads stuck during network partition can put leader into down state

Mark Miller (JIRA) Fri, 13 Mar 2015 19:25:22 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361527#comment-14361527
 ]


Mark Miller commented on SOLR-7109:
-----------------------------------

I think we need to open an issue to start using annotations for what API's you 
can count on in Java. We can start labeling most of them internal and open them 
based on demand, maturity, sensibility, but a plugin writer should have an idea 
of what API's they can count on and still get support for things like rolling 
upgrades. Perhaps that just most of the basic SolrCore methods and 
ZKStateReader methods, but it should be something over time. Eventually it 
would be nice if basic plugins could survive rolling upgrades if they use some 
common simple API's.

Given where things are currently though, these particular types of internal 
methods - especially those on ZkController, are still under considerable flux.

> Indexing threads stuck during network partition can put leader into down state
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-7109
>                 URL: https://issues.apache.org/jira/browse/SOLR-7109
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.10.3, 5.0
>            Reporter: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.1
>
>         Attachments: SOLR-7109.patch, SOLR-7109.patch
>
>
> I found this recently while running some Jepsen tests. I found that some 
> threads get stuck on zk operations for a long time in 
> ZkController.updateLeaderInitiatedRecoveryState method and when they wake up 
> they go ahead with setting the LIR state to down. But in the mean time, new 
> leader has been elected and sometimes you'd get into a state where the leader 
> itself is put into recovery causing the shard to reject all writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-7109) Indexing threads stuck during network partition can put leader into down state

Reply via email to