[ 
https://issues.apache.org/jira/browse/SOLR-5495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13991498#comment-13991498
 ] 

Timothy Potter commented on SOLR-5495:
--------------------------------------

Thanks for the review Mark! I think there are still some weird interactions 
going on with this code and the waitForLeaderToSeeDownState stuff as I'm seeing 
some exceptions like the following in a good sized cluster when I knock over 
replicas during heavy indexing. Leader doesn't see down state, it sees the 
"recovering" state.

2014-05-07 02:34:03,112 [Thread-3531] ERROR solr.cloud.ZkController  - There 
was a problem making a request to the 
leader:org.apache.solr.client.solrj.SolrServerException: Timeout occured while 
waiting response from server at: http://host:8985/solr
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:562)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
        at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
        at 
org.apache.solr.cloud.ZkController.waitForLeaderToSeeDownState(ZkController.java:1528)
        at 
org.apache.solr.cloud.ZkController.registerAllCoresAsDown(ZkController.java:372)
        at org.apache.solr.cloud.ZkController.access$000(ZkController.java:87)
        at org.apache.solr.cloud.ZkController$1.command(ZkController.java:229)
        at 
org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166)

In short, I there are still a few little issues that didn't show up in unit 
testing. So I'm going to flog this area of the code a bit more tomorrow morning!

> Recovery strategy for leader partitioned from replica case.
> -----------------------------------------------------------
>
>                 Key: SOLR-5495
>                 URL: https://issues.apache.org/jira/browse/SOLR-5495
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Mark Miller
>            Assignee: Timothy Potter
>         Attachments: SOLR-5495.patch, SOLR-5495.patch, SOLR-5495.patch
>
>
> We need to work out a strategy for the case of:
> Leader and replicas can still talk to ZooKeeper, Leader cannot talk to 
> replica.
> We punted on this in initial design, but I'd like to get something in.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to