[jira] [Commented] (SOLR-6402) OverseerCollectionProcessor should not exit for ZK ConnectionLoss

Mark Miller (JIRA) Thu, 21 Aug 2014 19:19:26 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-6402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106365#comment-14106365
 ]


Mark Miller commented on SOLR-6402:
-----------------------------------

bq.  there are lots of other operations past the if check that don't. E.g. all 
those workqueue manipulations.

That should not be the case. All ZK manipulation should be through 
SolrZkClient, which should use ZkCmdExecutor to retry on connection loss passed 
expiration unless explicitly asked not to.

> OverseerCollectionProcessor should not exit for ZK ConnectionLoss
> -----------------------------------------------------------------
>
>                 Key: SOLR-6402
>                 URL: https://issues.apache.org/jira/browse/SOLR-6402
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.8, 5.0
>            Reporter: Jessica Cheng Mallet
>            Assignee: Mark Miller
>             Fix For: 5.0, 4.10
>
>
> We saw an occurrence where we had some ZK connection blip and the 
> OverseerCollectionProcessor thread stopped but the ClusterStateUpdater output 
> some error but kept running, and the node didn't lose its leadership. this 
> caused our collection work queue to back up.
> Right now OverseerCollectionProcessor's run method has on trunk:
> {quote}
> 344           if (e.code() == KeeperException.Code.SESSIONEXPIRED
> 345                 || e.code() == KeeperException.Code.CONNECTIONLOSS) \{
> 346               log.warn("Overseer cannot talk to ZK");
> 347               return;
> 348             \}
> {quote}
> I think this if statement should only be for SESSIONEXPIRED. If it just 
> experiences a connection loss but then reconnect before the session expired, 
> it'll keep being the leader.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-6402) OverseerCollectionProcessor should not exit for ZK ConnectionLoss

Reply via email to