[jira] [Updated] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable

Vinod Kumar Vavilapalli (JIRA) Wed, 17 Jun 2015 15:02:18 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vinod Kumar Vavilapalli updated HADOOP-10584:
---------------------------------------------
    Target Version/s: 2.7.2  (was: 2.7.1)

Moving this to 2.7.2 after ack from Karthik.

bq. It does help. But in the patch, reJoinElection(0) is called, which will 
further call joinElectionInternal ..
bq. Since the ZK quorum is unavailable, we still have the same issue. The 
difference is that with the patch we will retry 45s more(by using the default 
configuration).
Tx for pointing this out, [~xgong]. That still sounds bad though. IIRC, when we 
were originally discussing the design of state-stores for RM, we were assuming 
that RM should give up on Zookeeper only after trying for may be ~an hour. /cc 
[~jianhe]. Looking at the recent flow of tickets, that doesn't seem to be the 
case?


> ActiveStandbyElector goes down if ZK quorum become unavailable
> --------------------------------------------------------------
>
>                 Key: HADOOP-10584
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10584
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.4.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>            Priority: Critical
>         Attachments: hadoop-10584-prelim.patch, rm.log
>
>
> ActiveStandbyElector retries operations for a few times. If the ZK quorum 
> itself is down, it goes down and the daemons will have to be brought up 
> again. 
> Instead, it should log the fact that it is unable to talk to ZK, call 
> becomeStandby on its client, and continue to attempt connecting to ZK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable

Reply via email to