[jira] [Commented] (HDFS-6706) ZKFailoverController failed to recognize the quorum is not met

Yongjun Zhang (JIRA) Mon, 21 Jul 2014 13:31:10 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069226#comment-14069226
 ]


Yongjun Zhang commented on HDFS-6706:
-------------------------------------

HI [~brandonli], 

Thanks for reporting and addressing the issue.  I have some questions here.  
The original report seems to indicate that the reported error message doesn't 
indicate the real reason of failure. My questions are,
1. In the case reported initially, the real problem was said to be "The zkfc 
cannot be startup due to ha.zookeeper.quorum is not met". With your last 
update, can we say the real problem is a misconfiguration?
2. What kind of misconfiguration caused the symptom?
3. When misconfigured, user will still see the reported error message. Should 
we have the error message to tell that the symptom is caused by the possible 
misconfiguration?

Thanks.
.


> ZKFailoverController failed to recognize the quorum is not met
> --------------------------------------------------------------
>
>                 Key: HDFS-6706
>                 URL: https://issues.apache.org/jira/browse/HDFS-6706
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>
> Thanks Kenny Zhang for finding this problem.
> The zkfc cannot be startup due to ha.zookeeper.quorum is not met. "zkfc 
> -format" doesn't log the real problem. And then user will see the error 
> message instead of the real issue when starting zkfc:
> 2014-07-01 17:08:17,528 FATAL ha.ZKFailoverController 
> (ZKFailoverController.java:doRun(213)) - Unable to start failover controller. 
> Parent znode does not exist.
> Run with -formatZK flag to initialize ZooKeeper.
> 2014-07-01 16:00:48,678 FATAL ha.ZKFailoverController 
> (ZKFailoverController.java:fatalError(365)) - Fatal error occurred:Received 
> create error from Zookeeper. code:NONODE for path 
> /hadoop-ha/prodcluster/ActiveStandbyElectorLock
> 2014-07-01 17:24:44,202 - INFO ProcessThread(sid:2 
> cport:-1)::PrepRequestProcessor@627 - Got user-level KeeperException when 
> processing sessionid:0x346f36191250005 type:create cxid:0x2 zxid:0xf00000033 
> txntype:-1 reqpath:n/a Error 
> Path:/hadoop-ha/prodcluster/ActiveStandbyElectorLock Error:KeeperErrorCode = 
> NodeExists for /hadoop-ha/prodcluster/ActiveStandbyElectorLock
> To reproduce the problem:
> 1. use HDFS cluster with automatic HA enable and set the ha.zookeeper.quorum 
> to 3.
> 2. start two zookeeper servers.
> 3. do "hdfs zkfc -format", and then "hdfs zkfc"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6706) ZKFailoverController failed to recognize the quorum is not met

Reply via email to