[
https://issues.apache.org/jira/browse/HBASE-21147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603337#comment-16603337
]
Andrew Purtell commented on HBASE-21147:
----------------------------------------
Thanks for asking [~elserj]. For future reference, once I've put up a RC, there
is a tag set pointing to the RC and commits to branch-1.4 are fine. In fact,
commits to branch-1.4 are always fine, per our compatibility guidelines. If you
happen to stomp on me during the RC I'll see it and respin. If only JIRA state
management of fix versions and issue state were so easy.
> (1.4) Add ability for HBase Canary to ignore a configurable number of
> ZooKeeper down nodes
> ------------------------------------------------------------------------------------------
>
> Key: HBASE-21147
> URL: https://issues.apache.org/jira/browse/HBASE-21147
> Project: HBase
> Issue Type: Improvement
> Components: canary, Zookeeper
> Affects Versions: 1.0.0, 3.0.0, 2.0.0
> Reporter: David Manning
> Assignee: Andrew Purtell
> Priority: Minor
> Fix For: 1.4.8
>
> Attachments: HBASE-21126.branch-1.001.patch,
> HBASE-21126.master.001.patch, HBASE-21126.master.002.patch,
> HBASE-21126.master.003.patch, zookeeperCanaryLocalTestValidation.txt
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> When running org.apache.hadoop.hbase.tool.Canary with args -zookeeper
> -treatFailureAsError, the Canary will try to get a znode from each ZooKeeper
> server in the ensemble. If any server is unavailable or unresponsive, the
> canary will exit with a failure code.
> If we use the Canary to gauge server health, and alert accordingly, this can
> be too strict. For example, in a 5-node ZooKeeper cluster, having one node
> down is safe and expected in rolling upgrades/patches.
> This is a request to allow the Canary to take another parameter
> {code:java}
> -permittedZookeeperFailures <N>{code}
> If N=1, in the 5-node ZooKeeper ensemble example, then the Canary will still
> pass if 4 ZooKeeper nodes are reachable, but fail if 3 or fewer are reachable.
> (This is my first Jira posting... sorry if I messed anything up.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)