[
https://issues.apache.org/jira/browse/HBASE-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204058#comment-13204058
]
Todd Lipcon commented on HBASE-5353:
------------------------------------
bq. The cluster knows about it, so you can have a link on the webui to the
master or any of the region servers
And each of the potential masters publishes metrics to ganglia, so if you want
to find the master metrics, you have to hunt around in the ganglia graphs for
which master was active at that time?
And any cron jobs or nagios alerts you write need to first call some HBase
utility to find the active master's IP via ZK in order to get to it?
bq. True, but if those masters fail over, then your cluster management needs to
be aware enough of that to provision more, on different servers
If you have two masters on separate racks, and you have any reasonable
monitoring, then your ops team will restart or provision a new one when they
fail. I've never ever heard of this kind of scenario being a major cause of
downtime.
The whole thing seems like a bad idea to me. I won't -1 but consider me -0.5
> HA/Distributed HMaster via RegionServers
> ----------------------------------------
>
> Key: HBASE-5353
> URL: https://issues.apache.org/jira/browse/HBASE-5353
> Project: HBase
> Issue Type: Improvement
> Components: master, regionserver
> Affects Versions: 0.94.0
> Reporter: Jesse Yates
> Priority: Minor
>
> Currently, the HMaster node must be considered a 'special' node (single point
> of failure), meaning that the node must be protected more than the other
> commodity machines. It should be possible to instead have the HMaster be much
> more available, either in a distributed sense (meaning a bit rewrite) or with
> multiple instances and automatic failover.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira