[jira] [Commented] (HBASE-5353) HA/Distributed HMaster via RegionServers

Todd Lipcon (Commented) (JIRA) Wed, 08 Feb 2012 14:23:23 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204058#comment-13204058
 ]


Todd Lipcon commented on HBASE-5353:
------------------------------------

bq. The cluster knows about it, so you can have a link on the webui to the 
master or any of the region servers

And each of the potential masters publishes metrics to ganglia, so if you want 
to find the master metrics, you have to hunt around in the ganglia graphs for 
which master was active at that time?
And any cron jobs or nagios alerts you write need to first call some HBase 
utility to find the active master's IP via ZK in order to get to it?

bq. True, but if those masters fail over, then your cluster management needs to 
be aware enough of that to provision more, on different servers

If you have two masters on separate racks, and you have any reasonable 
monitoring, then your ops team will restart or provision a new one when they 
fail. I've never ever heard of this kind of scenario being a major cause of 
downtime.


The whole thing seems like a bad idea to me. I won't -1 but consider me -0.5
                
> HA/Distributed HMaster via RegionServers
> ----------------------------------------
>
>                 Key: HBASE-5353
>                 URL: https://issues.apache.org/jira/browse/HBASE-5353
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.94.0
>            Reporter: Jesse Yates
>            Priority: Minor
>
> Currently, the HMaster node must be considered a 'special' node (single point 
> of failure), meaning that the node must be protected more than the other 
> commodity machines. It should be possible to instead have the HMaster be much 
> more available, either in a distributed sense (meaning a bit rewrite) or with 
> multiple instances and automatic failover. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5353) HA/Distributed HMaster via RegionServers

Reply via email to