Re: HBase failover

Andrew Purtell Wed, 07 Jan 2009 11:23:06 -0800

You can affect the current situation with a little development
effort, as we (privately) have at my employer. Comments inline
below.

> From: Cosmin Lehene
> My answers inline...
> 
> On 1/7/09 12:05 PM, "Genady" wrote:
> > Could somebody explain an expected HBase 0.18.1 nodes
> > behavior in case Hadoop cluster failover for a following
> > reasons:
> > 
> > 
> > 
> > -          HBase master region server fails
> 
> You need to manually set a different machine as master and
> redistribute the configuration files on all other region
> servers and restart the cluster.
>  
> Maybe someone in the development team could explain if this
> will change with Zookeeper integration.

First, my understanding is that the ZK integration will handle
master role reassignment without requiring a cluster restart.
J-D could say more (or deny). 

What we (my employer) do currently is run the Hadoop and HBase
daemons as child processes of custom monitoring daemons that
use a private DHT which supports TTLs on cells to write
heartbeats into the cloud. This same mechanism also supports
service discovery. All HBase processes in particular can be
automatically restarted should the location of the master shift.
(The location of the master may shift if a node has a hard
failure.) We write Hadoop and HBase configuration files on the
fly as necessary.

This all took me only a few days to implement and a few more
to debug. 

Relocation of the Hadoop name node is trickier. I believe it
is possible to have it write the fs image out to a NFS share
such that a service relocation to another host with the same
NFS mount will pick up the latest edits seamlessly. However I
do not trust NFS under a number of failure conditions so will
not try this myself. There might be other better strategies
for replication of the fs image.

> > -          HBase slave region server fails
>
> This is handled transparently. Regions served by the failed
> region server are reassigned to the rest of the region
> servers.
>
> > -          Hadoop master server fails
>
> I suppose you mean the HDFS namenode. Currently, the
> namenode is a single point of failure in HDFS and needs
> manual intervention to configure a new namenode. A
> secondary namenode can be configured, however this one only
> keeps a metadata replica and does not act as failover node.
> http://wiki.apache.org/hadoop/NameNode

See my comments above.

> > -          Hadoop slave server fails
>
> If a HDFS datanode fails it's files are already replicated
> on 2 other datanodes. Eventually the replication will be
> fixed by the namenode - creating third replica on one of
> remaining datanodes.

You want to make sure your client is requesting the default
replication. The stock Hadoop config does allow DFS clients
to specify a replication factor of 1 only. However the HBase
DFS client will always request the default so this is not
an issue for HBase. 

> > -          Hadoop master and HBase master are fail (
> > in case they're installed on the same computer and for
> > instance disk has failover)
>
> These servers run independently so you can see above what
> happens.

Don't run them on the same node regardless. The Hadoop name
node can become very busy given a lot of DFS file system
activity. Let it have its own dedicated node to avoid problems
e.g. replication stalls. 

> > -          HBase slave region server is failed but
> > HBase data could be recovered and copied to other node
> > and the new node is added instead of failed one.
> 
> Hbase region servers don't actually hold the data. Data
> is stored in HDFS. Region servers just serve regions and
> when the region server fails the regions are reassigned
> (see above).
> 
> Cosmin

   - Andy

Re: HBase failover

Reply via email to