RE: Question on Availability

Yair Even-Zohar Fri, 07 Nov 2008 09:38:30 -0800

How can we shut down the regieonservers cleanly if the master is down?

Thanks
-Yair


-----Original Message-----
From: Jim Kellerman (POWERSET) [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 07, 2008 8:54 AM
To: [email protected]
Subject: RE: Question on Availability

Jean-Daniel is correct. Although HDFS has a "warm backup" for
the namenode, it requires physical intervention to start it
up if the primary namenode fails.

With respect to HBase, moving the master is not a big deal.
Shutting down the region servers (cleanly: no kill -9),
pushing a new config which indicates where the new master
will live and restarting the cluster should help get over
a failed master. Eventually we will get rid of the master
as a single point of failure when we integrate Zookeeper,
which will allow us to run a multi-master configuration.

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of
Jean-
> Daniel Cryans
> Sent: Friday, November 07, 2008 6:53 AM
> To: [email protected]
> Subject: Re: Question on Availability
>
> Michael,
>
> The Namenode is also a SPOF.
>
> On keeping a separate cluster for failover, until we add higher
> availability, it depends on how much uptime you need to provide I
guess. I
> personally never saw a machine hosting a Master failing, so I'm not
sure
> on
> how clean it can be regards the META, but I think that just closing
your
> region servers, changing the config for the master then restart the
> cluster
> with a new master would be nearly enough provided that the Namenode
was
> hosted on another machine. Maybe Jim or Stack can confirm?
>
> J-D
>
> On Fri, Nov 7, 2008 at 5:33 AM, Michael Dagaev
> <[EMAIL PROTECTED]>wrote:
>
> > Hi, all
> >
> >    I guess that Hbase master server is a single point of failure. Is
> > it correct ? Does Hbase (I mean the whole stack -- HDFS + Hbase)
have
> > any other single point of failure ?
> >
> >    If Hbase has a single point of failure we should arrange a backup
> > Hbase cluster to switch to it in case of failure. Does it make sense
?
> >
> > Thank you for your cooperation,
> > M.
> >

RE: Question on Availability

Reply via email to