kill -0 region-server-pid should do the trick. --- Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
> -----Original Message----- > From: Yair Even-Zohar [mailto:[EMAIL PROTECTED] > Sent: Friday, November 07, 2008 9:38 AM > To: [email protected] > Subject: RE: Question on Availability > > How can we shut down the regieonservers cleanly if the master is down? > > Thanks > -Yair > > -----Original Message----- > From: Jim Kellerman (POWERSET) [mailto:[EMAIL PROTECTED] > Sent: Friday, November 07, 2008 8:54 AM > To: [email protected] > Subject: RE: Question on Availability > > Jean-Daniel is correct. Although HDFS has a "warm backup" for > the namenode, it requires physical intervention to start it > up if the primary namenode fails. > > With respect to HBase, moving the master is not a big deal. > Shutting down the region servers (cleanly: no kill -9), > pushing a new config which indicates where the new master > will live and restarting the cluster should help get over > a failed master. Eventually we will get rid of the master > as a single point of failure when we integrate Zookeeper, > which will allow us to run a multi-master configuration. > > --- > Jim Kellerman, Powerset (Live Search, Microsoft Corporation) > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of > Jean- > > Daniel Cryans > > Sent: Friday, November 07, 2008 6:53 AM > > To: [email protected] > > Subject: Re: Question on Availability > > > > Michael, > > > > The Namenode is also a SPOF. > > > > On keeping a separate cluster for failover, until we add higher > > availability, it depends on how much uptime you need to provide I > guess. I > > personally never saw a machine hosting a Master failing, so I'm not > sure > > on > > how clean it can be regards the META, but I think that just closing > your > > region servers, changing the config for the master then restart the > > cluster > > with a new master would be nearly enough provided that the Namenode > was > > hosted on another machine. Maybe Jim or Stack can confirm? > > > > J-D > > > > On Fri, Nov 7, 2008 at 5:33 AM, Michael Dagaev > > <[EMAIL PROTECTED]>wrote: > > > > > Hi, all > > > > > > I guess that Hbase master server is a single point of failure. Is > > > it correct ? Does Hbase (I mean the whole stack -- HDFS + Hbase) > have > > > any other single point of failure ? > > > > > > If Hbase has a single point of failure we should arrange a backup > > > Hbase cluster to switch to it in case of failure. Does it make sense > ? > > > > > > Thank you for your cooperation, > > > M. > > >
