Re: HBase and failure notification

David Van Couvering Thu, 26 Feb 2009 12:01:38 -0800

Cool stuff, thanks!

David


On Thu, Feb 26, 2009 at 11:13 AM, Jim Kellerman (POWERSET) <
[email protected]> wrote:

> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> > On Behalf Of David Van Couvering
> > Sent: Thursday, February 26, 2009 10:18 AM
> > To: [email protected]
> > Subject: HBase and failure notification
> >
> > Hey, all.  I'm doing a bit of a survey of distributed key/value stores
> > out
> > there.  HBase looks pretty interesting, nice to see an open source
> > version
> > of BigTable out there.
> >
> > HBase is obviously clustered, but what I can't figure out is how it does
> > cluster management.  It looks like you have to configure it to tell it
> > all
> > the machines that have region servers, and that implies to me that *you*
> > have to start and manage the region servers - HBase doesn't do any of
> > that
> > for you.
>
> There are start and stop scripts that will start up the master and region
> servers.
>
> > So I think that means that it doesn't have any node monitoring
> > support - you have to have your own monitoring system that detects
> > failed nodes and notifies you and/or restarts them for you.
>
> HBase has a web UI that you can use to monitor the state of the cluster.
> The master does detect when a region server becomes unreachable.
>
> But if you mean machine failure, HBase does not have built in monitoring,
> but you can use Ganglia to monitor the hardware status. HBase can also
> feed metrics to Ganglia.
>
> >
> > Also, the architecture document says "if [the master server] detects a
> > HRegionServer is no longer reachable, it will split the HRegionServer's
> > write-ahead log so that there is now one write-ahead log for each region
> > that the HRegionServer was serving. After it has accomplished this, it
> > will
> > reassign the regions that were being served by the unreachable
> > HRegionServer"
> >
> > This seems to imply that even though the HRegionServer is unreachable,
> > somehow it's write-ahead log and the regions it was serving are.
> > Perhaps I
> > don't fully understand HFS, but is this a guarantee when the node
> > hosting
> > the HRegionServer is down?  What happens if you can't get to the write-
> > ahead
> > log and/or some of the regions the region server was serving?
>
> HDFS replicates data to multiple machines (3 by default), so unless you
> have a catastrophic outage, it is very unlikely that the data will be
> completely unreachable.
>
> > Thanks,
> >
> > David
> >
> > --
> > David W. Van Couvering
> >
> > I am looking for a senior position working on server-side Java systems.
> >  Feel free to contact me if you know of any opportunities.
> >
> > http://www.linkedin.com/in/davidvc
> > http://davidvancouvering.blogspot.com
> > http://twitter.com/dcouvering
> >
> >
> > --
> > David W. Van Couvering
> >
> > I am looking for a senior position working on server-side Java systems.
> >  Feel free to contact me if you know of any opportunities.
> >
> > http://www.linkedin.com/in/davidvc
> > http://davidvancouvering.blogspot.com
> > http://twitter.com/dcouvering
>



-- 
David W. Van Couvering

I am looking for a senior position working on server-side Java systems.
 Feel free to contact me if you know of any opportunities.

http://www.linkedin.com/in/davidvc
http://davidvancouvering.blogspot.com
http://twitter.com/dcouvering

Re: HBase and failure notification

Reply via email to