Re: HBase and failure notification

David Van Couvering Thu, 26 Feb 2009 12:00:59 -0800

Thanks for the answesr, St. Ack.  That name is very very familiar, and I am
married to a woman named Linda.  Look for me on Facebook :)


I'll look into HDFS to understand the failure semantics for things like
network partitions, etc.

David

On Thu, Feb 26, 2009 at 10:54 AM, stack <[email protected]> wrote:

> On Thu, Feb 26, 2009 at 10:17 AM, David Van Couvering <
> [email protected]> wrote:
>
> >
> > HBase is obviously clustered, but what I can't figure out is how it does
> > cluster management.  It looks like you have to configure it to tell it
> all
> > the machines that have region servers, and that implies to me that *you*
> > have to start and manage the region servers - HBase doesn't do any of
> that
> > for you.  So I think that means that it doesn't have any node monitoring
> > support - you have to have your own monitoring system that detects failed
> > nodes and notifies you and/or restarts them for you.
> >
>
>
> It'll start them all for you.  If one dies, it deals reallocating the
> downed
> servers regions.  It doesn't call the data center to schedule the disk
> replacement for you (smile).
>
>
>
> > Also, the architecture document says "if [the master server] detects a
> > HRegionServer is no longer reachable, it will split the HRegionServer's
> > write-ahead log so that there is now one write-ahead log for each region
> > that the HRegionServer was serving. After it has accomplished this, it
> will
> > reassign the regions that were being served by the unreachable
> > HRegionServer"
> >
> > This seems to imply that even though the HRegionServer is unreachable,
> > somehow it's write-ahead log and the regions it was serving are.  Perhaps
> I
> > don't fully understand HFS, but is this a guarantee when the node hosting
> > the HRegionServer is down?  What happens if you can't get to the
> > write-ahead
> > log and/or some of the regions the region server was serving?
>
>
> Its log is written into the HDFS, a distributed file system that by default
> replicates all that is written to it.  A member of the HDFS cluster might
> go
> down and take some data with it but because the data is replicated, when
> the
> commit log is replayed, it'll be using one of the still online replicas.
>
> (Do you know a woman named Linda?)
>
> St.Ack
>



-- 
David W. Van Couvering

I am looking for a senior position working on server-side Java systems.
 Feel free to contact me if you know of any opportunities.

http://www.linkedin.com/in/davidvc
http://davidvancouvering.blogspot.com
http://twitter.com/dcouvering

Re: HBase and failure notification

Reply via email to