Thanks for the answesr, St. Ack. That name is very very familiar, and I am married to a woman named Linda. Look for me on Facebook :)
I'll look into HDFS to understand the failure semantics for things like network partitions, etc. David On Thu, Feb 26, 2009 at 10:54 AM, stack <[email protected]> wrote: > On Thu, Feb 26, 2009 at 10:17 AM, David Van Couvering < > [email protected]> wrote: > > > > > HBase is obviously clustered, but what I can't figure out is how it does > > cluster management. It looks like you have to configure it to tell it > all > > the machines that have region servers, and that implies to me that *you* > > have to start and manage the region servers - HBase doesn't do any of > that > > for you. So I think that means that it doesn't have any node monitoring > > support - you have to have your own monitoring system that detects failed > > nodes and notifies you and/or restarts them for you. > > > > > It'll start them all for you. If one dies, it deals reallocating the > downed > servers regions. It doesn't call the data center to schedule the disk > replacement for you (smile). > > > > > Also, the architecture document says "if [the master server] detects a > > HRegionServer is no longer reachable, it will split the HRegionServer's > > write-ahead log so that there is now one write-ahead log for each region > > that the HRegionServer was serving. After it has accomplished this, it > will > > reassign the regions that were being served by the unreachable > > HRegionServer" > > > > This seems to imply that even though the HRegionServer is unreachable, > > somehow it's write-ahead log and the regions it was serving are. Perhaps > I > > don't fully understand HFS, but is this a guarantee when the node hosting > > the HRegionServer is down? What happens if you can't get to the > > write-ahead > > log and/or some of the regions the region server was serving? > > > Its log is written into the HDFS, a distributed file system that by default > replicates all that is written to it. A member of the HDFS cluster might > go > down and take some data with it but because the data is replicated, when > the > commit log is replayed, it'll be using one of the still online replicas. > > (Do you know a woman named Linda?) > > St.Ack > -- David W. Van Couvering I am looking for a senior position working on server-side Java systems. Feel free to contact me if you know of any opportunities. http://www.linkedin.com/in/davidvc http://davidvancouvering.blogspot.com http://twitter.com/dcouvering
