Re: When does a row become highly available?

Seth Ladd Fri, 11 Dec 2009 10:35:50 -0800

> You are talking about durability, not HA.

Good point, thanks.  I meant HA for the data, but data durability
makes more sense.


> To have a better understanding I recommend reading our architecture
> page http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture and the
> Bigtable paper.

Thanks, I've been studying that today.

> In short, when you write a row it goes into the write-ahead-log and
> then right after that in MemStore. Once the MemStore is full (64MB) or
> for some other reasons, it is flushed to disk where the file is
> replicated (transparently).

Each RegionStore has its own WAL, yes?  From the Architecture page:

When a write request is received, it is first written to a write-ahead
log called a HLog. All write requests for every region the region
server is serving are written to the same log. Once the request has
been written to the HLog, it is stored in an in-memory cache called
the Memcache. There is one Memcache for each HStore.

Which confuses me, if the write goes straight to a RegionServer, but
then the RegionServer fails before the MemStore is flushed, did I just
lose data?

> If the node fails, the Master will process the WAL so that you don't

So do all writes go through the Master?  Clearly I'm a bit confused here :)

> lose rows in the MemStore. Prior to Hadoop 0.21 (unreleased), the

Moral of the story is to upgrade to 0.21 ASAP. :)

Thanks!

Seth

Re: When does a row become highly available?

Reply via email to