Re: thinking about hbase 0.20

Ryan Rawson Thu, 02 Apr 2009 17:42:01 -0700

I want to talk about sync() in HDFS for a bit...

I had a cluster crash, OOMEs out the butt, 17/19 machines were dead when I
got to the scene.

What I found was in .META. there were 2-3x as many regions as were actually
on disk.  Tons of older entries from parent splits. Looks like a bunch of
updates and deletes weren't persisted.  And by a bunch, I mean a SHIT TON.
It was insane.  I had to write HbaseFsck.java as an experiment to recover
without rm -rf /hbase

So, what will be in hadoop-0.20 to minimize this kind of horrible data loss?

Is this the 'sync()' call that is on-again-off-again reliable?

What about append?  Do we really need append?  Syncing an open file to
persist data is good enough, no?

-ryan

On Thu, Apr 2, 2009 at 5:34 PM, Jim Kellerman (POWERSET) <
[email protected]> wrote:

> > -----Original Message-----
> > From: Erik Holstad [mailto:[email protected]]
> > Sent: Thursday, April 02, 2009 5:09 PM
> > To: [email protected]
> > Subject: Re: thinking about hbase 0.20
> >
> > So the way I see it, from our point of view, we can probably get 0.20 out
> > the door a week after that meeting, so maybe a week and a half after
> Stack
> > gets back.
>
> We still have to wait for hadoop-0.20 which has no release candidate yet.
> However pushing tasks out is still a good idea so that we can spend the
> time between hadoop-0.20 release candidate and hbase-0.20 fixing issues
> which I'm certain we will find. All in all this should result in a more
> timely and stable release for hbase-0.20.
>
> -Jim
>

Re: thinking about hbase 0.20

Reply via email to