On Mon, Jul 4, 2011 at 1:28 PM, Stack <st...@duboce.net> wrote:

> On Sun, Jul 3, 2011 at 12:39 AM, Andrew Purtell <apurt...@apache.org>
> wrote:
> > I've done exercises in the past like delete META on disk and recreate it
> with the earlier set of utilities (add_table.rb). This always "worked for
> me" when I've tried it.
> >
>
> We need to update add_table.rb at least.  The onlining of regions was
> done by the metascan.  It no longer exists in 0.90.  Maybe a
> disable/enable after an add_table.rb would do but probably better to
> revamp and merge it with hbck?
>
>
> > Results from torture tests that HBase was subjected to in the timeframe
> leading up to 0.90 also resulted in better handling of .META. table related
> errors. They are fortunately demonstrably now rare.
> >
>
> Agreed.
>
>
> >My concern here is getting repeatable results demonstrating HBCK
> weaknesses will be challenging.
> >
>
> Yes.  This is the tough one.  I was hoping Wayne had a snapshot of
> .META. to help at least characterize the problem.
>
> (This does sound like something our Dan Harvey ran into recently on an
> hbase 0.20.x hbase.  Let me go back to him.  He might have some input
> that will help here.)
>
> St.Ack
>

If the root of this issue is the master filling up it is not totally an
hbase issue. If your search the hadoop mailing list you will find people
who's NameNode disk fills up and had quite a catastrophic, hard to recover
from, failure.

Monitor the s#it out of your SPOFs. To throw something very anecdotal in
here, I find not many data stores recover from full disk errors well.

Reply via email to