Ok, so if not all the data came back then it could be a bug, although
it could have already been fixed since we iterate very fast on the
0.89 releases (which are dev preview releases, not meant for
production).

When a region server crashes, the master splits all the write-ahead
logs and the regions are then distributed to the remaining region
servers. It's all automatic. Even if it happened during a major
compaction, the original store files aren't deleted until the new
store file is created.

Did the master encounter any fatal exceptions while splitting the
logs? Did you take a look at the log file? Can you figure which rows
in .META. are missing (there would be holes)?

J-D

On Fri, Aug 13, 2010 at 3:18 PM, Jeremy Carroll
<[email protected]> wrote:
> We are using CDH3 Beta 2.
> ________________________________________
> From: [email protected] [[email protected]] On Behalf Of Jean-Daniel Cryans 
> [[email protected]]
> Sent: Friday, August 13, 2010 4:50 PM
> To: [email protected]
> Subject: Re: HBase recovery
>
> Which version? Prior to HBase 0.89 + Hadoop 0.20-append (or cdh3),
> HBase cannot guarantee durability of the latest inserts (this includes
> edits to .META.)
>
> J-D
>
> On Fri, Aug 13, 2010 at 2:45 PM, Jeremy Carroll
> <[email protected]> wrote:
>> During some testing of a small development cluster, one of the RegionServers 
>> that we employ has an issue with a bad RAM stick. So when it gets into heavy 
>> RAM operation it likes to crash. Here is my question. We had an issue where 
>> the RegionServer holding .META. crashed. The entire cluster was unusable as 
>> it did not reassign .META. to a different region. Also when the server goes 
>> down, what happens to all the regions that it held? Does it reassign them to 
>> other region servers? Also what is the correct action for recovery. It 
>> crashed during a major_compaction so how do I verify that I am not missing 
>> data? I see that I had 166 regions online on this server before the crash, 
>> and now after the crash it has 158. What's the correct steps to recover 
>> HBase after a major crash?
>

Reply via email to