Try to run "habase hbck -fix" It should do the job. Thank you!
Sincerely, Leonid Fedotov On Apr 12, 2013, at 9:56 AM, Brennon Church wrote: > hbck does show the hdfs files there without associated regions. I probably > could have recovered had I noticed just after this happened, but given that > we've been running like this for over a week, and that there is the potential > for collisions between the missing and new data, I'm probably just going to > manually reinsert it all using the hdfs files. > > Hadoop version is 1.0.1, btw. > > Thanks. > > --Brennon > > On 4/11/13 11:05 PM, Ted Yu wrote: >> Brennon: >> Have you run hbck to diagnose the problem ? >> >> Since the issue might have involved hdfs, browsing DataNode log(s) may >> provide some clue as well. >> >> What hadoop version are you using ? >> >> Cheers >> >> On Thu, Apr 11, 2013 at 10:58 PM, ramkrishna vasudevan < >> [email protected]> wrote: >> >>> When you say that the parent regions got reopened does that mean that you >>> did not lose any data(any data could not be read). The reason am asking is >>> if after the parent got split into daughters and the data was written to >>> daughters and if the daughters related files could not be opened you could >>> have ended up in not able to read the data. >>> >>> Some logs could tell us what made the parent to get reopened rather than >>> daughters. Another thing i would like to ask is was the cluster brought >>> down abruptly by killing the RS. >>> >>> Which version of HBase? >>> >>> Regards >>> Ram >>> >>> >>> >>> >>> On Fri, Apr 12, 2013 at 11:20 AM, Brennon Church <[email protected]> >>> wrote: >>> >>>> Hello, >>>> >>>> I had an interesting problem come up recently. We have a few thousand >>>> regions across 8 datanode/regionservers. I made a change, increasing the >>>> heap size for hadoop from 128M to 2048M which ended up bringing the >>> cluster >>>> to a complete halt after about 1 hour. I reverted back to 128M and >>> turned >>>> things back on again but didn't realize at the time that I came up with 9 >>>> fewer regions than I started. Upon further investigation, I found that >>> all >>>> 9 missing regions were from splits that occurred while the cluster was >>>> running after making the heap change and before it came to a halt. There >>>> was a 10th regions (5 splits involved in total) that managed to get >>>> recovered. The really odd thing is that in the case of the other 9 >>>> regions, the original parent regions, which as far as I can tell in the >>>> logs were deleted, were re-opened upon restarting things once again. The >>>> daughter regions were gone. Interestingly, I found the orphaned >>> datablocks >>>> still intact, and in at least some cases have been able to extract the >>> data >>>> from them and will hopefully re-add it to the tables. >>>> >>>> My question is this. Does anyone know based on the rather muddled >>>> description I've given above, what could have possibly happened here? My >>>> best guess is that the bad state that hdfs was in caused some critical >>>> component of the split process to be missed, which resulted a reference >>> to >>>> the parent regions sticking around and losing the references to the >>>> daughter regions. >>>> >>>> Thanks for any insight you can provide. >>>> >>>> --Brennon >>>> >>>> >>>> >>>> > >
