You are on what version of hbase?

My guess is its 0.19.x?

How many nodes.  You using hdfs or local fs?

The log below doesn't show issues.

So, as to what happened, I speculate that you loaded up your table and then
there was some issue -- did you up your file descriptors, xceivers, etc? --
that caused the hang but uploads, in particular the edits that included
creation of your table and addition table regions had not been persisted.
The hungup hbase and your kill -9 -- there is nothing else you can do when
it won't respond though you could try ./bin/hbase-daemon.sh stop
regionserver on each of your regionservers to try and bring them down nicely
-- meant the catalog table edits were lost so it appears your table is lost
(HDFS does not have a working flush/sync/append in hadoop 0.19.x so hbase
can lose data).

In the head of the 0.19 branch we've done stuff to make the window whereby
we lose edits narrower (.META. flushes every few k or so).  I need to put up
a 0.19.4 release candidate (I'm held up by my tracing a new issue here on
our home cluster).

St.Ack





On Thu, Jun 18, 2009 at 9:10 AM, mike anderson <[email protected]>wrote:

> I had about 30,000 rows in my table 'cached_parsedtext'.  This morning when
> I checked, Hbase appeared to be down (master server web UI was not
> responding and the Shell crashed when I tried to count rows). I tried doing
> a nice shutdown via bin/stop-hbase, this hung for about 20 minutes though
> so
> I gave up and did a kill -9 on the hbase processes (what else was I
> supposed
> to do!?). Upon restarting I discovered that all of the rows were gone. I
> browsed the filesystem and saw that some of the metadata still existed in
> hadoop dfs. Is there a way to rebuild the table? (After the force kill I
> also did a nice restart of hbase and hadoop -- same results)
>
> A few of the relevent looking log files are included below for those that
> speak the language. However, these don't really mean much to me.
>
> logs/hbase-pubget-master-carr.domain.com.log:2009-06-18 11:12:42,038 INFO
> org.apache.hadoop.hba
> se.master.ServerManager: Received MSG_REPORT_OPEN:
> cached_parsedtext,,1244838542607: safeMode=false fr
> om 10.0.16.91:60020
> logs/hbase-pubget-master-carr.domain.com.log:2009-06-18 11:12:42,038 INFO
> org.apache.hadoop.hba
> se.master.ProcessRegionOpen$1: cached_parsedtext,,1244838542607 open on
> 10.0.16.91:60020
> logs/hbase-pubget-master-carr.domain.com.log:2009-06-18 11:12:42,039 INFO
> org.apache.hadoop.hba
> se.master.ProcessRegionOpen$1: updating row
> cached_parsedtext,,1244838542607
> in region .META.,,1 with
> startcode 1245337882941 and server 10.0.16.91:60020
> logs/hbase-pubget-master-carr.domain.com.log:2009-06-18 11:31:31,595 INFO
> org.apache.hadoop.hba
> se.master.RegionManager: assigning region cached_parsedtext,,1244838542607
> to the only server 10.0.16.
> 91:60020
> logs/hbase-pubget-master-carr.domain.com.log:2009-06-18 11:31:34,823 INFO
> org.apache.hadoop.hba
> se.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN:
> cached_parsedtext,,1244838542607: safeMode=
> false from 10.0.16.91:60020
>
>
>
>
> Ideally I'd love to get my table back, but if not, learning how to avoid
> this in the future would be great.
>
>
> Thanks in advance,
> Mike
>

Reply via email to