Kevin, Not at all preachy. Thank you very much for the all good, useful info. I think your information will benefit others in the community if they would meet the same problem I had. I wish I could see your post one day ago.
The cluster that I had that weird problem is a dev cluster and the issue was blocking someone's work. I had to take the brutal force way to restore it as quickly as possible. But yes you are right about in a production deleting stuff without knowing the root cause is dangerous. So your approach is very much desired in that case. Cheers, Shumin On Wed, Oct 3, 2012 at 6:13 AM, Kevin O'dell <[email protected]>wrote: > Shumin, > > In the future for these kinds of issues please be more methodical about > it. It is important to collect information along the way, unless this > hbase instance is something you don't care about losing. I would > recommend: > > hadoop fs -lsr /hbase > lsr_before.out > > hbase hbck -details 2>&1 | tee details_before.out > > Look at the details and try to understand what is happening, if you don't > please submit your questions and concerns here while providing us data > > hadoop fs -cp /hbase/.META. /tmp/.META. > > hbase hbck -fixMeta -fixAssignments 2>&1 | tee fix.out > > Try to drop your table. If that doesn't work > > hadoop fs -rmr /hbase/phantom_table > > hbase hbck -fixMeta -fixAssignments 2>&1 | tee fix1.out > > This should correct the issue as it will rebuild META like the table was > never there. You may end up with some bad data cached for META so restart > your master if you are having issues. This way we can see what was done, > follow along with data and be able to give complete diagnosis as well as > file JIRAs to strengthen the product. > > Sorry for being preachy :) > > On Tue, Oct 2, 2012 at 7:26 PM, Shumin Wu <[email protected]> wrote: > > > Thanks for reply! > > > > I did a > > hbase hbck -fixAssignments > > > > results showed > > 0 inconsistencies detected. > > Status: OK > > > > But the problem is still there -- hbase says table existing when I tried > to > > create and table not existed when i tried to delete. > > > > Then I brutally removed the phantom table from hdfs, rebooted the server, > > and hoped to see the problem gone. > > > > But then met the NotAllMetaRegionsOnlineException error. So I did another > > hbck. > > > > Then suddenly I was able to recreate the phantom table again. Problem > > solved. > > > > > > Shumin > > > > On Tue, Oct 2, 2012 at 4:07 PM, <[email protected]> wrote: > > > > > Can you try using hbck ? > > > > > > In the future, don't remove anything before using hbck. > > > > > > Thanks > > > > > > > > > > > > On Oct 2, 2012, at 3:55 PM, Shumin Wu <[email protected]> wrote: > > > > > > > Hi, > > > > > > > > I am using HBase 0.92 and got stuck with deletion/recreation of a > > phantom > > > > table. The table became "phantom" because hbase server went offline > > > during > > > > the first time when it got deleted. Since then I cannot recreate the > > > table > > > > because of the inconsistency catalog information. > > > > > > > > Below is what I got from the hbase shell (I replaced the table name > > with > > > a > > > > fake name). > > > > > > > > > > > > ============================================= > > > > * > > > > hbase(main):005:0> create 'phantom_table','cf'* > > > > > > > > ERROR: Table already exists: phantom_table! > > > > > > > > Here is some help for this command: > > > > Create table; pass table name, a dictionary of specifications per > > > > column family, and optionally a dictionary of table configuration. > > > > Dictionaries are described below in the GENERAL NOTES section. > > > > Examples: > > > > > > > > hbase> create 't1', {NAME => 'f1', VERSIONS => 5} > > > > hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'} > > > > hbase> # The above in shorthand would be the following: > > > > hbase> create 't1', 'f1', 'f2', 'f3' > > > > hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, > > > > BLOCKCACHE => true} > > > > hbase> create 't1', 'f1', {SPLITS => ['10', '20', '30', '40']} > > > > hbase> create 't1', 'f1', {SPLITS_FILE => 'splits.txt'} > > > > > > > > * > > > > hbase(main):006:0> disable 'phantom_table'* > > > > > > > > ERROR: Table phantom_table does not exist.' > > > > > > > > Here is some help for this command: > > > > Start disable of named table: e.g. "hbase> disable 't1'" > > > > > > > > ============================================= > > > > > > > > Here are the attempts I already did for investigation. > > > > > > > > - I did a fsck on /hbase. It reported healthy. > > > > > > > > - I did a scan on .META.. The phantom table is not listed there. > > > > > > > > - I checked zookeeper's hbase directory and found the phantom table. > I > > > > deleted the entry but the problem reported above still persistent. > > > > > > > > Any comments or suggestions are highly appreciated! > > > > > > > > Thanks, > > > > > > > > Shumin > > > > > > > > > -- > Kevin O'Dell > Customer Operations Engineer, Cloudera >
