Thanks for your reply J-D
my comments, inline
> > on first start, HBase master could not start, we had to rmr /hbase from
> > zookeeper. Is it safe ? By the way, why do we have to do that ?
>
> You should not have to do that, what was the reason the master didn't
> start? It's hard to answer "By the way, why do we have to do that ?"
> if you don't give us anything to ponder about.
>
>
Well, I can not reproduce it. But I take the opportunity to digress a bit.
Considering that this is a fresh start and that everything have been
stopped gracefully : is there any risk to remove the zookeeper content ?
What are the valuable data there ?
> > Lastly, we faced a weird "nil class" under hbase shell when we tried to
> > 'list'. It seems that the root cause was oldlogfiles.log like
> > /hbase/<table_name>/227772165/oldlogfile.log. There was nothing but this
> > file under /hbase/<table_name>/227772165. Is it safe to remove them ?
>
> Woah hold on there, there's a lot of info missing.
>
> You say "we faced a weird 'nil class' ", what was it like?
>
> Then "It seems that the root cause was oldlogfiles.log", how so?
>
> And about "Is it safe to remove them ?", it depends... is 227772165 an
> active region? Is the file old? If yes to both, then it's safe to
> delete as it would just be an old leftover which was fixed in the
> meantime.
>
>
Yes right, here are the details :
hbase shell output :
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.92.0, r1231986, Mon Jan 16 13:16:35 UTC 2012
hbase(main):001:0> list
TABLE
ERROR: undefined method `map' for nil:NilClass
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
gives the corresponding error in master logs :
2012-03-01 08:56:33,338 WARN org.apache.hadoop.hbase.master.HMaster: Failed
getting all descriptors
org.apache.hadoop.hbase.TableExistsException: No descriptor for gre014749
at
org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164)
at
org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182)
at
org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
walking thru the HDFS:
hadoop_storage fs -lsr /hbase/gre014749
drwxr-xr-x - hadoop supergroup 0 2011-07-22 07:24
/hbase/gre014749/196843901
-rw-r--r-- 3 hadoop supergroup 188392 2011-07-22 07:24
/hbase/gre014749/196843901/oldlogfile.log
I could not find any reference to this region in status 'detailed'. And I
guess that if the region was in use, there should be additional files
under 196843901/ ?