Re: flushing memstore when cluster shuts down WAS: empty table after flush

Stack Fri, 24 Sep 2010 23:03:48 -0700

On Fri, Sep 24, 2010 at 9:06 PM, Ted Yu <[email protected]> wrote:
> I see this log following the previous snippet:
>
> 2010-09-24 11:21:43,799 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block null bad datanode[0] nodes == null
> 2010-09-24 11:21:43,799 WARN org.apache.hadoop.hdfs.DFSClient: Could not get
> block locations. Source file
> "/hbase/.logs/sjc9-flash-grid02.carrieriq.com,60020,1285347585107/hlog.dat.1285351187512"
> - Aborting...
> 2010-09-24 11:21:45,417 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to close log in
> abort


So we were aborting and the one thing we'll try to do on our way out
when aborting is close the WAL log.  Seems like that failed in the
above.  (This stuff is odd -- 'Recovery for block null bad datanode[0]
nodes == null'... anything in your datanode logs to explain this?
What if you grep the WAL log name in namenode log, do you see anything
interesting?).

> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException:
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
> /hbase/.logs/sjc9-flash-grid02.carrieriq.com,60020,1285347585107/hlog.dat.1285351187512
> File does not exist. Holder DFSClient_302121899 does not have any open


Hmm... says it does not exist.

So, yeah, for sure, check out the namenode logs.

Hey Ted, are you fellas running 0.20.x still?  If so, what would it
take to get you fellas up on 0.89, say the RC J-D put up today?


> Would failure from hlog.close() lead to data loss ?
>

Are you not on 0.20 hbase still?  If so, yes.  If on 0.89 with an
hadoop 0.20 with append support (Apache -append branch or CDH3b2),
then some small amount may have been lost.
St.Ack

Re: flushing memstore when cluster shuts down WAS: empty table after flush

Reply via email to