Yesterday we ran into an interesting issue. We were shutting down our HBase
cluster ( 0.90.1 CDH3u0 ) and in the process one of the nodes encountered a
Java heap space exception. The bummer is the log file was listed as corrupt
from hadoop fsck and was unable to be read when re-starting the database. We
were able to recover in our situation by removing the corrupt log and did not
appear to lose any data.
Has anyone else seen this issue? If I'm reading the situation right it
looks like that a Java heap space error during the WAL checksum write could
leave the WAL corrupt which doesn't seem like desired behavior.
I'll looking into it further but any thoughts would be appreciated.
2011-07-14 14:54:53,741 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog:
Could not append. Requesting close of hlog
java.io.IOException: Reflection
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:147)
at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:987)
at
org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:964)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor1336.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:145)
... 2 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2375)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3271)
at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3354)
at
org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
at
org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944)
... 6 more