[ https://issues.apache.org/jira/browse/HBASE-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091896#comment-13091896 ]
Dave Latham commented on HBASE-4107: ------------------------------------ It looks like HLog main has support to invoke --split. Does it looks like if I call that on the log that it will put split it and put the data into the right place? We had a handful of regionservers go OOM yesterday while a MR job was doing heavy writes to a column family that doesn't usually get them. In this case, the first OOM occurred here during writing the checksum. > OOME while writing WAL checksum causes corrupt WAL > -------------------------------------------------- > > Key: HBASE-4107 > URL: https://issues.apache.org/jira/browse/HBASE-4107 > Project: HBase > Issue Type: Bug > Components: regionserver, wal > Affects Versions: 0.90.1 > Environment: CentOS 5.5x64 > Reporter: Andy Sautins > Attachments: master.splitting.log, regionserver.oom.log > > > An issue was observed where upon shutdown of a regionserver the regionserver > log was corrupt. It appears from the following stacktrace that an Java heap > memory exception occurred while writing the checksum to the WAL. Corrupting > the WAL can potentially cause data loss. > 2011-07-14 14:54:53,741 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: > Could not append. Requesting close of hlog > java.io.IOException: Reflection > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:147) > at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:987) > at > org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:964) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor1336.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:145) > ... 2 more > Caused by: java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$Packet.<init>(DFSClient.java:2375) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:3271) > at > org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150) > at > org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3354) > at > org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97) > at > org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944) > ... 6 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira