[
https://issues.apache.org/jira/browse/HBASE-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723324#comment-16723324
]
Sergey Shelukhin commented on HBASE-21601:
------------------------------------------
Another one: {noformat}
java.lang.ArrayIndexOutOfBoundsException: -8965
at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1365)
at org.apache.hadoop.hbase.KeyValue.getFamilyLength(KeyValue.java:1358)
at
org.apache.hadoop.hbase.PrivateCellUtil.matchingFamily(PrivateCellUtil.java:735)
at org.apache.hadoop.hbase.CellUtil.matchingFamily(CellUtil.java:786)
at org.apache.hadoop.hbase.CellUtil.matchingFamily(CellUtil.java:777)
at
org.apache.hadoop.hbase.wal.WALEdit.isMetaEditFamily(WALEdit.java:140)
at org.apache.hadoop.hbase.wal.WALEdit.isMetaEdit(WALEdit.java:145)
at
org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:298)
at
org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:196)
at
org.apache.hadoop.hbase.regionserver.SplitLogWorker.splitLog(SplitLogWorker.java:178)
at
org.apache.hadoop.hbase.regionserver.SplitLogWorker.lambda$new$0(SplitLogWorker.java:90)
at
org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:70)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}
Looks like there may need to be just a systematic review of that code to catch
exceptions and handle corrupt records.
> corrupted WAL is not handled in all places (NegativeArraySizeException)
> -----------------------------------------------------------------------
>
> Key: HBASE-21601
> URL: https://issues.apache.org/jira/browse/HBASE-21601
> Project: HBase
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Priority: Major
>
> {noformat}
> 2018-12-13 17:01:12,208 ERROR [RS_LOG_REPLAY_OPS-regionserver/...]
> executor.EventHandler: Caught throwable while processing event RS_LOG_REPLAY
> java.lang.RuntimeException: java.lang.NegativeArraySizeException
> at
> org.apache.hadoop.hbase.wal.WALSplitter$PipelineController.checkForErrors(WALSplitter.java:846)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$OutputSink.finishWriting(WALSplitter.java:1203)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(WALSplitter.java:1267)
> at
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:349)
> at
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:196)
> at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.splitLog(SplitLogWorker.java:178)
> at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.lambda$new$0(SplitLogWorker.java:90)
> at
> org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:70)
> at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NegativeArraySizeException
> at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:113)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1542)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.appendBuffer(WALSplitter.java:1586)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1560)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1085)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1077)
> at
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1047)
> {noformat}
> Unfortunately I cannot share the file.
> The issue appears to be straightforward - for whatever reason the family
> length is negative. Not sure how such a cell got created, I suspect the file
> was corrupted.
> {code}
> byte[] output = new byte[cell.getFamilyLength()];
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)