[ 
https://issues.apache.org/jira/browse/HBASE-21601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737564#comment-16737564
 ] 

Sergey Shelukhin commented on HBASE-21601:
------------------------------------------


Looks like we might need to look closer at the file... l cannot tell from 
KeyValueUtil/CellUtil/KeyValue/etc code where exactly the cell is created, but 
it seems like the requisite number of bytes should always be read for the 
record, assuming we don't get an IOException or EOF of some sort... or the 
lower level, byte-reading logic would throw an error.
So, we may be reading the record fully from the file, but getting some garbage 
bytes; or there's a bug somewhere that allows a partial read to happen, so the 
offset calculations in KeyValue/CellUtil return bogus offsets.

> corrupted WAL is not handled in all places (NegativeArraySizeException)
> -----------------------------------------------------------------------
>
>                 Key: HBASE-21601
>                 URL: https://issues.apache.org/jira/browse/HBASE-21601
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Major
>
> {noformat}
> 2018-12-13 17:01:12,208 ERROR [RS_LOG_REPLAY_OPS-regionserver/...] 
> executor.EventHandler: Caught throwable while processing event RS_LOG_REPLAY
> java.lang.RuntimeException: java.lang.NegativeArraySizeException
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$PipelineController.checkForErrors(WALSplitter.java:846)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$OutputSink.finishWriting(WALSplitter.java:1203)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(WALSplitter.java:1267)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:349)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:196)
>       at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.splitLog(SplitLogWorker.java:178)
>       at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.lambda$new$0(SplitLogWorker.java:90)
>       at 
> org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:70)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NegativeArraySizeException
>       at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:113)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1542)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.appendBuffer(WALSplitter.java:1586)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1560)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1085)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1077)
>       at 
> org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1047)
> {noformat}
> Unfortunately I cannot share the file.
> The issue appears to be straightforward - for whatever reason the family 
> length is negative. Not sure how such a cell got created, I suspect the file 
> was corrupted.
> {code}
> byte[] output = new byte[cell.getFamilyLength()];
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to