[
https://issues.apache.org/jira/browse/HBASE-18072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017755#comment-16017755
]
Andrew Purtell commented on HBASE-18072:
----------------------------------------
bq. At bare minimum, I think we need to do a sanity check on all the lengths
for Cells read off the CellScanner for incoming requests.
Agree
> Malformed Cell from client causes Regionserver abort on flush
> -------------------------------------------------------------
>
> Key: HBASE-18072
> URL: https://issues.apache.org/jira/browse/HBASE-18072
> Project: HBase
> Issue Type: Bug
> Components: regionserver, rpc
> Affects Versions: 1.3.0
> Reporter: Gary Helmling
> Assignee: Gary Helmling
> Priority: Critical
>
> When a client writes a mutation with a Cell with a corrupted value length
> field, it is possible for the corrupt cell to trigger an exception on
> memstore flush, which will trigger regionserver aborts until the region is
> manually recovered.
> This boils down to a lack of validation on the client submitted byte[]
> backing the cell.
> Consider the following sequence:
> 1. Client creates a new Put with a cell with value of byte[16]
> 2. When the backing KeyValue for the Put is created, we serialize 16 for the
> value length field in the backing array
> 3. Client calls Table.put()
> 4. RpcClientImpl calls KeyValueEncoder.encode() to serialize the Cell to the
> OutputStream
> 5. Memory corruption in the backing array changes the serialized contents of
> the value length field from 16 to 48
> 6. Regionserver handling the put uses KeyValueDecoder.decode() to create a
> KeyValue with the byte[] read directly off the InputStream. The overall
> length of the array is correct, but the integer value serialized at the value
> length offset has been corrupted from the original value of 16 to 48.
> 7. The corrupt KeyValue is appended to the WAL and added to the memstore
> 8. After some time, the memstore flushes. As HFileWriter is writing out the
> corrupted cell, it reads the serialized int from the value length position in
> the cell's byte[] to determine the number of bytes to write for the value.
> Because value offset + 48 is greater than the length of the cell's byte[], we
> hit an IndexOutOfBoundsException:
> {noformat}
> java.lang.IndexOutOfBoundsException
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:151)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at
> org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.encode(NoOpDataBlockEncoder.java:56)
> at
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(HFileBlock.java:954)
> at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:284)
> at
> org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
> at
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:1041)
> at
> org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:138)
> at
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
> at
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:937)
> at
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2413)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2456)
> {noformat}
> 9. Regionserver aborts due to the failed flush
> 10. The regionserver WAL is split into recovered.edits files, one of these
> containing the same corrupted cell
> 11. A new regionserver is assigned the region with the corrupted write
> 12. The new regionserver replays the recovered.edits entries into memstore
> and then tries to flush the memstore to an HFile
> 13. The flush triggers the same IndexOutOfBoundsException, causing us to go
> back to step #8 and loop on repeat until manual intervention is taken
> The corrupted cell basically becomes a poison pill that aborts regionservers
> one at a time as the region with the problem edit is passed around. This
> also means that a malicious client could easily construct requests allowing a
> denial of service attack against regionservers hosting any tables that the
> client has write access to.
> At bare minimum, I think we need to do a sanity check on all the lengths for
> Cells read off the CellScanner for incoming requests. This would allow us to
> reject corrupt cells before we append them to the WAL and succeed the
> request, putting us in a position where we cannot recover. This would only
> detect the corruption of length fields which puts us in a bad state.
> Whether or not Cells should carry some checksum generated at the time the
> Cell is created, which could then validated on the server-side, is a separate
> question. This would allow detection of other parts of the backing cell
> byte[], such as within the key fields or the value field. But the computer
> overhead of this may be too heavyweight to be practical.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)