[
https://issues.apache.org/jira/browse/HBASE-16624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486459#comment-15486459
]
deepankar edited comment on HBASE-16624 at 9/13/16 6:46 AM:
------------------------------------------------------------
Function is nested deep inside the HFileScannerImpl and it was hard to come up
with a UT. But if you can pass the sequenceId above the Integer.MAX_VALUE you
would immediately see the issue, to confirm the issue I have changed the
function as below and tested it with values passed below
{code}
/**
* Actually do the mvcc read. Does no checks.
* @param offsetFromPos
*/
private long _readMvccVersion(ByteBuff blockBuffer) {
long currMemstoreTS = 0;
int offsetFromPos = 0;
// This is Bytes#bytesToVint inlined so can save a few instructions in
this hot method; i.e.
// previous if one-byte vint, we'd redo the vint call to find int size.
// Also the method is kept small so can be inlined.
byte firstByte = blockBuffer.getByteAfterPosition(offsetFromPos);
int len = WritableUtils.decodeVIntSize(firstByte);
if (len == 1) {
currMemstoreTS = firstByte;
} else {
int remaining = len -1;
long i = 0;
offsetFromPos++;
if (remaining >= Bytes.SIZEOF_INT) {
// The int read has to be converted to unsigned long so the & op
i = (blockBuffer.getIntAfterPosition(offsetFromPos) &
0x00000000ffffffffL);
remaining -= Bytes.SIZEOF_INT;
offsetFromPos += Bytes.SIZEOF_INT;
}
if (remaining >= Bytes.SIZEOF_SHORT) {
short s = blockBuffer.getShortAfterPosition(offsetFromPos);
i = i << 16;
i = i | (s & 0xFFFF);
remaining -= Bytes.SIZEOF_SHORT;
offsetFromPos += Bytes.SIZEOF_SHORT;
}
for (int idx = 0; idx < remaining; idx++) {
byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx);
i = i << 8;
i = i | (b & 0xFF);
}
currMemstoreTS = (WritableUtils.isNegativeVInt(firstByte) ? ~i : i);
}
return currMemstoreTS;
}
{code}
And I passed the byteBuff I got from
{code}
ByteArrayDataOutput out = ByteStreams.newDataOutput();
WritableUtils.writeVLong(out, 3085788160L);
new SingleByteBuff(ByteBuffer.wrap(out.toByteArray()))
{code}
was (Author: dvdreddy):
Function is nested deep inside the HFileScannerImpl and it was hard to come up
with a UT. But if you can cross the sequenceId above the Integer.MAX_VALUE you
would immediately see the issue, to confirm the issue I have changed the
function
{code}
/**
* Actually do the mvcc read. Does no checks.
* @param offsetFromPos
*/
private long _readMvccVersion(ByteBuff blockBuffer) {
long currMemstoreTS = 0;
int offsetFromPos = 0;
// This is Bytes#bytesToVint inlined so can save a few instructions in
this hot method; i.e.
// previous if one-byte vint, we'd redo the vint call to find int size.
// Also the method is kept small so can be inlined.
byte firstByte = blockBuffer.getByteAfterPosition(offsetFromPos);
int len = WritableUtils.decodeVIntSize(firstByte);
if (len == 1) {
currMemstoreTS = firstByte;
} else {
int remaining = len -1;
long i = 0;
offsetFromPos++;
if (remaining >= Bytes.SIZEOF_INT) {
// The int read has to be converted to unsigned long so the & op
i = (blockBuffer.getIntAfterPosition(offsetFromPos) &
0x00000000ffffffffL);
remaining -= Bytes.SIZEOF_INT;
offsetFromPos += Bytes.SIZEOF_INT;
}
if (remaining >= Bytes.SIZEOF_SHORT) {
short s = blockBuffer.getShortAfterPosition(offsetFromPos);
i = i << 16;
i = i | (s & 0xFFFF);
remaining -= Bytes.SIZEOF_SHORT;
offsetFromPos += Bytes.SIZEOF_SHORT;
}
for (int idx = 0; idx < remaining; idx++) {
byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx);
i = i << 8;
i = i | (b & 0xFF);
}
currMemstoreTS = (WritableUtils.isNegativeVInt(firstByte) ? ~i : i);
}
return currMemstoreTS;
}
{code}
And I passed the byteBuff I got from
{code}
ByteArrayDataOutput out = ByteStreams.newDataOutput();
WritableUtils.writeVLong(out, 3085788160L);
new SingleByteBuff(ByteBuffer.wrap(out.toByteArray()))
{code}
> MVCC DeSerialization bug in the HFileScannerImpl
> ------------------------------------------------
>
> Key: HBASE-16624
> URL: https://issues.apache.org/jira/browse/HBASE-16624
> Project: HBase
> Issue Type: Bug
> Components: HFile
> Affects Versions: 2.0.0
> Reporter: deepankar
> Assignee: deepankar
> Priority: Blocker
> Attachments: HBASE-16624.patch
>
>
> My colleague [~naggarwal] found a bug in the deserialization of mvcc from
> HFile, As a part of the optimization of deserialization of VLong, we read a
> int at once but we forgot to convert it to unsigned one.
> This would cause issues because once we cross the integer threshold in
> sequenceId and a compaction happens we would write MAX_MEMSTORE_TS in the
> trailer as 0 (because we will be reading negative values from the file that
> got flushed with sequenceId > Integer.MAX_VALUE). And once we have
> MAX_MEMSTORE_TS as 0, and there are sequenceId values present alongside with
> KeyValues the regionserver will now start failing to read the compacted file
> and thus corruption.
> Interestingly this would happen only on the tables that don't have
> DataBlockEncoding enabled and unfortunately in our case that turned out to be
> META and a another small table.
> Fix is small (~20 chars) and attached
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)