Dennis Huo created HDFS-13191:
---------------------------------

             Summary: Internal buffer-sizing details are inadvertently baked 
into FileChecksum and BlockGroupChecksum
                 Key: HDFS-13191
                 URL: https://issues.apache.org/jira/browse/HDFS-13191
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs, hdfs-client
            Reporter: Dennis Huo


The org.apache.hadoop.io.DataOutputBuffer is used as an "optimization" in many 
places to allow a reusable form of ByteArrayOutputStream, but requires the 
caller to be careful to use getLength() instead of getData().length to 
determine the number of actually valid bytes to consume.

At least three places in the path of constructing FileChecksums have incorrect 
usage of DataOutputBuffer:

[FileChecksumHelper digesting block 
MD5s|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/FileChecksumHelper.java#L239]

[BlockChecksumHelper digesting striped block MD5s to construct block-group 
checksum|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockChecksumHelper.java#L412]

[MD5MD5CRC32FileChecksum.getBytes()|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java#L76]

The net effect is that FileChecksum consumes exact BlockChecksums if there are 
1 or 2 blocks (at 16 and 32 bytes respectively), but at 3 blocks will round up 
to 64 bytes, effectively returning the same FileChecksum as if there were 4 
blocks and the 4th block happened to have an MD5 exactly equal to 0x00...00. 
Similarly, BlockGroupChecksum will behave as if there is a power-of-2 number of 
bytes from BlockChecksums in the BlockGroup.

This appears to have been a latent bug for at least 9 years for FileChecksum 
(and since inception for the implementation of striped files), and works fine 
as long as HDFS implementations strictly stick to the same internal buffering 
semantics.

However, this also makes the implementation extremely brittle unless carefully 
documented. For example, if code is ever refactored to pass around a 
MessageDigest that consumes block MD5s as they come rather than writing into a 
DataOutputBuffer before digesting the entire buffer, then the resulting 
checksum calculations will change unexpectedly.

At the same time, "fixing" the bug would also be backwards-incompatible, so the 
bug might need to stick around. At least for the FileChecksum-level 
calculation, it seems the bug has been latent for a very long time. Since 
striped files are fairly new, the BlockChecksumHelper could probably be fixed 
sooner rather than later to avoid perpetuating a bug. The getBytes() method for 
FileChecksum is more innocuous, so could likely be fixed or left as-is without 
too much impact either way.

The bug can be highlighted by changing the internal buffer-growing semantics of 
the DataOutputBuffer, or simply returning a randomly-sized byte buffer in 
getData() while only ensuring the first getLength() bytes are actually present, 
for example:

 
{code:java}
diff --git 
a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
 
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
index 4c2fa67f8f2..f2df94e898f 100644
--- 
a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
+++ 
b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
@@ -103,7 +103,17 @@ private DataOutputBuffer(Buffer buffer) {
/** Returns the current contents of the buffer.
* Data is only valid to {@link #getLength()}.
*/
- public byte[] getData() { return buffer.getData(); }
+ public byte[] getData() {
+ java.util.Random rand = new java.util.Random();
+ byte[] bufferData = buffer.getData();
+ byte[] ret = new byte[rand.nextInt(bufferData.length) + bufferData.length];
+ System.arraycopy(bufferData, 0, ret, 0, getLength());
+ return ret;
+ }

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to