[jira] [Comment Edited] (HDFS-13191) Internal buffer-sizing details are inadvertently baked into FileChecksum and BlockGroupChecksum

Ajay Kumar (JIRA) Mon, 05 Mar 2018 14:12:13 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-13191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386395#comment-16386395
 ]


Ajay Kumar edited comment on HDFS-13191 at 3/5/18 10:11 PM:
------------------------------------------------------------

 Ya, it seems there is no good solution to this.
{quote} If getData() always allocates a new array, it likely eliminates any 
(perceived) advantage of using DataOutputBuffer in the first place{quote}
Modifying getData() to return only the trimmed copy of the array will result in 
overhead of new byte array allocation. Although we can minimize this by 
wrapping it with an if condition. (Create new copy only when count< length). 
{quote}Even though its interface doesn't promise anything about the size of the 
returned byte[], the fact that the checksums are sensitive to it means at least 
some places are apparently (unintentionally) relying on the exact buffer-sizing 
behavior already, so changing it inside DataOutputBuffer directly has a higher 
risk of breaking unexpected system components.{quote} Ya, but leaving it as it 
is may also involve risk of breaking some future functionality.
I don't feel strongly about any specific approach but leaving bug as it is may 
result in more innocuous  critical bugs somewhere else. At the minimum we 
should add more documentation to it to warn users about this bug.


was (Author: ajayydv):
 Ya, it seems there is no good solution to this.
{quote} If getData() always allocates a new array, it likely eliminates any 
(perceived) advantage of using DataOutputBuffer in the first place{quote}
Modifying getData() to return only the trimmed copy of the array will result in 
overhead of new byte array allocation. Although we can minimize this by 
wrapping it with an if condition. (Create new copy only when count< length). 
{quote}Even though its interface doesn't promise anything about the size of the 
returned byte[], the fact that the checksums are sensitive to it means at least 
some places are apparently (unintentionally) relying on the exact buffer-sizing 
behavior already, so changing it inside DataOutputBuffer directly has a higher 
risk of breaking unexpected system components.{quote} Ya, but leaving it as it 
is may also involve risk of breaking critical functionality.

I don't feel strongly about any specific approach but leaving bug as it is may 
result in more innocuous  critical bugs somewhere else. At the minimum we 
should add more documentation to it to warn users about this bug.

> Internal buffer-sizing details are inadvertently baked into FileChecksum and 
> BlockGroupChecksum
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13191
>                 URL: https://issues.apache.org/jira/browse/HDFS-13191
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, hdfs-client
>            Reporter: Dennis Huo
>            Priority: Minor
>         Attachments: HDFS-13191.001.patch
>
>
> {color:red}colored text{color}The org.apache.hadoop.io.DataOutputBuffer is 
> used as an "optimization" in many places to allow a reusable form of 
> ByteArrayOutputStream, but requires the caller to be careful to use 
> getLength() instead of getData().length to determine the number of actually 
> valid bytes to consume.
> At least three places in the path of constructing FileChecksums have 
> incorrect usage of DataOutputBuffer:
> [FileChecksumHelper digesting block 
> MD5s|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/FileChecksumHelper.java#L239]
> [BlockChecksumHelper digesting striped block MD5s to construct block-group 
> checksum|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockChecksumHelper.java#L412]
> [MD5MD5CRC32FileChecksum.getBytes()|https://github.com/apache/hadoop/blob/329a4fdd07ab007615f34c8e0e651360f988064d/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.java#L76]
> The net effect is that FileChecksum consumes exact BlockChecksums if there 
> are 1 or 2 blocks (at 16 and 32 bytes respectively), but at 3 blocks will 
> round up to 64 bytes, effectively returning the same FileChecksum as if there 
> were 4 blocks and the 4th block happened to have an MD5 exactly equal to 
> 0x00...00. Similarly, BlockGroupChecksum will behave as if there is a 
> power-of-2 number of bytes from BlockChecksums in the BlockGroup.
> This appears to have been a latent bug for at least 9 years for FileChecksum 
> (and since inception for the implementation of striped files), and works fine 
> as long as HDFS implementations strictly stick to the same internal buffering 
> semantics.
> However, this also makes the implementation extremely brittle unless 
> carefully documented. For example, if code is ever refactored to pass around 
> a MessageDigest that consumes block MD5s as they come rather than writing 
> into a DataOutputBuffer before digesting the entire buffer, then the 
> resulting checksum calculations will change unexpectedly.
> At the same time, "fixing" the bug would also be backwards-incompatible, so 
> the bug might need to stick around. At least for the FileChecksum-level 
> calculation, it seems the bug has been latent for a very long time. Since 
> striped files are fairly new, the BlockChecksumHelper could probably be fixed 
> sooner rather than later to avoid perpetuating a bug. The getBytes() method 
> for FileChecksum is more innocuous, so could likely be fixed or left as-is 
> without too much impact either way.
> The bug can be highlighted by changing the internal buffer-growing semantics 
> of the DataOutputBuffer, or simply returning a randomly-sized byte buffer in 
> getData() while only ensuring the first getLength() bytes are actually 
> present, for example:
>  
> {code:java}
> diff --git 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
>  
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
> index 4c2fa67f8f2..f2df94e898f 100644
> --- 
> a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
> +++ 
> b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/DataOutputBuffer.java
> @@ -103,7 +103,17 @@ private DataOutputBuffer(Buffer buffer) {
> /** Returns the current contents of the buffer.
> * Data is only valid to {@link #getLength()}.
> */
> - public byte[] getData() { return buffer.getData(); }
> + public byte[] getData() {
> + java.util.Random rand = new java.util.Random();
> + byte[] bufferData = buffer.getData();
> + byte[] ret = new byte[rand.nextInt(bufferData.length) + bufferData.length];
> + System.arraycopy(bufferData, 0, ret, 0, getLength());
> + return ret;
> + }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-13191) Internal buffer-sizing details are inadvertently baked into FileChecksum and BlockGroupChecksum

Reply via email to