[
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196553#comment-13196553
]
Ravi Prakash commented on HDFS-2848:
------------------------------------
The problem seems to be in BlockSender.java:258-285
Imagine our file was originally 100 bytes and got corrupted to 110 bytes.
{noformat}
// end is either last byte on disk or the length for which we have a
// checksum
long end = chunkChecksum != null ? chunkChecksum.getDataLength()
: replica.getBytesOnDisk();
if (startOffset < 0 || startOffset > end
|| (length + startOffset) > end) {
String msg = " Offset " + startOffset + " and length " + length
+ " don't match block " + block + " ( blockLen " + end + " )";
LOG.warn(datanode.getDNRegistrationForBP(block.getBlockPoolId()) +
":sendBlock() : " + msg);
throw new IOException(msg);
}
// Ensure read offset is position at the beginning of chunk
offset = startOffset - (startOffset % chunkSize);
if (length >= 0) {
// Ensure endOffset points to end of chunk.
long tmpLen = startOffset + length;
if (tmpLen % chunkSize != 0) {
tmpLen += (chunkSize - tmpLen % chunkSize);
}
if (tmpLen < end) {
// will use on-disk checksum here since the end is a stable chunk
end = tmpLen;
} else if (chunkChecksum != null) {
// last chunk is changing. flag that we need to use in-memory
checksum
this.lastChunkChecksum = chunkChecksum;
}
}
endOffset = end;
{noformat}
Then "end" here will be 110, because of replica.getBytesOnDisk()
The calculation of endOffset seems to be missing its mark.
Either that or BlockSender:sendPacket() should be properly checking the
checksum till endOffset which it is not
> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -------------------------------------------------------------------------
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.23.0
> Reporter: Ravi Prakash
> Assignee: Ravi Prakash
>
> Courtesy Pat White [~patwhitey2007]
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira