[
https://issues.apache.org/jira/browse/HDFS-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191254#comment-17191254
]
Masatake Iwasaki commented on HDFS-15422:
-----------------------------------------
I could not make a test case which fails without the fix. If
BlockManager#checkReplicaCorrupt returns BlockToMarkCorrupt in the conditional
below, the issue reported might be reproduced but the condition could not be
met by just tweaking the timing of edit log replay in the standby and block
reports.
{code:java}
switch(reportedState) {
case FINALIZED:
switch(ucState) {
case COMPLETE:
case COMMITTED:
if (storedBlock.getGenerationStamp() != reported.getGenerationStamp()) {
final long reportedGS = reported.getGenerationStamp();
return new BlockToMarkCorrupt(storedBlock, reportedGS,
"block is " + ucState + " and reported genstamp " + reportedGS
+ " does not match genstamp in block map "
+ storedBlock.getGenerationStamp(), Reason.GENSTAMP_MISMATCH);
} else if (storedBlock.getNumBytes() != reported.getNumBytes()) {
return new BlockToMarkCorrupt(storedBlock,
"block is " + ucState + " and reported length " +
reported.getNumBytes() + " does not match " +
"length in block map " + storedBlock.getNumBytes(),
Reason.SIZE_MISMATCH);
{code}
> Reported IBR is partially replaced with stored info when queuing.
> -----------------------------------------------------------------
>
> Key: HDFS-15422
> URL: https://issues.apache.org/jira/browse/HDFS-15422
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Reporter: Kihwal Lee
> Priority: Critical
> Attachments: HDFS-15422-branch-2.10.001.patch
>
>
> When queueing an IBR (incremental block report) on a standby namenode, some
> of the reported information is being replaced with the existing stored
> information. This can lead to false block corruption.
> We had a namenode, after transitioning to active, started reporting missing
> blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were
> appended and the sizes were actually correct on the datanodes. Upon further
> investigation, it was determined that the namenode was queueing IBRs with
> altered information.
> Although it sounds bad, I am not making it blocker
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]