[
https://issues.apache.org/jira/browse/HDFS-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15052397#comment-15052397
]
Mingliang Liu commented on HDFS-9535:
-------------------------------------
Thanks for the insightful discussion.
When client closes the file, it happens that the last block is committed while
no live replicas reported yet. In this case, the {{hasMinStorage()}} is false
and thus the last block is not added to pending replicas. When one IBR is later
received, the last block is completed (via {{addStoredBlock()}}). Next time the
client retries to complete the file, the {{commitOrCompleteLastBlock()}} simply
returns false (see beginning of the code snippet), instead of completing it
again. As the code brought by [HDFS-1172] is not really called, it fails to
stop the replication work from being scheduled. The unit test fails in this
case.
{code:title=code snippet of BlockManager#commitOrCompleteLastBlock()}
if(lastBlock.isComplete())
return false; // already completed (e.g. by syncBlock)
final boolean b = commitBlock(lastBlock, commitBlock);
if (hasMinStorage(lastBlock)) {
if (b && !bc.isStriped()) {
addExpectedReplicasToPending(lastBlock);
}
completeBlock(lastBlock, false);
}
{code}
I think we should correct the unit test before changing any logic in the
{{commitOrCompleteLastBlock}}.
> Fix TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate
> -----------------------------------------------------------------
>
> Key: HDFS-9535
> URL: https://issues.apache.org/jira/browse/HDFS-9535
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.8.0
> Reporter: Jing Zhao
> Assignee: Mingliang Liu
> Priority: Minor
>
> TestReplication#testNoExtraReplicationWhenBlockReceivedIsLate failed in
> several Jenkins run (e.g.,
> https://builds.apache.org/job/PreCommit-HDFS-Build/13818/testReport/). The
> failure is on the last {{assertNoReplicationWasPerformed}} check.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)