ZanderXu commented on code in PR #4407:
URL: https://github.com/apache/hadoop/pull/4407#discussion_r891298950
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/IncrementalBlockReportManager.java:
##########
@@ -251,12 +251,20 @@ synchronized void addRDBI(ReceivedDeletedBlockInfo rdbi,
DatanodeStorage storage) {
// Make sure another entry for the same block is first removed.
// There may only be one such entry.
+ ReceivedDeletedBlockInfo removedInfo = null;
for (PerStorageIBR perStorage : pendingIBRs.values()) {
- if (perStorage.remove(rdbi.getBlock()) != null) {
+ removedInfo = perStorage.remove(rdbi.getBlock());
+ if (removedInfo != null) {
break;
}
}
- getPerStorageIBR(storage).put(rdbi);
+ if (removedInfo != null &&
Review Comment:
We encountered the case of concurrent CloseRecovery. The CloseRecovery with
small GS early process block on Storage but later being added into pendingIBRs,
and CloseRecovery with bigger GS later process block on Storage but early being
added into pendingIBRs. As a result, the large GS block is stored on the disk,
but small GS block being reported to Namenode. And very unfortunately, the
block has one this valid replica, and leads to the block missing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]