ZanderXu commented on code in PR #6635:
URL: https://github.com/apache/hadoop/pull/6635#discussion_r1528310021
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java:
##########
@@ -628,7 +628,7 @@ public void run() {
new RecoveryTaskContiguous(b).recover();
}
} catch (IOException e) {
- LOG.warn("recover Block: {} FAILED: {}", b, e);
+ LOG.warn("recover Block: {} FAILED: ", b, e);
Review Comment:
this modification has nothing to do with this issue, right?
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##########
@@ -1755,12 +1755,19 @@ private BlockRecoveryCommand
getBlockRecoveryCommand(String blockPoolId,
LOG.info("Skipped stale nodes for recovery : "
+ (storages.length - recoveryLocations.size()));
}
- recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(recoveryLocations);
} else {
- // If too many replicas are stale, then choose all replicas to
+ // If too many replicas are stale, then choose live replicas to
// participate in block recovery.
- recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(storages);
+ recoveryLocations.clear();
+ storageIdx.clear();
+ for (int i = 0; i < storages.length; ++i) {
+ if (storages[i].getDatanodeDescriptor().isAlive()) {
+ recoveryLocations.add(storages[i]);
+ storageIdx.add(i);
+ }
Review Comment:
please add some logs for this case.
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java:
##########
@@ -1755,12 +1755,19 @@ private BlockRecoveryCommand
getBlockRecoveryCommand(String blockPoolId,
LOG.info("Skipped stale nodes for recovery : "
+ (storages.length - recoveryLocations.size()));
}
- recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(recoveryLocations);
} else {
- // If too many replicas are stale, then choose all replicas to
+ // If too many replicas are stale, then choose live replicas to
// participate in block recovery.
- recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(storages);
+ recoveryLocations.clear();
+ storageIdx.clear();
+ for (int i = 0; i < storages.length; ++i) {
+ if (storages[i].getDatanodeDescriptor().isAlive()) {
+ recoveryLocations.add(storages[i]);
+ storageIdx.add(i);
+ }
Review Comment:
please check if all replicas are dead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]