[
https://issues.apache.org/jira/browse/HDFS-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17735584#comment-17735584
]
ASF GitHub Bot commented on HDFS-17052:
---------------------------------------
zhtttylz commented on PR #5759:
URL: https://github.com/apache/hadoop/pull/5759#issuecomment-1600330026
```java
//BlockPlacementPolicyRackFaultTolerant#chooseEvenlyFromRemainingRacks
int bestEffortMaxNodesPerRack = maxNodesPerRack;
while (results.size() != totalReplicaExpected &&
numResultsOflastChoose != results.size()) {
// Exclude the chosen nodes
final Set<Node> newExcludeNodes = new HashSet<>();
for (DatanodeStorageInfo resultStorage : results) {
addToExcludedNodes(resultStorage.getDatanodeDescriptor(),
newExcludeNodes);
}
LOG.trace("Chosen nodes: {}", results);
LOG.trace("Excluded nodes: {}", excludedNodes);
LOG.trace("New Excluded nodes: {}", newExcludeNodes);
final int numOfReplicas = totalReplicaExpected - results.size();
numResultsOflastChoose = results.size();
try {
chooseOnce(numOfReplicas, writer, newExcludeNodes, blocksize,
++bestEffortMaxNodesPerRack, results, avoidStaleNodes,
storageTypes);
} catch (NotEnoughReplicasException nere) {
lastException = nere;
} finally {
excludedNodes.addAll(newExcludeNodes);
}
}
```
Since **totalNumOfReplicas < numOfRacks**, **maxNodesPerRack** initial to
**1**. If after incrementing **bestEffortMaxNodesPerRack++**,
**BlockPlacementPolicyRackFaultTolerant#chooseOnce** still cannot select a new
node to add to the results, the 'while' loop will exit because
**numResultsOflastChoose != results.size()** is **false**.
> Erasure coding reconstruction failed when num of storageType rack NOT enough
> ----------------------------------------------------------------------------
>
> Key: HDFS-17052
> URL: https://issues.apache.org/jira/browse/HDFS-17052
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: ec
> Affects Versions: 3.3.4
> Reporter: Hualong Zhang
> Assignee: Hualong Zhang
> Priority: Major
> Labels: pull-request-available
> Attachments: failed reconstruction ec in same rack.png, write ec in
> same rack.png
>
>
> When writing EC data, if the number of racks matching the storageType is
> insufficient, more than one block are allowed to be written to the same rack
> !write ec in same rack.png|width=962,height=604!
>
>
>
> However, during EC block recovery, it is not possible to recover on the same
> rack, which deviates from the expected behavior.
> !failed reconstruction ec in same rack.png|width=946,height=413!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]