zhtttylz commented on PR #5759:
URL: https://github.com/apache/hadoop/pull/5759#issuecomment-1607490022
> Great catch here! It's make sense to me. I have some thoughts to discuss
with you. The solution here may involve multiple calls to `chooseOnce`, some of
which may be unnecessary and waste some time. The root cause of this problem is
that `BlockPlacementPolicyDefault#getMaxNodesPerRack` cannot return an accurate
value for `maxNodesPerRack`. How about compute a right value before the loop in
`chooseEvenlyFromRemainingRacks`? Will this be more efficient?
>
> ```
> private void chooseEvenlyFromRemainingRacks(Node writer,
> Set<Node> excludedNodes, long blocksize, int maxNodesPerRack,
> List<DatanodeStorageInfo> results, boolean avoidStaleNodes,
> EnumMap<StorageType, Integer> storageTypes, int totalReplicaExpected,
> NotEnoughReplicasException e) throws NotEnoughReplicasException {
> int numResultsOflastChoose = 0;
> NotEnoughReplicasException lastException = e;
> int bestEffortMaxNodesPerRack = maxNodesPerRack;
> Map<String, Integer> nodesPerRack = new HashMap<>();
> for (DatanodeStorageInfo dsInfo : results) {
> String rackName =
dsInfo.getDatanodeDescriptor().getNetworkLocation();
> nodesPerRack.merge(rackName, 1, Integer::sum);
> }
> for (int numNodes : nodesPerRack.values()) {
> if (numNodes > bestEffortMaxNodesPerRack) {
> bestEffortMaxNodesPerRack = numNodes;
> }
> }
> while (results.size() != totalReplicaExpected &&
> numResultsOflastChoose != results.size()) {
> ```
Thank you for your review. We appreciate your suggestion. However, in
situations where the number of available racks is insufficient to meet the
requirements of the Erasure Coding storage type, each write operation would
trigger the invocation of the
`BlockPlacementPolicyRackFaultTolerant#chooseEvenlyFromRemainingRacks` method.
It's important to note that each invocation of this method involves the
calculation of `chooseEvenlyFromRemainingRacks`.We are uncertain about the
potential efficiency implications of adopting this approach.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]