qqqttt123 commented on code in PR #1652:
URL:
https://github.com/apache/incubator-uniffle/pull/1652#discussion_r1577728109
##########
client-spark/spark3/src/main/java/org/apache/spark/shuffle/writer/RssShuffleWriter.java:
##########
@@ -523,20 +532,31 @@ private void resendFailedBlocks(Set<TrackingBlockStatus>
failedBlockStatusSet) {
for (Map.Entry<ShuffleServerInfo, List<TrackingBlockStatus>> entry :
faultyServerToPartitions.entrySet()) {
- Set<Integer> partitionIds =
- entry.getValue().stream()
- .map(x -> x.getShuffleBlockInfo().getPartitionId())
- .collect(Collectors.toSet());
- ShuffleServerInfo replacement =
replacementShuffleServers.get(entry.getKey().getId());
- if (replacement == null) {
- // todo: merge multiple requests into one.
- replacement = reassignFaultyShuffleServer(partitionIds,
entry.getKey().getId());
- replacementShuffleServers.put(entry.getKey().getId(), replacement);
+ ShuffleServerInfo faultyServer = entry.getKey();
+ List<TrackingBlockStatus> blocks = entry.getValue();
+
+ if (!taskAttemptAssignment.isReassigned(faultyServer.getId())) {
Review Comment:
I feel that faulty server reassignment and rebalance reassignment shouldn't
be couped.
Should we reuse the code of huge partition. Maybe we should use a new code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]