RexXiong commented on code in PR #3079:
URL: https://github.com/apache/celeborn/pull/3079#discussion_r1924788949
##########
client/src/main/java/org/apache/celeborn/client/read/CelebornInputStream.java:
##########
@@ -322,14 +324,10 @@ private boolean isExcluded(PartitionLocation location) {
private PartitionReader createReaderWithRetry(
PartitionLocation location, PbStreamHandler pbStreamHandler) throws
IOException {
- // For the first time, the location will be selected according to
attemptNumber
- if (fetchChunkRetryCnt == 0 && attemptNumber % 2 == 1 &&
location.hasPeer()) {
- location = location.getPeer();
- logger.debug("Read peer {} for attempt {}.", location, attemptNumber);
- }
Exception lastException = null;
Review Comment:
> If there is a problem with the primary location, then it is likely that it
has already been changed to the peer in the last task attempt but still failed.
In this case, it is not so relevant which one to use to start fetching in the
new task attempt. If there is no problem with the primary location and the task
is retried due to other problems, this situation is even less relevant. So I
think we could always fetch chunk by starting from the primary location. WDYT?
Sound reasonable. And if we change location to peer here would cause
pbStreamHandler and location inconsistent when createReader,there may be issues
in some shuffle scenarios.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]