afterincomparableyum opened a new pull request, #3605:
URL: https://github.com/apache/celeborn/pull/3605
Implement chunk-fetch retry logic in CelebornInputStream::getNextChunk(),
matching the Java CelebornInputStream behavior. When a chunk fetch fails, the
retry loop excludes the failed worker, switches to the peer replica (if
available), and sleeps between retry rounds before creating a new reader.
Added getLocation() to PartitionReader interface and WorkerPartitionReader
Replaced the stub getNextChunk() with full retry logic: excluded worker
checks, peer switching, configurable retry count, sleep between retries
Updated moveToNextChunk() and moveToNextReader() to handle nullable returns
from getNextChunk()
Added unit test for WorkerPartitionReader::getLocation()
CI/CD unit tests pass and C++ compiles.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]