leixm commented on PR #3490: URL: https://github.com/apache/celeborn/pull/3490#issuecomment-3393213968
This test is a bit difficult. I will make the all attempt0 of the shuffle read task fail, and add PrimaryFetchBytes and ReplicateFetchBytes metrics on the worker. We can see that during the running of this task, only ReplicateFetchBytes has a value in all workers, while PrimaryFetchBytes is always 0, This means that all task attempt1 have priority in reading the replica data, which is as expected. <img width="3432" height="564" alt="image" src="https://github.com/user-attachments/assets/b855940d-860d-4e57-9494-461aa2abbdb5" /> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
