This is an automated email from the ASF dual-hosted git repository.
nicholasjiang pushed a commit to branch branch-0.5
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/branch-0.5 by this push:
new ff0ed5e31 [CELEBORN-1662] Handle PUSH_DATA_FAIL_PARTITION_NOT_FOUND in
getPushDataFailCause
ff0ed5e31 is described below
commit ff0ed5e31191644b192ee12d881c23779c285794
Author: jiang13021 <[email protected]>
AuthorDate: Mon Oct 21 21:06:06 2024 +0800
[CELEBORN-1662] Handle PUSH_DATA_FAIL_PARTITION_NOT_FOUND in
getPushDataFailCause
### What changes were proposed in this pull request?
Add a condition at the start of the failure cause logic to check for
PUSH_DATA_FAIL_PARTITION_NOT_FOUND.
### Why are the changes needed?
Currently, the getPushDataFailCause method does not identify and handle the
PUSH_DATA_FAIL_PARTITION_NOT_FOUND error type. All other failure causes are
explicitly checked and managed, but this specific error type is overlooked.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manual test
Closes #2833 from jiang13021/celeborn-1662.
Authored-by: jiang13021 <[email protected]>
Signed-off-by: SteNicholas <[email protected]>
(cherry picked from commit dc5f3fb96b608eb6eff4aec30f69f2c16ae5b1b2)
Signed-off-by: SteNicholas <[email protected]>
---
client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java | 2 ++
1 file changed, 2 insertions(+)
diff --git
a/client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java
b/client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java
index 1c86182d7..415ed51d1 100644
--- a/client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java
+++ b/client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java
@@ -1859,6 +1859,8 @@ public class ShuffleClientImpl extends ShuffleClient {
cause = StatusCode.PUSH_DATA_PRIMARY_WORKER_EXCLUDED;
} else if
(message.startsWith(StatusCode.PUSH_DATA_REPLICA_WORKER_EXCLUDED.name())) {
cause = StatusCode.PUSH_DATA_REPLICA_WORKER_EXCLUDED;
+ } else if
(message.startsWith(StatusCode.PUSH_DATA_FAIL_PARTITION_NOT_FOUND.name())) {
+ cause = StatusCode.PUSH_DATA_FAIL_PARTITION_NOT_FOUND;
} else if (ExceptionUtils.connectFail(message)) {
// Throw when push to primary worker connection causeException.
cause = StatusCode.PUSH_DATA_CONNECTION_EXCEPTION_PRIMARY;