This is an automated email from the ASF dual-hosted git repository.
chengpan pushed a commit to branch branch-0.4
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/branch-0.4 by this push:
new a69846e4d [CELEBORN-1192][BUG] Celeborn wait task timeout error
message should show correct corresponding batch and target host and port
a69846e4d is described below
commit a69846e4d53a23889d49955423effeddcd67868d
Author: Angerszhuuuu <[email protected]>
AuthorDate: Fri Dec 22 16:53:30 2023 +0800
[CELEBORN-1192][BUG] Celeborn wait task timeout error message should show
correct corresponding batch and target host and port
### What changes were proposed in this pull request?
Celeborn wait task timeout error message should show correct corresponding
batch and target host and port
### Why are the changes needed?
Current error log here is confused, can't found out the target
hostAndPushPort that have problem.
### Does this PR introduce _any_ user-facing change?
Refactor log help debug
### How was this patch tested?
Closes #2183 from AngersZhuuuu/CELEBORN-1192.
Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit f751df50ba7699cd10bd1277e5f132f49b923131)
Signed-off-by: Cheng Pan <[email protected]>
---
.../org/apache/celeborn/common/write/InFlightRequestTracker.java | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git
a/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
b/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
index e5514922c..fbfe124a5 100644
---
a/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
+++
b/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
@@ -164,12 +164,13 @@ public class InFlightRequestTracker {
if (times <= 0) {
logger.error(
"After waiting for {} ms, "
- + "there are still {} batches in flight "
- + "for hostAndPushPort {}, "
+ + "there are still {} in flight, "
+ "which exceeds the current limit 0.",
waitInflightTimeoutMs,
- totalInflightReqs.sum(),
-
inflightBatchesPerAddress.keySet().stream().collect(Collectors.joining(", ",
"[", "]")));
+ inflightBatchesPerAddress.entrySet().stream()
+ .filter(c -> !c.getValue().isEmpty())
+ .map(c -> c.getValue().size() + " batches for hostAndPushPort "
+ c.getKey())
+ .collect(Collectors.joining(", ", "[", "]")));
}
if (pushState.exception.get() != null) {