This is an automated email from the ASF dual-hosted git repository.

chengpan pushed a commit to branch branch-0.4
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git


The following commit(s) were added to refs/heads/branch-0.4 by this push:
     new a69846e4d [CELEBORN-1192][BUG] Celeborn wait task timeout error 
message should show correct corresponding batch and target host and port
a69846e4d is described below

commit a69846e4d53a23889d49955423effeddcd67868d
Author: Angerszhuuuu <[email protected]>
AuthorDate: Fri Dec 22 16:53:30 2023 +0800

    [CELEBORN-1192][BUG] Celeborn wait task timeout error message should show 
correct corresponding batch and target host and port
    
    ### What changes were proposed in this pull request?
    Celeborn wait task timeout error message should show correct corresponding 
batch and target host and port
    
    ### Why are the changes needed?
    Current error log here is confused, can't found out the target 
hostAndPushPort that have problem.
    
    ### Does this PR introduce _any_ user-facing change?
    Refactor log help debug
    
    ### How was this patch tested?
    
    Closes #2183 from AngersZhuuuu/CELEBORN-1192.
    
    Authored-by: Angerszhuuuu <[email protected]>
    Signed-off-by: Cheng Pan <[email protected]>
    (cherry picked from commit f751df50ba7699cd10bd1277e5f132f49b923131)
    Signed-off-by: Cheng Pan <[email protected]>
---
 .../org/apache/celeborn/common/write/InFlightRequestTracker.java | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git 
a/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
 
b/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
index e5514922c..fbfe124a5 100644
--- 
a/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
+++ 
b/common/src/main/java/org/apache/celeborn/common/write/InFlightRequestTracker.java
@@ -164,12 +164,13 @@ public class InFlightRequestTracker {
     if (times <= 0) {
       logger.error(
           "After waiting for {} ms, "
-              + "there are still {} batches in flight "
-              + "for hostAndPushPort {}, "
+              + "there are still {} in flight, "
               + "which exceeds the current limit 0.",
           waitInflightTimeoutMs,
-          totalInflightReqs.sum(),
-          
inflightBatchesPerAddress.keySet().stream().collect(Collectors.joining(", ", 
"[", "]")));
+          inflightBatchesPerAddress.entrySet().stream()
+              .filter(c -> !c.getValue().isEmpty())
+              .map(c -> c.getValue().size() + " batches for hostAndPushPort " 
+ c.getKey())
+              .collect(Collectors.joining(", ", "[", "]")));
     }
 
     if (pushState.exception.get() != null) {

Reply via email to