cuibo01 commented on a change in pull request #16315:
URL: https://github.com/apache/flink/pull/16315#discussion_r663780810



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/DefaultExecutionGraph.java
##########
@@ -1038,7 +1039,11 @@ private boolean transitionState(JobStatus current, 
JobStatus newState, Throwable
                     current,
                     newState,
                     error);
-
+            if (error != null) {
+                LOG.warn(
+                        "Print ExecutionGraph {}",
+                        
LogStackUtils.getCallStack(Thread.currentThread().getStackTrace()));
+            }

Review comment:
       from the log, we can see the scene where some exception occurred.
   i think the PR helps to analyze some exceptions and is especially important 
for newcomers.

##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/CreditBasedPartitionRequestClientHandler.java
##########
@@ -243,8 +245,32 @@ public void 
channelWritabilityChanged(ChannelHandlerContext ctx) throws Exceptio
         writeAndFlushNextMessageIfPossible(ctx.channel());
     }
 
+    private void printLogError(Throwable cause, StackTraceElement[] 
stackTraceElements) {
+        SocketAddress remoteAddr = null;
+        SocketAddress localAddr = null;
+        if (ctx != null) {
+            remoteAddr = ctx.channel().remoteAddress();
+            localAddr = ctx.channel().localAddress();
+        }
+        LOG.error(
+                "A remote channel of {} to {} throws a exception. and all {} 
will be notified and"
+                        + " set error.",
+                localAddr == null ? "localAddr" : localAddr,
+                remoteAddr == null ? "remoteAddr" : remoteAddr,
+                String.join(
+                        ",",
+                        inputChannels.values().stream()
+                                .map(RemoteInputChannel::toString)
+                                .collect(Collectors.toList())),
+                cause);
+        LOG.error(
+                "A remote channel throws exception.",
+                LogStackUtils.getCallStack(stackTraceElements));
+    }

Review comment:
       if TaskExecutor dies, the log is  not important, and in this scenario, 
the log has little effect.
   but if the Task fails due to network jitter, I think the log is important. 
Based on the log, we can know which remote channel are abnormal and which 
inputChannels fail.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to