[jira] [Commented] (FLINK-29940) ExecutionGraph logs job state change at ERROR level when job fails

Yun Gao (Jira) Thu, 17 Nov 2022 01:42:03 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-29940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635245#comment-17635245
 ]


Yun Gao commented on FLINK-29940:
---------------------------------

Hi [~liuml07] It looks to me that it might not be an error from the 
JobManager's perspective since JM is able to recover from this status. Also may 
I have a double confirmation about why we want to make it to be ERROR level ? 
If for the purpose of monitoring, the log might not be very stable and it might 
be better to rely on metrics like numberOfRestarts. 

> ExecutionGraph logs job state change at ERROR level when job fails
> ------------------------------------------------------------------
>
>                 Key: FLINK-29940
>                 URL: https://issues.apache.org/jira/browse/FLINK-29940
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.16.0
>            Reporter: Mingliang Liu
>            Priority: Minor
>              Labels: pull-request-available
>
> When job switched to FAILED state, the log is very useful to understand why 
> it failed along with the root cause exception stack. However, the current log 
> level is INFO - a bit inconvenient for users to search from logging with so 
> many surrounding log lines. We can log at ERROR level when the job switched 
> to FAILED state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29940) ExecutionGraph logs job state change at ERROR level when job fails

Reply via email to