[jira] [Commented] (FLINK-14035) Introduce/Change some log for snapshot to better analysis checkpoint problem

vinoyang (Jira) Tue, 10 Sep 2019 17:09:49 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927134#comment-16927134
 ]


vinoyang commented on FLINK-14035:
----------------------------------

[~klion26] Sounds reasonable. +1

> Introduce/Change some log for snapshot to better analysis checkpoint problem
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-14035
>                 URL: https://issues.apache.org/jira/browse/FLINK-14035
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.10.0
>            Reporter: Congxian Qiu(klion26)
>            Priority: Major
>
> Currently, the information for checkpoint are mostly debug log (especially on 
> TM side). If we want to track where the checkpoint steps and consume time 
> during each step when we have a failed checkpoint or the checkpoint time is 
> too long, we need to restart the job with enabling debug log, this issue 
> wants to improve this situation, wants to change some exist debug log from 
> debug to info, and add some more debug log.  we have changed this log level 
> in our production in Alibaba, and it seems no problem until now.
>  
> Detail
> {{change the log below from debug level to info}} 
>  * log about \{{Starting checkpoint xxx }} in TM  side
>  * log about Sync complete in TM  side
>  * log about async compete in TM  side
> Add debug log 
>  *  log about receiving the barrier  for exactly once mode  - align from at 
> lease once mode
>  
> If this issue is valid, then I'm happy to contribute it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (FLINK-14035) Introduce/Change some log for snapshot to better analysis checkpoint problem

Reply via email to