Congxian Qiu(klion26) created FLINK-14035:
---------------------------------------------

             Summary: Introduce/Change some log for snapshot to better analysis 
checkpoint problem
                 Key: FLINK-14035
                 URL: https://issues.apache.org/jira/browse/FLINK-14035
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Checkpointing
    Affects Versions: 1.10.0
            Reporter: Congxian Qiu(klion26)


Currently, the information for checkpoint are mostly debug log (especially on 
TM side). If we want to track where the checkpoint steps and consume time 
during each step when we have a failed checkpoint or the checkpoint time is too 
long, we need to restart the job with enabling debug log, this issue wants to 
improve this situation, wants to change some exist debug log from debug to 
info, and add some more debug log.  we have changed this log level in our 
production in Alibaba, and it seems no problem until now.

 

Detail
{{change the log below from debug level to info}} 
 * log about \{{Starting checkpoint xxx }} in TM  side
 * log about Sync complete in TM  side
 * log about async compete in TM  side

Add debug log 
 *  log about receiving the barrier  for exactly once mode  - align from at 
lease once mode

 

If this issue is valid, then I'm happy to contribute it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to