[jira] [Commented] (FLINK-4410) Split checkpoint times into synchronous and asynchronous part
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15522934#comment-15522934 ] Stephan Ewen commented on FLINK-4410: - I will take up this issue - have a pretty good plan how to do this and do some overdue cleanup in the process. > Split checkpoint times into synchronous and asynchronous part > - > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: Webfrontend >Reporter: Ufuk Celebi >Assignee: Stephan Ewen >Priority: Minor > > Checkpoint statistics contain the duration of a checkpoint. We should split > this time into the synchronous and asynchronous part. This will give more > insight into the inner workings of the checkpointing mechanism and help users > better understand what's going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4410) Split checkpoint times into synchronous and asynchronous part
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489788#comment-15489788 ] Aljoscha Krettek commented on FLINK-4410: - Hi, there's actually three different durations that could be reported: - time from the checkpoint coordinator initiating a checkpoint to an operator acknowledging that checkpoint - time that an operator spends in the synchronous part of the checkpoint - time that an operator spends in the asynchronous part of the checkpoint About synchronous/asynchronous. For this you can look at {{StreamTask.performCheckpoint()}}. At the end of the method a Thread is started that does the asynchronous work of the checkpoint and the method returns. Thus, time until then would be the synchronous part and the time spend in that thread would be the asynchronous part. > Split checkpoint times into synchronous and asynchronous part > - > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: Webfrontend >Reporter: Ufuk Celebi >Priority: Minor > > Checkpoint statistics contain the duration of a checkpoint. We should split > this time into the synchronous and asynchronous part. This will give more > insight into the inner workings of the checkpointing mechanism and help users > better understand what's going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4410) Split checkpoint times into synchronous and asynchronous part
[ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468721#comment-15468721 ] Ivan Mushketyk commented on FLINK-4410: --- Hi [~uce]. Just to make sure that I understand correctly what do you mean by synchronous and asynchronous parts. Do I understand correctly that they are: * synchronous - time span between checkpoint is initiated and the moment when TriggerCheckpoint messages are sent * asynchronous - time between all TriggerCheckpoint messages are sent and all replies are received > Split checkpoint times into synchronous and asynchronous part > - > > Key: FLINK-4410 > URL: https://issues.apache.org/jira/browse/FLINK-4410 > Project: Flink > Issue Type: Improvement > Components: Webfrontend >Reporter: Ufuk Celebi >Priority: Minor > > Checkpoint statistics contain the duration of a checkpoint. We should split > this time into the synchronous and asynchronous part. This will give more > insight into the inner workings of the checkpointing mechanism and help users > better understand what's going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)