If a TM goes down any data generated after the last successful checkpoint cannot be guaranteed to be consistent across the cluster. Hence, this data is discarded and we go back to the last known consistent state, the last checkpoint that was successfully created.

On 05.06.2018 13:06, Garvit Sharma wrote:
But job should be terminated gracefully. Why is this behavior not there?

On Tue, Jun 5, 2018 at 4:19 PM, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote:

    No checkpoint will be triggered when the cluster is shutdown. For
    this case you will have to manually trigger a savepoint.

    If a TM goes down it does not create a checkpoint. IN these cases
    the job will be restarted from the last successful checkpoint.


    On 05.06.2018 12:01, Data Engineer wrote:

        Hi,

        Suppose I have a working Flink cluster with 1 taskmanager and
        1 jobmanager and I have enabled checkpointing with say an
        interval of 1 minute.
        Now if I shut down the Flink cluster in between checkpoints
        (say for some upgrade), will the JobManager automatically
        trigger a checkpoint before going down?

        Or is it mandatory to manually trigger savepoints in these cases?
        Also am I correct in my understanding that if a taskmanager
        goes down first, there is no way the TaskManager can trigger
        the checkpoint on its own?






--

Garvit Sharma
github.com/garvitlnmiit/ <http://github.com/garvitlnmiit/>

No Body is a Scholar by birth, its only hard work and strong determination that makes him master.


Reply via email to