[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-03 Thread Matthias (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278088#comment-17278088 ] Matthias commented on FLINK-21030: -- {quote}1. Taking a savepoint 2. Stopping the job gracefully

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-03 Thread Till Rohrmann (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277880#comment-17277880 ] Till Rohrmann commented on FLINK-21030: --- Yes, fully agreed. Restarting the

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-02 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277769#comment-17277769 ] Zhu Zhu commented on FLINK-21030: - I guess I had misunderstood the protocol of

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-02 Thread Matthias (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277736#comment-17277736 ] Matthias commented on FLINK-21030: -- Thanks for the analysis, that makes sense. I went ahead with this

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-02 Thread Till Rohrmann (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277248#comment-17277248 ] Till Rohrmann commented on FLINK-21030: --- I think the problem can and should be solved in the scope

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-02 Thread Till Rohrmann (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277054#comment-17277054 ] Till Rohrmann commented on FLINK-21030: --- I am not entirely sure whether adjusting the checkpoint

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-02 Thread Matthias (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277024#comment-17277024 ] Matthias commented on FLINK-21030: -- You're right: My proposal would affect the checkpoint creation as

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-02 Thread Yun Tang (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276965#comment-17276965 ] Yun Tang commented on FLINK-21030: -- I think returning the complete future to checkpoint coordinator to

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-01 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276920#comment-17276920 ] Zhu Zhu commented on FLINK-21030: - Sounds good to me. [~yunta] would you also take a look at [~mapohl]'s

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-01 Thread Matthias (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276909#comment-17276909 ] Matthias commented on FLINK-21030: -- Yes, the notification of tasks happens right at the end of

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-01 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276846#comment-17276846 ] Zhu Zhu commented on FLINK-21030: - Thanks for the investigation! [~mapohl] Does this mean that the

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-02-01 Thread Matthias (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276539#comment-17276539 ] Matthias commented on FLINK-21030: -- I paired with [~chesnay] to look into the issue. We came up with a

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-01-26 Thread Till Rohrmann (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272186#comment-17272186 ] Till Rohrmann commented on FLINK-21030: --- No objections from my side. > Broken job restart for job

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-01-26 Thread Matthias (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272108#comment-17272108 ] Matthias commented on FLINK-21030: -- Just to clarify: The expected behavior is then that the command

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-01-19 Thread Zhu Zhu (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17268409#comment-17268409 ] Zhu Zhu commented on FLINK-21030: - Agreed to trigger a global failover to bring FINISHED tasks back to

[jira] [Commented] (FLINK-21030) Broken job restart for job with disjoint graph

2021-01-19 Thread Till Rohrmann (Jira)
[ https://issues.apache.org/jira/browse/FLINK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17267923#comment-17267923 ] Till Rohrmann commented on FLINK-21030: --- Thanks for reporting this issue [~TheoD]. I agree that