Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-12 Thread Till Rohrmann
Thanks for the logs Fabian. It is indeed a problem we introduced recently. I've created a JIRA issue to fix the problem [1]. This fix will also be included in the Flink 1.10.2 release. [1] https://issues.apache.org/jira/browse/FLINK-18902 Cheers, Till On Wed, Aug 12, 2020 at 2:30 PM Fabian Paul

Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-12 Thread Fabian Paul
I attached the last log lines[1] of the jobmanager after triggering the savepoint. I just saw the release for 1.10.2 is started so it would probably be great if we determine whether it is a bug to postpone the release if necessary. What do you think? Best, Fabian [1]

Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-11 Thread Till Rohrmann
This sounds like a bug in Flink. Could you share the logs of the cluster (ideally with TRACE log level) with us? Cheers, Till On Tue, Aug 11, 2020 at 9:49 AM Fabian Paul wrote: > Hi Till, > > The problem is reproducible with a basic shell script doing the following > operations. > > 1. Post

Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-11 Thread Fabian Paul
Hi Till, The problem is reproducible with a basic shell script doing the following operations. 1. Post request to /jobs/${JOB_ID}/savepoints with the payload {"cancel-job": true,"target-directory": $(LOCATION)} and store the trigger ID 2. Sleep 10 seconds 3. Get

Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-08 Thread Till Rohrmann
Hi Fabian, could explain a bit how you are cancelling a job with savepoint and then try to retrieve the savepoint path? When running Flink in per-job mode, the system should not shut down if you have an asynchronous operation running whose result you have not yet queried. I believe that this

Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-07 Thread Eleanore Jin
+1 Thank you Fabian! On Fri, Aug 7, 2020 at 6:58 AM Fabian Paul wrote: > Hi all, > > Due to recent changes in the shutdown mechanism of Flink [1] it is not > conveniently possible anymore to suspend a job running on a jobcluster > with a savepoint and retrieve the savepoint location via the

[DISCUSS] Retrieve savepoint location after suspension of jobclusters

2020-08-07 Thread Fabian Paul
Hi all, Due to recent changes in the shutdown mechanism of Flink [1] it is not conveniently possible anymore to suspend a job running on a jobcluster with a savepoint and retrieve the savepoint location via the Flink API programmatically. With the introduced changes the rest endpoint