Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

Till Rohrmann Tue, 11 Aug 2020 04:55:21 -0700

This sounds like a bug in Flink. Could you share the logs of the cluster
(ideally with TRACE log level) with us?


Cheers,
Till

On Tue, Aug 11, 2020 at 9:49 AM Fabian Paul <fabianp...@data-artisans.com>
wrote:

> Hi Till,
>
> The problem is reproducible with a basic shell script doing the following
> operations.
>
> 1. Post request to /jobs/${JOB_ID}/savepoints with the payload
>          {"cancel-job": true,"target-directory": $(LOCATION)}
>         and store the trigger ID
>
> 2. Sleep 10 seconds
>
> 3. Get jobs/${JOB_ID}/savepoints/$(TRIGGER_ID)
>         results in a connect exception because rest endpoint is shutdown.
>
> Sorry, if I misunderstood you previous answer but I would expect that
> stopping the job
> with a savepoint is an asynchronous operation and should block the
> shutdown until
> the result is served.
> I also can confirm that the cluster is not shutdown but the rest endpoint
> is which makes
> it impossible to serve the asynchronous result.
>
> Best,
> Fabian
>
>

Re: [DISCUSS] Retrieve savepoint location after suspension of jobclusters

Reply via email to