[jira] [Commented] (FLINK-21029) Failure of shutdown lead to restart of (connected) pipeline

Till Rohrmann (Jira) Tue, 19 Jan 2021 06:35:05 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-21029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267929#comment-17267929
 ]


Till Rohrmann commented on FLINK-21029:
---------------------------------------

Thanks for reporting this issue [~TheoD]. What would be the desired behaviour? 
Would you expect Flink to retry the stop-with-savepoint if the operation 
failed? How often would you allow it to retry in case that there is a 
systematic problem (e.g. wrong savepoint path or directory which is full)? 
Would you expect Flink to go then into a FAILED state if the 
stop-with-savepoint operation cannot be completed?

At the moment, the contract is that Flink tries to do stop-with-savepoint and 
if it fails, then it will report the failure back to the user and try to 
recover from it.

> Failure of shutdown lead to restart of (connected) pipeline
> -----------------------------------------------------------
>
>                 Key: FLINK-21029
>                 URL: https://issues.apache.org/jira/browse/FLINK-21029
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.2
>            Reporter: Theo Diefenthal
>            Priority: Major
>             Fix For: 1.13.0, 1.11.4, 1.12.2
>
>
> This bug happened in combination with 
> https://issues.apache.org/jira/browse/FLINK-21028 .
> When I wanted to stop a job via CLI "flink stop..." with disjoint job graph 
> (independent pipelines in the graph), one task wan't able to stop properly 
> (Reported in mentioned bug). This lead to restarting the job. I think, this 
> is a wrong behavior in general and a separated bug:
> If any crash occurs on (trying) to stop a job, Flink shouldn't try to restart 
> but continue stopping the job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-21029) Failure of shutdown lead to restart of (connected) pipeline

Reply via email to