[
https://issues.apache.org/jira/browse/FLINK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094028#comment-15094028
]
ASF GitHub Bot commented on FLINK-2111:
---------------------------------------
Github user tillrohrmann commented on the pull request:
https://github.com/apache/flink/pull/750#issuecomment-170942707
Good work @mjsax. I think the testing of the REST interface is sufficient,
also without having a dedicated YARN test case.
I had some more comments concerning the failure case handling of stop
calls. The first problem is still the handling of exceptions when calling
`stop` on the `Invokable` in `Task.stopExecution`. The exception will only be
logged but no further action is taken. This can lead to a situation where we
have a corrupted state. I think, we should fail the task in such a situation.
Additionally, the case that a task cannot be found on the `TaskManager` and
that an exception occurs in `Task.stopExecution` are treated identically by
sending a `TaskOperationResult` with `success == false` to the `JobManager`. On
the `JobManager` side this will only be logged. I think the exception case
should be handled differently. Failing the execution, for example.
And it is still possible that you send a `StopJob` message to the
`JobManager`, see that the job is in state `RUNNING`, then the `ExecutionGraph`
switches to `RESTARTING`, and then the stop call is executed on the
`ExecutionGraph` which won't have an effect. As a user you will receive a
`StoppingSuccess` message but the job will simply be restarted. I think we
should also allow stopping jobs when they are in the state `RESTARTING`.
What do you think?
> Add "stop" signal to cleanly shutdown streaming jobs
> ----------------------------------------------------
>
> Key: FLINK-2111
> URL: https://issues.apache.org/jira/browse/FLINK-2111
> Project: Flink
> Issue Type: Improvement
> Components: Distributed Runtime, JobManager, Local Runtime,
> Streaming, TaskManager, Webfrontend
> Reporter: Matthias J. Sax
> Assignee: Matthias J. Sax
> Priority: Minor
>
> Currently, streaming jobs can only be stopped using "cancel" command, what is
> a "hard" stop with no clean shutdown.
> The new introduced "stop" signal, will only affect streaming source tasks
> such that the sources can stop emitting data and shutdown cleanly, resulting
> in a clean shutdown of the whole streaming job.
> This feature is a pre-requirment for
> https://issues.apache.org/jira/browse/FLINK-1929
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)