[ 
https://issues.apache.org/jira/browse/FLINK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041478#comment-15041478
 ] 

ASF GitHub Bot commented on FLINK-2111:
---------------------------------------

Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/750#issuecomment-161949525
  
    When fixing the `JobManagerTest` I noticed the following. When the job was 
stopped when it was still in the state `SCHEDULED` or `DEPLOYING`, then one 
received a `StoppingSuccess`. The problem was that the stop was not executed 
and the job later switched to `RUNNING`.
    
    The same can be observed if the job is in state `RESTARTING`. Stopping a 
restarting job does nothing even though you receive a `StoppingSuccess` 
message. The job will later be redeployed. 
    
    As a user I would expect that the job is immediately stopped or at least at 
the next possible moment (e.g. when it's deployed). Or I would expect that the 
system tells me that the stopping is at the moment not possible.
    
    Similar is the question, what happens if only a subset of all sources is 
deployed and in the state `RUNNING`. This would mean that the undeployed 
sources won't get noticed about the stopping signal and, thus, be normally 
deployed. 
    
    Furthermore, what happens if the `stop` method of the `SourceFunction` 
throws an unchecked exception? If I'm not mistaken, then this will only get 
logged. But shouldn't the task be cancelled in such a situation because the 
state cannot be guaranteed to be consistent anymore?
    
    The case that a `Task` is not `Stoppable` and that a `Task` cannot be found 
on the `TaskManager` are treated by the `Execution` identically. Both cases 
cause a `TaskOperationResult(executionID, false, message)` to be sent back to 
the `Execution`. There it will be logged that the stopping call "did not find 
the task". I think it would be good to differentiate the two cases.


> Add "stop" signal to cleanly shutdown streaming jobs
> ----------------------------------------------------
>
>                 Key: FLINK-2111
>                 URL: https://issues.apache.org/jira/browse/FLINK-2111
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Runtime, JobManager, Local Runtime, 
> Streaming, TaskManager, Webfrontend
>            Reporter: Matthias J. Sax
>            Assignee: Matthias J. Sax
>            Priority: Minor
>
> Currently, streaming jobs can only be stopped using "cancel" command, what is 
> a "hard" stop with no clean shutdown.
> The new introduced "stop" signal, will only affect streaming source tasks 
> such that the sources can stop emitting data and shutdown cleanly, resulting 
> in a clean shutdown of the whole streaming job.
> This feature is a pre-requirment for 
> https://issues.apache.org/jira/browse/FLINK-1929



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to