[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...

mjsax Thu, 24 Sep 2015 10:41:05 -0700

Github user mjsax commented on the pull request:

    https://github.com/apache/flink/pull/750#issuecomment-142999802
  
    Thanks for the review! Just to clarify: I did not apply any changes "just 
for fun". But I agree that this PR contains more changes as are actually 
related to this PR. For example, I wrote some tests according to the given 
"test pattern" of a package and was told that this test pattern should be 
changed. This also covers catching exceptions... So if this changes are "wrong" 
I was instructed not correctly. In order to "clean up" I changed the test 
pattern not just for my own new tests, but for the whole package (this might 
have been over eagerly...) Anyway, I agree, that this PR should be split and 
the changes to the tests should go into an own PR -- we can discuss what 
changes for the tests makes sense or not there.
    
    How STOP works:
     - `stop` signal is only sent to streaming source tasks
     - thus, batch jobs and streaming jobs are distinguished now (for batch 
jobs, no stop signal can be triggered at all)
     - only streaming sources implement `Stoppable` interface and receive stop 
signal
     - there is no explicit state change necessary -- after streaming tasks 
return from "run", the sources automatically go to "finishing" and close there 
output channels which propagates through the whole ExecutionGraph and all tasks 
finish automatically. Thus the overall job is marked as SUCCESS and not FAILED
     - for the JobManager UI, sources set a flag that they received the stop 
signal (which is displayed with a checkmark). This might be helpful, if it take 
some time before a source actually returns from `run()` and switches from 
RUNNING into FINISHING (otherwise, the user has no feedback if STOP was issued 
or not)
    
    About `stop()` vs `cancel()`: we need a dedicated stop signal because the 
stop signal only goes to streaming sources (and no other tasks) and does no 
trigger an explicit state change. For the user function, I used `cancel()` in 
the beginning, and was told the you agreed an the introduction of `stop()` 
method. So I just introduced it at the very last. I think there are some 
differences (see the JavaDoc I added for stop and cancel) so `UDF.stop()` might 
be useful (I think on exactly-once guartentee). If a job is canceled, it is a 
hard stop and exactly-once must not be preserved (tuples emitted by source are 
not processed as all tasks of the job get canceled at the same time). Not so 
for stop which gives exactly-once; the job is shut-down cleanly from the 
sources to the sinks and all data emitted by sources gets processed.
    
    About whitespace formatting: there might be a few, but a lot of changes 
look like WS formatting but are not. This happens, if a whole block is removed 
(I did remove some try-catch-blocks) and if there are empty line (with to WS 
indention) within the block. The diff does not display the changes nicely and 
consecutive line look like WS formatting even if there are not (there was a tap 
remove due to the removed block). I enable auto-formatting in eclipse only for 
lines I changed...
    
    I will split and update this PR and we can than take if from there. Makes 
sense?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...

Reply via email to