GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/3773
[FLINK-5867] [FLINK-5866] [flip-1] Implement FailoverStrategy for pipelined regions This is based on #3772 , the relevant commits are the latter four. The majority of the work has been done by @tiemsn , with some rebasing and additions from me. # Pipelined Region Failover As described in [FLIP-1](https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures), this pull request implements the detection of pipelined regions in the `ExecutionGraph` and failover within these pipelined regions. ![st0-nzqia5abpwrgaogpllw](https://cloud.githubusercontent.com/assets/1727146/25399938/54fda5a4-29f1-11e7-9efe-5d845644089f.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink flip-1-pipelined-regions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3773.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3773 ---- commit ef7fd9964c1c74feb4641e57a138c54558b2449c Author: Stephan Ewen <se...@apache.org> Date: 2017-03-21T18:13:34Z [FLINK-5869] [flip-1] Add basic abstraction for Failover Strategies to ExecutionGraph - Rename 'ExecutionGraph.fail()' to 'ExecutionGraph.failGlobally()' to differentiate from fine grained failures/recovery - Add base class for FailoverStrategy - Add default implementation (restart all tasks) - Add logic to load the failover strategy from the configuration commit c04a8a312098fddce14e392b8d9dbf396b1df3f3 Author: Stephan Ewen <se...@apache.org> Date: 2017-03-29T20:49:54Z [FLINK-6340] [flip-1] Add a termination future to the Execution commit 92d3f7e1025dc3c3499730bda8e8a9acfd3b5c13 Author: shuai.xus <shuai....@alibaba-inc.com> Date: 2017-04-18T06:15:29Z [FLINK-5867] [flip-1] Support restarting only pipelined sub-regions of the ExecutionGraph on task failure commit 456600d5e37724bbcc7d570f6828e3fef6298483 Author: shuai.xus <shuai....@alibaba-inc.com> Date: 2017-04-20T21:56:53Z [FLINK-5867] [flip-1] Add tests for pipelined failover region construction commit 622f07e0efc82bf13f12ae1960a35ecc48c865c1 Author: Stephan Ewen <se...@apache.org> Date: 2017-04-20T22:02:19Z [FLINK-5867] [flip-1] Improve performance of Pipelined Failover Region construction This method exploits the fact that verties are already in topological order. commit 39402583df8b4c51016c72f968772cbbdd6c92e3 Author: shuai.xus <shuai....@alibaba-inc.com> Date: 2017-04-25T07:42:48Z [FLINK-5867] [flip-1] Correct some JavaDocs for RestartIndividualStrategy ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---