I see the problem with very large jobs. Maybe we could solve it a bit 
differently, by deploying tasks in topological order when using the `EAGER` 
scheduling.

Concerning your answer to my second question: What if the producer partition 
would get disposed (e.g. due to a failover which does not necessarily restart 
the downstream operators). At the moment an upstream task failure will always 
fail the downstream consumers. However, this can change in the future and the 
more assumptions (e.g. downstream operators will be failed if upstream 
operators fail) we bake in, the harder it gets to change this behaviour. 
Moreover, I think it is always a good idea, to make the components as 
self-contained as possible. This also entails that the failover behaviour 
should ideally not depend on other things to happen. Therefore, I'm a bit 
hesitant to change the existing behaviour.

[ Full content available at: https://github.com/apache/flink/pull/6680 ]
This message was relayed via gitbox.apache.org for devnull@infra.apache.org

Reply via email to