TisonKun commented on issue #6680: [FLINK-10319] [runtime] Too many requestPartitionState would crash JM URL: https://github.com/apache/flink/pull/6680#issuecomment-430886993 As "deploying tasks in topological order", I agree that it could help. It is a orthonormal improvement though. For your hesitancy, I'd like to learn in which situation that a downstream operator would not be failed by a upstream failing. To keep the state clean either the upstream fails downstream and both restore from the least checkpoint, or we need to implement a failover strategy that take the responsibility for reconcile the state. The latter sounds quite costly.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
