zhijiangW commented on issue #7186: [FLINK-10941] Keep slots which contain unconsumed result partitions URL: https://github.com/apache/flink/pull/7186#issuecomment-468624034 @azagrebin , `sendFailIntermediateResultPartitionsRpcCall` is mainly used for the scenario that the producer task is not in `TaskManager` (such as already FINISHED) but its `ResultPartition` might still exist in `TaskManager`, then we can cancel its `ResultPartition` instead of cancel `Task`. I might catch your confusing of how the producer releases its partition if not receiving `ClosePartition` and the consumer exits successfully no failover. The key point is aware of the inactive channel on consumer side. Once the consumer (tcp client) exits to close the tcp channel on its side, the producer (tcp server) would be aware of this inactive channel in short time (based on tcp mechanism and netty), and then release all the partitions, finally close the tcp channel on server side. I ever encountered a scenario that when the tcp client is closed, it takes about two hours for tcp server awareness to close because of the hardware issue and setting. Then we added an idle hander in netty to find the closed client real time. In common case, this awareness is nearly real time.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
