zhijiangW commented on issue #7186: [FLINK-10941] Keep slots which contain 
unconsumed result partitions
URL: https://github.com/apache/flink/pull/7186#issuecomment-468624034
 
 
   @azagrebin , `sendFailIntermediateResultPartitionsRpcCall` is mainly used 
for the scenario that the producer task is not in `TaskManager` (such as 
already FINISHED) but its `ResultPartition` might still exist in `TaskManager`, 
then we can cancel its `ResultPartition` instead of cancel `Task`.
   
   I might catch your confusing of how the producer releases its partition if 
not receiving `ClosePartition` and the consumer exits successfully no failover. 
The key point is aware of the inactive channel on consumer side. Once the 
consumer (tcp client) exits to close the tcp channel on its side, the producer 
(tcp server) would be aware of this inactive channel in short time (based on 
tcp mechanism and netty), and then release all the partitions, finally close 
the tcp channel on server side.
   
   I ever encountered a scenario that when the tcp client is closed, it takes 
about two hours for tcp server awareness to close because of the hardware issue 
and setting. Then we added an idle hander in netty to find the closed client 
real time. In common case, this awareness is nearly real time. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to