zhuzhurk commented on issue #9058: [FLINK-13166] Add support for batch slot 
requests to SlotPoolImpl
URL: https://github.com/apache/flink/pull/9058#issuecomment-510007140
 
 
   Hi Xingtong, I think you are right that this improvement cannot handle the 
case you describes. However, the fine-grained recovery can work as a fallback. 
It uses re-scheduling as a retry for resources. In this way B will finally get 
assigned with the resources that is released from A and returned to RM.
   As Stephan mentioned, the failover way can be annoying to Flink users. And 
Till's PR is targeting for improvement this by reducing task failovers caused 
by slot allocation timeout. It works for most cases, although not all(like the 
one you mentioned).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to