metaswirl commented on a change in pull request #18169:
URL: https://github.com/apache/flink/pull/18169#discussion_r783922862



##########
File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManagerDriver.java
##########
@@ -684,6 +694,7 @@ public void onGetContainerStatusError(ContainerId 
containerId, Throwable throwab
 
         @Override
         public void onStopContainerError(ContainerId containerId, Throwable 
throwable) {
+            trackerOfReleasedResources.arriveAndDeregister();

Review comment:
       Not sure that I follow. If we start the container release process we 
basically count up (register). When the container release process completes 
successfully we count down (deregister). The same happens, when the release 
process fails (onStopContainerError). If we start a new release process, we 
would still count up at the start and count down at the end.
   
   Are you saying that scenarios exist where we count up, but never count down 
or vice versa? This can only happen, if the callback is for some reason never 
called. Possibly due to a failure in the YARN clients. Even in this (unlikely?) 
case, the containers will be killed shortly afterwards anyway. (That is, if the 
shutdown procedure is initated over YARN's kill application command.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to