Blazer-007 commented on code in PR #4092:
URL: https://github.com/apache/gobblin/pull/4092#discussion_r1927140164


##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/yarn/DynamicScalingYarnService.java:
##########
@@ -62,6 +79,71 @@ protected synchronized void requestInitialContainers() {
     requestNewContainersForStaffingDeltas(deltas);
   }
 
+  /**
+   * Handle the completion of a container. A new container will be requested 
to replace the one
+   * that just exited depending on the exit status.
+   * <p>
+   * A container completes in either of the following conditions:
+   * <ol>
+   *   <li> The container gets stopped by the ApplicationMaster. </li>
+   *   <li> Some error happens in the container and caused the container to 
exit </li>
+   *   <li> The container gets preempted by the ResourceManager </li>
+   *   <li> The container gets killed due to some reason, for example, if it 
runs over the allowed amount of virtual or physical memory </li>
+   * </ol>
+   * A replacement container is needed in all except the first case.
+   * </p>
+   */
+  @Override
+  protected void handleContainerCompletion(ContainerStatus containerStatus) {
+    ContainerId completedContainerId = containerStatus.getContainerId();
+    ContainerInfo completedContainerInfo = 
this.containerMap.remove(completedContainerId);
+
+    if (completedContainerInfo == null) {
+      log.warn("Container {} not found in containerMap. This container 
onContainersCompleted() likely called before onContainersAllocated()",

Review Comment:
   **I forgot to add the original comment here, will add in next revision**
   
   But simple reason for this is that callbacks are asynchronously called -- 
https://github.com/apache/gobblin/blob/e584e5b49d24aaed398d082cf1e07bac8c13f453/gobblin-temporal/src/main/java/org/apache/gobblin/temporal/yarn/YarnService.java#L650
 
   
   ```
   //Get the Helix instance name for the completed container. Because callbacks 
are processed asynchronously, we might
       //encounter situations where handleContainerCompletion() is called 
before onContainersAllocated(), resulting in the
       //containerId missing from the containersMap.
       // We use removedContainerID to remember these containers and remove 
them from containerMap later when we call requestTargetNumberOfContainers method
       if (completedContainerInfo == null) {
         removedContainerID.putIfAbsent(containerStatus.getContainerId(), "");
       }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to