abhishekmjain commented on code in PR #4092:
URL: https://github.com/apache/gobblin/pull/4092#discussion_r1940501751


##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/yarn/DynamicScalingYarnService.java:
##########
@@ -232,7 +237,12 @@ private synchronized void 
handleContainerExitedWithOOM(ContainerId completedCont
 
     // Request a replacement container
     int currContainerMemoryMbs = 
workerProfile.getConfig().getInt(GobblinYarnConfigurationKeys.CONTAINER_MEMORY_MBS_KEY);
-    int newContainerMemoryMbs = currContainerMemoryMbs * 2; //TODO: make it 
configurable or auto-tunable
+    int newContainerMemoryMbs = currContainerMemoryMbs * 
DEFAULT_REPLACEMENT_CONTAINER_MEMORY_MULTIPLIER;
+    if (newContainerMemoryMbs > MAX_REPLACEMENT_CONTAINER_MEMORY_MBS) {
+      log.warn("Expected replacement container memory exceeds the maximum 
allowed memory {}. Not requesting a replacement container.",
+          MAX_REPLACEMENT_CONTAINER_MEMORY_MBS);
+      return;

Review Comment:
   We have two options here:
   1. to create a replacement container with same memory as the old container
   2. return without creating any container.
   
   I think 2 here would help us in failing the flow fast, whereas 1 would help 
resolve any intermittent OOM issues (which is unlikely). We have opted for 2 
here, but may need to update this in future based on the observations.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to