RexXiong commented on code in PR #3018:
URL: https://github.com/apache/celeborn/pull/3018#discussion_r1912705474


##########
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala:
##########
@@ -3858,7 +3858,7 @@ object CelebornConf extends Logging {
       .doc("If direct memory usage is less than this limit, worker will 
resume.")
       .version("0.2.0")
       .doubleConf
-      .createWithDefault(0.7)
+      .createWithDefault(0.3)

Review Comment:
   Maybe we can add a new conf  for pinnedMemoryToResume  and keep exist conf 
for directMemoryRatioToResume



##########
worker/src/main/java/org/apache/celeborn/service/deploy/worker/memory/MemoryManager.java:
##########
@@ -293,6 +297,18 @@ public boolean shouldEvict(boolean 
aggressiveMemoryFileEvictEnabled, double evic
 
   public ServingState currentServingState() {
     long memoryUsage = getMemoryUsage();
+    long allocatedMemory;
+    if (networkMemoryAllocatorPooled) {
+      allocatedMemory = getAllocatedMemory();
+    } else {
+      allocatedMemory = memoryUsage;
+    }
+    // trigger resume
+    // CELEBORN-1792: resume should use pinnedDirectMemory instead of 
usedDirectMemory

Review Comment:
   Although we needn't  change to pause state, it would be better to call trim 
when netty direct memory used above 
pausePushDataThreshold/pauseReplicateThreshold, WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to