waitinfuture commented on code in PR #1800:
URL: 
https://github.com/apache/incubator-celeborn/pull/1800#discussion_r1306984985


##########
worker/src/main/java/org/apache/celeborn/service/deploy/worker/memory/MemoryManager.java:
##########
@@ -281,7 +285,10 @@ protected void switchServingState() {
         trimAllListeners();
         break;
       case NONE_PAUSED:
-        logger.info("Trigger action: RESUME PUSH and REPLICATE");
+        long pauseSpendMills = System.currentTimeMillis() - pauseStartTime;
+        logger.info(
+            "Trigger action: RESUME PUSH and REPLICATE, pause push spent: " + 
pauseSpendMills);
+        pausePushDataTime.add(pauseSpendMills);

Review Comment:
   Consider the corner case that a worker enters the pause state but never goes 
out, then  `pausePushDataTime` will never increase. Maybe we can increase 
`pausePushDataTime` every N successive pause states, as well as state changes 
from pause to non_pause.
   
   Also, I think we should consider worker under high load only when the pause 
time exceeds some threshold, instead of whenever it enters pause state, as 
https://github.com/apache/incubator-celeborn/pull/1840 does, cc @pan3793



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to