cfmcgrady commented on PR #2358:
URL: 
https://github.com/apache/incubator-celeborn/pull/2358#issuecomment-1997903706

   Hi, @CodingCat 
   
   Sorry for the late review.
   
   IMO, the current approaches to "adaptive memory management" do not 
effectively achieve the intended adaptability. Specifically, in cases where the 
shuffle data is of significant size, the maximum push threshold will continue 
to increase until it reaches `numPartitions * sendBufferSizeInBytes` 
(`executorMemory * 0.4` after 
[PR-2388](https://github.com/apache/incubator-celeborn/pull/2388)). This 
increases the risk of encountering OOM errors, as there is currently no 
mechanism in place to lower the maximum push threshold.
   
   I think we need to refactor this feature. cc @waitinfuture 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to