supermem613 commented on PR #7803:
URL: 
https://github.com/apache/incubator-gluten/pull/7803#issuecomment-2459934538

   Fundamentally, if the dynamic sizing off-heap memory feature is on, the user 
manually setting the off-heap size - which will be overwritten - could only be 
interpreted in two ways:
   
   (1) The user wants to manually control the overall off-heap memory usage 
(Gluten + any other uses).
   (2) The user wants to use off-heap memory outside of Gluten, say, for a 
custom JAR. 
   
   For #1, it directly conflicts with the feature. Either one manually sets the 
off-heap or one lets the feature do its thing.
   
   For #2, the current implementation of the feature is counter-productive as 
it will size the off-heap based solely on Gluten calculation and will use it 
ignoring any other user off-heap usage. This scenario is essentially not well 
addressed right now. I'd argue that really, we should separate the off-heap 
tracking in Gluten vs. Spark off-heap (maybe only when the dynamic sizing 
feature is on), so that customers can choose to set aside off-heap for their 
custom JARs, etc, and separately set the memory for Gluten OR use the dynamic 
sizing feature, but without conflicting with the off-heap setting.
   
   Now, going back to your example of "onheap 10G offheap 5G". Fundamentally 
the question is what the use is attempting to do. If the user wants to use 10GB 
for JVM and 5GB for Gluten + custom JARs, it would do so with the dynamic 
sizing feature off (scenario #1 above). If they want 10GB for JVM or Gluten and 
5GB for custom JARs, we hit the #2 scenario above - so right now they would 
turn off the dynamic sizing feature.
   
   Overall, IMO, this is a strong argument for pursuing next what I described 
above - separating off-heap usage, perhaps only when the dynamic sizing feature 
is on, for Gluten vs. everything else. This would cover the user scenario of 
setting aside memory for things other than Gluten nicely.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to