supermem613 commented on PR #7803: URL: https://github.com/apache/incubator-gluten/pull/7803#issuecomment-2459934538
Fundamentally, if the dynamic sizing off-heap memory feature is on, the user manually setting the off-heap size - which will be overwritten - could only be interpreted in two ways: (1) The user wants to manually control the overall off-heap memory usage (Gluten + any other uses). (2) The user wants to use off-heap memory outside of Gluten, say, for a custom JAR. For #1, it directly conflicts with the feature. Either one manually sets the off-heap or one lets the feature do its thing. For #2, the current implementation of the feature is counter-productive as it will size the off-heap based solely on Gluten calculation and will use it ignoring any other user off-heap usage. This scenario is essentially not well addressed right now. I'd argue that really, we should separate the off-heap tracking in Gluten vs. Spark off-heap (maybe only when the dynamic sizing feature is on), so that customers can choose to set aside off-heap for their custom JARs, etc, and separately set the memory for Gluten OR use the dynamic sizing feature, but without conflicting with the off-heap setting. Now, going back to your example of "onheap 10G offheap 5G". Fundamentally the question is what the use is attempting to do. If the user wants to use 10GB for JVM and 5GB for Gluten + custom JARs, it would do so with the dynamic sizing feature off (scenario #1 above). If they want 10GB for JVM or Gluten and 5GB for custom JARs, we hit the #2 scenario above - so right now they would turn off the dynamic sizing feature. Overall, IMO, this is a strong argument for pursuing next what I described above - separating off-heap usage, perhaps only when the dynamic sizing feature is on, for Gluten vs. everything else. This would cover the user scenario of setting aside memory for things other than Gluten nicely. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
