FelixYBW commented on issue #8018: URL: https://github.com/apache/incubator-gluten/issues/8018#issuecomment-2495191927
@zjuwangg Thank you for your investigation! It's really something we'd like to do. - We also should consider about the collaboration with RAS. - We need to predefine some operators' potential memory usage like Scan or Project in velox consumes little memory, but aggregate and join need much. So if a scan + fallback aggregate, we are able to set small offheap + large on heap. If it's a offloaded agg + fallbacked join, we now needs to set large offheap + large on heap, in this way we should fallback the agg or even whole stage then set a large on heap memory. - It's even better if we can specify different fallback policy when a task is retried, which means some task may offload to Velox, some task may retry with fallback. In theory it's possible but more complex. @PHILO-HE has done some investigation some time ago and noted some code changes in Vanilla Spark is necessary, did you noted it? if so we may hack the code in Gluten firstly then submit PR to upstream Spark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
