nathanb9 commented on issue #3873: URL: https://github.com/apache/datafusion-comet/issues/3873#issuecomment-4194877691
Drew what this could look like starting from the point where spark memory pressure occurs. Meaning Step 1. Spark operator tries to grow and is unable to allocate and so spill attempt begins. The spill is called on [every consumer ](https://wforget.github.io/apache-spark-internals/memory/TaskMemoryManager/#source-java_1) ``` Spark TaskMemoryManager CometTaskMemoryManager "Execution Registry" TrackConsumersPool Spillable Native Op Comet Native Pool | | | | | | |---- 1. spill(size) ---->| | | | | | |-- 2. spillMemory ----->| | | | | | (execution_id,size) | | | | | | |-- 3. reclaim ------->| | | | | | (size,exclude_cur) | | | | | | |-- 4. reclaimer ----->| | | | | | | | | | | |<-- 5. free/shrink ---| | | | | |-- 6. shrink ------------------------------->| | |<-- 7. reclaimed bytes -|<-- 7. reclaimed bytes| | | |<-- 8. reclaimed bytes --| | | | | |<---------------------- 10. releaseExecutionMemory(...) -----------------|<-- 9. releaseMemory- | | ``` step 2's `spillMemory` is the JNI interface step 4's `reclaimer` is what we would need in datafusion for this approach. Ill write a more detailed explanation in a PR and for now I will cut a ticket in datafusion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
