Re: [I] External sort failing with non-spillable operators as input (RepartitionExec) [datafusion]

via GitHub Mon, 24 Nov 2025 02:41:02 -0800


Kontinuation commented on issue #17334:
URL: https://github.com/apache/datafusion/issues/17334#issuecomment-3570067327


   I'm also looking forward to the cooperative spilling feature. This is 
important for projects such as 
[Comet](https://github.com/apache/datafusion-comet) to implement [Photon-like 
memory 
management](https://people.eecs.berkeley.edu/~matei/papers/2022/sigmod_photon.pdf)
 and reduce the possibility of allocation failure (see related issue 
https://github.com/apache/datafusion-comet/issues/949). A similar 
DataFusion-based Spark accelerator project [Apache 
Auron](https://github.com/apache/auron) re-implemented memory-intensive 
operators such as sort, aggregation, and join [all by 
themselves](https://github.com/apache/auron/tree/11164c92c735dcf0204330a3e33621753005f3e9/native-engine/datafusion-ext-plans/src)
 to use [their own memory 
manager](https://github.com/apache/auron/blob/11164c92c735dcf0204330a3e33621753005f3e9/native-engine/auron-memmgr/src/lib.rs)
 and bypass the limitation of DataFusion's memory management API.
   
   A bit of history: the initial memory management 
[proposal](https://docs.google.com/document/d/1BT5HH-2sKq-Jxo51PNE6l9NNd_F-FyyYcyC3SKTnkIA/edit)
 and [implementation](https://github.com/apache/datafusion/pull/1526) did 
support cooperative spilling. However, a later 
[simplification](https://github.com/apache/datafusion/pull/4522) removed that 
feature.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] External sort failing with non-spillable operators as input (RepartitionExec) [datafusion]

Reply via email to