ariel-miculas commented on issue #22526:
URL: https://github.com/apache/datafusion/issues/22526#issuecomment-4621836931

   > I think this will be easier to do and would also be a more general 
solution -- any operator that emits record batches with shared underlying 
buffers would be handled. I can open a draft PR doing this in hash join - we 
can discuss more there.
   
   One problem with this approach is that the first sliced record batch would 
still report the memory of the entire original large record batch (the one 
being sliced).
   
   It could lead to the following situation (observed in practice):
   * the hash aggregation hits the OOM case, creates the large RecordBatch 
(which will be sliced) and releases the memory reservation
   * the downstream operator consumes the first batch (which is a small one as 
a result of the `slice`, but reporting the memory of all the underlying arrow 
buffers, i.e. large value returned by RecordBatch::get_array_memory_size)
   * the reservation for this batch fails (even though theoretically the hash 
aggregation released the memory reservation, the new reservation for this 
RecordBatch could require a bit more memory, meaning it would fail)
   
   Conclusion: the downstream operator cannot even reserve memory for a single 
batch, resulting in spilling for every batch.
   
   
   It would be nice if we could use something like:
   
https://github.com/apache/datafusion/blob/e7f7fa9929fdcca01df8c90023f961cdcdb217de/datafusion/physical-plan/src/spill/spill_manager.rs#L214
   (but without breaking the functionality of the other operators)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to