JanKaul commented on PR #22043:
URL: https://github.com/apache/datafusion/pull/22043#issuecomment-4394157165

   If I use this branch to query a larger-than-memory dataset, I get:
   ```
   Error: Resources exhausted: Additional allocation failed for 
ExternalSorterMerge[3] with top memory consumers (across reservations) as:
     ExternalSorter[6]#13(can spill: true) consumed 597.0 MB, peak 597.0 MB,
     ExternalSorter[12]#138(can spill: true) consumed 597.0 MB, peak 597.0 MB,
     ExternalSorter[2]#11(can spill: true) consumed 596.6 MB, peak 596.6 MB.
   Error: Failed to allocate additional 192.0 KB for ExternalSorterMerge[3] 
with 277.7 MB already allocated for this reservation - 2.2 KB remain available 
for the total memory pool: greedy(used: 8.0 GB, pool_size: 8.0 GB)
   ```
   With vanilla datafusion the ExternalSorter fails to allocate memory. So it 
looks like it solves the memory reclaiming issue for a single operator. 
However, now the next operator ExternalSorterMerge fails. So this solution 
doesn't handle cross operator reclamations.
   
   I think we need a hierarchical design with a MemoryPool and Reclaimer tree 
such that we have full control. I think the Velox design would be really great.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to