alamb opened a new issue #898:
URL: https://github.com/apache/arrow-datafusion/issues/898


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   When running DataFusion as part of a Rust program that has other substantial 
uses of memory (for example Buffers in IOx) we would like to know how much 
memory is allocated when running DataFusion plans so we can:
   1. Allocate sufficient Memory to DataFusion / limit other users of memory
   2. Start turning / working to limit memory usage by DataFusion (e.g. #587 )
   
   **Describe the solution you'd like**
   A counter (perhaps `AtomicUsize` tied to the ExecutionContext somehow) that 
tracks, across all DataFusion plans running in that context, how much memory 
has been allocated. This counter's value should be available both during the 
plan execution as well as after it has completed.
   
   The counter should include:
   1. Memory allocated in RecordBatches *created* by DataFusion operators
   2. Memory used in intermediate buffers (e.g. HashTables, Sort buffers, etc) 
- should be "capacity" rather than "size" to reflect the heap usage of the 
program
   3. Decremented when memory is deallocated
   
   Initially, a counter that gets the major allocations of memory would be 
idea. 
   
   **Describe alternatives you've considered**
   Implement a per-operator allocation tracking scheme (perhaps based on 
metrics, see #866 and https://github.com/apache/arrow-datafusion/issues/679). 
   
   I think a per-operator tracking of memory is also valuable and will file a 
separate ticket for that capability
   
   **Additional context**
   This is likely a pre-requisite for actually limiting memory usage for 
DataFusion plans as described in 
https://github.com/apache/arrow-datafusion/issues/587
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to