yjshen opened a new pull request #1691:
URL: https://github.com/apache/arrow-datafusion/pull/1691


   # Which issue does this PR close?
   
   Closes #1569.
   
    # Rationale for this change
   
   The kinds of requesting memory consumers are pretty limited. As shown in 
#587, we only have 3 or 4 types of requesting memory consumers (join / sort / 
agg / repartition). All other consumers that take non-neglectable memory are 
considered tracking consumers.
   
   Tracking consumers always have a relatively fixed pattern for memory usage. 
They claim some memory, use it during execution, and free it when finished. The 
situation for growing or shrinking memory usage can be rare. 
   
   Considering the potentially large number of tracking consumers and the 
simple use case, we'd better have a simple method to achieve this kind of 
tracking; therefore, `MemTrackingMetrics` is proposed in this PR.
   
   # What changes are included in this PR?
   
   1. `MemTrackingMetrics` introduced, act similar to `BaselineMetrics`, report 
memory usage with `init_mem_used` and free memory when it's been dropped.
   2. MemoryManager no longer stores weak references for any consumers. 
Simplify the registering for memory consumers as well.
   3. Consumers push their memory usage to MemoryManager. No more pull from 
MemoryManagers for usage update.
   4. Use `MemTrackingMetrics` in SortPreservingMergeStream and 
SizedRecordBatchStream, simplify the tracking logic.
   5. Rename `AggregatedMetricsSet` to `CompositeMetricsSet`, fix start/end 
time aggregation.
   
   # Are there any user-facing changes?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to