yjshen opened a new pull request #1691:
URL: https://github.com/apache/arrow-datafusion/pull/1691
# Which issue does this PR close?
Closes #1569.
# Rationale for this change
The kinds of requesting memory consumers are pretty limited. As shown in
#587, we only have 3 or 4 types of requesting memory consumers (join / sort /
agg / repartition). All other consumers that take non-neglectable memory are
considered tracking consumers.
Tracking consumers always have a relatively fixed pattern for memory usage.
They claim some memory, use it during execution, and free it when finished. The
situation for growing or shrinking memory usage can be rare.
Considering the potentially large number of tracking consumers and the
simple use case, we'd better have a simple method to achieve this kind of
tracking; therefore, `MemTrackingMetrics` is proposed in this PR.
# What changes are included in this PR?
1. `MemTrackingMetrics` introduced, act similar to `BaselineMetrics`, report
memory usage with `init_mem_used` and free memory when it's been dropped.
2. MemoryManager no longer stores weak references for any consumers.
Simplify the registering for memory consumers as well.
3. Consumers push their memory usage to MemoryManager. No more pull from
MemoryManagers for usage update.
4. Use `MemTrackingMetrics` in SortPreservingMergeStream and
SizedRecordBatchStream, simplify the tracking logic.
5. Rename `AggregatedMetricsSet` to `CompositeMetricsSet`, fix start/end
time aggregation.
# Are there any user-facing changes?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]