kosiew opened a new pull request, #16926: URL: https://github.com/apache/datafusion/pull/16926
## Which issue does this PR close? - Closes #16984 ## Rationale for this change This PR introduces a standardized way to inspect and debug memory usage across various components of the execution engine. The `ExplainMemory` trait provides a human-readable summary of memory usage, which is particularly helpful for debugging and monitoring purposes. This also lays the groundwork for improving visibility into memory consumption patterns in complex queries. ## What changes are included in this PR? - Introduced a new module `memory_report` implementing the `ExplainMemory` trait. - Blanket implementation of `ExplainMemory` for any `Accumulator` using its `size()` method. - Specific implementation of `ExplainMemory` for `MemoryReservation`. - A helper function `report_top_consumers` that attempts to downcast a memory pool and extract consumer memory usage stats. - Unit tests verifying the behavior of the `ExplainMemory` trait. - Implemented `ExplainMemory` for `GroupedHashAggregateStream` to provide detailed insight into memory usage of group aggregates. - Enabled `lz4` and `zstd` features in `datafusion/physical-plan` to allow test coverage of compression-related functionality. ## Are these changes tested? Yes: - Added unit tests for both `MemoryReservation` and `Accumulator` implementations of `ExplainMemory`. - Ensured the `GroupedHashAggregateStream`'s memory explanation logic compiles and integrates cleanly. - Existing and new tests validate the memory reporting logic. ## Are there any user-facing changes? Yes: - Users can now call `.explain_memory()` on memory reservations and accumulators to obtain human-readable memory usage summaries. - Enhanced memory reporting visibility for `GroupedHashAggregateStream`, aiding debugging and optimization efforts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org