alamb commented on code in PR #17021: URL: https://github.com/apache/datafusion/pull/17021#discussion_r2279414621
########## datafusion-cli/README.md: ########## @@ -30,3 +30,33 @@ DataFusion CLI (`datafusion-cli`) is a small command line utility that runs SQL ## Where can I find more information? See the [`datafusion-cli` documentation](https://datafusion.apache.org/user-guide/cli/index.html) for further information. + +## Memory Profiling + +> **Tip:** Memory profiling requires the tracked pool. Start the CLI with `--top-memory-consumers N` (N≥1), or profiling will report no metrics. By default, CLI starts with --top-memory-consumers 5. + +Enable memory tracking for the next query and display the report afterwards: + +```text +> \memory_profiling enable +Memory profiling enabled +> SELECT v % 100 AS group_key, COUNT(*) AS cnt, SUM(v) AS sum_v FROM generate_series(1,100000) AS t(v) GROUP BY group_key ORDER BY group_key; + ++-----------+------+----------+ +| group_key | cnt | sum_v | ++-----------+------+----------+ +| 0 | 1000 | 50050000 | +| 1 | 1000 | 49951000 | +| 2 | 1000 | 49952000 | +... + +\memory_profiling show Review Comment: In order to do that we might have to change the core of DataFusion (specifically build the reporting into EXPLAIN ANALYZE) I think it is worth keeping things modular and reporting outside datafusion's core at first and keep it in datafusion-cli, and if it proves to be a popular feature we can then consider moving it into the core -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org