geoffreyclaude opened a new pull request, #23003: URL: https://github.com/apache/datafusion/pull/23003
## Which issue does this PR close? This PR does not close an issue. It adds a first-class benchmark target for an existing `sort-tpch` mode. ## Rationale for this change `dfbench sort-tpch` already supports TopK queries over input declared as sorted with `--sorted --limit`, but `bench.sh` only exposed the unsorted TopK wrapper as `topk_tpch`. That made the sorted TopK path harder to run from the benchmark bot and easier to miss during performance work. Adding `topk_sorted_tpch` gives reviewers and contributors a named target for the sorted-input TopK case: ```bash ./benchmarks/bench.sh run topk_sorted_tpch ``` The new target uses `--limit 100` so it is the sorted counterpart to the existing `topk_tpch` benchmark. ## What changes are included in this PR? - Adds `topk_sorted_tpch` to the benchmark script help text. - Reuses the existing TPC-H SF1 parquet data setup. - Adds a `run_topk_sorted_tpch` wrapper around `dfbench sort-tpch --sorted --limit 100`. - Writes results to `run_topk_sorted_tpch.json`. - Documents the new benchmark target in `benchmarks/README.md`. ## Are these changes tested? Validated with: ```bash bash -n benchmarks/bench.sh CARGO_COMMAND=echo DATA_DIR=/tmp/df-topk-bench-data RESULTS_NAME=topk_sorted_tpch_smoke ./benchmarks/bench.sh run topk_sorted_tpch git diff --check cargo fmt --all cargo clippy --all-targets --all-features -- -D warnings ``` The smoke run verified that the script dispatches to: ```bash dfbench sort-tpch --iterations 5 --path ... --sorted --limit 100 ``` ## Are there any user-facing changes? No engine or API behavior changes. This only adds a new opt-in benchmark target. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
