kosiew opened a new issue, #19713: URL: https://github.com/apache/datafusion/issues/19713
## Summary Add benchmark variants that run string-aggregate workloads with TopK disabled (representative case) in addition to the existing TopK-enabled worst-case benchmarks, so we can directly compare performance and verify correctness across `Utf8` and `Utf8View` group keys. ## Background - The current `datafusion/core/benches/topk_aggregate.rs` benchmarks exercise the TopK-enabled code path for string aggregates. - @haohuaijin [suggested adding non-TopK benchmarks so we can compare the TopK and non-TopK behavior/performances](https://github.com/apache/datafusion/pull/19285#discussion_r2668719929) and validate the TopK fix in PR #19285. ## Why this matters - Allows straightforward measurement of TopK's benefit (or regression) relative to the fallback path. - Helps validate correctness (e.g., Utf8View grouping) under both code paths. - Provides repeatable benchmarks to include in PRs and discussions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
