kosiew opened a new issue, #19713:
URL: https://github.com/apache/datafusion/issues/19713

   
   ## Summary
   Add benchmark variants that run string-aggregate workloads with TopK 
disabled (representative case) in addition to the existing TopK-enabled 
worst-case benchmarks, so we can directly compare performance and verify 
correctness across `Utf8` and `Utf8View` group keys.
   
   ## Background
   - The current `datafusion/core/benches/topk_aggregate.rs` benchmarks 
exercise the TopK-enabled code path for string aggregates.
   - @haohuaijin  [suggested adding non-TopK benchmarks so we can compare the 
TopK and non-TopK 
behavior/performances](https://github.com/apache/datafusion/pull/19285#discussion_r2668719929)
 and validate the TopK fix in PR #19285.
   
   ## Why this matters
   - Allows straightforward measurement of TopK's benefit (or regression) 
relative to the fallback path.
   - Helps validate correctness (e.g., Utf8View grouping) under both code paths.
   - Provides repeatable benchmarks to include in PRs and discussions.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to