[I] [DRAFT, EPIC] Benchmark improvements [datafusion]

via GitHub Wed, 25 Mar 2026 19:24:31 -0700


adriangb opened a new issue, #21165:
URL: https://github.com/apache/datafusion/issues/21165


   I'm opening this epic to track improvements / changes we want to our 
benchmarking setup.
   
   I'll start by collecting some relevant issues:
   - https://github.com/apache/datafusion/issues/15511
   - https://github.com/apache/datafusion/issues/5504
   - https://github.com/apache/datafusion/issues/13446
   - #21034
   
   I think we should discuss in this issue what we want from our benchmarking 
setup and use that to guide how we improve it.
   
   ## Trackable over time
   
   This is important for gating releases and catching regressions early. 
Currently we only ever really run benchmarks in a PR to compare to main or when 
we go to update ClickBench.
   
   I think we should target Codspeed compatibility for this.
   
   ## Can run slow/complex SQL benchmarks
   
   Think ClickBench. I think this might discard the use of criterion at least 
for this style of benchmark, see 
https://github.com/bheisler/criterion.rs/issues/320. We can use criterion for 
smaller, faster benchmarks.
   
   We also need a harness that supports loading data, can give us both cold and 
hot numbers.
   
   ##  Can do a "quick run"
   
   We want to be able to run benchmarks just to verify the results are correct, 
or as a test during development.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [DRAFT, EPIC] Benchmark improvements [datafusion]

Reply via email to