Re: [I] [EPIC] Benchmark improvements [datafusion]

via GitHub Thu, 26 Mar 2026 12:15:26 -0700


wjones127 commented on issue #21165:
URL: https://github.com/apache/datafusion/issues/21165#issuecomment-4137574161


   > My proposal would be that (cost permitting) we run benchmarks on every 
merge to main and post a comment in the PR if they regressed (so 1 run per PR 
vs. every commit) or at the very least we run them on RC branches comparing to 
the previous release.
   
   One possible intermediate cadence is a nightly run of the benchmarks (once 
per 24 hour period). That could be less frequent than every merge to main, but 
would catch regressions much earlier than testing only in RCs.
   
   > We don't really have _any_ benchmarks for memory use and there have been 
multiple memory use regressions.
   
   +1 on this. divan's support looks really promising. 
   
   In addition to allocation tracking, it might also be good to track IO used 
in queries. That's not necessarily something you might care about as much as 
runtime or peak memory use, but it helps in identifying where runtime 
regressions might have come from.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [EPIC] Benchmark improvements [datafusion]

Reply via email to