gruuya commented on issue #5504: URL: https://github.com/apache/arrow-datafusion/issues/5504#issuecomment-2021563143
Hey @korowa, thanks for your observations! Indeed my own experience suggests a slight bias against the PR results for some reason: typically a couple of queries reported with 1.05-1.3 slow-up (even if nothing is changed), so I'd ignore anything in that range atm (not sure it's worth increasing the 5% threshold though). That said, I think the current setup is sufficient to catch larger regressions for now. I also don't think increasing the number of iterations / sleeping between them on it's own would be good enough, since we'd trade that against the increased longitudinal performance variance component of the shared GitHub runners, but I guess it might be worth a try. The more long term solution, and the first next step that makes sense to me would be to run these on a dedicated runner. Following that I see the list of improvements as: - (optional) simplify CI benchmark development: https://github.com/apache/arrow-datafusion/issues/9638 - add more benchmarks (a selection of Clickbench + TPC-DS queries) - track benchmarks over time, through a similar job triggered by merge commits to main (fwiw, I now prefer Bencher to conbench, as it seems simpler to setup/maintain) - (optional) re-base benchmarks on criterion.rs; this provides a neat standardized way of running/analyzing/collecting these stats in as-of-recently a bit user-friendlier manner: https://www.tweag.io/blog/2022-03-03-criterion-rs/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
