alamb commented on issue #5504: URL: https://github.com/apache/datafusion/issues/5504#issuecomment-2707185516
I also think it is important not to tie ourselves to any one particular CI framework / infrastructure. It should be possible to run the scripts to gather data in any environment My suggested implementation order: 1. Create a bash script that takes as input a git commit, runs the `clickbench` benchmark ([instructions here](https://github.com/apache/datafusion/tree/main/benchmarks#running-the-benchmarks)) and saves the resulting json file somewhere 2. Run that script with 3 git shas (perhaps versions `44.0.0`, `45.0.0`, `46.0.0`) 3. Then write a script that takes results and creates charts from them As @ozankabak says in slack, there is plenty of flexibility about how this is done. We already use bash and python extensively so unless there is a good reason I think we should use one of those tools Also @logan-keede made a PR here: - https://github.com/apache/datafusion/pull/14662 That converts the benchmark output into lieprotocol format which can be fed into graphana or influxdb 2.0 for viewing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
