>When you say "output is parsed", how is that exactly? We don't have any
>scripts in the repository to do this yet (I have some comments on this below).
>We also have to collect machine information and insert that into the database.
>From my >perspective we have quite a bit of engineering work on this topic
>("benchmark execution and data collection") to do.
Yes I wrote one as a test. Then it can do POST to the needed endpoint the JSON
structure. Everything else will be done in the
>My team and I have some physical hardware (including an Aarch64 Jetson TX2
>machine, might be interesting to see what the ARM64 results look like) where
>we'd like to run benchmarks and upload the results also, so we need to write
>some documentation about how to add a new machine and set up a cron job of
>some kind.
If it can run Linux, then we can setup it.
>I'd like to eventually have a bot that we can ask to run a benchmark
>comparison versus master. Reporting on all PRs automatically might be quite a
>bit of work (and load on the machines)
You should be able to choose the comparison between any two points: master-PR,
master now - master yesterday, etc.
>I thought the idea (based on our past e-mail discussions) was that we would
>implement benchmark collectors (as programs in the Arrow git
repository) for each benchmarking framework, starting with gbenchmark and
expanding to include ASV (for Python) and then others
I'll open a PR and happy to put it into Arrow.
>It seems like writing the benchmark collector script that runs the benchmarks,
>collects machine information, and inserts data into an instance of the
>database is the next milestone. Until that's done it seems difficult to do
>much else
Ok, will update the Jira 5070 and link the 5071.
Thanks.