RE: [Discuss] Benchmarking infrastructure

Melik-Adamyan, Areg Fri, 29 Mar 2019 12:22:10 -0700

>When you say "output is parsed", how is that exactly? We don't have any 
>scripts in the repository to do this yet (I have some comments on this below). 
>We also have to collect machine information and insert that into the database. 
>From my >perspective we have quite a bit of engineering work on this topic 
>("benchmark execution and data collection") to do.
Yes I wrote one as a test.  Then it can do POST to the needed endpoint the JSON 
structure. Everything else will be done in the


>My team and I have some physical hardware (including an Aarch64 Jetson TX2 
>machine, might be interesting to see what the ARM64 results look like) where 
>we'd like to run benchmarks and upload the results also, so we need to write 
>some documentation about how to add a new machine and set up a cron job of 
>some kind.
If it can run Linux, then we can setup it. 

>I'd like to eventually have a bot that we can ask to run a benchmark 
>comparison versus master. Reporting on all PRs automatically might be quite a 
>bit of work (and load on the machines)
You should be able to choose the comparison between any two points: master-PR, 
master now - master yesterday, etc.

>I thought the idea (based on our past e-mail discussions) was that we would 
>implement benchmark collectors (as programs in the Arrow git
repository) for each benchmarking framework, starting with gbenchmark and 
expanding to include ASV (for Python) and then others
I'll open a PR and happy to put it into Arrow.

>It seems like writing the benchmark collector script that runs the benchmarks, 
>collects machine information, and inserts data into an instance of the 
>database is the next milestone. Until that's done it seems difficult to do 
>much else
Ok, will update the Jira 5070 and link the 5071.

Thanks.

RE: [Discuss] Benchmarking infrastructure

Reply via email to