RE: Benchmarking mailing list thread [was Fwd: [Discuss] Benchmarking infrastructure]

Melik-Adamyan, Areg Wed, 24 Apr 2019 23:29:03 -0700

Hi,

We are talking about the same thing actually, but you do not want to use 3rd 
party tools. 
For 3 and 4 - you run the first version store in 1.out, then second version 
store in 2.out and run compare tool. Your tool does two steps automatically, 
that is fine.


> Various reason why I think the archery route is preferred over a mix of
> scattered scripts, CI pipeline steps and random go binaries.
> 
> 1. It is OS agnostic since it's written in python, and depends on cmake + git
>    installed in PATH.
[>] So is Google Benchmark, cmake and git, no?
> 
> 2. Self contained in arrow's repository, no need to manually install external
>    dependencies (go toolchain, then compile & install benchstat, benchcmp).
>    Assuming python3 and pip are provided, which we already need for pyarrow.
[>] Those operations are lighter than 'conda install', but ok, point taken.
> 
> 3. Written as a library where the command line is a frontend. This makes it
>    very easy to test and re-use. It also opens the door to clearing
>    technical debt we've accumulated in `dev/`. This is not relevant for the
>    benchmark sub-project, but still relevant for arrow developers in general.
[>] Agree, but out of the scope of the benchmarking.
> 
> 4. Benchmark framework agnostic. This does not depend on google's
> benchmark and
>    go benchmark output format. It does support it, but does not mandate it.
>    Will be key to support Python (ASV) and other languages.
[>] I do not understand what do you mean by other languages testing: core 
performance will come from the core C++ libraries, everything else will be 
wrappers around. So if I understand correctly by testing languages, we are 
testing wrappers?
> 
> 5. Shell scripts tend to grow un-maintenance. I say this as someone who abuse
>    them. (archery implementation is derived from a local bash script).
[>] There is no shell script in the first approach, but I totally share your 
pain.
> 
> 6. It is not orchestrated by a complex CI pipeline (which effectively is a
>    non-portable hardly reproducible script). It is self contained, can run
>    within a CI or on a local machine. This is very convenient for local 
> testing
>    and debugging. I loathe waiting for the CI, especially when iterating in
> development.
[>] What you are really saying, is that Archery *is the CI* that you ship with 
the source code. It does all the same things. I am not against, but it will 
create a maintenance burden, and in a couple of years, you'll discover that it 
is outdated :)

> You can get a sneak peek at of automation working here
> http://nashville.ursalabs.org:4100/#/builders/16/builds/129,
> note that this doesn't use dedicated hardware yet.
[>] Nice, so, when we can start using it, and I guess nobody will object that 
perf.zaiteki.tech is not competing with Archery. So how can I help you to 
proceed faster? I can create and host DB from 5071 in the cloud if you want.

-Areg.

RE: Benchmarking mailing list thread [was Fwd: [Discuss] Benchmarking infrastructure]

Reply via email to