Re: [Discuss] Benchmarking infrastructure

Wes McKinney Fri, 29 Mar 2019 09:22:42 -0700

hi,

After doing a little research I took a closer look at the shell scripts in


https://github.com/apache/arrow/tree/master/dev/benchmarking

While these may work for importing the gbenchmark data, the general
approach seems inflexible to me, and I would recommend rewriting them
as Python programs to enable better extensibility, finer grained
control (e.g. to refine and manipulate the output to be "nicer"), and
make it easier to support importing output from different kinds of
benchmark output.

- Wes

On Fri, Mar 29, 2019 at 10:06 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> hi Areg,
>
> On Fri, Mar 29, 2019 at 1:25 AM Melik-Adamyan, Areg
> <areg.melik-adam...@intel.com> wrote:
> >
> > Back to the benchmarking per commit.
> >
> > So currently I have fired a community TeamCity Edition here 
> > http://arrow-publi-1wwtu5dnaytn9-2060566241.us-east-1.elb.amazonaws.com and 
> > dedicated pool of two Skylake bare metal machines (Intel(R) Core(TM) 
> > i7-6700 CPU @ 3.40GHz) This can go to up to 4 if needed.
> > Then the machines are prepared for benchmarking in the following way:
> > - In BIOS/Setup power saving features are disabled
> > - Machines are locked for access using pam_access
> > - Max frequency is set through  cpupower  and in /etc/sysconfig/cpupower
> > - All services that are not needed switched off: > uptime 23:15:17 up 26 
> > days, 23:24,  1 user,  load average: 0.00, 0.00, 0.00
> > - Transparent huge pages set on demand cat 
> > /sys/kernel/mm/transparent_hugepage/enabled
> > always [madvise] never
> > - audit control switched off auditctl -e 0
> > - Memory clean added to launch scripts echo 3 > /proc/sys/vm/drop_caches
> > - pstate=disable added to the kernel config
> >
> > This config is giving relatively clean and not noisy machine.
> > Commits in master trigger build and ctest -L benchmarks. Output is parsed.
>
> When you say "output is parsed", how is that exactly? We don't have
> any scripts in the repository to do this yet (I have some comments on
> this below). We also have to collect machine information and insert
> that into the database. From my perspective we have quite a bit of
> engineering work on this topic ("benchmark execution and data
> collection") to do.
>
> My team and I have some physical hardware (including an Aarch64 Jetson
> TX2 machine, might be interesting to see what the ARM64 results look
> like) where we'd like to run benchmarks and upload the results also,
> so we need to write some documentation about how to add a new machine
> and set up a cron job of some kind
>
> >
> > What is missing:
> > * Where should our Codespeed database reside? I can fire-up a VM and put it 
> > there, or if you have other preferences let's discuss.
>
> Since this isn't ASF-owned infrastructure, it can go anywhere. It
> would be nice to make backups publicly available
>
> > * What address should it have?
>
> The address can be anything really
>
> > * How to make it available to all developers? Do we want to integrate into 
> > CI or not?
>
> I'd like to eventually have a bot that we can ask to run a benchmark
> comparison versus master. Reporting on all PRs automatically might be
> quite a bit of work (and load on the machines)
>
> > * What is the standard benchmark output? I suppose Googlebench, but lets 
> > state that.
>
> I thought the idea (based on our past e-mail discussions) was that we
> would implement benchmark collectors (as programs in the Arrow git
> repository) for each benchmarking framework, starting with gbenchmark
> and expanding to include ASV (for Python) and then others
>
> > * My interest is the C++ benchmarks only for now. Do we need to track all 
> > benchmarks?
>
> Yes I think we want to be able to run the Python benchmarks too and
> insert that data. Other languages can implement a benchmark collector
> to arrange their benchmark data according to the database schema
>
> > * What is the process of adding benchmarks?
>
> Normal pull requests (see all the C++ programs that end in
> "-benchmark.cc"). The benchmark collector / insertion scripts may need
> to recognize when a benchmark has been run for the first time (I
> haven't looked closely enough at the schema to see if there are any
> primary keys associated with a particular benchmark name)
>
> >
> > Anything else for short term?
>
> It seems like writing the benchmark collector script that runs the
> benchmarks, collects machine information, and inserts data into an
> instance of the database is the next milestone. Until that's done it
> seems difficult to do much else
>
> >
> > -Areg.
> >
> >
> >
> >
> >

Re: [Discuss] Benchmarking infrastructure

Reply via email to