Re: Benchmarking dashboard proposal

Tanya Schlusser Thu, 07 Feb 2019 14:40:28 -0800

Late, but there's a PR now with first-draft DDL (
https://github.com/apache/arrow/pull/3586).
Happy to receive any feedback!


I tried to think about how people would submit benchmarks, and added a
Postgraphile container for http-via-GraphQL.
If others have strong opinions on the data modeling please speak up because
I'm more a database user than a designer.

I can also help with benchmarking work in R/Python given guidance/a
roadmap/examples from someone else.

Best,
Tanya

On Mon, Feb 4, 2019 at 12:37 PM Tanya Schlusser <ta...@tickel.net> wrote:

> I hope to make a PR with the DDL by tomorrow or Wednesday night—DDL along
> with a README in a new directory `arrow/dev/benchmarking` unless directed
> otherwise.
>
> A "C++ Benchmark Collector" script would be super. I expect some
> back-and-forth on this to identify naïve assumptions in the data model.
>
> Attempting to submit actual benchmarks is how to get a handle on that. I
> recognize I'm blocking downstream work. Better to get an initial PR and
> some discussion going.
>
> Best,
> Tanya
>
> On Mon, Feb 4, 2019 at 10:10 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
>> hi folks,
>>
>> I'm curious where we currently stand on this project. I see the
>> discussion in https://issues.apache.org/jira/browse/ARROW-4313 --
>> would the next step be to have a pull request with .sql files
>> containing the DDL required to create the schema in PostgreSQL?
>>
>> I could volunteer to write the "C++ Benchmark Collector" script that
>> will run all the benchmarks on Linux and collect their data to be
>> inserted into the database.
>>
>> Thanks
>> Wes
>>
>> On Sun, Jan 27, 2019 at 12:20 AM Tanya Schlusser <ta...@tickel.net>
>> wrote:
>> >
>> > I don't want to be the bottleneck and have posted an initial draft data
>> > model in the JIRA issue
>> https://issues.apache.org/jira/browse/ARROW-4313
>> >
>> > It should not be a problem to get content into a form that would be
>> > acceptable for either a static site like ASV (via CORS queries to a
>> > GraphQL/REST interface) or a codespeed-style site (via a separate schema
>> > organized for Django)
>> >
>> > I don't think I'm experienced enough to actually write any benchmarks
>> > though, so all I can contribute is backend work for this task.
>> >
>> > Best,
>> > Tanya
>> >
>> > On Sat, Jan 26, 2019 at 7:37 PM Wes McKinney <wesmck...@gmail.com>
>> wrote:
>> >
>> > > hi folks,
>> > >
>> > > I'd like to propose some kind of timeline for getting a first
>> > > iteration of a benchmark database developed and live, with scripts to
>> > > enable one or more initial agents to start adding new data on a daily
>> > > / per-commit basis. I have at least 3 physical machines where I could
>> > > immediately set up cron jobs to start adding new data, and I could
>> > > attempt to backfill data as far back as possible.
>> > >
>> > > Personally, I would like to see this done by the end of February if
>> > > not sooner -- if we don't have the volunteers to push the work to
>> > > completion by then please let me know as I will rearrange my
>> > > priorities to make sure that it happens. Does that sounds reasonable?
>> > >
>> > > Please let me know if this plan sounds reasonable:
>> > >
>> > > * Set up a hosted PostgreSQL instance, configure backups
>> > > * Propose and adopt a database schema for storing benchmark results
>> > > * For C++, write script (or Dockerfile) to execute all
>> > > google-benchmarks, output results to JSON, then adapter script
>> > > (Python) to ingest into database
>> > > * For Python, similar script that invokes ASV, then inserts ASV
>> > > results into benchmark database
>> > >
>> > > This seems to be a pre-requisite for having a front-end to visualize
>> > > the results, but the dashboard/front end can hopefully be implemented
>> > > in such a way that the details of the benchmark database are not too
>> > > tightly coupled
>> > >
>> > > (Do we have any other benchmarks in the project that would need to be
>> > > inserted initially?)
>> > >
>> > > Related work to trigger benchmarks on agents when new commits land in
>> > > master can happen concurrently -- one task need not block the other
>> > >
>> > > Thanks
>> > > Wes
>> > >
>> > > On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney <wesmck...@gmail.com>
>> wrote:
>> > > >
>> > > > Sorry, copy-paste failure:
>> > > https://issues.apache.org/jira/browse/ARROW-4313
>> > > >
>> > > > On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney <wesmck...@gmail.com>
>> > > wrote:
>> > > > >
>> > > > > I don't think there is one but I just created
>> > > > >
>> > >
>> https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E
>> > > > >
>> > > > > On Mon, Jan 21, 2019 at 10:35 AM Tanya Schlusser <
>> ta...@tickel.net>
>> > > wrote:
>> > > > > >
>> > > > > > Areg,
>> > > > > >
>> > > > > > If you'd like help, I volunteer! No experience benchmarking but
>> tons
>> > > > > > experience databasing—I can mock the backend (database + http)
>> as a
>> > > > > > starting point for discussion if this is the way people want to
>> go.
>> > > > > >
>> > > > > > Is there a Jira ticket for this that i can jump into?
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Sun, Jan 20, 2019 at 3:24 PM Wes McKinney <
>> wesmck...@gmail.com>
>> > > wrote:
>> > > > > >
>> > > > > > > hi Areg,
>> > > > > > >
>> > > > > > > This sounds great -- we've discussed building a more
>> full-featured
>> > > > > > > benchmark automation system in the past but nothing has been
>> > > developed
>> > > > > > > yet.
>> > > > > > >
>> > > > > > > Your proposal about the details sounds OK; the single most
>> > > important
>> > > > > > > thing to me is that we build and maintain a very general
>> purpose
>> > > > > > > database schema for building the historical benchmark database
>> > > > > > >
>> > > > > > > The benchmark database should keep track of:
>> > > > > > >
>> > > > > > > * Timestamp of benchmark run
>> > > > > > > * Git commit hash of codebase
>> > > > > > > * Machine unique name (sort of the "user id")
>> > > > > > > * CPU identification for machine, and clock frequency (in
>> case of
>> > > > > > > overclocking)
>> > > > > > > * CPU cache sizes (L1/L2/L3)
>> > > > > > > * Whether or not CPU throttling is enabled (if it can be
>> easily
>> > > determined)
>> > > > > > > * RAM size
>> > > > > > > * GPU identification (if any)
>> > > > > > > * Benchmark unique name
>> > > > > > > * Programming language(s) associated with benchmark (e.g. a
>> > > benchmark
>> > > > > > > may involve both C++ and Python)
>> > > > > > > * Benchmark time, plus mean and standard deviation if
>> available,
>> > > else NULL
>> > > > > > >
>> > > > > > > (maybe some other things)
>> > > > > > >
>> > > > > > > I would rather not be locked into the internal database
>> schema of a
>> > > > > > > particular benchmarking tool. So people in the community can
>> just
>> > > run
>> > > > > > > SQL queries against the database and use the data however they
>> > > like.
>> > > > > > > We'll just have to be careful that people don't DROP TABLE or
>> > > DELETE
>> > > > > > > (but we should have daily backups so we can recover from such
>> > > cases)
>> > > > > > >
>> > > > > > > So while we may make use of TeamCity to schedule the runs on
>> the
>> > > cloud
>> > > > > > > and physical hardware, we should also provide a path for other
>> > > people
>> > > > > > > in the community to add data to the benchmark database on
>> their
>> > > > > > > hardware on an ad hoc basis. For example, I have several
>> machines
>> > > in
>> > > > > > > my home on all operating systems (Windows / macOS / Linux,
>> and soon
>> > > > > > > also ARM64) and I'd like to set up scheduled tasks / cron
>> jobs to
>> > > > > > > report in to the database at least on a daily basis.
>> > > > > > >
>> > > > > > > Ideally the benchmark database would just be a PostgreSQL
>> server
>> > > with
>> > > > > > > a schema we write down and keep backed up etc. Hosted
>> PostgreSQL is
>> > > > > > > inexpensive ($200+ per year depending on size of instance;
>> this
>> > > > > > > probably doesn't need to be a crazy big machine)
>> > > > > > >
>> > > > > > > I suspect there will be a manageable amount of development
>> > > involved to
>> > > > > > > glue each of the benchmarking frameworks together with the
>> > > benchmark
>> > > > > > > database. This can also handle querying the operating system
>> for
>> > > the
>> > > > > > > system information listed above
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > > Wes
>> > > > > > >
>> > > > > > > On Fri, Jan 18, 2019 at 12:14 AM Melik-Adamyan, Areg
>> > > > > > > <areg.melik-adam...@intel.com> wrote:
>> > > > > > > >
>> > > > > > > > Hello,
>> > > > > > > >
>> > > > > > > > I want to restart/attach to the discussions for creating
>> Arrow
>> > > > > > > benchmarking dashboard. I want to propose performance
>> benchmark
>> > > run per
>> > > > > > > commit to track the changes.
>> > > > > > > > The proposal includes building infrastructure for per-commit
>> > > tracking
>> > > > > > > comprising of the following parts:
>> > > > > > > > - Hosted JetBrains for OSS https://teamcity.jetbrains.com/
>> as a
>> > > build
>> > > > > > > system
>> > > > > > > > - Agents running in cloud both VM/container (DigitalOcean,
>> or
>> > > others)
>> > > > > > > and bare-metal (Packet.net/AWS) and on-premise(Nvidia boxes?)
>> > > > > > > > - JFrog artifactory storage and management for OSS projects
>> > > > > > > https://jfrog.com/open-source/#artifactory2
>> > > > > > > > - Codespeed as a frontend
>> https://github.com/tobami/codespeed
>> > > > > > > >
>> > > > > > > > I am volunteering to build such system (if needed more Intel
>> > > folks will
>> > > > > > > be involved) so we can start tracking performance on various
>> > > platforms and
>> > > > > > > understand how changes affect it.
>> > > > > > > >
>> > > > > > > > Please, let me know your thoughts!
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > > -Areg.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > >
>>
>

Re: Benchmarking dashboard proposal

Reply via email to