Late, but there's a PR now with first-draft DDL ( https://github.com/apache/arrow/pull/3586). Happy to receive any feedback!
I tried to think about how people would submit benchmarks, and added a Postgraphile container for http-via-GraphQL. If others have strong opinions on the data modeling please speak up because I'm more a database user than a designer. I can also help with benchmarking work in R/Python given guidance/a roadmap/examples from someone else. Best, Tanya On Mon, Feb 4, 2019 at 12:37 PM Tanya Schlusser <ta...@tickel.net> wrote: > I hope to make a PR with the DDL by tomorrow or Wednesday night—DDL along > with a README in a new directory `arrow/dev/benchmarking` unless directed > otherwise. > > A "C++ Benchmark Collector" script would be super. I expect some > back-and-forth on this to identify naïve assumptions in the data model. > > Attempting to submit actual benchmarks is how to get a handle on that. I > recognize I'm blocking downstream work. Better to get an initial PR and > some discussion going. > > Best, > Tanya > > On Mon, Feb 4, 2019 at 10:10 AM Wes McKinney <wesmck...@gmail.com> wrote: > >> hi folks, >> >> I'm curious where we currently stand on this project. I see the >> discussion in https://issues.apache.org/jira/browse/ARROW-4313 -- >> would the next step be to have a pull request with .sql files >> containing the DDL required to create the schema in PostgreSQL? >> >> I could volunteer to write the "C++ Benchmark Collector" script that >> will run all the benchmarks on Linux and collect their data to be >> inserted into the database. >> >> Thanks >> Wes >> >> On Sun, Jan 27, 2019 at 12:20 AM Tanya Schlusser <ta...@tickel.net> >> wrote: >> > >> > I don't want to be the bottleneck and have posted an initial draft data >> > model in the JIRA issue >> https://issues.apache.org/jira/browse/ARROW-4313 >> > >> > It should not be a problem to get content into a form that would be >> > acceptable for either a static site like ASV (via CORS queries to a >> > GraphQL/REST interface) or a codespeed-style site (via a separate schema >> > organized for Django) >> > >> > I don't think I'm experienced enough to actually write any benchmarks >> > though, so all I can contribute is backend work for this task. >> > >> > Best, >> > Tanya >> > >> > On Sat, Jan 26, 2019 at 7:37 PM Wes McKinney <wesmck...@gmail.com> >> wrote: >> > >> > > hi folks, >> > > >> > > I'd like to propose some kind of timeline for getting a first >> > > iteration of a benchmark database developed and live, with scripts to >> > > enable one or more initial agents to start adding new data on a daily >> > > / per-commit basis. I have at least 3 physical machines where I could >> > > immediately set up cron jobs to start adding new data, and I could >> > > attempt to backfill data as far back as possible. >> > > >> > > Personally, I would like to see this done by the end of February if >> > > not sooner -- if we don't have the volunteers to push the work to >> > > completion by then please let me know as I will rearrange my >> > > priorities to make sure that it happens. Does that sounds reasonable? >> > > >> > > Please let me know if this plan sounds reasonable: >> > > >> > > * Set up a hosted PostgreSQL instance, configure backups >> > > * Propose and adopt a database schema for storing benchmark results >> > > * For C++, write script (or Dockerfile) to execute all >> > > google-benchmarks, output results to JSON, then adapter script >> > > (Python) to ingest into database >> > > * For Python, similar script that invokes ASV, then inserts ASV >> > > results into benchmark database >> > > >> > > This seems to be a pre-requisite for having a front-end to visualize >> > > the results, but the dashboard/front end can hopefully be implemented >> > > in such a way that the details of the benchmark database are not too >> > > tightly coupled >> > > >> > > (Do we have any other benchmarks in the project that would need to be >> > > inserted initially?) >> > > >> > > Related work to trigger benchmarks on agents when new commits land in >> > > master can happen concurrently -- one task need not block the other >> > > >> > > Thanks >> > > Wes >> > > >> > > On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney <wesmck...@gmail.com> >> wrote: >> > > > >> > > > Sorry, copy-paste failure: >> > > https://issues.apache.org/jira/browse/ARROW-4313 >> > > > >> > > > On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney <wesmck...@gmail.com> >> > > wrote: >> > > > > >> > > > > I don't think there is one but I just created >> > > > > >> > > >> https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E >> > > > > >> > > > > On Mon, Jan 21, 2019 at 10:35 AM Tanya Schlusser < >> ta...@tickel.net> >> > > wrote: >> > > > > > >> > > > > > Areg, >> > > > > > >> > > > > > If you'd like help, I volunteer! No experience benchmarking but >> tons >> > > > > > experience databasing—I can mock the backend (database + http) >> as a >> > > > > > starting point for discussion if this is the way people want to >> go. >> > > > > > >> > > > > > Is there a Jira ticket for this that i can jump into? >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > On Sun, Jan 20, 2019 at 3:24 PM Wes McKinney < >> wesmck...@gmail.com> >> > > wrote: >> > > > > > >> > > > > > > hi Areg, >> > > > > > > >> > > > > > > This sounds great -- we've discussed building a more >> full-featured >> > > > > > > benchmark automation system in the past but nothing has been >> > > developed >> > > > > > > yet. >> > > > > > > >> > > > > > > Your proposal about the details sounds OK; the single most >> > > important >> > > > > > > thing to me is that we build and maintain a very general >> purpose >> > > > > > > database schema for building the historical benchmark database >> > > > > > > >> > > > > > > The benchmark database should keep track of: >> > > > > > > >> > > > > > > * Timestamp of benchmark run >> > > > > > > * Git commit hash of codebase >> > > > > > > * Machine unique name (sort of the "user id") >> > > > > > > * CPU identification for machine, and clock frequency (in >> case of >> > > > > > > overclocking) >> > > > > > > * CPU cache sizes (L1/L2/L3) >> > > > > > > * Whether or not CPU throttling is enabled (if it can be >> easily >> > > determined) >> > > > > > > * RAM size >> > > > > > > * GPU identification (if any) >> > > > > > > * Benchmark unique name >> > > > > > > * Programming language(s) associated with benchmark (e.g. a >> > > benchmark >> > > > > > > may involve both C++ and Python) >> > > > > > > * Benchmark time, plus mean and standard deviation if >> available, >> > > else NULL >> > > > > > > >> > > > > > > (maybe some other things) >> > > > > > > >> > > > > > > I would rather not be locked into the internal database >> schema of a >> > > > > > > particular benchmarking tool. So people in the community can >> just >> > > run >> > > > > > > SQL queries against the database and use the data however they >> > > like. >> > > > > > > We'll just have to be careful that people don't DROP TABLE or >> > > DELETE >> > > > > > > (but we should have daily backups so we can recover from such >> > > cases) >> > > > > > > >> > > > > > > So while we may make use of TeamCity to schedule the runs on >> the >> > > cloud >> > > > > > > and physical hardware, we should also provide a path for other >> > > people >> > > > > > > in the community to add data to the benchmark database on >> their >> > > > > > > hardware on an ad hoc basis. For example, I have several >> machines >> > > in >> > > > > > > my home on all operating systems (Windows / macOS / Linux, >> and soon >> > > > > > > also ARM64) and I'd like to set up scheduled tasks / cron >> jobs to >> > > > > > > report in to the database at least on a daily basis. >> > > > > > > >> > > > > > > Ideally the benchmark database would just be a PostgreSQL >> server >> > > with >> > > > > > > a schema we write down and keep backed up etc. Hosted >> PostgreSQL is >> > > > > > > inexpensive ($200+ per year depending on size of instance; >> this >> > > > > > > probably doesn't need to be a crazy big machine) >> > > > > > > >> > > > > > > I suspect there will be a manageable amount of development >> > > involved to >> > > > > > > glue each of the benchmarking frameworks together with the >> > > benchmark >> > > > > > > database. This can also handle querying the operating system >> for >> > > the >> > > > > > > system information listed above >> > > > > > > >> > > > > > > Thanks >> > > > > > > Wes >> > > > > > > >> > > > > > > On Fri, Jan 18, 2019 at 12:14 AM Melik-Adamyan, Areg >> > > > > > > <areg.melik-adam...@intel.com> wrote: >> > > > > > > > >> > > > > > > > Hello, >> > > > > > > > >> > > > > > > > I want to restart/attach to the discussions for creating >> Arrow >> > > > > > > benchmarking dashboard. I want to propose performance >> benchmark >> > > run per >> > > > > > > commit to track the changes. >> > > > > > > > The proposal includes building infrastructure for per-commit >> > > tracking >> > > > > > > comprising of the following parts: >> > > > > > > > - Hosted JetBrains for OSS https://teamcity.jetbrains.com/ >> as a >> > > build >> > > > > > > system >> > > > > > > > - Agents running in cloud both VM/container (DigitalOcean, >> or >> > > others) >> > > > > > > and bare-metal (Packet.net/AWS) and on-premise(Nvidia boxes?) >> > > > > > > > - JFrog artifactory storage and management for OSS projects >> > > > > > > https://jfrog.com/open-source/#artifactory2 >> > > > > > > > - Codespeed as a frontend >> https://github.com/tobami/codespeed >> > > > > > > > >> > > > > > > > I am volunteering to build such system (if needed more Intel >> > > folks will >> > > > > > > be involved) so we can start tracking performance on various >> > > platforms and >> > > > > > > understand how changes affect it. >> > > > > > > > >> > > > > > > > Please, let me know your thoughts! >> > > > > > > > >> > > > > > > > Thanks, >> > > > > > > > -Areg. >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > >> >