[jira] [Commented] (ARROW-4313) Define general benchmark database schema

Areg Melik-Adamyan (JIRA) Tue, 29 Jan 2019 16:44:12 -0800


    [ 
https://issues.apache.org/jira/browse/ARROW-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755513#comment-16755513
 ]


Areg Melik-Adamyan commented on ARROW-4313:
-------------------------------------------

Got it. I think that mostly those numbers are never used because you run 
benchmarks on a fixed freq always to get consistent results in time. So they 
can be easily determined from the model name or cpuid, just for informational 
purposes, but will never be used in a serial benchmarking. In a serial 
benchmarking everything should be fixed, nailed and unchanged, except the 
variable you are measuring, and it is the arrow code measured through the 
benchmark code. 

> Define general benchmark database schema
> ----------------------------------------
>
>                 Key: ARROW-4313
>                 URL: https://issues.apache.org/jira/browse/ARROW-4313
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Benchmarking
>            Reporter: Wes McKinney
>            Priority: Major
>             Fix For: 0.13.0
>
>         Attachments: benchmark-data-model.erdplus, benchmark-data-model.png
>
>
> Some possible attributes that the benchmark database should track, to permit 
> heterogeneity of hardware and programming languages
> * Timestamp of benchmark run
> * Git commit hash of codebase
> * Machine unique name (sort of the "user id")
> * CPU identification for machine, and clock frequency (in case of 
> overclocking)
> * CPU cache sizes (L1/L2/L3)
> * Whether or not CPU throttling is enabled (if it can be easily determined)
> * RAM size
> * GPU identification (if any)
> * Benchmark unique name
> * Programming language(s) associated with benchmark (e.g. a benchmark
> may involve both C++ and Python)
> * Benchmark time, plus mean and standard deviation if available, else NULL
> see discussion on mailing list 
> https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-4313) Define general benchmark database schema

Reply via email to