[
https://issues.apache.org/jira/browse/ARROW-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755808#comment-16755808
]
Antoine Pitrou commented on ARROW-4313:
---------------------------------------
"Is it a mistake to make `cpu.cpu_model_name` unique? I mean, are the LX cache
levels, core counts, or any other attribute ever different for the same CPU
model string?"
The overclocked frequency may vary (which we could also call "actual
frequency"), the rest should be the same.
> Define general benchmark database schema
> ----------------------------------------
>
> Key: ARROW-4313
> URL: https://issues.apache.org/jira/browse/ARROW-4313
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Benchmarking
> Reporter: Wes McKinney
> Priority: Major
> Fix For: 0.13.0
>
> Attachments: benchmark-data-model.erdplus, benchmark-data-model.png
>
>
> Some possible attributes that the benchmark database should track, to permit
> heterogeneity of hardware and programming languages
> * Timestamp of benchmark run
> * Git commit hash of codebase
> * Machine unique name (sort of the "user id")
> * CPU identification for machine, and clock frequency (in case of
> overclocking)
> * CPU cache sizes (L1/L2/L3)
> * Whether or not CPU throttling is enabled (if it can be easily determined)
> * RAM size
> * GPU identification (if any)
> * Benchmark unique name
> * Programming language(s) associated with benchmark (e.g. a benchmark
> may involve both C++ and Python)
> * Benchmark time, plus mean and standard deviation if available, else NULL
> see discussion on mailing list
> https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c5291a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)