srowen commented on a change in pull request #23914: [SPARK-27009][TEST] Add
Standard Deviation to benchmark results
URL: https://github.com/apache/spark/pull/23914#discussion_r261232628
##########
File path: core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala
##########
@@ -158,7 +159,8 @@ private[spark] class Benchmark(
// scalastyle:on
val best = runTimes.min
val avg = runTimes.sum / runTimes.size
- Result(avg / 1000000.0, num / (best / 1000.0), best / 1000000.0)
+ val stdev = math.sqrt(runTimes.map(time => math.pow(time - avg, 2)).sum /
runTimes.size)
Review comment:
Not that it really matters, but `(time - avg) * (time - avg)` is fine here
and faster than pow.
Super nit but I'd suggest it's more reasonable to use the sample rather than
population stdev: divide by `runTimes.size - 1`. I suppose this means also
checking that there are at least 2 runs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]