[Impala-ASF-CR] IMPALA-7426: Use Mann-Whitney U to compare benchmarks

Jim Apple (Code Review) Fri, 31 Aug 2018 19:16:53 -0700

Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11194 )


Change subject: IMPALA-7426: Use Mann-Whitney U to compare benchmarks
......................................................................


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py
File tests/benchmark/report_benchmark_results.py:

http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py@a568
PS2, Line 568:
> is removing num_clients intentional?
Yes. This information was the same in each row and was not used in calculating 
statistical significance.


http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py@129
PS2, Line 129:                  dest="max_percent_change_threshold", 
default=float("inf"),
> Is the default change intentional?
Yes. High variance tests can have large percent changes without being 
statistically significant.


http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py@394
PS2, Line 394:       if not options.hive_results:
             :         try:
             :           save_runtime_diffs(results, ref_results, 
self.perf_change, self.zval, self.tval)
             :         except Exception as e:
             :           LOG.error('Could not generate an html diff: 
{0}'.format(e))
> This code used to never run because it was indented incorrectly? It looks o
Correct - it was dead code, it appears


http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py@563
PS2, Line 563:                                      'Base StdDev(%)', 'Iters', 
'Median Diff(%)', 'Zval',
> Should this be "MW Z-Val" or something to indicate what kind of zvalue this
Done


http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py@1037
PS2, Line 1037:     prefix = ('reg' if zval >= 0 and tval >=
> This is different than __check_perf_change_significance's implementation. D
Well, only kind of - if the change was significant, then abs(zval) > 
zval_thresh or abs(tval) > tval_thresh. Only then do we print reg or imp or ???.

Maybe I'm not understanding your question.


http://gerrit.cloudera.org:8080/#/c/11194/2/tests/benchmark/report_benchmark_results.py@1037
PS2, Line 1037:     prefix = ('reg' if zval >= 0 and tval >=
              :               0 else 'imp' if zval <= 0 and tval <= 0 else 
'???')
> I think this would be more legible as multiple lines. (I have a deep hatred
Done


http://gerrit.cloudera.org:8080/#/c/11194/2/tests/util/calculation_util.py
File tests/util/calculation_util.py:

http://gerrit.cloudera.org:8080/#/c/11194/2/tests/util/calculation_util.py@74
PS2, Line 74:   Calculates the Mann-Whitney U Test Z value for the given 
samples and reference.
> Have you compared this to https://docs.scipy.org/doc/scipy/reference/genera
The normal approximation will be less valid with smaller sample sizes, but I 
expect that to show up infrequently.

If you look at the code for scipy.stats.mannwhitneyu, you see it computes z and 
then calls distributions.norm.sf

https://github.com/scipy/scipy/blob/v1.1.0/scipy/stats/stats.py#L4836-L4925

which calls norm_cdf

https://github.com/scipy/scipy/blob/v1.1.0/scipy/stats/_continuous_distns.py#L101

which calls ndtr

https://github.com/scipy/scipy/blob/v1.1.0/scipy/stats/_continuous_distns.py#L90

which is just turning a z-value into a p-value:

https://github.com/scipy/scipy/blob/v1.1.0/scipy/special/cephes/ndtr.c

We could do the same here or with the t-value, but I think it's fine to leave 
them uninterpreted.



--
To view, visit http://gerrit.cloudera.org:8080/11194
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8d6631ebeba1422b832def5cd68537624f672fa0
Gerrit-Change-Number: 11194
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Comment-Date: Sat, 01 Sep 2018 02:16:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7426: Use Mann-Whitney U to compare benchmarks

Reply via email to