Joe McDonnell created IMPALA-13781:
--------------------------------------
Summary: report_benchmark_result.py uses the wrong calculation for
median diff %
Key: IMPALA-13781
URL: https://issues.apache.org/jira/browse/IMPALA-13781
Project: IMPALA
Issue Type: Bug
Components: Infrastructure
Affects Versions: Impala 4.6.0
Reporter: Joe McDonnell
The benchmark report includes a column with the "Median Diff(%)", but it is
being calculated improperly. It can produce % reductions greater than 100%
because it is dividing by the new result rather than the base result:
{noformat}
# median uses "results", but it should use "ref_results"
median = results[SORTED][int(len(results[SORTED]) / 2)]
all_diffs = [x - y for x in results[SORTED] for y in
ref_results[SORTED]]
all_diffs.sort()
self.median_diff = all_diffs[int(len(all_diffs) / 2)] / median{noformat}
In an AB test, the median variable used as the divisor should be the A value
(i.e. the base / reference value). Instead, this is using the B value.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)