Gergely Fürnstáhl created IMPALA-11114:
------------------------------------------
Summary: calculate_tval fails with ZeroDevisionError if the
standard deviations are 0
Key: IMPALA-11114
URL: https://issues.apache.org/jira/browse/IMPALA-11114
Project: IMPALA
Issue Type: Bug
Reporter: Gergely Fürnstáhl
Possible cause:
_Rounding of the data or other forms of truncation could give zero standard
deviation when in fact you have some. And if the difference that you are trying
to measure is within your measurement error that is a problem not addressed by
the t-test._
[https://stats.stackexchange.com/questions/78570/t-test-with-sample-standard-deviation-of-zero-possible/275879]
Full log:
{code:java}
Traceback (most recent call last):
File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py",
line 1131, in <module>
report = Report(grouped, ref_grouped)
File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py",
line 494, in __init__
self.__analyze()
File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py",
line 514, in __analyze
query_comparison_row = Report.QueryComparisonRow(results, ref_results)
File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py",
line 370, in __init__
self.__check_perf_change_significance(results, ref_results))
File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py",
line 390, in __check_perf_change_significance
ref_stat[AVG], ref_stat[STDDEV], ref_stat[ITERATIONS])
File "/home/gfurnstahl/Impala/tests/util/calculation_util.py", line 65, in
calculate_tval
return (avg - ref_avg) / sem
ZeroDivisionError: float division by zero
Traceback (most recent call last):
File "bin/single_node_perf_run.py", line 359, in <module>
main()
File "bin/single_node_perf_run.py", line 349, in main
perf_ab_test(options, args)
File "bin/single_node_perf_run.py", line 267, in perf_ab_test
compare(temp_dir, hash_a, hash_b)
File "bin/single_node_perf_run.py", line 175, in compare
report_benchmark_results(file_a, file_b, description)
File "bin/single_node_perf_run.py", line 166, in report_benchmark_results
stdout=f)
File
"/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/subprocess.py",
line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command
'['/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py',
'--reference_result_file=/home/gfurnstahl/Impala/perf_results/perf_run_0SdUw7/a87f8c5df9f6fbf8d468921642d7ec3d37c5f4de.json',
'--input_result_file=/home/gfurnstahl/Impala/perf_results/perf_run_0SdUw7/b4d04112559c3f04ebf42b36deb1cd537dea78c4.json',
'--report_description="a87f8c5df9f6fbf8d468921642d7ec3d37c5f4de vs
b4d04112559c3f04ebf42b36deb1cd537dea78c4"']' returned non-zero exit status
1{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)