Gergely Fürnstáhl created IMPALA-11114:
------------------------------------------

             Summary: calculate_tval fails with ZeroDevisionError if the 
standard deviations are 0
                 Key: IMPALA-11114
                 URL: https://issues.apache.org/jira/browse/IMPALA-11114
             Project: IMPALA
          Issue Type: Bug
            Reporter: Gergely Fürnstáhl


Possible cause:

_Rounding of the data or other forms of truncation could give zero standard 
deviation when in fact you have some. And if the difference that you are trying 
to measure is within your measurement error that is a problem not addressed by 
the t-test._

[https://stats.stackexchange.com/questions/78570/t-test-with-sample-standard-deviation-of-zero-possible/275879]

Full log:
{code:java}
Traceback (most recent call last):
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", 
line 1131, in <module>
    report = Report(grouped, ref_grouped)
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", 
line 494, in __init__
    self.__analyze()
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", 
line 514, in __analyze
    query_comparison_row = Report.QueryComparisonRow(results, ref_results)
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", 
line 370, in __init__
    self.__check_perf_change_significance(results, ref_results))
  File "/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py", 
line 390, in __check_perf_change_significance
    ref_stat[AVG], ref_stat[STDDEV], ref_stat[ITERATIONS])
  File "/home/gfurnstahl/Impala/tests/util/calculation_util.py", line 65, in 
calculate_tval
    return (avg - ref_avg) / sem
ZeroDivisionError: float division by zero
Traceback (most recent call last):
  File "bin/single_node_perf_run.py", line 359, in <module>
    main()
  File "bin/single_node_perf_run.py", line 349, in main
    perf_ab_test(options, args)
  File "bin/single_node_perf_run.py", line 267, in perf_ab_test
    compare(temp_dir, hash_a, hash_b)
  File "bin/single_node_perf_run.py", line 175, in compare
    report_benchmark_results(file_a, file_b, description)
  File "bin/single_node_perf_run.py", line 166, in report_benchmark_results
    stdout=f)
  File 
"/home/gfurnstahl/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/subprocess.py",
 line 190, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 
'['/home/gfurnstahl/Impala/tests/benchmark/report_benchmark_results.py', 
'--reference_result_file=/home/gfurnstahl/Impala/perf_results/perf_run_0SdUw7/a87f8c5df9f6fbf8d468921642d7ec3d37c5f4de.json',
 
'--input_result_file=/home/gfurnstahl/Impala/perf_results/perf_run_0SdUw7/b4d04112559c3f04ebf42b36deb1cd537dea78c4.json',
 '--report_description="a87f8c5df9f6fbf8d468921642d7ec3d37c5f4de vs 
b4d04112559c3f04ebf42b36deb1cd537dea78c4"']' returned non-zero exit status 
1{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to