Csaba Ringhofer created IMPALA-13216:
----------------------------------------

             Summary: Switch run_workload.py to use HS2 instead of Beeswax
                 Key: IMPALA-13216
                 URL: https://issues.apache.org/jira/browse/IMPALA-13216
             Project: IMPALA
          Issue Type: Improvement
          Components: Clients, Infrastructure
            Reporter: Csaba Ringhofer


Currently the default is using beeswax, which leads to using beeswax in perf 
tests.
https://github.com/apache/impala/blob/c53987480726b114e0c3537c71297df2834a4962/bin/run-workload.py#L98

This could affect perf results/variance, because different clients use 
different sleep intervals when waiting for query status to become finished:
beeswax uses 50ms:
https://github.com/apache/impala/blob/c53987480726b114e0c3537c71297df2834a4962/tests/beeswax/impala_beeswax.py#L408
while hs2 would use a more complicated formula from Impyla, ranging for 10ms to 
1s:
https://github.com/apache/impala/blob/c53987480726b114e0c3537c71297df2834a4962/tests/performance/query_exec_functions.py#L122
https://github.com/cloudera/impyla/blob/acbd481dde28d85976dfc777f888b32ad6c8d721/impala/hiveserver2.py#L513

Making sleep times configurable in Impyla could help with this - it would make 
sense to use smaller sleeps than in real workloads to reduce variability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to