Joe McDonnell created IMPALA-13125:
--------------------------------------
Summary: Set of tests for exploration_strategy=exhaustive varies
between python 2 and 3
Key: IMPALA-13125
URL: https://issues.apache.org/jira/browse/IMPALA-13125
Project: IMPALA
Issue Type: Sub-task
Components: Infrastructure
Affects Versions: Impala 4.5.0
Reporter: Joe McDonnell
TLDR: Python 3 runs a different set of exhaustive tests than Python 2.
Longer version:
When looking into running Python 3 tests, I noticed that the set of tests
running for the exhaustive tests is different for Python 2 vs Python 3. This
was surprising.
It turns out there is a distinction between run-tests.py's
--exploration_strategy=exhaustive vs the
--workload_exploration_strategy="functional-query:exhaustive" option. The
exhaustive job is actually doing the latter. This means that individual
function-query workload classes see cls.exploration_strategy() == "exhaustive",
but the logic that generates the test vector still see
exploration_strategy=core and it still uses pairwise generation. Code:
{noformat}
if exploration_strategy == 'exhaustive':
return self.__generate_exhaustive_combinations()
elif exploration_strategy in ['core', 'pairwise']:
return self.__generate_pairwise_combinations(){noformat}
[https://github.com/apache/impala/blob/master/tests/common/test_vector.py#L165-L168]
Python 2 vs 3 changes the way dictionaries work, impacting the order of test
dimensions and how it picks tests. So, the Python 3 exhaustive tests are
different. This may expose latent bugs, because some combinations that meet the
constraints are never actually run (e.g. some json encodings don't have the
decimal_tiny table).
We can work to make them behave similarly, using pytest's --collect-only option
to look at the differences (and compare them to actual existing runs).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)