[jira] [Created] (IMPALA-12898) Tidy up test matrix of test_scanner.py

Riza Suminto (Jira) Tue, 12 Mar 2024 16:52:03 -0700

Riza Suminto created IMPALA-12898:
-------------------------------------

             Summary: Tidy up test matrix of test_scanner.py
                 Key: IMPALA-12898
                 URL: https://issues.apache.org/jira/browse/IMPALA-12898
             Project: IMPALA
          Issue Type: Test
          Components: Infrastructure
            Reporter: Riza Suminto



Several tests in test_scanner.py seemingly declared with tests dimensions that 
is ignored by its tests.

For example, TestScannersAllTableFormats
{code:python}
class TestScannersAllTableFormats(ImpalaTestSuite):
  BATCH_SIZES = [0, 1, 16]

  @classmethod
  def get_workload(cls):
    return 'functional-query'

  @classmethod
  def add_test_dimensions(cls):
    super(TestScannersAllTableFormats, cls).add_test_dimensions()
    if cls.exploration_strategy() == 'core':
      # The purpose of this test is to get some base coverage of all the file 
formats.
      # Even in 'core', we'll test each format by using the pairwise strategy.
      
cls.ImpalaTestMatrix.add_dimension(cls.create_table_info_dimension('pairwise'))
    cls.ImpalaTestMatrix.add_dimension(
        ImpalaTestDimension('batch_size', 
*TestScannersAllTableFormats.BATCH_SIZES))
    cls.ImpalaTestMatrix.add_dimension(
        ImpalaTestDimension('debug_action', *DEBUG_ACTION_DIMS))
    cls.ImpalaTestMatrix.add_dimension(ImpalaTestDimension('mt_dop', 
*MT_DOP_VALUES))

  def test_scanners(self, vector):
    new_vector = deepcopy(vector)
    # Copy over test dimensions to the matching query options.
    new_vector.get_value('exec_option')['batch_size'] = 
vector.get_value('batch_size')
    new_vector.get_value('exec_option')['debug_action'] = 
vector.get_value('debug_action')
    new_vector.get_value('exec_option')['mt_dop'] = vector.get_value('mt_dop')
    self.run_test_case('QueryTest/scanners', new_vector)

  def test_many_nulls(self, vector):
    if vector.get_value('table_format').file_format == 'hbase':
      # manynulls table not loaded for HBase
      pytest.skip()
    # Copy over test dimensions to the matching query options.
    new_vector = deepcopy(vector)
    new_vector.get_value('exec_option')['batch_size'] = 
vector.get_value('batch_size')
    new_vector.get_value('exec_option')['debug_action'] = 
vector.get_value('debug_action')
    new_vector.get_value('exec_option')['mt_dop'] = vector.get_value('mt_dop')
    self.run_test_case('QueryTest/scanners-many-nulls', new_vector)

  def test_hdfs_scanner_profile(self, vector):
    if vector.get_value('table_format').file_format in ('kudu', 'hbase') or \
       vector.get_value('exec_option')['num_nodes'] != 0:
      pytest.skip()
    self.run_test_case('QueryTest/hdfs_scanner_profile', vector)

  def test_string_escaping(self, vector):
    """Test handling of string escape sequences."""
    if vector.get_value('table_format').file_format == 'rc':
      # IMPALA-7778: RCFile scanner incorrectly ignores escapes for now.
      self.run_test_case('QueryTest/string-escaping-rcfile-bug', vector)
    else:
      self.run_test_case('QueryTest/string-escaping', vector)
{code}
test_scanners and test_many_nulls correctly copy exec_option values from  test 
vector. But test_hdfs_scanner_profile and test_string_escaping is not, and 
unnecessary run multiple times even though it does not permuting its 
exec_option. This and other test classes inside test_scanner.py can benefit 
from refactoring and dimension reduction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (IMPALA-12898) Tidy up test matrix of test_scanner.py

Reply via email to