Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20181 )

Change subject: IMPALA-12275: Read files written with DeflateCodec
......................................................................


Patch Set 4:

(5 comments)

Thanks for the refactoring, just a couple comments around how super() is 
getting called here.

http://gerrit.cloudera.org:8080/#/c/20181/4/tests/query_test/test_hive_codec_interop.py
File tests/query_test/test_hive_codec_interop.py:

http://gerrit.cloudera.org:8080/#/c/20181/4/tests/query_test/test_hive_codec_interop.py@50
PS4, Line 50: ImpalaTestSuite
TestFileCodecInterop?

The original test seems to have been doing something subtle here (calling 
grandparent BaseTestSuite::add_test_dimensions() rather than parent 
ImpalaTestSuite::add_test_dimensions()), and it ends up without all the default 
dimensions. Another way to do that is to directly initialize ImpalaTestMatrix. 
Here is an example from query_test/test_decimal_casting.py:

  @classmethod
  def add_test_dimensions(cls):
    cls.ImpalaTestMatrix = ImpalaTestMatrix()
    cls.ImpalaTestMatrix.add_dimension(ImpalaTestDimension('decimal_type',
        *TestDecimalCasting.DECIMAL_TYPES_MAP[cls.exploration_strategy()]))
    cls.ImpalaTestMatrix.add_dimension(
        ImpalaTestDimension('cast_from', *TestDecimalCasting.CAST_FROM))
    cls.ImpalaTestMatrix.add_dimension(create_exec_option_dimension_from_dict(
        {'decimal_v2': ['false','true']}))

So, this function could just be:
cls.ImpalaTestMatrix = ImpalaTestMatrix()
Then the subclasses can add what dimensions they want (or no dimensions as the 
case may be).

If we do that, I think the add_dimension()/add_constraint() calls in this 
method are unnecessary.


http://gerrit.cloudera.org:8080/#/c/20181/4/tests/query_test/test_hive_codec_interop.py@71
PS4, Line 71:     for codec in codecs:
Nit: (Not required for approval, just a bonus that comes from this not being a 
custom cluster test): When this was a custom cluster test, it made sense to 
test all the codecs in a single test, because it avoids Impala cluster restarts.

Now that we aren't doing Impala restarts, a different option is to make the 
codec a test dimension. That would allow all the codecs to run in parallel.

An example is test_failpoints.py:
https://github.com/apache/impala/blob/master/tests/failure/test_failpoints.py#L78-L79
https://github.com/apache/impala/blob/master/tests/failure/test_failpoints.py#L38-L39


http://gerrit.cloudera.org:8080/#/c/20181/4/tests/query_test/test_hive_codec_interop.py@112
PS4, Line 112: TestFileCodecInterop
TestTextInterop

(We're currently calling on the grandparent, not the parent class)


http://gerrit.cloudera.org:8080/#/c/20181/4/tests/query_test/test_hive_codec_interop.py@126
PS4, Line 126: TestFileCodecInterop
TestSequenceInterop


http://gerrit.cloudera.org:8080/#/c/20181/4/tests/query_test/test_hive_codec_interop.py@129
PS4, Line 129:   @pytest.mark.execute_serially
I think we can remove this



--
To view, visit http://gerrit.cloudera.org:8080/20181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5ec1d0345ae35597f6aade9d8b9eef2257efeba
Gerrit-Change-Number: 20181
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Wed, 12 Jul 2023 15:38:55 +0000
Gerrit-HasComments: Yes

Reply via email to