Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19471 )

Change subject: IMPALA-11081: Fix incorrect results in partition key scan
......................................................................


Patch Set 10:

(3 comments)

The fix looks good to me. Just have some suggestions in the test.

http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py
File tests/query_test/test_queries.py:

http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py@347
PS10, Line 347:     db_suffix = file_format + '_' + compression_codec
              :     if file_format == 'parquet' and compression_codec == 'none':
              :         db_suffix = 'parquet'
              :     elif compression_codec == 'none':
              :         db_suffix = file_format + '_def'
I'm not clear why adding '_def' in the suffix. If you want the db_suffix of the 
file format, you can use:

  vector.get_value('table_format').db_suffix()

https://github.com/apache/impala/blob/32536ba25869ea621bdb82bb79a520a115e02cdd/tests/common/test_dimensions.py#L90-L98


http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py@353
PS10, Line 353:     STORED_AS_ARGS = {'text': 'textfile', 'parquet': 'parquet', 
'avro': 'avro',
              :     'seq': 'sequencefile', 'orc': 'orc', 'rc': 'rcfile'}
I also see this in tests/metadata/test_partition_metadata.py. I think it's 
useful for other tests. Let's move this to a common place so it can be shared. 
E.g. move it to tests/common/test_dimensions.py with more items from the 
complete map:

https://github.com/apache/impala/blob/1d05381b7b791bde5820572b3e7a4b2b5db1db73/testdata/bin/generate-schema-statements.py#L196-L207


http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py@358
PS10, Line 358:     source_file = 
get_fs_path("/test-warehouse/alltypes_%s/year=2010/month=12/*"
              :         % (db_suffix))
              :     if file_format == 'orc':
              :         # functional_orc_def.alltypes is a managed table.
              :         source_file = get_fs_path(
              :             
"/test-warehouse/managed/functional_%s.db/alltypes_%s/year=2010/month=12/*"
              :             % (db_suffix, db_suffix))
I think we can get the source table location by the _get_table_location() util 
method:

  src_tbl_name = 'functional' + db_suffix + '.alltypes'
  src_tbl_loc = self._get_table_location(src_tbl_name, vector)
  source_file = src_tbl_loc + '/year=2010/month=12'



--
To view, visit http://gerrit.cloudera.org:8080/19471
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17331ed6c26a747e0509dcbaf427cd52808943b1
Gerrit-Change-Number: 19471
Gerrit-PatchSet: 10
Gerrit-Owner: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Comment-Date: Wed, 08 Feb 2023 08:19:30 +0000
Gerrit-HasComments: Yes

Reply via email to