Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19471 )
Change subject: IMPALA-11081: Fix incorrect results in partition key scan ...................................................................... Patch Set 10: (3 comments) The fix looks good to me. Just have some suggestions in the test. http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py File tests/query_test/test_queries.py: http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py@347 PS10, Line 347: db_suffix = file_format + '_' + compression_codec : if file_format == 'parquet' and compression_codec == 'none': : db_suffix = 'parquet' : elif compression_codec == 'none': : db_suffix = file_format + '_def' I'm not clear why adding '_def' in the suffix. If you want the db_suffix of the file format, you can use: vector.get_value('table_format').db_suffix() https://github.com/apache/impala/blob/32536ba25869ea621bdb82bb79a520a115e02cdd/tests/common/test_dimensions.py#L90-L98 http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py@353 PS10, Line 353: STORED_AS_ARGS = {'text': 'textfile', 'parquet': 'parquet', 'avro': 'avro', : 'seq': 'sequencefile', 'orc': 'orc', 'rc': 'rcfile'} I also see this in tests/metadata/test_partition_metadata.py. I think it's useful for other tests. Let's move this to a common place so it can be shared. E.g. move it to tests/common/test_dimensions.py with more items from the complete map: https://github.com/apache/impala/blob/1d05381b7b791bde5820572b3e7a4b2b5db1db73/testdata/bin/generate-schema-statements.py#L196-L207 http://gerrit.cloudera.org:8080/#/c/19471/10/tests/query_test/test_queries.py@358 PS10, Line 358: source_file = get_fs_path("/test-warehouse/alltypes_%s/year=2010/month=12/*" : % (db_suffix)) : if file_format == 'orc': : # functional_orc_def.alltypes is a managed table. : source_file = get_fs_path( : "/test-warehouse/managed/functional_%s.db/alltypes_%s/year=2010/month=12/*" : % (db_suffix, db_suffix)) I think we can get the source table location by the _get_table_location() util method: src_tbl_name = 'functional' + db_suffix + '.alltypes' src_tbl_loc = self._get_table_location(src_tbl_name, vector) source_file = src_tbl_loc + '/year=2010/month=12' -- To view, visit http://gerrit.cloudera.org:8080/19471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17331ed6c26a747e0509dcbaf427cd52808943b1 Gerrit-Change-Number: 19471 Gerrit-PatchSet: 10 Gerrit-Owner: Yifan Zhang <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]> Gerrit-Comment-Date: Wed, 08 Feb 2023 08:19:30 +0000 Gerrit-HasComments: Yes
