Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21864 )

Change subject: IMPALA-13340: Fix missing partitions in COPY TESTCASE of 
LocalCatalog mode
......................................................................


Patch Set 2:

(3 comments)

> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10986/

The test failure is 
https://jenkins.impala.io/job/ubuntu-20.04-dockerised-tests/2317/

metadata/test_testcase_builder.py:59: in test_query_with_tbls
    {"PLANNER_TESTCASE_MODE": True})
common/impala_test_suite.py:891: in wrapper
    return function(*args, **kwargs)
common/impala_test_suite.py:933: in execute_query
    return self.__execute_query(self.client, query, query_options)
common/impala_test_suite.py:1045: in __execute_query
    return impalad_client.execute(query, user=user)
common/impala_connection.py:216: in execute
    fetch_profile_after_close=fetch_profile_after_close)
beeswax/impala_beeswax.py:190: in execute
    handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:381: in __execute_query
    handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:375: in execute_query_async
    handle = self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:553: in __do_rpc
    raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: Query 5c4e2eead8d2994d:134a083400000000 failed:
E   LocalCatalogException: Could not load partitions for table 
test_query_with_tbls_954fd7ae.alltypes
E   CAUSED BY: TException: Invalid response from catalogd for request 
TGetPartialCatalogObjectRequest(protocol_version:V2, 
object_desc:TCatalogObject(type:TABLE, catalog_version:7321, 
table:TTable(db_name:test_query_with_tbls_954fd7ae, tbl_name:alltypes)), 
table_info_selector:TTableInfoSelector(want_hms_table:false, partition_ids:[1, 
2], want_partition_names:false, want_partition_metadata:true, 
want_partition_files:true, want_partition_stats:true, 
want_table_constraints:false, want_hms_partition:false, 
want_iceberg_table:false)): Should not return a partition with missing 
partition meta unless the table is unpartitioned

Resolved in PS2. See comments for the causes.

http://gerrit.cloudera.org:8080/#/c/21864/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/21864/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1075
PS1, Line 1075:     if (id_ == CatalogObjectsConstants.PROTOTYPE_PARTITION_ID
'cachedMsPartitionDescriptor_' will always be null if the table is imported 
from a testcase file. The reason is these info are not cached in coordinator 
side. So when coordinator creates the testcase file, they are always missing.

However, the partition metadata (i.e. parameters, write_id, 
hdfs_storage_descriptor, location) exists, so we are able to publish them into 
'tPart'.


http://gerrit.cloudera.org:8080/#/c/21864/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/21864/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2101
PS1, Line 2101:
When importing a testcase file, we shouldn't reuse the partition ids since they 
could be generated by another catalogd instance, thus conflicts with the 
existing partition ids.

When this method is used in catalogd (currently only used in importing testcase 
files), we should regenerate the partition ids.


http://gerrit.cloudera.org:8080/#/c/21864/1/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java:

http://gerrit.cloudera.org:8080/#/c/21864/1/fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java@565
PS1, Line 565:       b.put(id, new LocalPartitionSpec(this, part, id));
Note that partition ids assigned in LocalFsTable always start at 0. The 
partition ids used in the local catalog cache (CatalogdMetaProvider) are not 
used.

So in the exported testcase files, partition ids of a table always start at 0.



--
To view, visit http://gerrit.cloudera.org:8080/21864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icc2e8b71564ad37973ddfca92801afea8e26ff73
Gerrit-Change-Number: 21864
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Mon, 14 Oct 2024 03:36:50 +0000
Gerrit-HasComments: Yes

Reply via email to