Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20490 )
Change subject: IMPALA-12448: Avoid getting stuck when refreshing a non-existent partition ...................................................................... Patch Set 23: (10 comments) Sorry to be late on this. It took me some time to revisit the details. http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3266 PS23, Line 3266: version nit: Let's make this clear to be "catalog version" http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3277 PS23, Line 3277: catalog topic version nit: Let's clear this to be "catalog version" http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3304 PS23, Line 3304: Topic version for {} not found yet. Last sent topic version: {}. nit: Let's clear this to be "Topic update for {} not found yet. Last sent catalog version: {}. " http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3324 PS23, Line 3324: topic version nit: Let's make this clear to be "topic update" http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3328 PS23, Line 3328: topic version nit: "topic update" http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3341 PS23, Line 3341: catalog topic version nit: "catalog version {} to be sent" http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3342 PS23, Line 3342: topic version nit: "topic update" http://gerrit.cloudera.org:8080/#/c/20490/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20490/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6705 PS22, Line 6705: > Catalog topic version is different than the catalog object version. It's th Please ignore this, I have a misunderstanding on it. The version returned by waitForSyncDdlVersion() is actually a catalog version. Here is what we add into the topicUpdateLog_: https://github.com/apache/impala/blob/5af8fef199b60fb7725971b419596a36e48b1eec/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L812C33-L813 topicUpdateLog_.add(key, new TopicUpdateLog.Entry(0, obj.getCatalog_version(), toVersion, 0)); 'toVersion' is a catalog version. Could you rename the variable 'topicVersion' to something like 'syncedCatalogVersion' to avoid confusion? http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6625 PS23, Line 6625: && preTblCatalogVersion < tbl.getCatalogVersion(); Can we use 'wasPartitionRefreshed' directly? waitForSyncDdl &= wasPartitionRefreshed.getRef(); EDIT: Don't need this if we modify waitForSyncDdlVersion(). See comments at L6705. http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6705 PS23, Line 6705: catalog_.INITIAL_CATALOG_VERSION Returning INITIAL_CATALOG_VERSION will let coordinator think its catalog is new enough. It's possible that coordinator still has a stale snapshot of the table that has the missing partition, but catalogd has a newer version of the table that already removed this partition. In such case, catalogd will return INITIAL_CATALOG_VERSION but the coordinator should wait until it also removes the stale partition. We can add a test for this: * start impala cluster with a long catalog topic update interval, e.g. --statestore_update_frequency_ms=10000 * create a table with two partitions. set sync_ddl=true to make sure all impalads see two partitions of the table. * drop one partition with sync_ddl=false so only one impalad has the fresh metadata. The other impalads still see two partitions. * refresh the dropped partition in another impalad with sync_ddl=true. It will finish immediately. * ShowPartitions on the table in the 3rd impalad. It still shows two partitions. test code: @CustomClusterTestSuite.with_args( statestored_args="--statestore_update_frequency_ms=5000") def test_refresh_missing_partition(self, unique_database): client1 = self.cluster.impalads[1].service.create_beeswax_client() client2 = self.cluster.impalads[2].service.create_beeswax_client() self.client.execute('create table {}.tbl (i int) partitioned by (p int)'.format(unique_database)) self.execute_query( 'insert into {}.tbl partition(p) values (0,0), (1,1)'.format(unique_database), query_options={"SYNC_DDL": "true"}) self.execute_query_expect_success( self.client, 'alter table {}.tbl drop partition(p=0)'.format(unique_database), {"SYNC_DDL": "false"}) self.execute_query_expect_success( client1, 'refresh {}.tbl partition(p=0)'.format(unique_database), {"SYNC_DDL": "true"}) show_parts_stmt = 'show partitions {}.tbl'.format(unique_database) res = self.execute_query_expect_success(client2, show_parts_stmt) # First line is the header. Only one partition should be shown so the result has two lines. assert len(res.data) == 2 res = self.execute_query_expect_success(client1, show_parts_stmt) assert len(res.data) == 2 res = self.execute_query_expect_success(self.client, show_parts_stmt) assert len(res.data) == 2 I think we should still wait for the catalog version of the table to be sent unless it has been GCed (compared it with oldestTopicUpdateToGc_ of TopicUpdateLog). If it has been GCed, we should send the current table version. I.e. modify waitForSyncDdlVersion() to add check for whether the version has been GCed. -- To view, visit http://gerrit.cloudera.org:8080/20490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iace7cdadda300b03896f92415822266354421887 Gerrit-Change-Number: 20490 Gerrit-PatchSet: 23 Gerrit-Owner: ttttttz <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: ttttttz <[email protected]> Gerrit-Comment-Date: Mon, 22 Jan 2024 04:40:08 +0000 Gerrit-HasComments: Yes
