Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20490 )

Change subject: IMPALA-12448: Avoid getting stuck when refreshing a 
non-existent partition
......................................................................


Patch Set 23:

(10 comments)

Sorry to be late on this. It took me some time to revisit the details.

http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3266
PS23, Line 3266: version
nit: Let's make this clear to be "catalog version"


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3277
PS23, Line 3277: catalog topic version
nit: Let's clear this to be "catalog version"


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3304
PS23, Line 3304: Topic version for {} not found yet. Last sent topic version: 
{}.
nit: Let's clear this to be "Topic update for {} not found yet. Last sent 
catalog version: {}. "


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3324
PS23, Line 3324: topic version
nit: Let's make this clear to be "topic update"


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3328
PS23, Line 3328: topic version
nit: "topic update"


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3341
PS23, Line 3341: catalog topic version
nit: "catalog version {} to be sent"


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3342
PS23, Line 3342: topic version
nit: "topic update"


http://gerrit.cloudera.org:8080/#/c/20490/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20490/22/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6705
PS22, Line 6705:
> Catalog topic version is different than the catalog object version. It's th
Please ignore this, I have a misunderstanding on it. The version returned by 
waitForSyncDdlVersion() is actually a catalog version.

Here is what we add into the topicUpdateLog_:
https://github.com/apache/impala/blob/5af8fef199b60fb7725971b419596a36e48b1eec/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L812C33-L813

        topicUpdateLog_.add(key,
            new TopicUpdateLog.Entry(0, obj.getCatalog_version(), toVersion, 
0));

'toVersion' is a catalog version.

Could you rename the variable 'topicVersion' to something like 
'syncedCatalogVersion' to avoid confusion?


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6625
PS23, Line 6625:                   && preTblCatalogVersion < 
tbl.getCatalogVersion();
Can we use 'wasPartitionRefreshed' directly?

 waitForSyncDdl &= wasPartitionRefreshed.getRef();

EDIT: Don't need this if we modify waitForSyncDdlVersion(). See comments at 
L6705.


http://gerrit.cloudera.org:8080/#/c/20490/23/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6705
PS23, Line 6705: catalog_.INITIAL_CATALOG_VERSION
Returning INITIAL_CATALOG_VERSION will let coordinator think its catalog is new 
enough. It's possible that coordinator still has a stale snapshot of the table 
that has the missing partition, but catalogd has a newer version of the table 
that already removed this partition. In such case, catalogd will return 
INITIAL_CATALOG_VERSION but the coordinator should wait until it also removes 
the stale partition.

We can add a test for this:

* start impala cluster with a long catalog topic update interval, e.g. 
--statestore_update_frequency_ms=10000
* create a table with two partitions. set sync_ddl=true to make sure all 
impalads see two partitions of the table.
* drop one partition with sync_ddl=false so only one impalad has the fresh 
metadata. The other impalads still see two partitions.
* refresh the dropped partition in another impalad with sync_ddl=true. It will 
finish immediately.
* ShowPartitions on the table in the 3rd impalad. It still shows two partitions.

test code:

  @CustomClusterTestSuite.with_args(
    statestored_args="--statestore_update_frequency_ms=5000")
  def test_refresh_missing_partition(self, unique_database):
    client1 = self.cluster.impalads[1].service.create_beeswax_client()
    client2 = self.cluster.impalads[2].service.create_beeswax_client()

    self.client.execute('create table {}.tbl (i int) partitioned by (p 
int)'.format(unique_database))
    self.execute_query(
        'insert into {}.tbl partition(p) values (0,0), 
(1,1)'.format(unique_database),
        query_options={"SYNC_DDL": "true"})
    self.execute_query_expect_success(
        self.client,
        'alter table {}.tbl drop partition(p=0)'.format(unique_database),
        {"SYNC_DDL": "false"})
    self.execute_query_expect_success(
        client1,
        'refresh {}.tbl partition(p=0)'.format(unique_database),
        {"SYNC_DDL": "true"})
    show_parts_stmt = 'show partitions {}.tbl'.format(unique_database)
    res = self.execute_query_expect_success(client2, show_parts_stmt)
    # First line is the header. Only one partition should be shown so the 
result has two lines.
    assert len(res.data) == 2
    res = self.execute_query_expect_success(client1, show_parts_stmt)
    assert len(res.data) == 2
    res = self.execute_query_expect_success(self.client, show_parts_stmt)
    assert len(res.data) == 2

I think we should still wait for the catalog version of the table to be sent 
unless it has been GCed (compared it with oldestTopicUpdateToGc_ of 
TopicUpdateLog). If it has been GCed, we should send the current table version. 
I.e. modify waitForSyncDdlVersion() to add check for whether the version has 
been GCed.



--
To view, visit http://gerrit.cloudera.org:8080/20490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iace7cdadda300b03896f92415822266354421887
Gerrit-Change-Number: 20490
Gerrit-PatchSet: 23
Gerrit-Owner: ttttttz <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: ttttttz <[email protected]>
Gerrit-Comment-Date: Mon, 22 Jan 2024 04:40:08 +0000
Gerrit-HasComments: Yes

Reply via email to