Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20192 )

Change subject: IMPALA-12267: DMLs/DDLs can hang as a result of catalogd restart
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc@382
PS1, Line 382: 5
> It sounds good.
I think the limit should be independent. It doesn't make sense to wait that 
many valid iterations. We can make it another startup flag if you think 
appropriate.

For the behavior change, it seems IMPALA-12267 is the only case that could 
reach this limit. For other situations, they should exit the loop normally. Or 
did I miss any cases?

One more point to decide is defining a valid iteration. There are several 
fields in catalog_update_info_ to track:

  struct CatalogUpdateVersionInfo {
    /// The last catalog version returned from UpdateCatalog()
    int64_t catalog_version;
    /// The CatalogService ID that this catalog version is from.
    TUniqueId catalog_service_id;
    /// The statestore catalog topic version this update was received in.
    int64_t catalog_topic_version;
    /// Lower bound of catalog object versions after a call to UpdateCatalog()
    int64_t catalog_object_version_lower_bound;
  };

I think we just need to track more on 'catalog_version' since it along with 
'catalog_service_id' define the catalog state in catalogd. Any of them change 
means a real catalog change in catalogd side. The other two fields can be 
ignored:
* If 'catalog_topic_version' changes, it should bring a new 'catalog_version' 
or 'catalog_service_id'.
* 'catalog_object_version_lower_bound' reflects the min catalog version in the 
local catalog. It could change even if we receive an empty statestore update if 
some other DDL results are applied before.


http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc@2269
PS1, Line 2269: pdate_info_.catalog_service_id;
              :       }
              :     }
> So do you think this comment is OK or should I change it?
Sorry that I misread "some but not all restarts" as "some of the catalog 
updates"..

The original comment is ok. Maybe we need to mention the iteration limit as 
well. Such comment helps a lot since this part is subtle.



--
To view, visit http://gerrit.cloudera.org:8080/20192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib71bec8f67f80b0bdfe0a6cc46a16ef624163d8b
Gerrit-Change-Number: 20192
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Thu, 20 Jul 2023 13:56:06 +0000
Gerrit-HasComments: Yes

Reply via email to