Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20192 )
Change subject: IMPALA-12267: DMLs/DDLs can hang as a result of catalogd restart ...................................................................... Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc@382 PS1, Line 382: -1 > But as far as I know condition variables may receive spurious wakeups, so i I see. To overcome spurious wakeups, what about tracking the wakeups that really change catalog_update_info_ ? http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc@2269 PS1, Line 2269: we only got the updates about some but not all restarts : // - the update about the catalogd that has 'catalog_service_id' has not : // arrived yet Is it possible that we get partial update from statestore? I think the case is that the latest catalogd (the second restarted one) clears the update from the first restarted catalogd. The update haven't been sent out and it's cleared so never be sent to the coordinators. So coordinators just get the id of the latest catalogd and miss the first restarted one. There are logs indicating this clear when catalogd starts: Received request for clearing the entries of topic: catalog-update from: catalog-server@hostname:26000 http://gerrit.cloudera.org:8080/#/c/20192/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java: http://gerrit.cloudera.org:8080/#/c/20192/1/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@a179 PS1, Line 179: I found it helpful by showing the old and new ids in debug. Can we add them for better observability? throw new CatalogException("Detected catalog service ID changes from " + TUniqueIdUtil.PrintId(oldId) + " to " + TUniqueIdUtil.PrintId(catalog_service_id) + ". Aborting updateCatalog()"); -- To view, visit http://gerrit.cloudera.org:8080/20192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib71bec8f67f80b0bdfe0a6cc46a16ef624163d8b Gerrit-Change-Number: 20192 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Wed, 19 Jul 2023 05:13:41 +0000 Gerrit-HasComments: Yes
