Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/20192 )
Change subject: IMPALA-12267: DMLs/DDLs can hang as a result of catalogd restart ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc@382 PS1, Line 382: 5 > Ok, now I understand what you mean. Ah, you are right. We still need the timeout in case there are no further catalog changes. Note that statestore still sends empty updates periodically even if there are no catalog changes, which will wake up the waiting threads. Maybe we should avoid calling catalog_version_update_cv_.NotifyAll() for empty updates. I think we need both conditions and exit the loop if any of them matches. The limit on *valid* iterations help the loop to exit fast (on a working cluster), and the timeout guards on corner cases that the cluster have no more catalog updates arrive in time. So we can have a safer timeout like 30 mins. Note that we've seen valid(non-empty) catalog update arrivals longer than 30 mins but the clusters are already in bad state. And the limit on *valid* iterations can be smaller, e.g. 5. If coordinator receives consecutive non-empty catalog updates that have the same catalog service id, that id should be considered stable and we don't need to wait more. I might miss something. What do you think? http://gerrit.cloudera.org:8080/#/c/20192/3/tests/common/impala_test_suite.py File tests/common/impala_test_suite.py: http://gerrit.cloudera.org:8080/#/c/20192/3/tests/common/impala_test_suite.py@477 PS3, Line 477: def get_logs(self): Can we use assert_impalad_log_contains directly? -- To view, visit http://gerrit.cloudera.org:8080/20192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib71bec8f67f80b0bdfe0a6cc46a16ef624163d8b Gerrit-Change-Number: 20192 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Thu, 20 Jul 2023 08:30:49 +0000 Gerrit-HasComments: Yes
