Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20192 )

Change subject: IMPALA-12267: DMLs/DDLs can hang as a result of catalogd restart
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/20192/1/be/src/service/impala-server.cc@382
PS1, Line 382: 5
> Ok, now I understand what you mean.
Ah, you are right. We still need the timeout in case there are no further 
catalog changes. Note that statestore still sends empty updates periodically 
even if there are no catalog changes, which will wake up the waiting threads. 
Maybe we should avoid calling catalog_version_update_cv_.NotifyAll() for empty 
updates.

I think we need both conditions and exit the loop if any of them matches. The 
limit on *valid* iterations help the loop to exit fast (on a working cluster), 
and the timeout guards on corner cases that the cluster have no more catalog 
updates arrive in time. So we can have a safer timeout like 30 mins. Note that 
we've seen valid(non-empty) catalog update arrivals longer than 30 mins but the 
clusters are already in bad state.

And the limit on *valid* iterations can be smaller, e.g. 5. If coordinator 
receives consecutive non-empty catalog updates that have the same catalog 
service id, that id should be considered stable and we don't need to wait more.

I might miss something. What do you think?


http://gerrit.cloudera.org:8080/#/c/20192/3/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/20192/3/tests/common/impala_test_suite.py@477
PS3, Line 477:   def get_logs(self):
Can we use assert_impalad_log_contains directly?



--
To view, visit http://gerrit.cloudera.org:8080/20192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib71bec8f67f80b0bdfe0a6cc46a16ef624163d8b
Gerrit-Change-Number: 20192
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Thu, 20 Jul 2023 08:30:49 +0000
Gerrit-HasComments: Yes

Reply via email to