Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/13061 )
Change subject: IMPALA-7665: Fix unwarranted query cancellation on statestore restart ...................................................................... Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/13061/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/13061/1/be/src/service/impala-server.cc@235 PS1, Line 235: scheduled to run on running on http://gerrit.cloudera.org:8080/#/c/13061/1/be/src/service/impala-server.cc@236 PS1, Line 236: This assumes that it will This value should be large enough to give the statestore... http://gerrit.cloudera.org:8080/#/c/13061/1/be/src/statestore/statestore-subscriber.h File be/src/statestore/statestore-subscriber.h: http://gerrit.cloudera.org:8080/#/c/13061/1/be/src/statestore/statestore-subscriber.h@215 PS1, Line 215: AtomicInt64 last_registration_ts_{0}; I think last_registration_ms_ would also convey that it's a timestamp, but have the benefit of including the unit. http://gerrit.cloudera.org:8080/#/c/13061/1/tests/custom_cluster/test_restart_services.py File tests/custom_cluster/test_restart_services.py: http://gerrit.cloudera.org:8080/#/c/13061/1/tests/custom_cluster/test_restart_services.py@103 PS1, Line 103: impalaD nit: backend or impalad (lowercase d) http://gerrit.cloudera.org:8080/#/c/13061/1/tests/custom_cluster/test_restart_services.py@112 PS1, Line 112: wait_for_admission_control Does this make sure that the query is actually running, or is there a small race still? Should we use wait_for_state below instead? http://gerrit.cloudera.org:8080/#/c/13061/1/tests/custom_cluster/test_restart_services.py@122 PS1, Line 122: self.cluster.get_different_impalad(impalad).kill() : start_time = time() : self.cluster.statestored.start() I think it would be interesting to test the case where the statestore comes back up, but within the cancellation grace period an impalad dies. That could be further distinguished in killing it a) before or b) after it has re-registered itself with the statestored. Another thing would be to kill the statestore again after 20 seconds. Would this reset the grace period correctly? -- To view, visit http://gerrit.cloudera.org:8080/13061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30b68bd8bde4bf589d58d42d6f683afb166de959 Gerrit-Change-Number: 13061 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 24 Apr 2019 21:59:36 +0000 Gerrit-HasComments: Yes
