Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/17999 )
Change subject: IMPALA-11006: Impalad crashes during query cancel tests ...................................................................... Patch Set 1: (2 comments) Thanks a lot for the careful review! http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1070 PS1, Line 1070: else > What is query_status_ in case of cancellation? Shouldn't it be in an error CTAS is processed async with a difference before the RPC patch. Main steps involved: 1. COMPILE 2. CREATE TABLE 3. POPULATE TABLE Before: 1 and 2 in main thread, 3 in async_exec_thread After: 2 in main thread, 2 and 3 in async_exec_thread GetCoordinator() returns TRUE iff coord_exec_called_ is true which is set only inside ClientRequestState::FinishExecQueryOrDmlRequest(), which is called only if there is no interruption (like cancel)) and step 3 completes successfully. In the core dump case, cancel is received from the client and the above method is never called. So GetCoordinator() is FALSE. My limited understanding of the code is that the GetCoordinator() is available (and to get results from it) only if there is no interruption. We therefore have to check it being FALSE case here. http://gerrit.cloudera.org:8080/#/c/17999/1/be/src/service/client-request-state.cc@1076 PS1, Line 1076: MarkInactive > I think that MarkInactive() is not needed in case of cancellation, as we ha I instrumented the code to observe when MarkActive() and MarkInactive() for the relevant this object and found the following. MarkActive(Exec()): this=0xe62e800, updated ref_count_=1 MarkActive(Finalize()): this=0xe62e800, updated ref_count_=2 MarkInactive(WaitInternal() for ELSE branch): this=0xe62e800, updated ref_count_=1 So if MarkInactive() is not called here, the ref count will be 2. Per code calling convention, EXEC() and WAIT() (which calls WaitInternal() are paired so ideally it is a good idea to decrement ref_count here. But I am Okay with remove MarkInactive() as the reference count is not back to 0 anyway. -- To view, visit http://gerrit.cloudera.org:8080/17999 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia49411f8525734b8d463d9ffbfbce705b90a8d73 Gerrit-Change-Number: 17999 Gerrit-PatchSet: 1 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Comment-Date: Mon, 08 Nov 2021 15:23:18 +0000 Gerrit-HasComments: Yes
