[Impala-ASF-CR] IMPALA-4037,IMPALA-4038: fix locking during query cancellation
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4037,IMPALA-4038: fix locking during query cancellation .. IMPALA-4037,IMPALA-4038: fix locking during query cancellation * Refactor the child query handling out of QueryExecState and clarify locking rules. * Avoid holding QueryExecState::lock_ while calling Coordinator::Cancel() or ChildQuery::Cancel(), which can both do RPCs or acquire ImpalaServer::query_exec_state_map_lock_. * Fix a potential race between QueryExecState::Exec() and QueryExecState::Cancel() where the cancelling thread did an unlocked read of the 'coordinator_' field and may not have cancelled the coordinator. Testing: Ran exhaustive build, ran local stress test for a bit. Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Reviewed-on: http://gerrit.cloudera.org:8080/4163 Reviewed-by: Tim ArmstrongTested-by: Internal Jenkins --- M be/src/runtime/coordinator.h M be/src/service/child-query.cc M be/src/service/child-query.h M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M tests/query_test/test_cancellation.py 7 files changed, 226 insertions(+), 103 deletions(-) Approvals: Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4163 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4037,IMPALA-4038: fix locking during query cancellation
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4037,IMPALA-4038: fix locking during query cancellation .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4163 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4037,IMPALA-4038: fix locking during query cancellation
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4037,IMPALA-4038: fix locking during query cancellation .. Patch Set 11: Code-Review+2 Rebase, carry +2 -- To view, visit http://gerrit.cloudera.org:8080/4163 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4037,IMPALA-4038: fix locking during query cancellation
Hello Henry Robinson, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4163 to look at the new patch set (#10). Change subject: IMPALA-4037,IMPALA-4038: fix locking during query cancellation .. IMPALA-4037,IMPALA-4038: fix locking during query cancellation * Refactor the child query handling out of QueryExecState and clarify locking rules. * Avoid holding QueryExecState::lock_ while calling Coordinator::Cancel() or ChildQuery::Cancel(), which can both do RPCs or acquire ImpalaServer::query_exec_state_map_lock_. * Fix a potential race between QueryExecState::Exec() and QueryExecState::Cancel() where the cancelling thread did an unlocked read of the 'coordinator_' field and may not have cancelled the coordinator. Testing: Ran exhaustive build, ran local stress test for a bit. Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 --- M be/src/runtime/coordinator.h M be/src/service/child-query.cc M be/src/service/child-query.h M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M tests/query_test/test_cancellation.py 7 files changed, 226 insertions(+), 103 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/4163/10 -- To view, visit http://gerrit.cloudera.org:8080/4163 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4037,IMPALA-4038: fix locking during query cancellation
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4037,IMPALA-4038: fix locking during query cancellation .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/4163/6/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS6, Line 246: during > what does this mean? while making RPCs? while servicing RPCs? while making - Done http://gerrit.cloudera.org:8080/#/c/4163/6/be/src/service/child-query.cc File be/src/service/child-query.cc: Line 220: return child_queries_thread_.get(); > what does holding the lock here prevent? (related to my question about allo I was generally just trying to keep the locking rules simple and conservative (i.e. grab the lock when touching the state). I removed the locking and added justifications for why its safe. I'm not concerned about the correctness here but it may get tricky if we add any additional functionality. http://gerrit.cloudera.org:8080/#/c/4163/6/be/src/service/child-query.h File be/src/service/child-query.h: PS6, Line 161: by joining : /// 'child_queries_thread_'. > you should explain this without talking about private members. Just deleted that clause. Line 167: Status WaitForAll(std::vector* completed_queries); > can this be called by concurrently with ExecAsync() or is that not allowed? Nope, documented it. PS6, Line 170: and the execution thread has exited > what does this really mean to the caller? that it can destroy the object? s I mentioned this in the class comment but will also mention here and in WaitForAll. -- To view, visit http://gerrit.cloudera.org:8080/4163 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4037,IMPALA-4038: fix locking during query cancellation
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4163 to look at the new patch set (#7). Change subject: IMPALA-4037,IMPALA-4038: fix locking during query cancellation .. IMPALA-4037,IMPALA-4038: fix locking during query cancellation * Refactor the child query handling out of QueryExecState and clarify locking rules. * Avoid holding QueryExecState::lock_ while calling Coordinator::Cancel() or ChildQuery::Cancel(), which can both do RPCs or acquire ImpalaServer::query_exec_state_map_lock_. * Fix a potential race between QueryExecState::Exec() and QueryExecState::Cancel() where the cancelling thread did an unlocked read of the 'coordinator_' field and may not have cancelled the coordinator. Testing: Ran exhaustive build, ran local stress test for a bit. Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 --- M be/src/runtime/coordinator.h M be/src/service/child-query.cc M be/src/service/child-query.h M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M tests/query_test/test_cancellation.py 7 files changed, 233 insertions(+), 103 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/4163/7 -- To view, visit http://gerrit.cloudera.org:8080/4163 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe3024803e03595ee69c47759b58e8443d7bd167 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong