Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/10440 )
Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state ...................................................................... Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h@351 PS1, Line 351: Cannot finalize execution until exec RPCs are no longer : /// being sent. > We added a DCHECK, but this comment suggests that this method will ensure t Yeah, I can see how the comment is confusing. Updated the comments. (Ultimately, in a future change I likely will move the wait() but want to hold off on that to "prove" that the wait is only needed for the UpdateBackendExecStatus() call path) http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@530 PS1, Line 530: // TODO: IMPALA-7011 is this needed? This will also happen on the "backend" side of : // cancel rpc. And in the case of EOS, sink already knows this. > I think we can update this comment, since we figured out that we do need it Done. (Though I haven't given up on this quite yet -- I think it's needed given the current plan-root-sink code, but it might be possible to fix that code to make it not needed. But the JIRA can track that work). http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@695 PS1, Line 695: exec_rpcs_complete_barrier_->Wait(); > should we be concerned about adding a possible long wait() call on an RPC c as mentioned in this thread, it will be no worse than before this change. In the old code, we hold the Coordinator::lock_ while sending the exec RPCs. Additionally, we take the Coordinator::lock_ inside UpdateStatus() and also at line 688 -- so this RPC handler will always block until the exec rpcs are done sending in the old code. In the new code, the blocking only happens when the query hits an error. I agree it's not ideal, but we're better off. And I plan to improve things with IMPALA-6788. -- To view, visit http://gerrit.cloudera.org:8080/10440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c Gerrit-Change-Number: 10440 Gerrit-PatchSet: 1 Gerrit-Owner: Dan Hecht <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Thu, 17 May 2018 22:00:30 +0000 Gerrit-HasComments: Yes
