Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10158 )
Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify state ...................................................................... Patch Set 14: (33 comments) I think this makes sense. I really need to do another pass to verify my understanding. I have a lot of comments but they're mainly cleanup opportunities and about comments. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@23 PS14, Line 23: #include <boost/accumulators/accumulators.hpp> Not your change but these accumulators don't look like they're used. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@33 PS14, Line 33: #include <boost/unordered_set.hpp> Not used? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@37 PS14, Line 37: #include "common/hdfs.h" Move to .cc? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@42 PS14, Line 42: #include "runtime/query-state.h" I think you could move this to the .cc - QueryState is already forward-declared. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@43 PS14, Line 43: #include "scheduling/query-schedule.h" Could forward-declare QuerySchedule and move this to the .cc. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@87 PS14, Line 87: /// 1. When an error is encountered while still executing, backend cancellation is Can we define what "executing" means? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@99 PS14, Line 99: /// Should we document the lock order? Coordinator::lock_ is mentioned in the ImpalaServer header but there are multiple locks in here. Or is it sufficient to document that individual locks are terminal? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@114 PS14, Line 114: extra spaces in this comment http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@130 PS14, Line 130: /// Cancel execution of query and sets the overall query status to CANCELLED if the Long line http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@191 PS14, Line 191: accesses accessing. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@214 PS14, Line 214: boost::mutex wait_lock_; Switch to SpinLock? The inconsistency might be confusing otherwise. Probably doesn't matter that much... http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@218 PS14, Line 218: /// Keeps track of number of completed ranges and total scan ranges. Initialized by Exec(). Long line. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@219 PS14, Line 219: ProgressUpdater progress_; I don't feel that strongly but it seems like we could PIMPL this and avoid including the header above. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@291 PS14, Line 291: ExecState exec_state_ = ExecState::EXECUTING; It's a little confusing that this is EXECUTING before Exec() is called. There's a distinction made above about being in an "Executing" state, which is distinct from this being EXECUTING. Maybe there should be a different initial state to make that part of the state machine explicit? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@325 PS14, Line 325: /// If 'exec_state_' != EXECUTING, does nothing. Otherwise sets 'exec_state_' to Maybe a brief high-level description would be helpful. E.g. "Called when the query is done executing in a non-error state." http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@326 PS14, Line 326: /// 'state' (must be either CANCELLED or RETURNED_RESULTS), Formatting of this comment seems weird. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@333 PS14, Line 333: /// Transitions 'exec_state_' given an execution status: It's a little hard to decode why someone would call this with an OK status based on the comment - at least until you get all the ways to the last line of the comment. I'm not sure that I have a concrete suggestion since the comment is already quite long, but maybe you have an idea. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@351 PS14, Line 351: extra space http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@360 PS14, Line 360: void void argument list not needed in C++ - () is equivalent to (void) unlike C. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.h@406 PS14, Line 406: extra space http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@54 PS14, Line 54: using namespace strings; strings "using" may not be needed http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@61 PS14, Line 61: using std::unique_ptr; unique_ptr "using" may not be needed http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@63 PS14, Line 63: DECLARE_int32(be_port); be_port not used. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@79 PS14, Line 79: teriminal terminal http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@449 PS14, Line 449: if (exec_state_ != ExecState::EXECUTING) return exec_status_; Worth defining an "is_terminal_state()" helper? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@477 PS14, Line 477: DCHECK(exec_status_.ok()); Maybe log the status if this DCHECK fails? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@481 PS14, Line 481: DCHECK(exec_status_.IsCancelled()); Maybe log the status if this DCHECK fails? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@486 PS14, Line 486: DCHECK(!exec_status_.ok()); Maybe log the status if this DCHECK fails? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@517 PS14, Line 517: // Once we enter a terminal state, we stay there. That guarantees this code runs only once. Long line. http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@526 PS14, Line 526: temrinal terminal http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@540 PS14, Line 540: // TODO: should move this off of the query execution path? Is there a JIRA for this? http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@623 PS14, Line 623: TOOD "TODO". It's also unclear what this TODO depends on http://gerrit.cloudera.org:8080/#/c/10158/14/be/src/runtime/coordinator.cc@735 PS14, Line 735: for (BackendState* state: backend_states_) { nit: one line -- To view, visit http://gerrit.cloudera.org:8080/10158 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1abdfd02163f9356c59d470fe1c64ebe012a9e8e Gerrit-Change-Number: 10158 Gerrit-PatchSet: 14 Gerrit-Owner: Dan Hecht <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Tue, 08 May 2018 20:31:52 +0000 Gerrit-HasComments: Yes
