Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2550: Switch to per-query exec rpc ......................................................................
Patch Set 10: (72 comments) http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: PS10, Line 327: // if we get an error while trying to get a connection to the backend, : // keep going : > what is this talking about? removed PS10, Line 352: keep on cancelling the other fragments > seems misplaced now Done PS10, Line 354: eturn true; > should this return false? we didn't actually succeed at sending the rpc, ri it doesn't really matter, the return value is only used for a log message. the return value is true because it attempted to cancel. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: PS10, Line 61: status > GetStatus()? Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS10, Line 178: close > i'm not sure what this is talking about. is it referring to CloseConsumer() removed comment, it was a leftover PS10, Line 179: it should : // cancel itself when it receives an error status after reporting its profile. > I also don't see what this is trying to explain. Done PS10, Line 183: Grab executor > what is this referring to? Done PS10, Line 184: order to get a reference to coord_instance_ : // so that coord_sink_ remains valid throughout query lifetime. > how does getting a reference to coord_instance_ affect the validity of coor Done PS10, Line 189: at this point, the query is done with the Prepare phase, > why do we know that? Oh, because GetFinstanceState() blocks until Prepare? added comment PS10, Line 204: GetPrepareStatus > this is a dangerous side-effect to have inside of a DCHECK -- it creates a the barrier isn't really present, because getfinstancestate already waits for all instances' prepare phase to finish, so it can't block here. the first sentence of this comment kind of explains that. Line 263: } > DCHECK(!has_coord_fragment || coord_backend_idx_ != -1) ? Done PS10, Line 413: backend_state->GetStatus(); > what is this trying to do? Oh, I guess get the status of BackendState::Exec good catch, this could be cancelled at this point. switching to per-backend status reporting will fix that. PS10, Line 424: DCHECK(query_status_.ok()); // nobody should have been able to cancel > why? because the query handle hasn't yet been returned? But what about via when does the query get displayed on the debug page? you still haven't returned from exec() at this point. PS10, Line 964: #if 0 : do we really want this? > what are you going to do with this? remove it unless someone says we want it. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS10, Line 210: query > query execution? (as opposed to the query being closed) Done PS10, Line 299: /// True if execution has completed, false otherwise. : bool execution_completed_ = false; > doesn't appear to be used Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.cc File be/src/runtime/descriptors.cc: PS10, Line 547: ReleaseResources > why is this called ReleaseResources and not Close? What are we saying the d we haven't settled on anything yet. should we stipulate that close = release query result-related resources but keep control structures, and rename releaseresources to close everywhere? fine by me. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: PS10, Line 287: /// TODO: Move these into the new query-wide state, indexed by partition id. > Remove. Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: > Like QES, I think the lifecycle state machine could be more clearly defined Done PS10, Line 95: if (!status.ok() && !status.IsCancelled() && !status.IsMemLimitExceeded()) { : // Log error message in addition to returning in Status. Queries that do not fetch : // results (e.g. insert) may not receive the message directly and can only retrieve : // the log. : runtime_state_->LogError(status.msg()); : } > you should delete this when rebasing e0d1db5 Done PS10, Line 104: // TODO: deal with final report failure, otherwise the coordinator doesn't know : // we're done > is that a regression or something that didn't work before as well? i can probably remove this, if the final report fails, other communication may have failed as well PS10, Line 213: // TODO: why is this here? FragmentComplete() already releases the thread token : //ReleaseThreadToken(); > from the comment above, it sounds like it's meant to intentionally release don't know, i reinstated it. PS10, Line 315: // TODO: do not delete mem trackers in Close()/ReleaseResources(), they are part of : // the control structures we need to preserve until the underlying QueryState : // disappears. > is this todo talking about the code here or the code that deletes the mem t this is referring to runtimestate::releaseresources PS10, Line 320: ReleaseResources > not for this change, but again we seem to be mixing terminology close and r Done PS10, Line 331: DCHECK_LE(other_time, total_time); > IMPALA-4631 is going to hit again. what do you plan to do about that? i don Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: PS10, Line 52: /// Tear-down happens automatically at the end of Exec() and frees all memory allocated : /// for this fragment instance and closes all data streams. > Is this referring to Close()? So in FIS case, Close() releases resources? Done PS10, Line 77: The remaining non-getter public functions block until the Prepare phase finishes. > this comment is kinda buried. maybe better to note that at the class level, Done PS10, Line 84: GetPrepareStatus > like QES, the blocking side-effect isn't obvious from the function name. I do you mean client-request-state (formerly query-exec-state)? renamed. PS10, Line 91: GetOpenStatus > same Done Line 121: RuntimeState* runtime_state_; // lives in obj_pool() > it'd be nice to point out the lifetimes. i.e. valid after Prepare, if succe Done PS10, Line 149: > by Prepare() left comment PS10, Line 153: should live in obj_pool() > doesn't the second clause invalidate this. i.e. this shouldn't live in obj_ i'd love to get away from unique_/scoped_ptrs altogether. we should have a rowbatch::releaseresources that does what ~rowbatch currently does. the empty rowbatch can then sit around until the containing object pool gets destroyed PS10, Line 159: . > ... or to the Prepare status if Prepare() failed. Done PS10, Line 224: Blocks until Prepare() finishes. > what does that mean? if Prepare() is run synchronously w.r.t. the caller of not sure how that ended up there. removed. PS10, Line 227: Can handle : /// partially-finished Prepare(). Blocks until Prepare() finishes. > that seems to contradict. i.e. if it blocks, then Prepare() must be finishe Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: Line 111: //VLOG_QUERY << "AddChildTracker(): parent=" << this << " child=" << tracker << " " << tracker->label_ << "\n" << GetStackTrace(); > more of these to remove Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-exec-mgr.cc File be/src/runtime/query-exec-mgr.cc: PS10, Line 51: Prepare > why doesn't this happen on the other thread? won't codegen happen here even i think we should do some initial setup here so we can return an error status early. heavy lifting should happen in startfinstances. Line 57: // handler thread (= the caller of this function) > let's mention that ownership of the qs reference count is passed to the Sta Done PS10, Line 59: exec-fragment-instance-$0 > I don't understand the choice of name for this thread and the ones started i left everything as is, because i didn't want to break existing tests. happy to clean that up, though. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: Line 74: /// 'created' is set to true if it was created, false otherwise. > it'd be nice to document that this increments the refcount. Done http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: PS10, Line 195: // TODO: this might flood the log : LOG(WARNING) << "Couldn't get a client for " << query_ctx().coord_address : <<"\tReason: " << coord_status.GetDetail(); > how do we plan to address that? what used to happen in this case? address what, flooding the log? we basically used to drop it on the floor. PS10, Line 208: status.SetTStatus(¶ms); > why does 'status' get set into the params.status if it's specific to the fi clarified with comment. PS10, Line 255: // TODO: should we try to keep rpc_status for the final report? (but the final : // report, following this Cancel(), may not succeed anyway.) > what used to happen in this case? did we used to send the rpc_status back i if the report rpc fails we fail the query on this node and then do nothing/wait for the coordinator to find out about that some other way (e.g., a sending instance gets an error on transmitdata and aborts). this looks like a flaw in the protocol. i'll leave a todo. PS10, Line 306: exec-query-finstances-$0 > as mentioned in query-exec-mgr, seems this should be not plural: "exec-quer Done PS10, Line 311: don't return until every instance is prepared > why is that important? we're already running on an async thread, so why sho it's the declared behavior as pointed out in the function comment, and it's a superset of what qem needs (needs to know that at least one instance thread started so it can release the qs refcount. Line 319: prepare_status = instance_status; > It might simplify reasoning about lifetimes if we set some rules on when in fis.h points out that all public non-getter functions block until the prepare phase finishes, is that not enough? PS10, Line 323: returned status > what is this referring to? Done PS10, Line 342: CancelInternal(); > I guess this can happen before all instances have gotten through their Prep yes, there was a race. i restructured it. Line 343: ExecEnv::GetInstance()->query_exec_mgr()->ReleaseQueryState(this); > would be nice to comment that this reference was taken in StartFInstances() Done PS10, Line 353: if (!is_cancelled_.CompareAndSwap(0, 1)) return; > cancellation is a bit more difficult to reason about now that CancelInterna that would mean that every single instance will call cancel for all other instances, which can be really tedious if you need to debug something and have log output in that path. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-state.h File be/src/runtime/query-state.h: > I think defining a state machine for the lifecycle of this (and FIS) would Done PS10, Line 54: all query state (contained : /// either in this class or accessible through this class, such as the : /// FragmentInstanceStates) is guaranteed to be alive. > That's not the long term plan though, right? i.e. resources may be freed ev clarified to mean control structures. PS10, Line 59: cancelled > I think what it means to be "cancelled" is a bit vague, given the asynchron vague in what way? and lifetime guarantee for what? PS10, Line 106: Uses few cycles > where will we eventually put codegen? renamed. codegen will happen in startfinstances. i'll clarify in the comment. PS10, Line 112: Blocks until all fragment instances have finished : /// their Prepare phase. > why is that important given that this runs on an async thread itself? it's the externally observable behavior, and qem::startqueryhelper takes advantage of it (by releasing its reference after this call; if it weren't guaranteed that at least one instance had started up by that time, it would be a race). PS10, Line 119: GetPrepareStatus > This name is confusing given that it doesn't get the status for QES Prepare Done PS10, Line 132: query is complete > what does that mean? when all fragment instances are complete, or when all i agree the guarantees around resource release and lifetime need to be completely revised and clarified. i'd prefer not to add this to this change. in this case, i simply retained the existing code from fis and planfragmentexecutor. Line 146: /// including its profile. > Add: Not legal to call until after Prepare is complete. Done PS10, Line 159: const TQueryCtx query_ctx_; > why is this split out from rpc_params_? oh, I guess for the coordinator, h Done Line 174: DescriptorTbl* desc_tbl_ = nullptr; > For all these fields that are set outside the ctor, how does an outsider kn clarified in the public section Line 177: MemTracker* query_mem_tracker_; > would be nice to group the fields based on their lifetime, i.e. move this a Done PS10, Line 184: prepared_promise_ > when reading the code it's hard to remember that this is a barrier for all Done PS10, Line 194: ) > Created in Prepare(). Done Line 199: /// Temporary files for this query (owned by obj_pool_) > created in Prepare(). Done Line 203: > For all these fields that are created after the ctor, how does an outsider Done PS10, Line 222: latter > former? Done PS10, Line 228: Cancels > Initiates cancellation ... it's not, good catch. there's a deadlock in reportexecstatus when called from startfinstances. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/runtime-state.cc File be/src/runtime/runtime-state.cc: PS10, Line 335: DCHECK(query_state_ != nullptr); > why? (was this meant to go in query_mem_tracker()?) removed. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS10, Line 99: Return the query's ObjectPool > i guess that means only control structures, not resources, should be put in it has to be. the runtimestate lives as long as the query state (even currently). and, looking for calls to 'obj_pool()', i didn't see anything like 'delete state->obj_pool()' in the code. http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/service/impala-internal-service.cc File be/src/service/impala-internal-service.cc: PS10, Line 44: // > are you removing these or putting them back? Done http://gerrit.cloudera.org:8080/#/c/6535/10/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: PS10, Line 339: TExecQueryFragmentsParams > ? Done PS10, Line 398: s > delete 's' Done -- To view, visit http://gerrit.cloudera.org:8080/6535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
