Henry Robinson has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.
......................................................................


Patch Set 4:

(30 comments)

This is my first pass, and so doesn't capture all the style, comment etc 
issues. I'm first focusing on whether the ScopedQueryExecStateRef works as 
intended, and whether the lookup-by-fragment ID is required. 

I'm not yet convinced that this is coming out more cleanly than a shared_ptr 
with a custom deleter (which is effectively what we are doing here).

http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 418: // TODO: Should we create a QueryExecState for the coordinator 
fragment?
yes, otherwise you wind up with special-case logic to handle the paths where 
the QES is null. 

Why does this QES not go in the local QEM?


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/query-exec-state.cc
File be/src/runtime/query-exec-state.cc:

PS4, Line 71: QueryExecMgr::ScopedQueryExecStateRef s(query_exec_mgr, 
fragment_instance_id);
This is confusing: presumably this exists to ensure that *this has a lifetime 
at least as long as this method. But if there's a danger that *this will be 
destroyed during this method, that could happen before this line ever gets 
executed. So this won't have a reference to anything (and will actually crash).

Why not have a fragment exec state add a reference to the QES that is 
decremented when the FES completes? Then the reference is guaranteed to be > 0 
after line 49 above, and you don't need to take the reference here.


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 1: // Copyright 2016 Cloudera Inc.
Fix license headers here and elsewhere.


Line 36: /// The lifetime of this class is maintained by QueryExecMgr.
No need to talk about what owners of this class might do with it.


PS4, Line 41: NULL
nit: use nullptr now.


PS4, Line 84: std::shared_ptr
Why shared?


PS4, Line 101: boost
std


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 274:   QueryExecState* query_exec_state_; // not owned
comment what this is.


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/client-request-exec-state.cc
File be/src/service/client-request-exec-state.cc:

Line 14: #include "service/request-exec-state.h"
add new line before this one


Line 570:       "query-exec-state", "wait-thread", 
&ImpalaServer::ClientRequestExecState::Wait, this));
long line


Line 677: void 
ImpalaServer::ClientRequestExecState::UpdateNonErrorQueryState(QueryState::type 
query_state) {
long line


Line 829: Status ImpalaServer::ClientRequestExecState::GetRowValue(TupleRow* 
row, vector<void*>* result,
long line


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

PS4, Line 37: bind<void>(mem_fn(&QueryExecMgr::DestroyQueryExecState), this, 
_1, _2)
avoid bind, prefer lambdas


PS4, Line 48:  const TUniqueId& query_id = exec_params.query_ctx.query_id;
            :   // Preparing and opening the fragment creates a thread and 
consumes a non-trivial
            :   // amount of memory. If we are already starved for memory, 
cancel the fragment as
            :   // early as possible to avoid digging the hole deeper.
            :   MemTracker* process_mem_tracker = 
ExecEnv::GetInstance()->process_mem_tracker();
            :   if (process_mem_tracker->LimitExceeded()) {
            :     string msg = Substitute("Instance $0 of plan fragment $1 of 
query $2 could not "
            :         "start because the backend Impala daemon is over its 
memory limit",
            :         
PrintId(exec_params.fragment_instance_ctx.fragment_instance_id),
            :         exec_params.fragment_ctx.fragment.display_name,
            :         PrintId(query_id));
            :     return process_mem_tracker->MemLimitExceeded(NULL, msg, 0);
            :   }
this logic looks like it should live inside RegisterFragmentInstance


PS4, Line 65: // Create or retrieve the QueryExecState for this query id.
            :   QueryExecState* query_exec_state = 
FindOrInsertQueryExecState(exec_params.query_ctx);
            :   DCHECK(query_exec_state != NULL);
            : 
            :   // This is a no-op if another fragment of the same query 
already called Init().
            :   query_exec_state->Init();
Why not change the interface to the following:

  // Creates a QES if one doesn't exist.
  qes = QEM->GetQueryExecState(query_id);
  qes->RegisterFragment(fragment_id);

Then the Init() is only ever done in one place (GetQES()), and you directly 
interact with the QES itself.


PS4, Line 103: i->second->num_current_references_++;
++


PS4, Line 107: void QueryExecMgr::ReleaseQueryExecState(QueryExecState* 
query_exec_state) {
             :   lock_guard<SpinLock> l(query_exec_state_map_lock_);
             :   query_exec_state->num_current_references_--;
             :   // If this thread was the last to use it, schedule it for 
destruction.
             :   if (query_exec_state->num_current_references_ == 0) {
             :     destroy_thread_.Offer(query_exec_state);
             :   }
             : }
A QES will never be destroyed unless at least one reference is ever taken to 
it. That works in this patch, because a fragment will always take at least one 
reference, but it's a bit subtle. Can you think of a way to make the guarantee 
of at-least-one-ref more explicit?


PS4, Line 133: std
remove std::


PS4, Line 137: query_exec_state_map_[query_id].get()
nit: you're doing the lookup twice. insert returns a pair <iterator, bool> - 
just do:

  i = qesm_.insert(...).first;

on line 134 and on this line:

  return i->second.get();


PS4, Line 142: params.fragment_instance_id
The right thing to do is to augment the rpc params with the query ID, so you 
don't have to maintain the fragment->query map.


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 1: // Copyright 2014 Cloudera Inc.
Replace header with ASF one.


PS4, Line 29: fragments
fragment


PS4, Line 40: /// TODO: Remove Thrift args from methods where it would improve 
readability;
            : /// ImpalaInternalService can take care of translation to / from 
Thrift, as it already
            : /// does for RegisterFragmentInstanceWithQuery()'s return value.
Can you do that in this patch? (Already noticed one instance below). Getting 
the interface right is critical for this patch.


PS4, Line 61: void CancelPlanFragment(TCancelPlanFragmentResult& return_val,
            :       const TCancelPlanFragmentParams& params);
This shouldn't have RPC-style signature (since this isn't a proxy class): 
return a Status, have explicit parameters.


PS4, Line 80: GetQueryExecStateFromFragmentId
When do we know the fragment instance id, but not the query id?


Line 91:   class ScopedQueryExecStateRef {
Comment?


PS4, Line 94: const TUniqueId& fragment_id
This should take a query ID. You should try to get rid of the fragment->query 
ID map and only access the QEM->QES->FES hierarchy from the top down (if I have 
a process I can get a QEM, if I have a query ID and a QEM can get a QES, if I 
have a QES and a fragment ID I can get an FES).


PS4, Line 98: query_exec_state_ = 
query_exec_mgr_->GetQueryExecStateFromFragmentId(fragment_id);
What prevents this from returning NULL? You need some mechanism to ensure that 
the QES actually exists and has not been torn down.


PS4, Line 117: boost
std


PS4, Line 130: destroy_thread_
not just a single thread then.


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to