Marcel Kornacker has posted comments on this change.

Change subject: PREVIEW IMPALA-2550: RPC batching
......................................................................


Patch Set 5:

(28 comments)

haven't looked at coordinator.{cc,h} yet

http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 99:   bool fragment_has_reserved_resource = 
fragment_instance_ctx.__isset.reserved_resource;
need to distinguish between fragments and their instances


Line 126:     if (is_first) {
could do this directly in the Fragment/QueryMgr


Line 207:   RETURN_IF_ERROR(DescriptorTbl::Create(obj_pool(), 
query_ctx.desc_tbl, &desc_tbl));
let's avoid this duplication across fragment instances


http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 72:     query_ctx_(query_ctx),
let's not make copies of that stuff, in particular the thrift version of the 
descriptor table (which we don't use during execution anyway).

we need to create a way to manage query-duration state.


http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-exec-state.cc
File be/src/service/fragment-exec-state.cc:

Line 121:       if (rpc_status.IsTryAgain()) {
is that from a rebase?


http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-exec-state.h
File be/src/service/fragment-exec-state.h:

Line 74:   // TODO-MT: Make fragment_ctx_ a const shared_ptr to share it 
between instances.
remove, too speculative


http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-mgr.cc
File be/src/service/fragment-mgr.cc:

Line 51:   std::shared_ptr<const TQueryCtx> query_ctx =
we don't use std:: in .cc


Line 52:       std::make_shared<const TQueryCtx>(exec_params.query_ctx);
are there other pieces of query-wide state that need to stick around until the 
last fragment instance has finished?

we should wrap those into a QueryExecState (or something else; that name is 
already in use).


Line 68:       exec_params.fragment_instance_ctxs[fragment_idx];
formatting


Line 95:       lock_guard<SpinLock> l(fragment_exec_state_map_lock_);
do we still need a per-instance map? i don't think for cancellation


Line 104:     // TODO: manage threads via global thread pool?
no, let's not. please remove


Line 105:     const TUniqueId& fragment_id = exec_state->fragment_instance_id();
naming


Line 111:   return_val.all_instance_start_latency_ms =
never assign to thrift struct fields directly, use __set instead


Line 115:   if(status.IsMemLimitExceeded()) {
formatting


Line 124:       << " fragment instances started";
formatting


Line 131:   std::shared_ptr<FragmentExecState> exec_state =
remove (and elsewhere)


http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-mgr.h
File be/src/service/fragment-mgr.h:

Line 29: /// result of ExecPlanFragments() RPCs that arrive via the internal 
Impala interface.
point out relationship with PlanFragmentExecutor

also, since we now start execution of all fragment instances of a single query 
together (or at least this class sees all of those), this might better be 
rebranded QueryFragmentMgr.


Line 33: /// Fragments are Prepare()'d in that thread, and then Exec() is 
called. At any point a
you need to differentiate carefully now between fragments and their instances. 
in this case, you're talking about the instance.

please fix that everywhere it applies (= not just in this file).


Line 60:   class FragmentExecState;
is this per fragment or per instance?


Line 78:   FragmentExecStateMap;
indentation


http://gerrit.cloudera.org:8080/#/c/3390/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 366:   // TODO-MT: This is currently unused but will be needed for 
multi-threading.
why will it be needed?


Line 380:   // required in V1
superfluous comment (the protocol version will always be required, in each  
version of the protocol)


Line 386:   // List of contexts of this RPC's fragments.
not really necessary, simply paraphrases the declaration


Line 390:   // total number of fragment instances, the query context, etc.
also not necessary (we shouldn't repeat the content of TPlanFragmentInstanceCtx 
here anyway)


Line 399:   2: optional i64 all_instance_start_latency_ms
start_latency or startup_latency is fine (if it were a particular fragment we'd 
want to point that out).

why not make that a float? that way we don't have to change anything if the 
latency drops below 1ms (which may happen with plan caching)


Line 640:   TExecPlanFragmentsResult 
ExecPlanFragments(1:TExecPlanFragmentsParams params);
rename to ExecQueryFragments?


Line 649:   TCancelPlanFragmentResult 
CancelPlanFragment(1:TCancelPlanFragmentParams params);
does it still make sense to call this for each fragment instance?


http://gerrit.cloudera.org:8080/#/c/3390/5/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

Line 278:   ("TRY_AGAIN_LATER", 90, ""),
later is superfluous


-- 
To view, visit http://gerrit.cloudera.org:8080/3390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e3a9d4a78b59394aaf343df08f6bfda22c3148e
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to