Marcel Kornacker has posted comments on this change.
Change subject: PREVIEW IMPALA-2550: RPC batching
......................................................................
Patch Set 5:
(28 comments)
haven't looked at coordinator.{cc,h} yet
http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:
Line 99: bool fragment_has_reserved_resource =
fragment_instance_ctx.__isset.reserved_resource;
need to distinguish between fragments and their instances
Line 126: if (is_first) {
could do this directly in the Fragment/QueryMgr
Line 207: RETURN_IF_ERROR(DescriptorTbl::Create(obj_pool(),
query_ctx.desc_tbl, &desc_tbl));
let's avoid this duplication across fragment instances
http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:
Line 72: query_ctx_(query_ctx),
let's not make copies of that stuff, in particular the thrift version of the
descriptor table (which we don't use during execution anyway).
we need to create a way to manage query-duration state.
http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-exec-state.cc
File be/src/service/fragment-exec-state.cc:
Line 121: if (rpc_status.IsTryAgain()) {
is that from a rebase?
http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-exec-state.h
File be/src/service/fragment-exec-state.h:
Line 74: // TODO-MT: Make fragment_ctx_ a const shared_ptr to share it
between instances.
remove, too speculative
http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-mgr.cc
File be/src/service/fragment-mgr.cc:
Line 51: std::shared_ptr<const TQueryCtx> query_ctx =
we don't use std:: in .cc
Line 52: std::make_shared<const TQueryCtx>(exec_params.query_ctx);
are there other pieces of query-wide state that need to stick around until the
last fragment instance has finished?
we should wrap those into a QueryExecState (or something else; that name is
already in use).
Line 68: exec_params.fragment_instance_ctxs[fragment_idx];
formatting
Line 95: lock_guard<SpinLock> l(fragment_exec_state_map_lock_);
do we still need a per-instance map? i don't think for cancellation
Line 104: // TODO: manage threads via global thread pool?
no, let's not. please remove
Line 105: const TUniqueId& fragment_id = exec_state->fragment_instance_id();
naming
Line 111: return_val.all_instance_start_latency_ms =
never assign to thrift struct fields directly, use __set instead
Line 115: if(status.IsMemLimitExceeded()) {
formatting
Line 124: << " fragment instances started";
formatting
Line 131: std::shared_ptr<FragmentExecState> exec_state =
remove (and elsewhere)
http://gerrit.cloudera.org:8080/#/c/3390/5/be/src/service/fragment-mgr.h
File be/src/service/fragment-mgr.h:
Line 29: /// result of ExecPlanFragments() RPCs that arrive via the internal
Impala interface.
point out relationship with PlanFragmentExecutor
also, since we now start execution of all fragment instances of a single query
together (or at least this class sees all of those), this might better be
rebranded QueryFragmentMgr.
Line 33: /// Fragments are Prepare()'d in that thread, and then Exec() is
called. At any point a
you need to differentiate carefully now between fragments and their instances.
in this case, you're talking about the instance.
please fix that everywhere it applies (= not just in this file).
Line 60: class FragmentExecState;
is this per fragment or per instance?
Line 78: FragmentExecStateMap;
indentation
http://gerrit.cloudera.org:8080/#/c/3390/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:
Line 366: // TODO-MT: This is currently unused but will be needed for
multi-threading.
why will it be needed?
Line 380: // required in V1
superfluous comment (the protocol version will always be required, in each
version of the protocol)
Line 386: // List of contexts of this RPC's fragments.
not really necessary, simply paraphrases the declaration
Line 390: // total number of fragment instances, the query context, etc.
also not necessary (we shouldn't repeat the content of TPlanFragmentInstanceCtx
here anyway)
Line 399: 2: optional i64 all_instance_start_latency_ms
start_latency or startup_latency is fine (if it were a particular fragment we'd
want to point that out).
why not make that a float? that way we don't have to change anything if the
latency drops below 1ms (which may happen with plan caching)
Line 640: TExecPlanFragmentsResult
ExecPlanFragments(1:TExecPlanFragmentsParams params);
rename to ExecQueryFragments?
Line 649: TCancelPlanFragmentResult
CancelPlanFragment(1:TCancelPlanFragmentParams params);
does it still make sense to call this for each fragment instance?
http://gerrit.cloudera.org:8080/#/c/3390/5/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:
Line 278: ("TRY_AGAIN_LATER", 90, ""),
later is superfluous
--
To view, visit http://gerrit.cloudera.org:8080/3390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8e3a9d4a78b59394aaf343df08f6bfda22c3148e
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes