Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
......................................................................


Patch Set 9:

(23 comments)

Did a review pass while I don't have access to my desktop.

http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS9, Line 293: UpdateExecStats
The similarity between UpdateExecStats() and UpdateExecStatus() is potentially 
confusing. Spell out Statistics() maybe?


Line 455:     // TODO: we don't track cpu time per node now. Do that.
More generally we should have an aggregate profile for each backend that 
includes things like this (query memory consumption, I/O, other aggregates). We 
don't need to do this now but it will be helpful


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

Line 132:     const FInstanceExecParams& exec_params_;
Might be helpful to document what owns the object being referenced.


Line 164:   std::vector<const FInstanceExecParams*> instance_params_list_;
Might be helpful to document what owns the objects being referenced.


Line 175:   /// Protects fields below. Can be held while doing an RPC, so 
SpinLock is a bad idea.
Our current SpinLock implementation should be fine (it blocks after spinning 
for a short period). Mutex should also be fine since this won't be heavily 
contented - I don't feel strongly.


Line 219:   /// TODO: including the median doesn't compile, looks like some 
includes are missing
This TODO has been in the code since 2012 - maybe just remove it?


Line 253:   /// Root profile for all fragment instances for this fragment
Stored in obj_pool?


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 364:           f->targets()->push_back(target);
Maybe:

  ->emplace_back(t_target, fragment_params.fragment.idx);


Line 402:   int num_instances = schedule_.GetNumFragmentInstances();
nit: intermediate variable seems unnecessary.


Line 410:   // TODO: rename this metric
There are a handful of minor TODOs like this. I assume there's some reason not 
to do them in this patch, but it's not clear when/if it makes sense to address 
them. Do we have a JIRA or anything to track them?


Line 938:   // Notify that we completed with an error.
This comment seems wrong - the cancellation path is used to clean up even if 
there isn't an error. Maybe just remove the comment?


Line 956:   // TODO: return here if returned_all_results_? 
I believe we need to do cancellation even if we returned all results. Having 
returned all results from the coordinator fragment doesn't imply that all 
fragments have returned all their results.

The cancellation path is convoluted so I may be misunderstanding something 
though.

Henry had a related patch in review: https://gerrit.cloudera.org/#/c/5987


Line 983:   // TODO: clarify control flow here, it's unclear we should even 
process this status
Henry had a patch that clarified this and improved the comment: 
https://gerrit.cloudera.org/#/c/5987/1/be/src/runtime/coordinator.cc


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 108: /// TODO: clean up locking behavior; in particular, clarify 
dependency on lock_
Does any of the locking information in the impala-server.h need to be updated?


Line 261:   boost::mutex wait_lock_;
A lot of these locks could safely be SpinLocks - unless they're used with a 
condition variable. Now that we have a blocking SpinLock it's better in most 
cases until you need a CV or strong fairness guarantees.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 507: 
Extra line


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 162:   /// True if and only if Cancel() has been called.
I had a hard time reasoning about concurrent accesses to this until I realised 
that it didn't actually influence query execution. Document that it's only used 
for DCHECKs? Or just remove it?


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

Line 111:   //VLOG_QUERY << "AddChildTracker(): parent=" << this << " child=" 
<< tracker << " " << tracker->label_ << "\n" << GetStackTrace();
I assume this will be removed.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 77:   /// Execute instances and decrement refcount.
I find it clearer to understand as: "Acquires ownership of 'qs'".


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 43: #define RETRY_SLEEP_MS 100
Make a class-level constant?


Line 311:   // don't return until every instance is prepared and record the 
first non-OK status
Comment isn't really accurate - it doesn't record the first non-OK status 
because of how it handles CANCELLED.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 257:   void set_is_cancelled(bool v) { is_cancelled_ = v; }
Not your change but related. This bool is accessed from multiple threads yet we 
don't do any synchronisation. Make it an Atomic32 so that we actually have some 
guarantees here?


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 96:   VLOG_QUERY << "FeSupport::NativeEvalExprs: " << texprs.size() << " 
exprs";
Could get very noisy depending on how many constant exprs there are in the 
query and how many partitions predicates on partition keys get evaluated 
against.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to