Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11615 )

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
......................................................................


Patch Set 1:

(11 comments)

Had a few comments. Overall direction of the patch makes sense to me

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/coordinator-backend-state.cc@277
PS1, Line 277:   lock_guard<SpinLock> l1(exec_summary->lock);
Isn't this just reverting the change that the previous patch made?


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h@93
PS1, Line 93:   /// Called periodically to get the current status of this 
fragment instance.
Since we're describing the usage pattern, might as well mention which thread is 
calling it.


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h@159
PS1, Line 159:   bool final_report_sent_ = false;
What's the thread safety story here?


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@97
PS1, Line 97: intance
instance


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@239
PS1, Line 239:
nit: unnecessary blank line


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@243
PS1, Line 243: void 
FragmentInstanceState::GetStatusReport(FragmentInstanceExecStatusPB* 
instance_status,
nit: could probably reduce vertical whitespace in this function.


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-exec-mgr.cc@128
PS1, Line 128: void QueryExecMgr::StartQueryHelper(QueryState* qs) {
I feel like we should rename this function since it now runs until the query 
finishes on this backend, e.g. ExecuteQueryHelper()


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@91
PS1, Line 91: /// the final report (it will not be overridden by the resulting 
cancellation).
Is there anything short we can say here to clarify the relationship with the 
overall state on the coordinator?


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@418
PS1, Line 418:   bool IsStatusSet() const {
Maybe something like "HasErrorStatus()". It's confusing that setting 
overall_status_ to cancelled counts as "not set".


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@419
PS1, Line 419:     return !overall_status_.ok() && 
!overall_status_.IsCancelled();
This is called without holding status_lock_, right? I don't think IsCancelled() 
is thread-safe? Since we could could overwrite msg_ with an error status from a 
different thread and free the previous msg_ object.

I don't really like this pattern of doing unlocked reads to Status objects 
given that the interface of status is not documented as thread-safe - I get 
that .ok() works since Status is implemented as a pointer and we're on x86 but 
I think we need to clarify the interfaces - feels like it wouldn't take much 
for someone to invalidate one of the assumptions made by accident.

I wonder if the optimisation of deferring acquisition of the lock really 
matters. If we think it's necessary, it seems like we should document which 
Status methods are threadsafe, and change the implementation to use atomics so 
that the memory ordering assumptions are explicit.


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc@236
PS1, Line 236:     if (overall_status_.IsCancelled()) {
Can we refine this with the internal/client-driven cancellation distinction 
that was added here https://gerrit.cloudera.org/#/c/11464/



--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Oct 2018 22:56:38 +0000
Gerrit-HasComments: Yes

Reply via email to