Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11615 )

Change subject: IMPALA-4063: Merge report of query fragment instances per 
executor
......................................................................


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h@93
PS1, Line 93:   /// Called periodically to get the current status of this 
fragment instance.
> Since we're describing the usage pattern, might as well mention which threa
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.h@159
PS1, Line 159:   bool final_report_sent_ = false;
> What's the thread safety story here?
Added some comments. This is exclusively accessed by the query state thread 
only.


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@97
PS1, Line 97: intance
> instance
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@239
PS1, Line 239:
> nit: unnecessary blank line
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/fragment-instance-state.cc@243
PS1, Line 243: void 
FragmentInstanceState::GetStatusReport(FragmentInstanceExecStatusPB* 
instance_status,
> nit: could probably reduce vertical whitespace in this function.
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-exec-mgr.cc@128
PS1, Line 128: void QueryExecMgr::StartQueryHelper(QueryState* qs) {
> I feel like we should rename this function since it now runs until the quer
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@353
PS1, Line 353:   /// 'overall_status_' will be set once this is unblocked and 
so will 'failed_instance_id_'
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@418
PS1, Line 418:   bool IsStatusSet() const {
> Maybe something like "HasErrorStatus()". It's confusing that setting overal
Done


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.h@419
PS1, Line 419:     return !overall_status_.ok() && 
!overall_status_.IsCancelled();
> This is called without holding status_lock_, right? I don't think IsCancell
Good point. I think the optimization doesn't really matter that much given it's 
only exercised in the error path. It's okay to have a minor bit of lock 
contention if multiple fragment instances hit errors at the same time.


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc@a553
PS1, Line 553:
             :
The deletion of this line warrants some explanation:

In the old code, if there is any error returned from fis->Exec(), the final 
report for "fis" would have been sent already so the coordinator will already 
record the failure of "fis" before the cancellation is issued.

In the new code, the reporting is done periodically by the query state thread 
so the final report may or may not be sent at this point after "fis" hit an 
error above. Calling Cancel() here may actually cause the coordinator fragment 
(if there is one) to prematurely set the backend status at the coordinator, 
causing Coordinator::BackendState::ApplyExecStatusReport() to ignore subsequent 
updates from "fis".  As a result, the actual error status was masked.

The implicit contract in the code (even before this change) is that the final 
report of the first fragment instance to hit an (non-cancellation) error must 
have been sent before initiating the cancellation. Will add a comment here.


http://gerrit.cloudera.org:8080/#/c/11615/1/be/src/runtime/query-state.cc@236
PS1, Line 236:     if (overall_status_.IsCancelled()) {
> Can we refine this with the internal/client-driven cancellation distinction
Sorry, I don't quite get this comment. In either cases, won't the state still 
transit to BackendExecState::CANCELLED ?



--
To view, visit http://gerrit.cloudera.org:8080/11615
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Gerrit-Change-Number: 11615
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Oct 2018 08:11:17 +0000
Gerrit-HasComments: Yes

Reply via email to