Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8758 )

Change subject: IMPALA-6190/6246: Add instances tab and event sequence
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/coordinator-backend-state.cc@512
PS2, Line 512:   // We send 'done' explicitly so we don't have to infer it by 
comparison with a string
             :   // constant in the debug page JS code.
             :   value->AddMember("done", done_, document->GetAllocator());
             :
             :   Value state_val(
             :       
FragmentInstanceState::GetFInstanceStateDescription(current_state_)
> is 'done' completely redundant with 'current_state', or can it not always b
There's a subtle difference: 'done' is set in the final report after the 
reporting thread has been stopped. Between the final state of 'current_state' 
and reporting 'done', we release the thread token and stop the reporting 
thread. With the state machine I added in the next PS the state should always 
be FINISHED when a FI ends in error though.

With abort_on_error=1, the coordinator will not accept status updates once the 
query has been cancelled. I couldn't come up with a query that runs for a while 
before aborting an instance and thus the resulting profiles I looked at all had 
empty instance sub-profiles.

Sending 'done' explicitly also makes handling in the debug page's javascript 
code easier since we don't have to perform a string comparison with a constant 
to determine whether the fragment is done.

I left a comment, but I'm happy to remove it if you feel strongly about it and 
instead replace it with a comparison in JS.


http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/fragment-instance-state.h@169
PS2, Line 169:   /// 'current_state_'. Events are issued throughout the 
execution by calling
> Should document why this is atomic -- which threads read and which write it
Done


http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/fragment-instance-state.cc@296
PS2, Line 296:   
SCOPED_THREAD_COUNTER_MEASUREMENT(runtime_state_->total_thread_statistics());
             :   bool exec_tree_complete = false;
             :   do {
             :     Status status;
             :     row_batch_->Reset();
             :     {
             :       UpdateState(StateEvent::WAITING_FOR_BATCH);
             :       SCOPED_TIMER(plan_exec_timer);
             :       RETURN_IF_ERROR(
             :           exec_tree_->GetNext(runtime_state_, row_batch_.get(), 
&exec_tree_complete));
             :       UpdateState(StateEvent::BATCH_RECEIVED);
             :     }
             :     if (VLOG_ROW_IS_ON) 
row_batch_->VLogRows("FragmentInstanceState::ExecInternal()");
             :     COUNTER_ADD(rows_produced_counter_, row_batch_->num_rows());
             :     RETURN_IF_ERROR(sink_->Send(runtime_state_, 
row_batch_.get()));
             :     UpdateState(StateEvent::BATCH_SENT);
             :   } while (!exec_tree_complete);
             :
             :   UpdateState(StateEvent::END_OF_STREAM);
             :   // Flush the sink *before* stopping the report thread. Flush 
may need to add some
             :   // important information to the last report that gets sent. 
(e.g. table sinks record the
             :   // files they have written to in this method)
             :   RETURN_IF_ERROR(sink_->FlushFinal(runtime_state()));
             :   return Status::OK();
             : }
             :
             : void FragmentInstanceState::Close() {
             :   DCHECK(!report_thread_active_);
             :   DCHECK(runtime_state_ != nullptr);
> To be clear, the goal is to remove all the if-stmts and state variables fro
Done


http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/runtime/fragment-instance-state.cc@443
PS2, Line 443:    next_state = TFInstanceState::RECEIVING_DATA;
             :     } else {
> this would be easier to understand if you explicitly list out all cases. th
Done


http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/8758/2/be/src/util/runtime-profile.cc@310
PS2, Line 310:       }
> what's the summary for why this change is needed?
There was a bug, I shouldn't have removed L296. In general we need to merge the 
event sequence to get the instance events into the query profile - something we 
just had not done so far.


http://gerrit.cloudera.org:8080/#/c/8758/2/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/8758/2/common/thrift/ImpalaInternalService.thrift@639
PS2, Line 639: }
> given this interacts with IMPALA-2990 work, we may want to split this chang
I had talked to Sailesh about this change before pushing it and we agreed to 
keep it in a single commit for now. I'm also happy to pull it apart if we think 
it'll make things a lot easier.



--
To view, visit http://gerrit.cloudera.org:8080/8758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I626456b6afa9101eeeeffd5cda10c4096d63d7f9
Gerrit-Change-Number: 8758
Gerrit-PatchSet: 3
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Comment-Date: Fri, 08 Dec 2017 00:39:42 +0000
Gerrit-HasComments: Yes

Reply via email to