Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8758 )

Change subject: IMPALA-6190/6246: Add instances tab and event sequence
......................................................................


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/coordinator.h@187
PS4, Line 187: instances
instances' stats


http://gerrit.cloudera.org:8080/#/c/8758/3/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/8758/3/be/src/runtime/fragment-instance-state.cc@241
PS3, Line 241: UpdateState(StateEvent::CODEGEN_END);
> Yes, these overlap. I thought it'd still be nice to show it in the event se
Please see comments in Open(). This state doesn't seem necessary anymore if you 
take the recommendation there.


http://gerrit.cloudera.org:8080/#/c/8758/3/be/src/runtime/fragment-instance-state.cc@302
PS3, Line 302:       UpdateState(StateEvent::WAITING_FOR_BATCH);
             :       SCOPED_TIMER(plan_exec_timer);
             :       RETURN_IF_ERROR(
             :           exec_tree_->GetNext(runtime_state_, row_batch_.get(), 
&exec_tree_complete));
             :       UpdateState(StateEvent::BATCH_RECEIVED);
             :     }
             :     if (VLOG_ROW_IS_ON) 
row_batch_->VLogRows("FragmentInstanceState::ExecInternal()");
             :     COUNTER_ADD(rows_produced_counter_, row_batch_->num_rows());
             :     RETURN_IF_ERROR(sink_->Send(runtime_state_, 
row_batch_.get()));
             :     UpdateState(StateEvent::BATCH_SENT);
> I think it's helpful to track "first batch received/sent" separately since
I think it's definitely worth doing a perf run to make sure things don't 
regress. I suppose it should be mostly okay if it's done once per row batch. 
That said, it's a bit unfortunate that we have to do call UpdateState() in the 
loop. I wonder if there is a chance that the compiler will be smart enough to 
constant fold the argument (e.g. StateEvent::WAITING_FOR_BATCH) into the 
function and avoid the switch at runtime.


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@63
PS4, Line 63: static const string INSTANCE_EVENT_TIMELINE_NAME =
            :     "Fragment Instance Lifecycle Event Timeline";
            : static const string CODEGEN_EVENT_NAME = "Codegen Finished";
            : static const string PREPARE_EVENT_NAME = "Prepare Finished";
            : static const string OPEN_EVENT_NAME = "Open Finished";
            : static const string FIRST_BATCH_RECEIVED_EVENT_NAME = "First 
Batch Received";
            : static const string FIRST_BATCH_SENT_EVENT_NAME = "First Batch 
Sent";
            : static const string EXEC_INTERNAL_EVENT_NAME = "ExecInternal 
Finished";
Please provide comments about these names and how they may correlate with the 
current state of the state machine.


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@73
PS4, Line 73: FINSTANCE_STATE_LABELS[]
Is there any static assert to ensure that number of entries in this array match 
the declaration in the thrift definition ?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@79
PS4, Line 79: received
Will "produced" be more accurate as there is usually some extra processing on 
top of the raw data received from either data source or another fragment ?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@81
PS4, Line 81: Receiving Data
"Producing row batches" seems more consistent with the rest of the state names.


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@283
PS4, Line 283: RETURN_IF_ERROR(codegen->FinalizeModule());
This is actually the majority of time codegen spent on. May want to extend the 
tracking the StateEvent::CODEGEN to this point and start StateEvent::OPEN_START 
below. In that way, you don't need StateEvent::CODEGEN_END either.


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@302
PS4, Line 302: UpdateState(StateEvent::WAITING_FOR_BATCH);
Can this be moved out of the loop to line 297 and simplify the state machine to 
jump directly to RECEIVING_DATA once the first batch has been sent ?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@306
PS4, Line 306: UpdateState(StateEvent::BATCH_RECEIVED);
Should we not count the time to execute UpdateState() towards "plan_exec_timer" 
?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@410
PS4, Line 410: void FragmentInstanceState::UpdateState(StateEvent event)
const StateEvent


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@418
PS4, Line 418: DCHECK(
DCHECK_EQ(). Same below.


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@483
PS4, Line 483: default:
nit: wrong indent


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@484
PS4, Line 484:     DCHECK(false) << "Unexpected Event";
Does it help to print "event" too ?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/runtime/fragment-instance-state.cc@487
PS4, Line 487: current_state_.Store(next_state);
Can there be multiple threads updating the state here ? If so, does the above 
code need to be thread safe too ?

If it's single threaded only, do we still need AtomicEnum ?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/util/runtime-profile-counters.h@305
PS4, Line 305:     boost::lock_guard<SpinLock> event_lock(lock_);
             :     Event event = make_pair(move(label), sw_.ElapsedTime());
Why the change in order ? Do you mean to count the spin lock wait time towards 
elapsed time too ?


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/util/runtime-profile-counters.h@333
PS4, Line 333: DCHECK(
DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/8758/4/be/src/util/runtime-profile-counters.h@337
PS4, Line 337: events_.back().second
Not sure if the compiler is smart enough to hoist this out of the loop as it 
may assume that events_ can be changed during the loop iteration.



--
To view, visit http://gerrit.cloudera.org:8080/8758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I626456b6afa9101eeeeffd5cda10c4096d63d7f9
Gerrit-Change-Number: 8758
Gerrit-PatchSet: 4
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Comment-Date: Tue, 12 Dec 2017 01:35:21 +0000
Gerrit-HasComments: Yes

Reply via email to