Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 10:

(52 comments)

http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/fragment-instance-exec-state.h
File be/src/runtime/fragment-instance-exec-state.h:

Line 32: /// Execution state of a single plan fragment.
> instance
Done


PS10, Line 33: FragmentInstanceExecState
> Let's consider FragmentInstanceState and QueryState (if they're not executi
Done


PS10, Line 33: FragmentInstanceExecState
> i think that's a good idea, long names aren't necessarily more readable or 
Done


PS10, Line 39: lambda(
             :           
boost::mem_fn(&FragmentInstanceExecState::ReportStatusCb),
             :               this, _1, _2, _3)
> Try and find a clearer way to write this. If you don't want to write the la
I'm still not very well versed on the C++ lambda  syntax.
My next step is to create a compilable version of this patch. I'll make the 
changes to this in the next patch set so that I get the syntax right and make 
sure that it compiles.
Also, I don't want to hold up reviewers while I figure this out.


PS10, Line 80: ImpalaBackendClientCache* client_cache_;
> Is this necessary given that ExecEnv can give us the same pointer?
No. I've changed it.


PS10, Line 77: QueryExecState* query_exec_state_;  // not owned
             :   TPlanFragmentInstanceCtx fragment_instance_ctx_;
             :   FragmentInstanceExecutor executor_;
             :   ImpalaBackendClientCache* client_cache_;
             :   TExecPlanFragmentParams exec_params_;
> Good opportunity to add some comments for these.
Done


PS10, Line 83: plan fragment
> fragment instance.
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/fragment-instance-executor.cc
File be/src/runtime/fragment-instance-executor.cc:

PS10, Line 80: Close
> Since we are aiming for deterministic, manual management of finst lifetimes
It already does get called explicitly and this call shouldn't be necessary. I 
changed it to a DCHECK.


Line 349:     // We also want to call sink_->Close() here rather than in 
FragmentInstanceExecutor::Close
> long line
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/fragment-instance-executor.h
File be/src/runtime/fragment-instance-executor.h:

Line 48: /// FragmentInstanceExecutor handles all aspects of the execution of a 
single plan fragment,
> long line
Done


PS10, Line 155: ExecEnv* exec_env_;  // not owned
> Can you remove this? ExecEnv::GetInstance() works fine.
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/query-exec-state.cc
File be/src/runtime/query-exec-state.cc:

PS10, Line 60:  
> remove space
Done


PS10, Line 73: 
DCHECK_EQ(fragment_inst_exec_state_map_.find(GetInstanceIdx(fragment_instance_id)),
             :       fragment_inst_exec_state_map_)
> these aren't the same type (does this compile?).
Oops, sorry. Meant to compare it with map_.end(). Changed it.

I haven't gotten this to compile yet, so there may be a few syntax errors that 
I've missed.


PS10, Line 76: FragmentInstanceExecState* fragment_inst_exec_state =
             :       new FragmentInstanceExecState(exec_params, this, 
ExecEnv::GetInstance());
> Presumably this should be managed by obj_pool. If not, the value type of th
Are there any arguments for or against a unique_ptr?

It seems to fit the use here perfectly:
- We would never want to transfer ownership of the FragmentInstanceState 
objects.
- We want its lifetime to be as long as the QueryState.

If there are no objections, I can change it to use a unique_ptr.


PS10, Line 80: his RPC returns
> This isn't an RPC any more. Update comment.
Done


PS10, Line 113: lexical_cast
> PrintId()
Done


PS10, Line 128: lexical_cast
> PrintId()
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 38: /// This class is responsible for creating, running and destroying 
FragmentInstanceExecStates.
> Talk about lifetime here: when is this created, when is it destroyed, what 
Done


PS10, Line 41: // Special constructor for the coordinator.
> TODO: remove.
Done


PS10, Line 59: CleanFragmentInstanceStates
> Don't talk about implementation details (cleaning the map). When would this
Done


Line 61:   /// Registers a new FragmentInstanceExecState and launches the 
thread that calls Prepare() and
> long line.
Done


PS10, Line 90: std::unique_ptr<ObjectPool*> obj_pool_
> Comment broadly what this is used to manage, since that affects lifecycles.
It doesn't manage anything yet in this patch because we decided to bring in the 
shared state across fragments (DescriptorTbl) in a future patch.

The reason is that the serialization/deserialization of the DescriptorTbl needs 
to be modified quite a bit to become a general DescTbl across all fragments.

So should I remove this since it won't be used for now?

Or should I have the FragmentExecStates be managed by the ObjectPool? If so, 
what's the advantage of that?


PS10, Line 90: *
> do you mean unique_ptr<ObjectPool> ?
Oops, yes. Changed it.


PS10, Line 90: Owned
> redundant wrt unique_ptr
Done


PS10, Line 94: active
> expand. What does it mean for a fragment to be active?
Done


PS10, Line 105: shared_ptr
> Update comment.
Done


PS10, Line 109: FragmentInstanceExecState
> Are the exec states managed by the obj pool? If so, say so otherwise the li
No they currently aren't because the lifetime of the FragmentInstanceState 
seems very clear, i.e. it's tied to the lifetime of the QueryState.

Is there any advantage of adding it to the object pool?


PS10, Line 113: shared pointer 
> update comment
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/client-request-exec-state.cc
File be/src/service/client-request-exec-state.cc:

Line 17: #include "service/client-request-exec-state.h"
> add a blank line before this one
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/client-request-exec-state.h
File be/src/service/client-request-exec-state.h:

PS10, Line 48: This is responsible for creating the
             : /// coordinator object
> Seems unnecessary.
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/impala-internal-service.h
File be/src/service/impala-internal-service.h:

PS10, Line 48: fragment_mgr_->CancelPlanFragment(params.query_id,
             :         params.fragment_instance_id);
> Copy the one-line idiom from other RPCs, and set the TStatus directly witho
Done


PS10, Line 74: Ignoring returned status here as failing to publish a filter is 
not fatal.
> It would be better to add a TStatus to the return struct, and have the call
Done


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

PS10, Line 40: bind<void>(mem_fn(
> please avoid in favour of lambda expressions, to reduce dependency on boost
Likewise, I'll make this change in the next patchset which is compilable.


PS10, Line 74: fragment_inst_exec_state->set_exec_thread(new 
Thread("query-exec-state",
             :       Substitute("exec-plan-fragment-$0", PrintId(fragment_id)),
             :           &QueryExecMgr::ExecInstance, this, 
fragment_inst_exec_state));
> Why is the QEM running the thread, and not the QES? If it's to ensure a ref
We had a discussion about this yesterday. I've changed it like this for the 
following reason:
Previously, the ExecInstance() thread was run by the QES. So, the QEM used to 
reserve refcounts (increase) and the QES used to decrement the refcounts in 
ExecInstance() by calling ReleaseQES(*this).

This means that the refcount is increased in one level(QEM) and decreased in 
another level(QES).

Keeping it this way means that the QEM does both the increase/decrease of the 
refcounts.

This will be shortlived as there are plans to include a QueryExecutor class 
(similar to PlanFragmentExecutor, but for a query), that will take care of 
query execution and this logic will be moved there instead.


Line 123:   DCHECK(i->second->num_current_references_ != 0);
> DCHECK_GT
Done


Line 142:       query_exec_state_map_.find(query_id);
> one line?
Done


PS10, Line 146: std
> remove std::
This is not a unique_ptr anymore. Removed it.


PS10, Line 148: ++(i->second->num_current_references_);
> Who returns this reference? The caller of FindOrCreate...() can't tell if i
The caller doesn't need to know that. Why would it?


PS10, Line 159: lexical_cast
> PrintId
Done


PS10, Line 167: this
> QESGuard() takes a QEM*. Did this compile?
No, this was a mistake. I've removed this argument anyway.


http://gerrit.cloudera.org:8080/#/c/3817/10/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

PS10, Line 31: shared
> 'Shared' suggests ownership. Better is to describe that the QES manages, or
Done


Line 45: /// does for RegisterFragmentInstanceWithQuery()'s return value.
> I think after the batching change the QEM shouldn't ever know about fragmen
Done


PS10, Line 46: QueryExecMgr
> please instrument this class with more metrics as needed (e.g. number of ru
Done


PS10, Line 76: Status CancelPlanFragment(const TUniqueId& query_id,
             :       const TUniqueId& fragment_instance_id);
> Remind me why we have this shorthand for:
I'm sorry I don't follow the question. Could you clarify?


PS10, Line 84: c'tor
> this comment is describing a class, not a c'tor. Not sure what you're tryin
I meant class object. Changed it.


PS10, Line 87: nothing
> imprecise - talk in terms of invariants on methods, e.g. query_exec_state()
Done


PS10, Line 90: /// This class makes sure that the caller cannot directly access 
the QueryExecState
             :   /// and allowing access to the QES only through the APIs 
provided from this class.
             :   /// This is so that the caller doesn't accidentaly try and 
transfer ownership of the
             :   /// QES.
             :   /// TODO: Is this really necessary? The downside is that this 
class needs to duplicate
             :   ///       function signatures for every function the QES wants 
to expose.
> Isn't this comment no longer true given QESGuard::query_exec_state() ?
Yes, I forgot to update the comment. Removed it.


PS10, Line 99: QueryExecMgr
> why is this an argument? Surely QEM is a process-wide singleton, and we can
I've made the ExecEnv hold a pointer to the QEM and removed this argument.


Line 112:     QueryExecState* query_exec_state_;
> add DISALLOW_COPY_AND_ASSIGN, otherwise you have the same leakage problems 
Done


PS10, Line 124: QueryExecStateMap
> can't get on one line?
Done


PS10, Line 127: /// The thread pool that is in charge of destroying a 
QueryExecState when it is no
              :   /// longer in use.
              :   ThreadPool<QueryExecState*> destroy_thread_;
> Do we have evidence that this is necessary, or should we remove this mechan
Currently, destruction of the QES is not a very heavyweight operation. It will 
be in the near future though (after IMPALA-2550).

I could leave it as it is, or it could be added in a future patch. I'm okay 
with either.


Line 137:   void DestroyQueryExecState(uint32_t thread_id, QueryExecState* 
query_exec_state);
> mention that it's invoked by the thread pool.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to