Sailesh Mukil has posted comments on this change. Change subject: IMPALA-2550 Introduce query-wide execution context. ......................................................................
Patch Set 8: (9 comments) http://gerrit.cloudera.org:8080/#/c/3817/7//COMMIT_MSG Commit Message: PS7, Line 17: passes this received fragment along to the Query > which fragment? the received fragment. I've clarified the comment. PS7, Line 26: Once scheduled for destruction, a thread in the QueryExecMgr will : destroy the QueryExecState. > Why do we need a special thread for this? I've explained this in the CR comment in the query-exec-mgr.h file. PS7, Line 28: Every user of the QueryExecState must access it within the scope of : ScopedQueryExecStateRef which guarantees ref count incrementing and : decrementing at entry/exit of the scope. > It looks like we're implementing something that looks like a shared_ptr. I How I see it, reference counting is necessary because the lifetime of the QES is defined by how many fragments are around to use it. If needed, I can quickly go over the overall design with you in person tomorrow and also talk about the shared_ptr vs refcount decision, so that it'll be a lot easier to review this patch. http://gerrit.cloudera.org:8080/#/c/3817/7/be/src/service/query-exec-mgr.h File be/src/service/query-exec-mgr.h: PS7, Line 54: and that the fragment instance must either run to completion : /// (with or withou > or it can encounter an error? Yes the fragment is executed on another thread. I've clarified that the fragment instance can encounter an error too. PS7, Line 60: RegisterFragmentInstanceWithQuery > How about just 'RegisterFragment'? This is a fragment instance, so we would have to change it to RegisterFragmentInstance(). But that name is already used in QueryExecState::RegisterFragmentInstance(). So I added a 'WithQuery' here to curb confusion. If you still think it's better to change the name, I'll go ahead and do that. PS7, Line 68: QueryExecState* GetQueryExecState(const TUniqueId& query_id); : : /// Decrements query_exec_state->num_current_references and schedules it for destruction : /// if the reference count hits zero. : void ReleaseQueryExecState(QueryExecState* query_exec_state); > Who are the consumers of these APIs? I'm wondering if ref counting is reall The consumers of these 2 functions are: - The fragment itself in QueryExecState::ExecInstance(). - CancelPlanFragment() when a Cancel RPC is received. - PublishFilter() when a publish filter RPC is received. PS7, Line 74: e que > query's Done PS7, Line 74: > a Done PS7, Line 127: /// The thread pool that is in charge of destroying a QueryExecState when it is no : /// longer in use. : ThreadPool<QueryExecState*> destroy_thread_; > why does the QES destruction need to happen on a separate thread? Currently I've made it that way because if a fragment is the last user of the QES, that fragment will decrease the refcount to 0, which means it should be deleted. This happens in QueryExecState::ExecInstance(). If we delete it in the context of the same thread, it will be as if the QueryExecState is deleting itself which could cause memory corruption, etc. Which is why the destruction of the QES is handed off to the QueryExecMgr on a separate thread. -- To view, visit http://gerrit.cloudera.org:8080/3817 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: David Knupp <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
