Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14824 )
Change subject: IMPALA-9199: Add support for single query retries on cluster membership changes ...................................................................... Patch Set 7: (1 comment) > Instead of creating a separate doc, think it might be easier to > just post comments here. > > I've done some re-factoring locally and this is what I have so far: > > * Created a "QueryDriver", which (as suggested by Thomas) is a > layer of abstraction of on top of ClientRequestState > * "QueryDriver" owns the ClientRequestState; all access to the > ClientRequestState goes through the QueryDriver > * Since the first patch will only support a single retry, the > QueryDriver has a regular client_request_state_ and a > retried_client_request_state_ (one for the original query and the > other for the retried query) > * QueryDriver owns the TExecRequest that is used by both the > original and retried queries > * The "transparent" part of this feature is now handled by the > QueryDriver (since the QueryDriver owns all access to the > ClientRequestState) > * The QueryDriver has two methods: GetActiveClientRequestState() > and GetClientRequestState(query_id) > * GetActiveClientRequestState() returns the CRS for the "active" > query - e.g. if the query has not been retried yet, it returns the > original query; if the query has been retried, it returns the > retried query > * GetClientRequestState(query_id) returns the CRS for the query > that matches the given 'query_id' - this allows clients to get the > CRS of exactly the query_id they are looking for > * ImpalaServer creates a QueryDriver for each query; the > ImpalaServer::client_request_state_map_ has been replaced with the > ImpalaServer::query_driver_map_ which maps query ids to > QueryDrivers > * The query_driver_map_ can have key, values pairs with the same > value (e.g. the same QueryDriver) > * Moved all of the retry code (along with the retry threadpool) > into QueryDriver (specifically QueryDriver::RetryQueryFromThreadPool) > > Example: > > * Client submits a query for execution > * Impala creates a QueryDriver for the query > * QueryDriver::CreateClientRequestState creates a ClientRequestState > for the query (this CRS is owned by the QueryDriver) > * ImpalaServer::RegisterQuery "registers" the query and adds an > entry to ImpalaServer::query_driver_map_ that maps the query id to > the QueryDriver > * QueryDriver::RunFrontendPlanner creates the TExecRequest (owned > by the QueryDriver) > * Eventually, the query fails and is retried > * QueryDriver::RetryQueryFromThreadPool contains all the retry > logic (moved from ImpalaServer) > * It cancels the original query, creates the new one and runs it > * It calls ImpalaServer::RegisterQuery again, but with the retried > query id > * It manipulates some internal pointers so that retried_client_request_state_ > now points to the CRS of the retried query That all sounds good to me http://gerrit.cloudera.org:8080/#/c/14824/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/14824/5/be/src/service/impala-server.cc@1025 PS5, Line 1025: query_ctx->query_id = UuidToQueryId(random_generator()()); > I'll try thinking through this some more. One challenge with creating an in Sure, I think this can probably be deferred for now and we can go with your original approach. -- To view, visit http://gerrit.cloudera.org:8080/14824 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e4a0e72a9bf8ec10b91639aefd81bef17886ddd Gerrit-Change-Number: 14824 Gerrit-PatchSet: 7 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Comment-Date: Mon, 24 Feb 2020 21:25:44 +0000 Gerrit-HasComments: Yes
