Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14824 )

Change subject: IMPALA-9199: Add support for single query retries on cluster 
membership changes
......................................................................


Patch Set 7:

Instead of creating a separate doc, think it might be easier to just post 
comments here.

I've done some re-factoring locally and this is what I have so far:

* Created a "QueryDriver", which (as suggested by Thomas) is a layer of 
abstraction of on top of ClientRequestState
* "QueryDriver" owns the ClientRequestState; all access to the 
ClientRequestState goes through the QueryDriver
* Since the first patch will only support a single retry, the QueryDriver has a 
regular client_request_state_ and a retried_client_request_state_ (one for the 
original query and the other for the retried query)
* QueryDriver owns the TExecRequest that is used by both the original and 
retried queries
* The "transparent" part of this feature is now handled by the QueryDriver 
(since the QueryDriver owns all access to the ClientRequestState)
* The QueryDriver has two methods: GetActiveClientRequestState() and 
GetClientRequestState(query_id)
* GetActiveClientRequestState() returns the CRS for the "active" query - e.g. 
if the query has not been retried yet, it returns the original query; if the 
query has been retried, it returns the retried query
* GetClientRequestState(query_id) returns the CRS for the query that matches 
the given 'query_id' - this allows clients to get the CRS of exactly the 
query_id they are looking for
* ImpalaServer creates a QueryDriver for each query; the 
ImpalaServer::client_request_state_map_ has been replaced with the 
ImpalaServer::query_driver_map_ which maps query ids to QueryDrivers
* The query_driver_map_ can have key, values pairs with the same value (e.g. 
the same QueryDriver)
* Moved all of the retry code (along with the retry threadpool) into 
QueryDriver (specifically QueryDriver::RetryQueryFromThreadPool)

Example:

* Client submits a query for execution
* Impala creates a QueryDriver for the query
* QueryDriver::CreateClientRequestState creates a ClientRequestState for the 
query (this CRS is owned by the QueryDriver)
* ImpalaServer::RegisterQuery "registers" the query and adds an entry to 
ImpalaServer::query_driver_map_ that maps the query id to the QueryDriver
* QueryDriver::RunFrontendPlanner creates the TExecRequest (owned by the 
QueryDriver)
* Eventually, the query fails and is retried
* QueryDriver::RetryQueryFromThreadPool contains all the retry logic (moved 
from ImpalaServer)
* It cancels the original query, creates the new one and runs it
* It calls ImpalaServer::RegisterQuery again, but with the retried query id
* It manipulates some internal pointers so that retried_client_request_state_ 
now points to the CRS of the retried query


--
To view, visit http://gerrit.cloudera.org:8080/14824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e4a0e72a9bf8ec10b91639aefd81bef17886ddd
Gerrit-Change-Number: 14824
Gerrit-PatchSet: 7
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Thu, 13 Feb 2020 19:05:45 +0000
Gerrit-HasComments: No

Reply via email to