Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14824 )

Change subject: IMPALA-9199: Add support for single query retries on cluster 
membership changes
......................................................................


Patch Set 13:

(5 comments)

addressed comments, fixed a possible race condition when a query gets cancelled 
and unregistered while a retry is being setup.

http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.h@169
PS13, Line 169:
> ClientRequestState has a pointer back to the QueryDriver (as does the Coord
Fixed. The Coordinator and ClientRequestState now take a shared_ptr to the 
QueryDriver


http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@161
PS13, Line 161: void QueryDriver::RetryQueryFromThread(const Status& error) {
> Is there anything that prevents the QueryDriver from being deleted while th
hmm yeah that's true. I changed it so that RetryQueryFromThread takes in a 
shared_ptr to the current QueryDriver. its a bit odd that the method has to 
take in a reference to itself, but I couldn't see a better way.


http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@178
PS13, Line 178:   // Cancel the query.
> Is this necessary? At least in the case of Coordinator calling TryQueryRetr
I don't think 'ImpalaServer::CancelFromThreadPool' (which calls TryQueryRetry() 
will call Cancel. Its probably possible to remove this, but I thought it was a 
nice property that you want to ensure the query is cancelled before trying to 
retry it.


http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@179
PS13, Line 179: true, nullptr
> Both of these parameters seem wrong - you note above that a query could get
I changed the 'check_inflight' parameter to false.
ClientRequestState::MarkAsRetrying explicitly sets the query_status_ in the 
ClientRequestState. The issue with calling Cancel with a cause is that it will 
call UpdateQueryStatus, which sets the ExecState to ERROR, which you don't want 
because the ERROR state will get exposed to the client, which you don't want to 
happen since the query is being retried.


http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@281
PS13, Line 281:   // TODO: IMPALA-9502: Avoid copying TExecRequest when 
retrying queries
> Out of curiosity, what are the issues here?
hmm can't remember, I replaced this with 'move(exec_request_)' and all the 
tests and a run of the stress test passes. I added this a long time ago on a 
much older version of this feature, so it is possible it got fixed as some 
point inadvertently.



--
To view, visit http://gerrit.cloudera.org:8080/14824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e4a0e72a9bf8ec10b91639aefd81bef17886ddd
Gerrit-Change-Number: 14824
Gerrit-PatchSet: 13
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Fri, 01 May 2020 21:33:54 +0000
Gerrit-HasComments: Yes

Reply via email to