Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14824 )
Change subject: IMPALA-9199: Add support for single query retries on cluster membership changes ...................................................................... Patch Set 13: (5 comments) addressed comments, fixed a possible race condition when a query gets cancelled and unregistered while a retry is being setup. http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.h File be/src/runtime/query-driver.h: http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.h@169 PS13, Line 169: > ClientRequestState has a pointer back to the QueryDriver (as does the Coord Fixed. The Coordinator and ClientRequestState now take a shared_ptr to the QueryDriver http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@161 PS13, Line 161: void QueryDriver::RetryQueryFromThread(const Status& error) { > Is there anything that prevents the QueryDriver from being deleted while th hmm yeah that's true. I changed it so that RetryQueryFromThread takes in a shared_ptr to the current QueryDriver. its a bit odd that the method has to take in a reference to itself, but I couldn't see a better way. http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@178 PS13, Line 178: // Cancel the query. > Is this necessary? At least in the case of Coordinator calling TryQueryRetr I don't think 'ImpalaServer::CancelFromThreadPool' (which calls TryQueryRetry() will call Cancel. Its probably possible to remove this, but I thought it was a nice property that you want to ensure the query is cancelled before trying to retry it. http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@179 PS13, Line 179: true, nullptr > Both of these parameters seem wrong - you note above that a query could get I changed the 'check_inflight' parameter to false. ClientRequestState::MarkAsRetrying explicitly sets the query_status_ in the ClientRequestState. The issue with calling Cancel with a cause is that it will call UpdateQueryStatus, which sets the ExecState to ERROR, which you don't want because the ERROR state will get exposed to the client, which you don't want to happen since the query is being retried. http://gerrit.cloudera.org:8080/#/c/14824/13/be/src/runtime/query-driver.cc@281 PS13, Line 281: // TODO: IMPALA-9502: Avoid copying TExecRequest when retrying queries > Out of curiosity, what are the issues here? hmm can't remember, I replaced this with 'move(exec_request_)' and all the tests and a run of the stress test passes. I added this a long time ago on a much older version of this feature, so it is possible it got fixed as some point inadvertently. -- To view, visit http://gerrit.cloudera.org:8080/14824 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2e4a0e72a9bf8ec10b91639aefd81bef17886ddd Gerrit-Change-Number: 14824 Gerrit-PatchSet: 13 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Comment-Date: Fri, 01 May 2020 21:33:54 +0000 Gerrit-HasComments: Yes
