Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12299 )

Change subject: IMPALA-2990: timeout unresponsive queries in coordinator
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12299/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12299/4/be/src/service/impala-server.cc@215
PS4, Line 215: DEFINE_int32(status_report_interval_ms, 5000,
> I think you're saying to set this to 1.1 * GetMaxReportRetryMs()?
right, I think I meant 0.1 * GetMaxReportRetryMs(). ie we need to check much 
more frequently than the timeout, or else we could overshoot it by a factor of 2


http://gerrit.cloudera.org:8080/#/c/12299/4/be/src/service/impala-server.cc@398
PS4, Line 398:
> Of course, even in this configuration, backends will report their final sta
k. We could deprecate it in a separate patch I suppose. Fine with this as is.


http://gerrit.cloudera.org:8080/#/c/12299/4/be/src/service/impala-server.cc@2288
PS4, Line 2288:   return Status::OK();
              : }
              :
              : void ImpalaServer::ExpireQuery(ClientRequestState* crs, const 
Status& status) {
              :   DCHECK(!status.ok());
              :   cancellation_thread_pool_->Offer(
> Yes, all of these flags are documented and settable.
k, would be interested what Michael thinks about that idea


http://gerrit.cloudera.org:8080/#/c/12299/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12299/7/be/src/service/impala-server.cc@218
PS7, Line 218: DEFINE_int32(status_report_max_retries, 3,
Is our timeout too aggressive here? If one of the backends goes into some kind 
of 15s process-wide pause (eg as we've seen in the past with tcmalloc releasing 
memory in https://issues.apache.org/jira/browse/IMPALA-2800) we'll end up 
cancelling. Am afraid that being too aggressive will end up cancelling queries 
incorrectly. Curious for Michael's thoughts


http://gerrit.cloudera.org:8080/#/c/12299/7/be/src/service/impala-server.cc@2258
PS7, Line 2258: max_lag_ms * 1.1
per comment elsewhere: don't we need to wake up more often than the timeout? 
otherwise we can overshoot by a factor of 2 here.



--
To view, visit http://gerrit.cloudera.org:8080/12299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I196c8c6a5633b1960e2c3a3884777be9b3824987
Gerrit-Change-Number: 12299
Gerrit-PatchSet: 7
Gerrit-Owner: Thomas Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Tue, 26 Mar 2019 23:16:03 +0000
Gerrit-HasComments: Yes

Reply via email to