Tim Armstrong has posted comments on this change. Change subject: IMPALA-5199: prevent hang on empty row batch exchange ......................................................................
Patch Set 2: That seems pretty rare but possible. In the common case where the limit is at the top level of the query, I think this would be fine, because the query would hit the limit on the coordinator, return successfully, and ignore any future errors. Those timeout errors would have to arrive after STREAM_EXPIRATION_TIME_MS (since if the closed stream was in the cache, we wouldn't report an error before that time). So I think in general the timeline would need to be: * Limit is hit at t1 * The eos message is received at t2 >= t1 + STREAM_EXPIRATION_TIME_MS and an error is sent to the coordinator * The query fails, but it would have succeeded at t3 > t2 if the error hadn't been received If the limit is not at the top level, it seems possible but unlikely that the error would cancel a query that was on the path to succeeding. I'm not sure if it changes the odds of queries succeeding enough to worry about, given that the cluster is already unhealthy if these timeouts are occurring. -- To view, visit http://gerrit.cloudera.org:8080/8005 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: No
