Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5199: prevent hang on empty row batch exchange
......................................................................


Patch Set 2:

That seems pretty rare but possible.

In the common case where the limit is at the top level of the query, I think 
this would be fine, because the query would hit the limit on the coordinator, 
return successfully, and ignore any future errors. Those timeout errors would 
have to arrive after STREAM_EXPIRATION_TIME_MS (since if the closed stream was 
in the cache, we wouldn't report an error before that time). 

So I think in general the timeline would need to be:
* Limit is hit at t1
* The eos message is received at t2 >= t1 + STREAM_EXPIRATION_TIME_MS and an 
error is sent to the coordinator
* The query fails, but it would have succeeded at t3 > t2 if the error hadn't 
been received

If the limit is not at the top level, it seems possible but unlikely that the 
error would cancel a query that was on the path to succeeding. I'm not sure if 
it changes the odds of queries succeeding enough to worry about, given that the 
cluster is already unhealthy if these timeouts are occurring.

-- 
To view, visit http://gerrit.cloudera.org:8080/8005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib370ebe44e3bb34d3f0fb9f05aa6386eb91c8645
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: No

Reply via email to