Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-3633: cancel fragment if coordinator is gone ......................................................................
IMPALA-3633: cancel fragment if coordinator is gone The bug is that return_val.status is an optional field, so setting the status without __isset is equivalent to Status::OK(). This meant that fragment did not get notified when reporting status if the coordinator had gone away. This means that is a cancel RPC was lost, we could be left with zombie fragments with no coordinator that kept on running until completion. Testing: I couldn't see a way to replicate this reliably with our existing test setup, since it requires some RPCs to be dropped to get into this state. I manually tested by commenting out CancelRemoteFragments(), starting a long-running query then cancelling it. Before the patch, perf top showed that the fragments continue to execute the query. After the patch, the fragments stopped executing quickly. Change-Id: I62ab6f4df7c0ee60c6aa6291513f9f0cbfac3fe7 Reviewed-on: http://gerrit.cloudera.org:8080/3238 Reviewed-by: Tim Armstrong <[email protected]> Tested-by: Internal Jenkins --- M be/src/service/impala-server.cc 1 file changed, 4 insertions(+), 7 deletions(-) Approvals: Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/3238 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I62ab6f4df7c0ee60c6aa6291513f9f0cbfac3fe7 Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
