Wenzhe Zhou has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16900 )
Change subject: IMPALA-10259: Fixed DCHECK error for backend in terminal state ...................................................................... IMPALA-10259: Fixed DCHECK error for backend in terminal state This issue happened for cdpd-master core ASAN build. According to log message, one backend sent status report with instance_exec_status as done for all assigned instances, then it sent last status report with error. The coordinator marked the backend state as done after it processed the status report with instance_exec_status as done, but did not apply last status report with error to the backend state. This caused backend to receive a response with status as OK for the last status report, hence hit DCHECK error. To fix the bug, the coordinator need to check the overall-status of exec status report even if the num_remaining_instances_ of the BackendState reach to 0. Testing: - Manual tests I could only reproduce the situation by adding some artificial delays in the beginning of QueryState::ErrorDuringExecute() when repeatedly running test case test_spilling.py:: TestSpillingDebugActionDimensions::test_spilling_naaj. Verified that the issue did not happen after applying this patch. - Passed exhaustive test. Change-Id: Ic12a80e20ddc11e32349edfec2bd16338c24b841 --- M be/src/runtime/coordinator-backend-state.cc 1 file changed, 17 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16900/2 -- To view, visit http://gerrit.cloudera.org:8080/16900 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic12a80e20ddc11e32349edfec2bd16338c24b841 Gerrit-Change-Number: 16900 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
