Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16369 )
Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes ...................................................................... Patch Set 11: (3 comments) http://gerrit.cloudera.org:8080/#/c/16369/8/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16369/8/be/src/scheduling/admission-controller.cc@1120 PS8, Line 1120: if (!request.blacklisted_executor_addresses.empty()) { : // This debug_action is to allow delay to be injected for the admission of a retried : // query, which is triggered by blacklisted node, before getting membership snapshot. : DebugActionNoFail(FLAGS_debug_actions, "ADMISSION_DELAY_FOR_RETRIED_QUERY"); : } > No, it does not work for adding new cases which need to trigger blacklist t why would the original attempt fail due to a timeout instead of an RPC error? http://gerrit.cloudera.org:8080/#/c/16369/11/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16369/11/be/src/scheduling/admission-controller.cc@1579 PS11, Line 1579: queue_node->not_admitted_reason = REASON_NO_EXECUTOR_GROUPS; might want to add "LOG(WARNING) << queue_node->not_admitted_reason;" as well to be consistent with the other usage of REASON_NO_EXECUTOR_GROUPS. http://gerrit.cloudera.org:8080/#/c/16369/11/be/src/scheduling/admission-controller.cc@1575 PS11, Line 1575: if (ret.code() == TErrorCode::NO_REGISTERED_BACKENDS && num_executors == 0) { : // Retried query was failed since all executors were blacklisted. To keep : // consistent with the other blacklisting logic, we should set : // not_admitted_reason as REASON_NO_EXECUTOR_GROUPS. : queue_node->not_admitted_reason = REASON_NO_EXECUTOR_GROUPS; : return Status::OK(); : } couldn't you do this before calling scheduler()->Schedule(...)? if num_executors == 0 then that should be sufficient to return REASON_NO_EXECUTOR_GROUPS right? that would remove the the need to eliminate the DCHECK in Scheduler::ComputeScanRangeAssignment as well, right? -- To view, visit http://gerrit.cloudera.org:8080/16369 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105 Gerrit-Change-Number: 16369 Gerrit-PatchSet: 11 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Thu, 10 Sep 2020 23:42:19 +0000 Gerrit-HasComments: Yes
