Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16369 )
Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes ...................................................................... Patch Set 13: (4 comments) http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1555 PS13, Line 1555: const ExecutorGroup* filtered_executor_group = > its probably cleaner to make this a unique_ptr so that you don't have to ma Since the "executor_group" is a constant object and maybe used by other queries, we cannot modify it. In case we have to filter out some executor with blacklisted_executor_addresses list, we have to create a temporary executor group. But in most cases especially the first attempt, we don't need to create a temporary executor group. If no new executor group object is created, "filtered_executor_group" set as "executor_group". Not sure if we can use smart pointer to cover these cases. http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1562 PS13, Line 1562: if (created && num_executors == 0) { : // All executors are blacklisted for retried query. : delete filtered_executor_group; : continue; : } > I'm not sure if this is correct; before this patch, 'executor_group' could If it's GetEmptyExecutorGroup(), ExecutorGroup::GetFilteredExecutorGroup will not create new executor group object so that "created" is false. "created" eqaul true only if at least one executor is filtered from executor group. That's way we check both "created" and num_executors. I can make code more easy to understand by comparing with GetEmptyExecutorGroup(), like if (num_executors == 0 && filtered_executor_group != cluster_membership_mgr_->GetEmptyExecutorGroup()) http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1568 PS13, Line 1568: cluster_membership_mgr_->GetEmptyExecutorGroup() == filtered_executor_group > I don't think this is a valid comparison using '=='. IIUC, the '==' is not This is to compare two pointers, not objects pointed by pointers. We have similar DCHECK in line 1546. http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1579 PS13, Line 1579: if (output_schedules->empty()) { : // Retried query could not be scheduled since all executors are blacklisted. : // To keep consistent with the other blacklisting logic, set not_admitted_reason as : // REASON_NO_EXECUTOR_GROUPS. : queue_node->not_admitted_reason = REASON_NO_EXECUTOR_GROUPS; : LOG(WARNING) << queue_node->not_admitted_reason; : } > can this be moved before the call to 'Scheduler()'. it should be possible, In the future, we may support multiple executor groups. If we move the code into the loop before calling Scheduler:schdule(), it will skip other executor groups. We reach here only all Scheduler:schdule() calling are skipped. -- To view, visit http://gerrit.cloudera.org:8080/16369 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105 Gerrit-Change-Number: 16369 Gerrit-PatchSet: 13 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Fri, 11 Sep 2020 19:53:53 +0000 Gerrit-HasComments: Yes
