Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16369 )

Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes
......................................................................


Patch Set 13:

(4 comments)

looking better, still some concerns about the logic in admission-controller.cc

http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1555
PS13, Line 1555:     const ExecutorGroup* filtered_executor_group =
its probably cleaner to make this a unique_ptr so that you don't have to 
manually call 'delete' whenever the method returns. that actually probably 
removes the need for the 'created' variable as well. in general, we try to 
prefer smart pointers over calling 'delete'.


http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1562
PS13, Line 1562:     if (created && num_executors == 0) {
               :       // All executors are blacklisted for retried query.
               :       delete filtered_executor_group;
               :       continue;
               :     }
I'm not sure if this is correct; before this patch, 'executor_group' could by 
the empty executor group returned by 'GetEmptyExecutorGroup' which is used for 
scheduling coordinator-only queries. I would expect 
'GetEmptyExecutorGroup()->NumExecutors()' would return 0.

so it seems like adding this logic would break coordinator-only queries.

I think you probably want to skip this check if the 'executor_group' == 
GetEmptyExecutorGroup()'.


http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1568
PS13, Line 1568: cluster_membership_mgr_->GetEmptyExecutorGroup() == 
filtered_executor_group
I don't think this is a valid comparison using '=='. IIUC, the '==' is not 
overloaded for the ExecutorGroup object, so this comparison is comparing if the 
objects themselves are the same, which doesn't seem possible given that 
GetFilteredExecutorGroup creates a new ExecutorGroup.


http://gerrit.cloudera.org:8080/#/c/16369/13/be/src/scheduling/admission-controller.cc@1579
PS13, Line 1579:   if (output_schedules->empty()) {
               :     // Retried query could not be scheduled since all 
executors are blacklisted.
               :     // To keep consistent with the other blacklisting logic, 
set not_admitted_reason as
               :     // REASON_NO_EXECUTOR_GROUPS.
               :     queue_node->not_admitted_reason = 
REASON_NO_EXECUTOR_GROUPS;
               :     LOG(WARNING) << queue_node->not_admitted_reason;
               :   }
can this be moved before the call to 'Scheduler()'. it should be possible, 
although you might have to re-factor the for loop a bit.



--
To view, visit http://gerrit.cloudera.org:8080/16369
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105
Gerrit-Change-Number: 16369
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Fri, 11 Sep 2020 18:16:34 +0000
Gerrit-HasComments: Yes

Reply via email to