Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/18093 )
Change subject: IMPALA-11033: Add support for specifying multiple executor group sets ...................................................................... Patch Set 3: (14 comments) http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/runtime/exec-env.cc@178 PS2, Line 178: const ClusterMembershipMgr::ExpectedExecGroupSets& expected_exec_group_sets, > typedef for ExecGroupSets Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr-test.cc File be/src/scheduling/cluster-membership-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr-test.cc@556 PS2, Line 556: ClusterMembershipMgr::ExpectedExecGroupSets expected_exec_group_sets; > use ExecGroupSets typedef here. Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr-test.cc@753 PS2, Line 753: EXPECT_EQ(update_req.exec_group_sets[0].curr_num_executors, 1); > negative test for duplicate names or invalid parameters? The inputs for PopulateExecutorMembershipRequest are from the cluster snapshot and the ExpectedExecGroupSets which are already parsed and error checked, so none of those checks added for it. However adding a test for duplicate check for TestPopulateExpectedExecGroupSets http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h@173 PS2, Line 173: /// Returns a pointer to the static empty group reserved for scheduling coord only > use ExecGroupSets typedef Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h@218 PS2, Line 218: > It will be nice if the current #executors is stored in this data structure. That details is extracted from the cluster snapshot and populated in PopulateExecutorMembershipRequest http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h@221 PS2, Line 221: /// Parses the --expected_executor_group_sets startup flag and populates > use ExecGroupSets typedef Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h@229 PS2, Line 229: > use ExecGroupSets typedef Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h@229 PS2, Line 229: > I wonder if we need to use 'expected' in the name. To me, the concept of a Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.h@285 PS2, Line 285: /// The frontend uses cluster membership information to determine whether it expects the > use ExecGroupSets typedef Done http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.cc@685 PS2, Line 685: parsed_group_prefixes.insert(group_prefix); > How do we deal with a common situation where a small number of exec groups For the first part of your comment: FE is sent the max size of an executor group among all groups in the set, so it would plan based off of that and does not need to consider individual groups in the set separately. (See PopulateExecutorMembershipRequest for how the update to FE is staged) For the second part: This method is only used to parse the startup flag and does not have cluster's current state. The reason I chose to sort here by expected size is because in general multiple exec group sets would have a non-trivial difference in their sizes and a min executor count for a group which usually is set to 90% to be considered healthy will prevent situations where exec group sets switch places in this order when an update is sent to FE. For, eg set1 -> expected size 10 -> min executors to consider healthy: 9 set2 -> 20 -> min executors to consider healthy: 18 http://gerrit.cloudera.org:8080/#/c/18093/2/be/src/scheduling/cluster-membership-mgr.cc@688 PS2, Line 688: Substitute("Invalid executor group set format: $0", group.ToString())); > could either DCHECK first.second!=second.second or make strong order to han added check for group prefix duplicates. However, allowing duplicates on the expected number as there can be a use case (though unlikely, for eg, 2 group sets with same group size but different defaults for resource configs like mt_dop). Let me know if you think we should be more stringent on these allowances. http://gerrit.cloudera.org:8080/#/c/18093/2/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/18093/2/common/thrift/Frontend.thrift@716 PS2, Line 716: cur > max or current? updated the desc to be more explicit http://gerrit.cloudera.org:8080/#/c/18093/2/common/thrift/Frontend.thrift@720 PS2, Line 720: no healthy executor groups are present > nit. "curr_num_executors is 0" is better and clear. Done http://gerrit.cloudera.org:8080/#/c/18093/2/fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java File fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java: http://gerrit.cloudera.org:8080/#/c/18093/2/fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java@71 PS2, Line 71: exec_group_sets > Need a call like Since this functionality is not used directly in this patch I deferred it to IMPALA-10992 where the implementation is free to choose its API on how to expose/use this. Hence the TODO was added. -- To view, visit http://gerrit.cloudera.org:8080/18093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e0a3a5fe2b1f0b7507b7c096b7a3c373bc2e684 Gerrit-Change-Number: 18093 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Wed, 15 Dec 2021 21:38:33 +0000 Gerrit-HasComments: Yes
