Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21653 )
Change subject: IMPALA-12737: Limit workload management initialization to one coordinator. ...................................................................... Patch Set 4: (8 comments) http://gerrit.cloudera.org:8080/#/c/21653/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21653/4//COMMIT_MSG@13 PS4, Line 13: The current design for this init process is to create both tables on : each coordinator at startup by running create table if not exists : DDLs I still think the current approach that use CatalogD/HMS as for synchronization point is better. If the concern is WM blocking impalad startup, maybe we should decouple workload management init from impalad init and keep incoming queries in admission control queue until WM is ready. Abort if WM still not ready until X seconds. Reading response from DESCRIBE EXTENDED, CREATE TABLE IF NOT EXIST, ALTER TABLE ADD COLUMN IF NOT EXISTS and act upon it seems more robust. http://gerrit.cloudera.org:8080/#/c/21653/4//COMMIT_MSG@24 PS4, Line 24: The first coordinator that puts a message on the new : impala-workload-management statestore topic will be the coordinator : that runs all the DDLs to create the workload management tables. This is assuming that all coordinators have the same version? What will happen in zero downtime upgrade scenario where both new and old coordinator can coexist? To me, it feels like CatalogD is a better daemon to initialize impala_query_log table. Have you look into that? (See CatalogOpExecutor::createTable()). I understand current approach still require SQL parsing by Frontend Planner at Coordinator. http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/impala-server.cc@566 PS4, Line 566: if (FLAGS_cluster_id.empty()) { : wm_topic_name_ = Statestore::IMPALA_WORKLOAD_MANAGEMENT_TOPIC; : } else { : wm_topic_name_ = : FLAGS_cluster_id + '-' + Statestore::IMPALA_WORKLOAD_MANAGEMENT_TOPIC; : } Please contain this into a function that can be reusable for other statestore topic. $ git grep -n "FLAGS_cluster_id.*Statestore::" be/src/scheduling/admission-controller.cc:671: FLAGS_cluster_id + '-' + Statestore::IMPALA_REQUEST_QUEUE_TOPIC; be/src/scheduling/cluster-membership-mgr.cc:97: membership_topic_name_ = FLAGS_cluster_id + '-' + Statestore::IMPALA_MEMBERSHIP_TOPIC; http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.h File be/src/service/workload-management-init.h: http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.h@27 PS4, Line 27: enum InitState { : // Initial state. Would be great if you can draw state transition or synchronization diagram for this InitState. See an example in join-builder.h http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc File be/src/service/workload-management-init.cc: http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@217 PS4, Line 217: VLOG(2) Lets do LOG(INFO) instead, here and below. http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@366 PS4, Line 366: init_ctx->id = PrintId(RandomUniqueID()); Can this be an identifier for an impalad instead? Perhaps "hostname:port"? http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@379 PS4, Line 379: topic_updates->push_back(build_topic_delta(init_ctx->id, INIT_DONE_MSG, : wm_topic_name_)); Please add test with DebugAction where coordinator leader slow to send completion signal. http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@407 PS4, Line 407: VLOG(2) I think it is worth to turn this log into LOG(INFO). Same too for other VLOG(2) below. -- To view, visit http://gerrit.cloudera.org:8080/21653 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96 Gerrit-Change-Number: 21653 Gerrit-PatchSet: 4 Gerrit-Owner: Jason Fehr <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Thu, 08 Aug 2024 22:21:05 +0000 Gerrit-HasComments: Yes
