Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21653 )

Change subject: IMPALA-12737: Limit workload management initialization to one 
coordinator.
......................................................................


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/21653/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21653/4//COMMIT_MSG@13
PS4, Line 13: The current design for this init process is to create both tables 
on
            : each coordinator at startup by running create table if not exists
            : DDLs
I still think the current approach that use CatalogD/HMS as for synchronization 
point is better. If the concern is WM blocking impalad startup, maybe we should 
decouple workload management init from impalad init and keep incoming queries 
in admission control queue until WM is ready. Abort if WM still not ready until 
X seconds.

Reading response from DESCRIBE EXTENDED, CREATE TABLE IF NOT EXIST, ALTER TABLE 
ADD COLUMN IF NOT EXISTS and act upon it seems more robust.


http://gerrit.cloudera.org:8080/#/c/21653/4//COMMIT_MSG@24
PS4, Line 24: The first coordinator that puts a message on the new
            : impala-workload-management statestore topic will be the 
coordinator
            : that runs all the DDLs to create the workload management tables.
This is assuming that all coordinators have the same version?
What will happen in zero downtime upgrade scenario where both new and old 
coordinator can coexist?

To me, it feels like CatalogD is a better daemon to initialize impala_query_log 
table. Have you look into that? (See CatalogOpExecutor::createTable()).
I understand current approach still require SQL parsing by Frontend Planner at 
Coordinator.


http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/impala-server.cc@566
PS4, Line 566: if (FLAGS_cluster_id.empty()) {
             :       wm_topic_name_ = 
Statestore::IMPALA_WORKLOAD_MANAGEMENT_TOPIC;
             :     } else {
             :       wm_topic_name_ =
             :           FLAGS_cluster_id + '-' + 
Statestore::IMPALA_WORKLOAD_MANAGEMENT_TOPIC;
             :     }
Please contain this into a function that can be reusable for other statestore 
topic.

$ git grep -n "FLAGS_cluster_id.*Statestore::"
be/src/scheduling/admission-controller.cc:671:        FLAGS_cluster_id + '-' + 
Statestore::IMPALA_REQUEST_QUEUE_TOPIC;
be/src/scheduling/cluster-membership-mgr.cc:97:    membership_topic_name_ = 
FLAGS_cluster_id + '-' + Statestore::IMPALA_MEMBERSHIP_TOPIC;


http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.h
File be/src/service/workload-management-init.h:

http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.h@27
PS4, Line 27: enum InitState {
            :   // Initial state.
Would be great if you can draw state transition or synchronization diagram for 
this InitState.
See an example in join-builder.h


http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc
File be/src/service/workload-management-init.cc:

http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@217
PS4, Line 217: VLOG(2)
Lets do LOG(INFO) instead, here and below.


http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@366
PS4, Line 366: init_ctx->id = PrintId(RandomUniqueID());
Can this be an identifier for an impalad instead? Perhaps "hostname:port"?


http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@379
PS4, Line 379:     topic_updates->push_back(build_topic_delta(init_ctx->id, 
INIT_DONE_MSG,
             :         wm_topic_name_));
Please add test with DebugAction where coordinator leader slow to send 
completion signal.


http://gerrit.cloudera.org:8080/#/c/21653/4/be/src/service/workload-management-init.cc@407
PS4, Line 407: VLOG(2)
I think it is worth to turn this log into LOG(INFO). Same too for other VLOG(2) 
below.



--
To view, visit http://gerrit.cloudera.org:8080/21653
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id645f94c8da73b91c13a23d7ac0ea026425f0f96
Gerrit-Change-Number: 21653
Gerrit-PatchSet: 4
Gerrit-Owner: Jason Fehr <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Thu, 08 Aug 2024 22:21:05 +0000
Gerrit-HasComments: Yes

Reply via email to