Wang Xixu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19571 )

Change subject: KUDU-3452 Create tablet without enough healthy tservers
......................................................................


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG@7
PS3, Line 7: Create tablet without enough healthy tservers
> Having a Kudu cluster with just 3 tablet servers is already quite limiting
Thank you for your comments. Please visit the related Jira and see the 
discussion.


http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG@10
PS3, Line 10: will retry continuously
> It seems that the catalog manager's 'bgtasks' thread will continuously try
Yes


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@417
PS3, Line 417: 
DEFINE_bool(enable_create_tablet_without_enough_healthy_tservers, false,
> We should also tag this new flag 'hidden', just as the 'catalog_manager_che
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@418
PS3, Line 418: creati
> creating
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@5933
PS3, Line 5933:       // NOTE: if we fail to select replicas on the first pass 
(due to
              :       // insufficient Tablet Servers being online), we will 
still try
              :       // again unless the tablet/table creation is cancelled.
              :       RETURN_NOT_OK_PREPEND(SelectReplicasForTablet(policy, 
tablet.get()),
              :                             Substitute("error selecting 
replicas for tablet $0",
              :                                        tablet->id()));
              :     }
              :   }
              :
              :   // Update the sys catalog with the new set of 
tablets/metadata.
              :   {
              :     SysCatalogTable::Actions actions;
              :     actions.tablets_to_add = deferred.tablets_to_add;
              :     actions.tablets_to_update = deferred.tablets_to_update;
              :     
RETURN_NOT_OK_PREPEND(sys_catalog_->Write(std::move(actions)),
              :                           "error persisting updated tablet 
metadata");
              :   }
              :
              :   // Ex
> How about encapsulate the logic into SelectNReplicasForTablet()? since it's
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@5933
PS3, Line 5933:       // NOTE: if we fail to select replicas on the first pass 
(due to
              :       // insufficient Tablet Servers being online), we will 
still try
              :       // again unless the tablet/table creation is cancelled.
              :       RETURN_NOT_OK_PREPEND(SelectReplicasForTablet(policy, 
tablet.get()),
              :                             Substitute("error selecting 
replicas for tablet $0",
              :                                        tablet->id()));
              :     }
              :   }
              :
              :   // Update the sys catalog with the new set of 
tablets/metadata.
              :   {
              :     SysCatalogTable::Actions actions;
              :     actions.tablets_to_add = deferred.tablets_to_add;
              :     actions.tablets_to_update = deferred.tablets_to_update;
              :     
RETURN_NOT_OK_PREPEND(sys_catalog_->Write(std::move(actions)),
              :                           "error persisting updated tablet 
metadata");
              :   }
              :
              :   // Ex
> +1
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@6006
PS3, Line 6006:                                                 "alive tablet 
servers",
> I didn't find any where else using SelectReplicasForTablet(), why not remov
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@6006
PS3, Line 6006:                                                 "alive tablet 
servers",
> +1
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc@9204
PS3, Line 9204:   NO_FATALS(cluster_->tablet_server(0)->Shutdown());
> nit: use cluster_->WaitForTabletServerCount() to ensure this.
WaitForTabletServerCount is a function of InternalMiniCluster, not a function 
of ExternalMiniCluster.  And I have not found a way to start an 
InternalMiniCluster with specified flags, like 
--catalog_manager_check_ts_count_for_create_table.


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc@9228
PS3, Line 9228:     ASSERT_EQ("UNDER_REPLICATED", items[i]["health"]);
> Would be great to test adding range partitions when there is only 2 tserver
Done


http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc@9235
PS3, Line 9235: rer->AddColumn("new_c
> Why do you need RunActionStdoutString() here if there isn't any analysis pe
Done



--
To view, visit http://gerrit.cloudera.org:8080/19571
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I742ba1ff770f5c8b1be5800334c29bec96e195c6
Gerrit-Change-Number: 19571
Gerrit-PatchSet: 4
Gerrit-Owner: Wang Xixu <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: KeDeng <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Mon, 27 Mar 2023 10:28:01 +0000
Gerrit-HasComments: Yes

Reply via email to