Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/19571 )
Change subject: KUDU-3452 Create tablet without enough healthy tservers ...................................................................... Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG@7 PS3, Line 7: Create tablet without enough healthy tservers Having a Kudu cluster with just 3 tablet servers is already quite limiting -- if any of these 3 nodes fail, there isn't a place to replicas which are to migrate from the failed node. So, a minimum recommended number of tablet servers nodes is a Kudu cluster is 4 already. Allowing to create a table with RF=3 when only two tablet servers are alive is asking for a disaster, because the table's data will become non-available when just one more tablet server node goes down. Why to create tables with RF=3 then? You could create tables with RF=1 and have just one tablet server in your cluster. I'm not sure I understand the reasoning behind this change. http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG@10 PS3, Line 10: will retry continuously I'm not sure I understand what will retry continuously here? What is component that performs those retries once the request to create a table is rejected by the system catalog? http://gerrit.cloudera.org:8080/#/c/19571/3//COMMIT_MSG@14 PS3, Line 14: An already created tablet can still be on service even if one : of its 3 replicas become unavailable. What happens if another tablet server is going down? http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@418 PS3, Line 418: create creating http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/master/catalog_manager.cc@418 PS3, Line 418: enough healthy What does 'enough' here means? Please be more specific. Would be just 2 tablet servers enough for a table with replication factor of 5? http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc@9172 PS3, Line 9172: TEST_F(ToolTest, TestCreateTabletWithoutEnoughHealthyTServers) { Would be great to have a test for various replication factors here. In addition to RF=3, most commonly used ones are 1 and 5. http://gerrit.cloudera.org:8080/#/c/19571/3/src/kudu/tools/kudu-tool-test.cc@9235 PS3, Line 9235: RunActionStdoutString Why do you need RunActionStdoutString() here if there isn't any analysis performed on the result output? -- To view, visit http://gerrit.cloudera.org:8080/19571 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I742ba1ff770f5c8b1be5800334c29bec96e195c6 Gerrit-Change-Number: 19571 Gerrit-PatchSet: 3 Gerrit-Owner: Wang Xixu <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: KeDeng <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]> Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Reviewer: Yuqi Du <[email protected]> Gerrit-Comment-Date: Mon, 20 Mar 2023 18:01:45 +0000 Gerrit-HasComments: Yes
