Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/18057 )
Change subject: KUDU-3341: No further retry when DeleteTabet with WRONG_SERVER_UUID error ...................................................................... Patch Set 1: (6 comments) Thanks for the fix! http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/integration-tests/delete_tablet-itest.cc File src/kudu/integration-tests/delete_tablet-itest.cc: http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/integration-tests/delete_tablet-itest.cc@240 PS1, Line 240: FLAGS_follower_unavailable_considered_failed_sec = 10; To speed up the test, could we reduce the Raft heartbeat interval down to something like 100ms, and set the unavailable time to something like 2 seconds instead of 20? http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/integration-tests/delete_tablet-itest.cc@241 PS1, Line 241: const int kNumTablets = 5; : const int kNumTabletServers = 5; Does this test actually need 5 servers? Would we get the same test coverage for the fix with 3? http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/integration-tests/delete_tablet-itest.cc@251 PS1, Line 251: Status s = v.RunKsck(); : ASSERT_OK(s); nit: could reduce to a single line? ASSERT_OK(v.RunKsck()); ? http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/integration-tests/delete_tablet-itest.cc@267 PS1, Line 267: workload.StopAndJoin(); nit: we never called workload.Start(), so there's no need to call StopAndJoin(). http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/master/catalog_manager.cc@4147 PS1, Line 4147: case TabletServerErrorPB::ALREADY_INPROGRESS: Should we call MarkComplete() for this and TABLET_NOT_FOUND? It's a CAS with kStateRunning anyway, so a subsequent call to MarkFailed() would be a no-op. http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/mini-cluster/internal_mini_cluster.cc File src/kudu/mini-cluster/internal_mini_cluster.cc: http://gerrit.cloudera.org:8080/#/c/18057/1/src/kudu/mini-cluster/internal_mini_cluster.cc@230 PS1, Line 230: push_back(shared_ptr< nit: we could construct in place with mini_tablet_servers_.emplace_back(make_shared<MiniTabletServer>(tablet_server.release()); -- To view, visit http://gerrit.cloudera.org:8080/18057 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieaa36086300bda7f958570c690b951dc090c342a Gerrit-Change-Number: 18057 Gerrit-PatchSet: 1 Gerrit-Owner: Yifan Zhang <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 30 Nov 2021 05:50:37 +0000 Gerrit-HasComments: Yes
