Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17915 )
Change subject: KUDU-1959 - Implement tests for the startup web page for Kudu servers ...................................................................... Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/17915/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17915/1//COMMIT_MSG@11 PS1, Line 11: - We inject latency to bootstrap tablets while reading the webpage every : 10 milliseconds and validating the status for each step. : - Fail a data directory and validate the status of each startup step. : For the master, : - We start it up and validate the status of each startup step. nit: make the spacing consistent? http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/master/master.cc File src/kudu/master/master.cc: http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/master/master.cc@a227 PS1, Line 227: : : : : : : nit: seems like this patch is more than just adding tests. Consider separating this into two patches: - fixing the master timers, which includes master tests - adding tests for tservers http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc File src/kudu/tserver/tablet_server-test.cc: http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@803 PS1, Line 803: bool This will be racy, so we should use an atomic<bool> http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@811 PS1, Line 811: continue; nit: maybe sleep here too so we're not a tight loop while starting up. Same below http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@827 PS1, Line 827: { : is_started = true; : read_startup_page.join(); : }); nit: in our codebase this is usually formatted as SCOPED_CLEANUP({ is_started = true; read_startup_page.join(); }); Same below. http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@846 PS1, Line 846: // Fail a data directory. : FLAGS_env_inject_eio_globs = fs_manager->GetDataRootDirs()[1]; : FLAGS_env_inject_eio = 1.0; nit: it's probably worth moving this down to right below calling Start() http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@862 PS1, Line 862: ASSERT_STR_MATCHES(thread_buf.ToString(), "\"init_status\":[100|0]" : ".*\"read_filesystem_status\":[100|0]" : ".*\"read_instance_metadatafiles_status\":[100|0]" : ".*\"read_data_directories_status\":" : "([0-9]|[1-9][0-9]|100)" : ".*\"start_tablets_status\":([0-9]|[1-9][0-9]|100)" : ".*\"start_rpc_server_status\":[100|0]"); nit: maybe define this in an anonymous namespace as a helper function that validates that an input string matches this, maybe returning a bool? Same with the 100 check. http://gerrit.cloudera.org:8080/#/c/17915/1/src/kudu/tserver/tablet_server-test.cc@890 PS1, Line 890: INSTANTIATE_TEST_SUITE_P(Params, TabletServerStartupWebPageTest, : ::testing::ValuesIn(BlockManager::block_manager_types())); Given the above tests are near identical, consider using ::testing::Combine() to cross test block manager types with whether we're injecting disk failures See TsLocationAssignmentITest for an example -- To view, visit http://gerrit.cloudera.org:8080/17915 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80880ecbda6a9f6db85baaf109af7e701111328d Gerrit-Change-Number: 17915 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Wed, 27 Oct 2021 00:39:02 +0000 Gerrit-HasComments: Yes
