Adar Dembo has posted comments on this change. Change subject: disk failure: test coverage for disk failure recovery ......................................................................
Patch Set 7: (16 comments) I didn't look at the test failures. Looks good, though I'd recommend you go over each "compound" test function (i.e. a function that does a ton of work) and add NO_FATALS and/or ASSERT_OK as needed. http://gerrit.cloudera.org:8080/#/c/7031/7/src/kudu/integration-tests/disk_failure-itest.cc File src/kudu/integration-tests/disk_failure-itest.cc: PS7, Line 51: uint32_t Simple int would be fine here, I think. Line 67: if (FLAGS_block_manager == "file") { Would it be useful to parameterize these tests and run them on both block managers? PS7, Line 68: ** Isn't this middle entry also '??'? I thought block paths were data/ab/cd/abcdefghijklmnop. Line 91: vector<TServerDetails*> tservers; Maybe omit this and iterate on the tablet_servers_ map directly? Line 150: if (counts_after - counts_before > 0) { Would if counts_after > counts_before be simpler? Line 176: workload->StopAndJoin(); IIUC, we're writing just enough to figure out who is writing where, and then we stop? Line 184: ext_tserver->GetInt64Metric(&METRIC_ENTITY_server, nullptr, &METRIC_data_dirs_failed, Need ASSERT_OK here too? Line 200: ASSERT_OK(row->SetInt32(0, i + start_row)); Got some tabs here. Line 204: ignore_result(session->Flush()); Flushing one row at a time? You sure you don't want to flush the whole thing together? Line 229: vector<ExternalTabletServer*> ext_tservers = SetupDefaultCluster(); Need NO_FATALS here? Line 233: workload_a.set_write_pattern(TestWorkload::WritePattern::INSERT_SEQUENTIAL_ROWS); Why is sequential important? PS7, Line 238: this avoid this to avoid Line 243: GetDataDirsWrittenToByWorkload(ext_tservers, &workload_a, 3, &dirs_with_A_data, Need NO_FATALS on these calls, right? PS7, Line 256: MonoDelta::FromSeconds(120) Maybe define this delta once and reuse it? Line 259: SetServerSurvivalFlags(ext_tservers); NO_FATALS here. Line 271: WaitForDiskFailures(ext_tservers[0]); NO_FATALS on these too. -- To view, visit http://gerrit.cloudera.org:8080/7031 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8ff63ec71ab718866484b9f3ec7264bc72ecfe97 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
