Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10591 )
Change subject: KUDU-1038 Deleting a tablet should also delete its log recovery directory, if any ...................................................................... Patch Set 7: (23 comments) http://gerrit.cloudera.org:8080/#/c/10591/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10591/7//COMMIT_MSG@18 PS7, Line 18: . after the next tablet copy. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.h File src/kudu/consensus/log.h: http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.h@143 PS7, Line 143: . add: " if it exists" http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.h@145 PS7, Line 145: RemoveRecoveryDir How about renaming this to RemoveRecoveryDirIfExists() ? http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc File src/kudu/consensus/log.cc: http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@996 PS7, Line 996: not nit: this syntax is neat, but for consistency with the rest of the Kudu C++ code base please use "!" http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@997 PS7, Line 997: nit: extra space, also make this a VLOG(1) to avoid spamming the server logs http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@1002 PS7, Line 1002: LOG All of these logs should also log the prefix "T <tablet id> P <peer uuid>: " with them, which is what LOG_WITH_PREFIX() normally does with non-static methods, so you can calculate that once at the top like: const auto kLogPrefix = Substitute("T $0 P $1: ", tablet_id, fs_manager->permanent_uuid()); Then LOG or VLOG like: LOG(INFO) << kLogPrefix << "some message"; http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@1005 PS7, Line 1005: LOG All of these LOG messages used to be VLOG_WITH_PREFIX(1) and I think we should put them back to be VLOG(1) with kLogPrefix in there. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/consensus/log.cc@1006 PS7, Line 1006: nit: line up your indentation with the previous << here and elsewhere http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc File src/kudu/integration-tests/recover_tablet-itest.cc: http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@43 PS7, Line 43: nit: remove multiple extra lines (one empty line is good) http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@59 PS7, Line 59: KUDU-1038 nit: Regression test for KUDU-1038. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@60 PS7, Line 60: . or is not fsynced before a crash http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@61 PS7, Line 61: On server restart, the replica with the deleted log segment enters a failed state. : // (In the presence of the WAL recovery dir (with deleted log segments), : // the replica enters a FAILED state and fails to bootstrap on subsequent attempts) slight rewrite: On server restart, the replica with the deleted log segment enters a FAILED state after being unable to complete replay of the WAL and leaves the WAL recovery directory in place. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@64 PS7, Line 64: 4. This test shows that the failed replica successfully bootstraps if the recovery dir : // is deleted(if it exists) when we delete (tombstone) the tablet data suggestion: The master should tombstone the FAILED replica, causing its recovery directory to be deleted. A subsequent tablet copy and tablet bootstrap should cause the replica to become healthy again. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@66 PS7, Line 66: TEST_F(TabletRecoveryITest, TestTabletRecoveryAfterSegmentDelete) { Can you move this test to ts_recovery-itest and make this part of TsRecoveryITest? http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@79 PS7, Line 79: Write data to the tablet. This actually just creates a new tablet. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@87 PS7, Line 87: id nit: missing trailing period per style guide per https://google.github.io/styleguide/cppguide.html#Punctuation,_Spelling_and_Grammar http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@96 PS7, Line 96: while (write_workload.rows_inserted() < 100) { : SleepFor(MonoDelta::FromMilliseconds(10)); : } you can get rid of this http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@107 PS7, Line 107: LOG(INFO) << "Stopping workload..."; remove http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@111 PS7, Line 111: LOG(INFO) << "Getting cstate..."; remove this debugging log message http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@117 PS7, Line 117: // Shutdown the cluster add a period at the end of your comment sentences, here and elsewhere in this file http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@133 PS7, Line 133: // Delete WAL segment at index 2 This comment describes the next line of code and is basically superfluous, consider saying why we are doing this, i.e.: // Delete a WAL segment in the middle of other log segments so at tablet startup time we will detect out-of-order WAL segments during log replay and fail to bootstrap. http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@141 PS7, Line 141: //Restart the server remove this comment http://gerrit.cloudera.org:8080/#/c/10591/7/src/kudu/integration-tests/recover_tablet-itest.cc@144 PS7, Line 144: SleepFor(MonoDelta::FromSeconds(3)); remove this sleep -- To view, visit http://gerrit.cloudera.org:8080/10591 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3f017caf41ac4e00bf89421d6e73924b9fdcbd0 Gerrit-Change-Number: 10591 Gerrit-PatchSet: 7 Gerrit-Owner: Anupama Gupta <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Thu, 07 Jun 2018 01:22:33 +0000 Gerrit-HasComments: Yes
