Mike Percy has posted comments on this change. Change subject: disk failure: coordinate error handling ......................................................................
Patch Set 20: (2 comments) http://gerrit.cloudera.org:8080/#/c/7029/20/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: Line 1072: void TSTabletManager::FailTabletsInDataDir(const string& uuid) { Why don't we just have this be a thunk that says "something failed" and then have the TSTabletManager do something like this: for (const auto& tablet_id : dd_mgr->GetFailedTablets()) { Tablet* t = Lookup(tablet_id); t->Shutdown(); } Maybe I'm missing something for this part, but we need a lock somewhere to track which data dirs are "broken" so that we can ensure that asynchronous things like tablet copy will also abort if it's in the process of copying a tablet that has blocks now on a failed disk. Line 1081: LOG(INFO) << "Shutting down tablet (not implemented in this patch): " << tablet_id; If this doesn't do anything, why not just have an empty body of the function? -- To view, visit http://gerrit.cloudera.org:8080/7029 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib3169deada702c527b70fabf9d2223364f9a9ea6 Gerrit-PatchSet: 20 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
