David Ribeiro Alves has posted comments on this change. Change subject: [catalog_manager] proper handling of catalog shutdown ......................................................................
Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/6170/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: PS2, Line 458: { : bool error_is_not_leader; : const Status& s = : catalog_manager_->TryGenerateNewTskUnlocked(&error_is_not_leader); : if (!s.ok()) { : LOG(WARNING) << "Failed to add new TSK: " << s.ToString(); : if (!error_is_not_leader) { : // Do not expect any other errors besides NOT_THE_LEADER error. : CHECK_OK(s); : } : } : } say that the error is something else (like the machine is partitioned) does this make the master crash? what if it's a timeout of some kind? we can definitely get into TOCTOU situations with the ScopedLeaderSharedLock ctor checks right? i.e. when we created the ScopedLeaderSharedLock it got leader_status().ok() but the machine got partitioned in between and so we can't actually write. are we crashing then, or am I misunderstanding? If we are should we? would a strategy of reversing this work better? that is, could you fail on-site when something bad tks related happens, but always return all other errors (including consensus ones) back to the caller and then just log the error and continue? PS2, Line 751: if (!(s.ok() || s.IsNotFound())) { : return s; : } could you leave a comment that we only handle the OK and NotFound cases below and everything else is an error? -- To view, visit http://gerrit.cloudera.org:8080/6170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I826826049e3c08a6c8345949690cbbedaea32ff8 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
