Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14223 )
Change subject: KUDU-2069 p5: recheck tablet health when exiting maintenance mode ...................................................................... Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/14223/2//COMMIT_MSG Commit Message: PS2: > Not a question regarding this patch as is, but anyways. Definitely; same with the Master web UI. There are many improvements we could add once the core functionality is in. http://gerrit.cloudera.org:8080/#/c/14223/2//COMMIT_MSG@33 PS2, Line 33: : Testing: > I don't think this is hamfisted at all once the enter/exit MM operations wo Done http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/integration-tests/maintenance_mode-itest.cc File src/kudu/integration-tests/maintenance_mode-itest.cc: http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/integration-tests/maintenance_mode-itest.cc@237 PS2, Line 237: > nir: bringing Done http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/catalog_manager.cc@4989 PS2, Line 4989: } > Might be a little clearer if SetAllTServers... were called instead in maste I've since refactored a bunch so most of the logic is in the TSManager, in which case I think having this call here mostly addresses your concern of encapsulating the new state in one place. Now, the TSManager is in charge of updating whether these TSDescriptors need full reports. http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/master_service.cc File src/kudu/master/master_service.cc: http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/master_service.cc@375 PS2, Line 375: > nit: 'Check if'... Done http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/ts_manager.cc File src/kudu/master/ts_manager.cc: http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/ts_manager.cc@234 PS2, Line 234: TServerStatePB ts_state, > Needs to acquire lock_ in shared mode, no? Surprised TSAN didn't catch this Done http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/ts_manager.cc@244 PS2, Line 244: ts_state_by_uuid_.erase(ts_uuid); : // If exiting maintenance mode, make sure that any replica failures that : > Why does this belong here? It's not intuitive since this isn't part of the It doesn't. The question I was trying to answer is, "What should we do when there's a Master leadership change?" Reset all the "should_report_full" state? Do nothing? For now, I'll do nothing because tservers will already send full reports when the master leadership changes. -- To view, visit http://gerrit.cloudera.org:8080/14223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0ab3d78cbc5b1228c01592a00118f11f76e43dd Gerrit-Change-Number: 14223 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Wed, 25 Sep 2019 23:59:53 +0000 Gerrit-HasComments: Yes
