Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14223 )

Change subject: KUDU-2069 p5: recheck tablet health when exiting maintenance 
mode
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14223/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14223/2//COMMIT_MSG@33
PS2, Line 33: While somewhat hamfisted, on a reasonably dense cluster, it 
doesn't seem
            : unlikely that every tserver might be hosting a leader.
I don't think this is hamfisted at all once the enter/exit MM operations work 
on a batch of tservers.


http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/catalog_manager.cc@4989
PS2, Line 4989: Status CatalogManager::SetTServerState(const string& 
tserver_uuid, TServerState state) {
Might be a little clearer if SetAllTServers... were called instead in 
master_service.cc when this function succeeds. That way all of the management 
of the new in-memory state is in just one place.


http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/catalog_manager.cc@5003
PS2, Line 5003: ts_manager->SetAllTServersRequireFullTabletReport();
> Could it be that the tserver already exited maintenance mode?
Wouldn't we have short-circuited out via L4997 then?


http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/ts_manager.cc@234
PS2, Line 234: void TSManager::SetAllTServersRequireFullTabletReport() {
Needs to acquire lock_ in shared mode, no? Surprised TSAN didn't catch this; do 
we have no multi-threaded test coverage including this function and another 
that takes lock_?

Could you update the comments in ts_manager.h to make it more clear when lock_ 
needs to be acquired?


http://gerrit.cloudera.org:8080/#/c/14223/2/src/kudu/master/ts_manager.cc@244
PS2, Line 244:   for (auto& id_and_desc : servers_by_id_) {
             :     id_and_desc.second->UpdateNeedsFullTabletReport(false);
             :   }
Why does this belong here? It's not intuitive since this isn't part of the 
"tserver state".



--
To view, visit http://gerrit.cloudera.org:8080/14223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0ab3d78cbc5b1228c01592a00118f11f76e43dd
Gerrit-Change-Number: 14223
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 13 Sep 2019 04:34:32 +0000
Gerrit-HasComments: Yes

Reply via email to