Mike Percy has posted comments on this change.

Change subject: [WIP] KUDU-1860: ksck doesn't identify tablets that are evicted 
but still in config
......................................................................


Patch Set 1:

(8 comments)

I wonder if we should add an RPC that will return all { committed, pending } 
configs from all replicas hosted on that tablet server. If we can pull that 
from all servers simultaneously (using the Async RPC API) we could turn this on 
by default since it shouldn't be too slow.

http://gerrit.cloudera.org:8080/#/c/6772/1/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

PS1, Line 210: s.ToString()
This s refers to the one from line 204, when I assume it should be the one from 
line 208


PS1, Line 616: shared_ptr<KsckTabletReplica> r
const shared_ptr<KsckTabletReplica>& r


PS1, Line 624: std::shared_ptr<KsckTabletReplica> r
same


Line 716:   } else if (consensus_conflict) {
If we catch a replica in the middle of a config change then we can still see 
this, which makes this warning a little racy. I wonder if we couldn't do 
something to make sure it's "stuck" like add a metric or a flag to the output 
given from the servers to indicate how long a config change has been pending.


http://gerrit.cloudera.org:8080/#/c/6772/1/src/kudu/tools/ksck.h
File src/kudu/tools/ksck.h:

PS1, Line 100: peers
you should also std::move(peers) here


PS1, Line 124:   int64_t term_;
             :   boost::optional<string> leader_uuid_;
             :   vector<string> peers_;
it looks like these can all be const


http://gerrit.cloudera.org:8080/#/c/6772/1/src/kudu/tools/ksck_remote.cc
File src/kudu/tools/ksck_remote.cc:

PS1, Line 114: for (auto it = tablet_status_map_.begin(); it != 
tablet_status_map_.end(); it++)
Since you're not mutating tablet_status_map_, it's more idiomatic to write this 
using C++11 for-loop syntax like this:

  for (const auto& entry : tablet_status_map_) {
    req.set_tablet_id(entry.first);
    // ...


PS1, Line 120: InsertOrDi
It's not safe to trust that multiple remotes won't have the same UUID and 
tablet id in cases where people did weird stuff with restoring backups. It's 
bad, and it indicates a misconfiguration, but it shouldn't crash the client 
tool.


-- 
To view, visit http://gerrit.cloudera.org:8080/6772
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-HasComments: Yes

Reply via email to