Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11508
Change subject: [tools] Fix bug in CheckCompleteMove ...................................................................... [tools] Fix bug in CheckCompleteMove It was possible for the following sequence to happen: 0. We are moving a replica from TS X to TS Y for tablet A. TS X is presently the leader. 1. We find the tablet leader (X) and build a proxy to it. 2. To remove X from A, we ask it to step down. 3. Leadership changes quickly and Z != X becomes the leader. 4. Since leadership has changed, we move to remove X from A. To prepare we gather consensus state using proxy, thinking we are talking to Z, but the proxy is pointed at X, causing a bad status like Invalid argument: GetConsensusState: Wrong destination UUID requested. Local UUID: X. Requested UUID: Z This bug has always been present but was exposed by the follow-up graceful leadership transfer patch, since #3 was unlikely with abrupt stepdown, and if CheckCompleteMove was retried after leadership changed it would not hit the same problem. This also reorganizes and re-comments CheckCompleteMove a bit, to try and make it easier to understand. Change-Id: I227b8f833e8904dd1ac18fbe17345bea13c96c16 --- M src/kudu/tools/tool_replica_util.cc 1 file changed, 69 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/11508/1 -- To view, visit http://gerrit.cloudera.org:8080/11508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I227b8f833e8904dd1ac18fbe17345bea13c96c16 Gerrit-Change-Number: 11508 Gerrit-PatchSet: 1 Gerrit-Owner: Will Berkeley <[email protected]>
