[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Todd Lipcon has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/3643/8/src/kudu/tserver/heartbeater.cc File src/kudu/tserver/heartbeater.cc: Line 405: resp.leader_master()) { nit: dont really need to check has_leader_master() in either of these cases -- it'll just default to false -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 8: Build Started http://104.196.14.100/job/kudu-gerrit/2604/ -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Hello Dan Burkert, Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/3643 to look at the new patch set (#7). Change subject: KUDU-1374: send full tablet report when new leader master is detected .. KUDU-1374: send full tablet report when new leader master is detected This should help prevent missed tablet reports in very specific edge cases, detailed in the bug report. The new integration test fails 100% of the time without the change, and passes 100% of the time with it. Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f --- M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/tserver/heartbeater.cc 3 files changed, 100 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/3643/7 -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Dan Burkert has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/3643/6/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: Line 156: DEFINE_bool(catalog_manager_fail_async_rpcs, false, what about swapping 'async' for 'ts'. When I read the previous test, my first question was "how is the alter table replicated to the other masters if all async RPCs are disabled?" -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Adar Dembo has uploaded a new patch set (#6). Change subject: KUDU-1374: send full tablet report when new leader master is detected .. KUDU-1374: send full tablet report when new leader master is detected This should help prevent missed tablet reports in very specific edge cases, detailed in the bug report. The new integration test fails some fraction of the time without change, and passes 100% of the time with it. Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f --- M src/kudu/integration-tests/master_failover-itest.cc M src/kudu/master/catalog_manager.cc M src/kudu/tserver/heartbeater.cc 3 files changed, 109 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/3643/6 -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Adar Dembo has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/3643/5//COMMIT_MSG Commit Message: Line 13: there was a way to mock a server-side krpc component as easily as it is to > what about injecting a master crash in the AsyncSendFoo() method? that woul Unfortunately, the code path that results in ts_proxy.FooAsync() is synchronous w.r.t. handling the heartbeats. So if we're going to induce a crash, it has to be somewhere in the RPC layer. Or we have to force the first RPC attempt to fail, then crash in the retry which is asynchronous. I banged my head against the wall for a while here but I did produce a test that fails 4/10 times without the new heuristic. Take a look and let me know if you see a way to simplify it, or to make it fail more reliably. Line 15: multi-master tests, though. > is there a test you can loop before/after to verify this? Done http://gerrit.cloudera.org:8080/#/c/3643/5/src/kudu/tserver/heartbeater.cc File src/kudu/tserver/heartbeater.cc: Line 175: // Indicates that the thread should set a full tablet report. Set when > s/set/send/ Done -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Todd Lipcon has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/3643/5//COMMIT_MSG Commit Message: Line 13: there was a way to mock a server-side krpc component as easily as it is to what about injecting a master crash in the AsyncSendFoo() method? that would be likely to trigger this in many cases, right? (not a guarantee that the response was quicker than the starting of the async stuff, but pretty likely I think?) Line 15: multi-master tests, though. is there a test you can loop before/after to verify this? http://gerrit.cloudera.org:8080/#/c/3643/5/src/kudu/tserver/heartbeater.cc File src/kudu/tserver/heartbeater.cc: Line 175: // Indicates that the thread should set a full tablet report. Set when s/set/send/ -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Adar Dembo has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 5: Verified+1 More chrpath() and isolate failures. -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 5: Build Started http://104.196.14.100/job/kudu-gerrit/2472/ -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 4: Build Started http://104.196.14.100/job/kudu-gerrit/2456/ -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Kudu Jenkins has posted comments on this change. Change subject: KUDU-1374: send full tablet report when new leader master is detected .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/2409/ -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] KUDU-1374: send full tablet report when new leader master is detected
Adar Dembo has uploaded a new change for review. http://gerrit.cloudera.org:8080/3643 Change subject: KUDU-1374: send full tablet report when new leader master is detected .. KUDU-1374: send full tablet report when new leader master is detected This should help prevent missed tablet reports in very specific edge cases, detailed in the bug report. No new tests, as I'm not quite sure how to test this (it would be easy if there was a way to mock a server-side krpc component as easily as it is to mock a client-side one). It should reduce flakiness in some existing multi-master tests, though. Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f --- M src/kudu/tserver/heartbeater.cc 1 file changed, 30 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/3643/1 -- To view, visit http://gerrit.cloudera.org:8080/3643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic16fc46736476dba39616e79ecfe79eee48b3d7f Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo