David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9277 )
Change subject: WIP [tests] scenario to repro off-by-one error in TestWorkload ...................................................................... Patch Set 1: I briefly tried this with my WIP implementation of leader leases and the problem seemed to have gone away. While not definitive this is encouraging evidence that the problem indeed stems from the lack of leader leases. Conceptually this seems plausible too. Scenario: - Two replicas (a,b) dispute leadership of tablet A, i.e. both think they're leaders, but "b" is the actual most recent leader. - Client writes the last row to replica "b". - Client then performs a scan at snapshot for all rows on replica "a", sending the timestamp of the write. - Replica "a" thinking it's leader simply considers the current timestamp "safe" and executes a (non-repeatable) scan that doesn't include the last row written to b. Possible follow-up steps: - Retry the scan. If the problem goes away, then it's more likely that the problem stems from the lack of leader leases. - Even better, hack it so that we retry the scan _at the same snapshot timestamp_. - Add logging events (test only) that register the timestamps/safe time/chosen replica on the server side (hopefully showing the smoking gun). - Implement leader leader and put this test on top :) -- To view, visit http://gerrit.cloudera.org:8080/9277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60666b8b05dce8dd13fcdee6408c0930915ba0c1 Gerrit-Change-Number: 9277 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Wed, 21 Feb 2018 00:54:04 +0000 Gerrit-HasComments: No