David Ribeiro Alves has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9277 )

Change subject: WIP [tests] scenario to repro off-by-one error in TestWorkload
......................................................................


Patch Set 1:

I briefly tried this with my WIP implementation of leader leases and the 
problem seemed to have gone away. While not definitive this is encouraging 
evidence that the problem indeed stems from the lack of leader leases.
Conceptually this seems plausible too. Scenario:
- Two replicas (a,b) dispute leadership of tablet A, i.e. both think they're 
leaders, but "b" is the actual most recent leader.
- Client writes the last row to replica "b".
- Client then performs a scan at snapshot for all rows on replica "a", sending 
the timestamp of the write.
- Replica "a" thinking it's leader simply considers the current timestamp 
"safe" and executes a (non-repeatable) scan that doesn't include the last row 
written to b.

Possible follow-up steps:
- Retry the scan. If the problem goes away, then it's more likely that the 
problem stems from the lack of leader leases.
- Even better, hack it so that we retry the scan _at the same snapshot 
timestamp_.
- Add logging events (test only) that register the timestamps/safe time/chosen 
replica on the server side (hopefully showing the smoking gun).
- Implement leader leader and put this test on top :)


--
To view, visit http://gerrit.cloudera.org:8080/9277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60666b8b05dce8dd13fcdee6408c0930915ba0c1
Gerrit-Change-Number: 9277
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Wed, 21 Feb 2018 00:54:04 +0000
Gerrit-HasComments: No

Reply via email to