Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7439 )

Change subject: mvcc: allow tablet shutdown without completing txs
......................................................................


Patch Set 15:

> Patch Set 15:
>
> (1 comment)

The behavior I saw before was that when I stopped MVCC on a leader and 
continued to write to that leader, something would successfully complete and I 
would see a mismatch in attempt numbers, resulting in a check failure.

I think this is being caused by the following race with three nodes A*, B, and 
C:
1. A* gets a client request for rpc0
2. rpc0 gets replicated and the nodes begin applying with attempt 0
3. A* fails. The client will try different nodes, but if A* remains the leader, 
it will eventually cycle back to A* and try again, so it replicates again.
4. B and C begin to apply rpc0 attempt 1, registering it with the result 
tracker and preempting the follower transaction rpc0 attempt 0
5. B and C finish Applying rpc0 attempt 0 and fail because attempt 0 is no 
longer the driver

If I put a sleep in ResultTracker::RecordCompletionAndRespond(), the above 
scenario is triggered consistently. We do this preemption to ensure that we 
don't respond to a leader transaction that may no longer be valid, but I don't 
see where we might be aborting this follower transaction.

Currently dong more testing around this.


--
To view, visit http://gerrit.cloudera.org:8080/7439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I983620f27e7226806a2cca253db7619731914d42
Gerrit-Change-Number: 7439
Gerrit-PatchSet: 15
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Fri, 29 Sep 2017 22:02:46 +0000
Gerrit-HasComments: No

Reply via email to