[
https://issues.apache.org/jira/browse/ZOOKEEPER-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088277#comment-16088277
]
ASF GitHub Bot commented on ZOOKEEPER-2845:
-------------------------------------------
GitHub user lvfangmin opened a pull request:
https://github.com/apache/zookeeper/pull/310
[ZOOKEEPER-2845][Test] Test used to reproduce the data inconsistency issue
due to retain database in leader election
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/lvfangmin/zookeeper ZOOKEEPER-2845-TEST
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/zookeeper/pull/310.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #310
----
commit ff0bc49de51635da1d5bff0e4f260a61acc87db0
Author: Fangmin Lyu <[email protected]>
Date: 2017-07-14T23:02:20Z
reproduce the data inconsistency issue
----
> Data inconsistency issue due to retain database in leader election
> ------------------------------------------------------------------
>
> Key: ZOOKEEPER-2845
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2845
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum
> Affects Versions: 3.4.10, 3.5.3
> Reporter: Fangmin Lv
> Assignee: Fangmin Lv
> Priority: Critical
>
> In ZOOKEEPER-2678, the ZKDatabase is retained to reduce the unavailable time
> during leader election. In ZooKeeper ensemble, it's possible that the
> snapshot is ahead of txn file (due to slow disk on the server, etc), or the
> txn file is ahead of snapshot due to no commit message being received yet.
> If snapshot is ahead of txn file, since the SyncRequestProcessor queue will
> be drained during shutdown, the snapshot and txn file will keep consistent
> before leader election happening, so this is not an issue.
> But if txn is ahead of snapshot, it's possible that the ensemble will have
> data inconsistent issue, here is the simplified scenario to show the issue:
> Let's say we have a 3 servers in the ensemble, server A and B are followers,
> and C is leader, and all the snapshot and txn are up to T0:
> 1. A new request reached to leader C to create Node N, and it's converted to
> txn T1
> 2. Txn T1 was synced to disk in C, but just before the proposal reaching out
> to the followers, A and B restarted, so the T1 didn't exist in A and B
> 3. A and B formed a new quorum after restart, let's say B is the leader
> 4. C changed to looking state due to no enough followers, it will sync with
> leader B with last Zxid T0, which will have an empty diff sync
> 5. Before C take snapshot it restarted, it replayed the txns on disk which
> includes T1, now it will have Node N, but A and B doesn't have it.
> Also I included the a test case to reproduce this issue consistently.
> We have a totally different RetainDB version which will avoid this issue by
> doing consensus between snapshot and txn files before leader election, will
> submit for review.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)