[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangmin Lv updated ZOOKEEPER-3114:
----------------------------------
    Summary: Built-in data consistency check inside ZooKeeper  (was: Real time 
consistency check between leader and learner)

> Built-in data consistency check inside ZooKeeper
> ------------------------------------------------
>
>                 Key: ZOOKEEPER-3114
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3114
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: quorum
>            Reporter: Fangmin Lv
>            Assignee: Fangmin Lv
>            Priority: Major
>             Fix For: 3.6.0
>
>
> The correctness of ZooKeeper was kind of proved in theory in ZAB paper, but 
> the implementation is a bit different from the paper, for example, save the 
> currentEpoch and proposals/commits upon to NEWLEADER is not atomic in current 
> implementation, so the correctness of ZooKeeper is not actually proved in 
> reality.
> Also bugs could be introduced during implementation, issues like sending 
> NEWLEADER packet too early reported in ZOOKEEPER-3104 might be there since 
> the beginning (didn't check exactly when this was introduced). 
> More correctness issues were introduced when adding new features, like on 
> disk txn sync, local session, retain database, etc, both of these features 
> added inconsistency bugs on production.
> To catch the consistency issue earlier, internally, we're running external 
> consistency checkers to compare nodes (digest), but that's not efficient 
> (slow and expensive) and there are corner cases we cannot cover in external 
> checker. For example, we don't know the last zxid before epoch change, which 
> makes it's impossible to check it's missing txn or not. Another challenge is 
> the false negative which is hard to avoid due to fuzzy snapshot or expected 
> txn gap during snapshot syncing, etc.
> This Jira is going to propose a built-in real time consistency check by 
> calculating the digest of DataTree after applying each txn, and sending it 
> over to learner during propose time so that it can verify the correctness in 
> real time. 
> The consistency check will cover all phases, including loading time during 
> startup, syncing, and broadcasting. It can help us avoid data lost or data 
> corrupt due to bad disk and catch bugs in code.
> The protocol change will make backward compatible to make sure we can 
> enable/disable this feature transparently.
> As for performance impact, based on our testing, it will add a bit overhead 
> during runtime, but doesn't have obvious impact in general.
> h2.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to