[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13824156#comment-13824156
 ] 

Thawan Kooburat commented on ZOOKEEPER-1573:
--------------------------------------------

Probably need a comment from other people as well.  We disable this check in 
our prod system because we have some other way of detecting data inconsistency. 
 This check has shown to catch a real bug but it can also raise false possible 
in certain usage pattern.

> Unable to load database due to missing parent node
> --------------------------------------------------
>
>                 Key: ZOOKEEPER-1573
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1573
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.5.0
>            Reporter: Thawan Kooburat
>         Attachments: ZOOKEEPER-1573.patch
>
>
> While replaying txnlog on data tree, the server has a code to detect missing 
> parent node. This code block was last modified as part of ZOOKEEPER-1333. In 
> our production, we found a case where this check is return false positive.
> The sequence of txns is as follows:
> zxid 1:  create /prefix/a
> zxid 2:  create /prefix/a/b
> zxid 3:  delete /prefix/a/b
> zxid 4:  delete /prefix/a
> The server start capturing snapshot at zxid 1. However, by the time it 
> traversing the data tree down to /prefix, txn 4 is already applied and 
> /prefix have no children. 
> When the server restore from snapshot, it process txnlog starting from zxid 
> 2. This txn generate missing parent error and the server refuse to start up.
> The same check allow me to discover bug in ZOOKEEPER-1551, but I don't know 
> if we have any option beside removing this check to solve this issue.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to