Dimas Shidqi Parikesit created ZOOKEEPER-4837:
-------------------------------------------------

             Summary: Network issue causes ephemeral node unremoved after the 
session expiration
                 Key: ZOOKEEPER-4837
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4837
             Project: ZooKeeper
          Issue Type: Bug
          Components: quorum, server
            Reporter: Dimas Shidqi Parikesit


In our testing cluster with the latest ZooKeeper version (66202cb), we observed 
that sometimes an ephemeral node never gets deleted if there is a network issue 
during the PROPOSAL request, even after the session expires. This bug is 
essentially related to ZOOKEEPER-2355, but the issue was not entirely fixed in 
the previous patch. We also tested on some related open PRs (e.g., 
[https://github.com/apache/zookeeper/pull/2152] and 
[https://github.com/apache/zookeeper/pull/1925] ), and confirmed the issue 
exists after the proposed fix.

 

Steps to reproduce this bug:
 # Start a cluster with 3 servers, follower A, leader B, follower C
 # Open a zk client in server A
 # Create an ephemeral node in the client
 # Inject network issue in all server that causes SocketTimeoutException during 
readPacket if the packet is a PROPOSAL
 # Close the client
 # Wait until the cluster is stable (the leader will change between B and C 
several times)
 # Remove the network issue from all server
 # Check every server for ephemeral node existence. The ephemeral node will 
exist in server A. However, server B and C will not have the ephemeral node 
anymore.

 

Essentially the bug is caused by loadDatabase loading a snapshot with a higher 
Zxid than the truncated log, causing fastForwardFromEdits to fail when trying 
to replay the transactions. For example, if one of the follower has a 
lastProcessedZxid of 0x200000001 and last snapshot snapshot.200000001, and the 
leader sends a TRUNC with a zxid of 100000002, truncateLog will truncate the 
follower's log to 100000002. However, loadDatabase will load 
snapshot.200000001. So when fastForwardFromEdits happens, it will set the data 
tree to 200000001 instead of 100000002.

 

We also attached a test case to reproduce this issue. Note that this test case 
is still pretty flaky at this moment.

 

We proposed to fix this case by loading the database from the last snapshot 
that happens before the last truncated-log entry during truncateLog. See our PR 
attached. Of course, this may not be the ideal solution and we welcome 
suggestions. Some other potential solutions include: 

(1) Disable fastForwardDatabase in shutdown

(2) Run setLastProcessedZxid at the end of Learner's syncWithLeader function if 
the packet is Leader.DIFF 

 

Your insights are very much appreciated. We will continue following up this 
issue until it is resolved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to