[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264585#comment-13264585
 ] 

Camille Fournier commented on ZOOKEEPER-1457:
---------------------------------------------

Hi Neha,

I'm trying to dig into this issue a bit, but I'm a little unclear of what the 
problem is. You have a node that was created by session C. When Session C 
expires, the node is deleted. You don't expect it to be deleted, but I'm not 
sure why. Is it because you didn't know about session C? Is it because session 
B didn't get the right information about the node? 

What does this mean in the context of the bug?
{quote}Since the leader processes create session and create znode for Session C 
first, shouldn't it be the session id that gets returned to the client as 
create session response ? Does this sound like a bug ?{quote}

Session C seems to be the owner of the node, and you've got a closeSession for 
C, so is it really deleting the node for an unexpired session?
                
> Ephemeral node deleted for unexpired sessions
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-1457
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1457
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.4
>            Reporter: Neha Narkhede
>
> This week, we saw a potential bug with zookeeper 3.3.4. In an attempt to 
> adding a separate disk for zookeeper transaction logs, our SysOps team threw 
> new disks at all the zookeeper servers in our production cluster at around 
> the same time. Right after this, we saw degraded performance on our zookeeper 
> cluster. And yes, I agree that this degraded behavior is expected and we 
> could've done a better job and upgraded one server at a time. Al though, the 
> observed impact was that ephemeral nodes got deleted without session 
> expiration on the zookeeper clients. 
> Let me try and describe what I've observed from the Kafka and ZK server logs 
> - Kafka client has a session established with ZK, say Session A, that it has 
> been using successfully. At the time of the degraded ZK performance issue, 
> Session A expires. Kafka's ZkClient tries to establish another session with 
> ZK. After 9 seconds, it establishes a session, say Session B and tries to use 
> it for creating a znode. This operation fails with a NodeExists error since 
> another session, say session C, has created that znode. This is considered OK 
> since ZkClient retries an operation transparently if it gets disconnected and 
> sometimes you can get NodeExists. But then later, session C expires and hence 
> the ephemeral node is deleted from ZK. This leads to unexpected errors in 
> Kafka since its session, Session B, is still valid and hence it expects the 
> znode to be there. The issue is that session C was established, created the 
> znode and expired, without the zookeeper client on Kafka ever knowing about 
> it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to