[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262900#comment-13262900
 ] 

Neha Narkhede commented on ZOOKEEPER-1457:
------------------------------------------

I've attempted to serialize the sequence of events according to relevant data 
from txn and log4j logs below. 

Zookeeper cluster size is 5 servers - srvr-84.prod, srvr-85.prod, srvr-88.prod, 
srvr-89.prod, srvr-90.prod
Leader zookeeper is srvr-90.prod 
Session A = 0x9367a5bd54c9078 
Session B = 0x8367a5bd75ea01b 
Session C = 0x8367a5bd75e9dd5 
The znode being created = 
/consumers/kafka-DR/ids/kafka-DR_srvr-0193.prod-1334094226781-6a7f1b76 
Zookeeper session timeout is 6 seconds 

1. Original session A, 0x9367a5bd54c9078 expired at 20:33:09 with zxid 
0x31189e02d8 on the leader. 

2. Leader processed session termination for 0x9367a5bd54c9078 at 20:33.09.000. 

3. Leader realizes client has closed socket, so it closes NIOServerCnxn for 
0x9367a5bd54c9078 at 20:33:09.009 

4. Kafka client gets Expired state at 20:33:09.161 

5. Kafka client starts re-registering its node in ZK at 20:33.09.209 with 
socket connection established to srvr-89.prod 

6. Client session 0x0 times out several times, reconnects to another server and 
repeats session establishment request. 

7. On the leader's txn log, session B 0x8367a5bd75ea01b is created at 20:33.17- 
1. time:4/23/12 20:33:17 PM PDT session:0x8367a5bd75ea01b cxid:0x0 
zxid:0x31189e1985 type:createSession timeOut:6000 

8. On zk server srvr-89, session B 0x8367a5bd75ea01b is established - 
1. zk.log4j.2012-04-23-srvr-89.gz: 1. 2012-04-23 20:33:17,987 - INFO 
[CommitProcessor:8:NIOServerCnxn@1580] - Established session 0x8367a5bd75ea01b 
with negotiated timeout 6000 for client /172.17.87.173:52865 
2. 2012/04/23 20:33:18.022 INFO [ZkClient] [main-EventThread] [kafka] zookeeper 
state changed (SyncConnected) 

9. At 20:33:18.022 on Kafka client, session is established with session id 
0x8367a5bd75ea01b 
1. 2012/04/23 20:33:18.022 INFO [ClientCnxn] 
[main-SendThread(srvr-89.prod:12913)] [kafka] Session establishment complete on 
server srvr-89.prod/172.17.229.105:12913, sessionid = 0x8367a5bd75ea01b, 
negotiated timeout = 6000 

10. Kafka zookeeper client fails to create znode at 20:33.18.358 with Session B 
due to NodeExists exception 
1. 2012/04/23 20:33:18.358 INFO [ZkUtils$] 
[ZkClient-EventThread-81-srvr-85.prod:12913,srvr-88.prod:12913,srvr-84.prod:12913,srvr-89.prod:12913,srvr-90.prod:12913/kafka]
 [kafka] /consumers/kafka-DR/ids/kafka-DR_srvr-0193.prod-1334094226781-6a7f1b76 
exists with value { "firehoseActivity": 1,"firehoseUpdates": 
1,"GenericSearchEvent": 1,"firehoseShares": 1,"ProfileViewEvent": 1 } during 
connection loss; this is ok

11. Kafka client gets Disconnected and SyncConnected with session B at 20:33:26 
1. 2012/04/23 20:33:26.669 INFO [ClientCnxn] 
[main-SendThread(srvr-89.prod:12913)] [kafka] Client session timed out, have 
not heard from server in 4000ms for sessionid 0x8367a5bd75ea01b, closing socket 
connection and attempting reconnect 
2. 2012/04/23 20:33:26.769 INFO [ZkClient] [main-EventThread] [kafka] zookeeper 
state changed (Disconnected) 
3. 2012/04/23 20:33:27.471 INFO [ClientCnxn] 
[main-SendThread(srvr-89.prod:12913)] [kafka] Opening socket connection to 
server srvr-88.prod/172.17.229.104:12913 
4. 2012/04/23 20:33:27.531 INFO [ClientCnxn] 
[main-SendThread(srvr-88.prod:12913)] [kafka] Socket connection established to 
srvr-88.prod/172.17.229.104:12913, initiating session 
5. 2012/04/23 20:33:28.419 INFO [ClientCnxn] 
[main-SendThread(srvr-88.prod:12913)] [kafka] Session establishment complete on 
server srvr-88.prod/172.17.229.104:12913, sessionid = 0x8367a5bd75ea01b, 
negotiated timeout = 6000 

12. Kafka client runs into NoNode exception at 20:33:30 2012/04/23 20:33:30.538 
INFO [ZookeeperConsumerConnector] 
[kafka-DR_srvr-0193.prod-1334094226781-6a7f1b76_watcher_executor] [kafka] 
exception during rebalance 

At around the same time, the session C (0x8367a5bd75e9dd5) gets established and 
also creates the node and later expires. 

1. On leader srvr-90, session C gets created 1. time:4/23/12 20:33:09 PM PDT 
session:0x8367a5bd75e9dd5 cxid:0x0 zxid:0x31189e0709 type:createSession 
timeOut:6000 

2. On server srvr-89, session gets created at 20:33.13.930 
1. zk.log4j.2012-04-23-srvr-89.gz:2012-04-23 20:33:13,930 - INFO 
[CommitProcessor:8:NIOServerCnxn@1580] - Established session 0x8367a5bd75e9dd5 
with negotiated timeout 6000 for client /172.17.87.173:52802 
2. zk.log4j.2012-04-23-srvr-89.gz:2012-04-23 20:33:17,667 - WARN 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@634] - 
EndOfStreamException: Unable to read additional data from client sessionid 
0x8367a5bd75e9dd5, likely client has closed socket 
3. zk.log4j.2012-04-23-srvr-89.gz:2012-04-23 20:33:17,668 - INFO 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed socket 
connection for client /172.17.87.173:52802 which had sessionid 
0x8367a5bd75e9dd5 

3. Session C creates the znode - 
1. time:4/23/12 20:33:17 PM PDT session:0x8367a5bd75e9dd5 cxid:0x1 
zxid:0x31189e1464 type:create acl:[31,s{'world,'anyone}] ephemeral:true 
path:/kafka/consumers/kafka-DR/ids/kafka-DR_ech3-app0193.prod-1334094226781-6a7f1b76
 
data:7B202266697265686F73654163746976697479223A20312C2266697265686F736555706461746573223A20312C2247656E657269635365617263684576656E74223A20312C2266697265686F7365536861726573223A20312C2250726F66696C65566965774576656E74223A2031207D
 
2. Session B gets closed on the leader

4. Session C expires on leader time:4/23/12 20:33:29 PM PDT 
session:0x8367a5bd75e9dd5 cxid:0x0 zxid:0x31189e27d0 type:closeSession 

Since the leader processes create session and create znode for Session C first, 
shouldn't it be the session id that gets returned to the client as create 
session response ? Does this sound like a bug ?
                
> Ephemeral node deleted for unexpired sessions
> ---------------------------------------------
>
>                 Key: ZOOKEEPER-1457
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1457
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.3.4
>            Reporter: Neha Narkhede
>
> This week, we saw a potential bug with zookeeper 3.3.4. In an attempt to 
> adding a separate disk for zookeeper transaction logs, our SysOps team threw 
> new disks at all the zookeeper servers in our production cluster at around 
> the same time. Right after this, we saw degraded performance on our zookeeper 
> cluster. And yes, I agree that this degraded behavior is expected and we 
> could've done a better job and upgraded one server at a time. Al though, the 
> observed impact was that ephemeral nodes got deleted without session 
> expiration on the zookeeper clients. 
> Let me try and describe what I've observed from the Kafka and ZK server logs 
> - Kafka client has a session established with ZK, say Session A, that it has 
> been using successfully. At the time of the degraded ZK performance issue, 
> Session A expires. Kafka's ZkClient tries to establish another session with 
> ZK. After 9 seconds, it establishes a session, say Session B and tries to use 
> it for creating a znode. This operation fails with a NodeExists error since 
> another session, say session C, has created that znode. This is considered OK 
> since ZkClient retries an operation transparently if it gets disconnected and 
> sometimes you can get NodeExists. But then later, session C expires and hence 
> the ephemeral node is deleted from ZK. This leads to unexpected errors in 
> Kafka since its session, Session B, is still valid and hence it expects the 
> znode to be there. The issue is that session C was established, created the 
> znode and expired, without the zookeeper client on Kafka ever knowing about 
> it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to