Kevin Wikant created ZOOKEEPER-4840: ---------------------------------------
Summary: Repeated SessionExpiredException after Zookeeper daemon restart Key: ZOOKEEPER-4840 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4840 Project: ZooKeeper Issue Type: Bug Reporter: Kevin Wikant ## Background Application is using Zookeeper for leader election & metadata storage. The application runs on 3 hosts which each also have 1 Zookeeper daemon running. Previously the application was running on Zookeeper version 3.5.10 & Curator version 4.3.0 After upgrading to Zookeeper version 3.9.1 & Curator version 5.2.0 a new edge case was observed where after Zookeeper daemons are restarted/failed the application (i.e. Zookeeper client) enters a 15+ minute loop of repeatedly logging "{{{}SessionExpiredException"{}}} These repeated "{{{}SessionExpiredException"{}}} are not indicative of a full Zookeeper client outage because DEBUG logs show that other Zookeeper sessions are communicating just fine. The "{{{}SessionExpiredException"{}}} logs unfortunately do not show the associated Session ID ## Symptoms When using Zookeeper version 3.9.1 & Curator version 5.2.0, after restarting/failing some of the Zookeeper daemons: # All the 3 zookeeper clients experience some connections failures lasting a few seconds after the Zookeeper daemons were failed/restarted. # These connection failure issues are resolved shortly without any action needed. # Around 1 minute after the Zookeeper daemons were failed/restarted, all the 3 zookeeper clients start repeatedly logging "{{{}SessionExpiredException"{}}} # The "{{{}SessionExpiredException" {}}}is repeatedly logged for 15+ minutes. During this time there are no connectivity issues. We can see from the Zookeeper server logs that all 3 Zookeeper servers are receiving regular traffic from the clients. # Interestingly, each Zookeeper server is not receiving any requests from the local Zookeeper client for the duration of the period where "{{{}SessionExpiredException"{}}}is repeatedly logged. However, each Zookeeper server is receiving regular traffic from the 2 remote Zookeeper clients. The evidence suggests that this is a client-side issue & the "{{{}SessionExpiredException" {}}}is being thrown before the request is even sent to the Zookeeper server. -- This message was sent by Atlassian Jira (v8.20.10#820010)