Hi, we use a single standalone Zookeeper in Version 3.3.6 with multiple (Java) Clients. Clients and Server are running in one zone of a Solaris SPARC.
The Clients are successfully connecting to the server, sending frequently their heartbeats and are able to execute operations. It seems you could run this forever and no timeout ever happens. But when they have finished their operations and go back to idle, just sending their heartbeats but doesn't execute any operations, after a while (seems like it is always about an hour) some but NOT all clients are starting to timeout. The client is reconnecting in time so the session is not lost but for me this is kind of strange, so my questions are: - Do you have an explanation what can lead to these timeouts which apparently happen on an hourly base? The network is up and running the whole time and as I said it just happens in the 'idle' state, seems like 'load' state would run forever. - Why is the IP switching between ipv4 and ipv6 with each reconnect? I would be glad if someone can put some light in the dark. See the attached log file for more details. Regards, Achim Log: ### Server ### 2015-11-04 11:16:58,367 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /fd00:0:0:c20:0:0:1:2:37188 which had sessionid 0x150cddeb4cd0008 2015-11-04 11:16:58,955 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /10.64.122.26:60066 2015-11-04 11:16:58,956 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew session 0x150cddeb4cd0008 at /10.64.122.26:60066 2015-11-04 11:16:58,957 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established session 0x150cddeb4cd0008 with negotiated timeout 15000 for client / 10.64.122.26:60066 ### Client ### 0x150cddeb4cd0008, closing socket connection and attempting reconnect 2015-11-04 11:16:58,364 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1181 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Client session timed out, have not heard from server in 11216ms for sessionid 0x150cddeb4cd0008, closing socket connection and attempting reconnect 2015-11-04 11:16:58,953 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1058 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Opening socket connection to server xxx.xxx.xxx/10.64.122.26:2181 2015-11-04 11:16:58,955 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 947 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Socket connection established to xxx.xxx.xxx/10.64.122.26:2181, initiating session 2015-11-04 11:16:58,957 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 736 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Session establishment complete on server xxx.xxx.xxx/10.64.122.26:2181, sessionid = 0x150cddeb4cd0008, negotiated timeout = 15000 ### Server ### 2015-11-04 12:17:05,817 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /10.64.122.26:60066 which had sessionid 0x150cddeb4cd0008 2015-11-04 12:17:06,088 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /fd00:0:0:c20:0:0:1:2:40286 2015-11-04 12:17:06,089 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew session 0x150cddeb4cd0008 at /fd00:0:0:c20:0:0:1:2:40286 2015-11-04 12:17:06,090 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established session 0x150cddeb4cd0008 with negotiated timeout 15000 for client /fd00:0:0:c20:0:0:1:2:40286 ### Client ### 2015-11-04 12:17:05,815 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1181 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Client session timed out, have not heard from server in 11449ms for sessionid 0x150cddeb4cd0008, closing socket connection and attempting reconnect 2015-11-04 12:17:06,087 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1058 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Opening socket connection to server xxx.xxx.xxx/fd00:0:0:c20:0:0:1:2:2181 2015-11-04 12:17:06,088 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 947 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Socket connection established to xxx.xxx.xxx/fd00:0:0:c20:0:0:1:2:2181, initiating session 2015-11-04 12:17:06,090 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 736 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Session establishment complete on server xxx.xxx.xxx/fd00:0:0:c20:0:0:1:2:2181, sessionid = 0x150cddeb4cd0008, negotiated timeout = 15000 ### Server ### 2015-11-04 13:17:13,277 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /fd00:0:0:c20:0:0:1:2:40286 which had sessionid 0x150cddeb4cd0008 2015-11-04 13:17:13,736 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /10.64.122.26:47735 2015-11-04 13:17:13,737 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew session 0x150cddeb4cd0008 at /10.64.122.26:47735 2015-11-04 13:17:13,738 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established session 0x150cddeb4cd0008 with negotiated timeout 15000 for client / 10.64.122.26:47735 ### Client ### 2015-11-04 13:17:13,275 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1181 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Client session timed out, have not heard from server in 11787ms for sessionid 0x150cddeb4cd0008, closing socket connection and attempting reconnect 2015-11-04 13:17:13,734 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1058 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Opening socket connection to server xxx.xxx.xxx/10.64.122.26:2181 2015-11-04 13:17:13,736 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 947 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Socket connection established to xxx.xxx.xxx/10.64.122.26:2181, initiating session 2015-11-04 13:17:13,738 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 736 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Session establishment complete on server xxx.xxx.xxx/10.64.122.26:2181, sessionid = 0x150cddeb4cd0008, negotiated timeout = 15000 ### Server ### 2015-11-04 14:17:20,734 - WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] - EndOfStreamException: Unable to read additional data from client sessionid 0x150cddeb4cd0008, likely client has closed socket 2015-11-04 14:17:20,735 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /10.64.122.26:47735 which had sessionid 0x150cddeb4cd0008 2015-11-04 14:17:21,600 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /fd00:0:0:c20:0:0:1:2:65493 2015-11-04 14:17:21,600 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client attempting to renew session 0x150cddeb4cd0008 at /fd00:0:0:c20:0:0:1:2:65493 2015-11-04 14:17:21,601 - INFO [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established session 0x150cddeb4cd0008 with negotiated timeout 15000 for client /fd00:0:0:c20:0:0:1:2:65493 ### Client ### 2015-11-04 14:17:20,732 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1181 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Client session timed out, have not heard from server in 11591ms for sessionid 0x150cddeb4cd0008, closing socket connection and attempting reconnect 2015-11-04 14:17:21,598 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 1058 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Opening socket connection to server xxx.xxx.xxx/fd00:0:0:c20:0:0:1:2:2181 2015-11-04 14:17:21,600 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 947 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Socket connection established to xxx.xxx.xxx/fd00:0:0:c20:0:0:1:2:2181, initiating session 2015-11-04 14:17:21,601 | INFO | xxx.xxx.xxx:2181) | org.apache.zookeeper.ClientCnxn$SendThread 736 | 197 - org.apache.hadoop.zookeeper - 3.3.6 | Session establishment complete on server xxx.xxx.xxx/fd00:0:0:c20:0:0:1:2:2181, sessionid = 0x150cddeb4cd0008, negotiated timeout = 15000
