Hi Prashant, It’s quite difficult to track what could have happened from your log file snippet. Look’s like there’re multiple things going on at the same time: dynamic reconfig, leader election, network issues?
However you’re using an early alpha version of ZooKeeper which is not recommended for production use. I suggest to upgrade to a recent stable release (3.5.5) and see if the problem still occurs. Regards, Andor > On 2019. Jun 20., at 13:44, prashantkumar dhotre > <[email protected]> wrote: > > Hi > We use 3.5.1-alpha version. > We are seeing session expiry issue in VM set up. > This is running in replicated more (two servers + node mastership as one > vote for quorum). > we see client session expired after session timeout (of 10 sec). > This connection was to local zk server. > session timeout is 10 sec. > > This session got established at 17:40:18 and ZK server expired this at > 17:40:57, after 39 seconds of establishment. > > in between this time, i see few errors and warnings in zookeeper server > logs (as shown below). > > I see below errors/warning in between this time before session expiry. > > This issue is not very easy to replicate , so far we have seen only twice. > > Could you please help me identify root cause and let me know if this is > fixed in later release ? > > Thanks, > > Prashant > > > Logs: > > > > ZK server errors and warnings in ZK server logs: > > > > 2711:2019-06-05 17:40:38,766 [myid:2147483652] - INFO [ > 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@637] - Received connection > request /128.0.0.5:38020 > > 2724:2019-06-05 17:40:38,931 [myid:2147483652] - WARN > [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@917] - Connection broken > for id 2147483653, my id = 2147483652, error = > > 2725:java.io.EOFException > > 2726: at java.io.DataInputStream.readInt(Unknown Source) > > 2727: at > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:902) > > 2728:2019-06-05 17:40:38,943 [myid:2147483652] - WARN > [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@920] - Interrupting > SendWorker > > 2730:2019-06-05 17:40:38,950 [myid:2147483652] - WARN > [SendWorker:2147483653:QuorumCnxManager$SendWorker@834] - Interrupted while > waiting for message on queue > > 2731:java.lang.InterruptedException > > 2732: at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown > Source) > > 2733: at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown > Source) > > 2734: at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source) > > 2735: at > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:986) > > 2736: at > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:63) > > 2737: at > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:822) > > 2738:2019-06-05 17:40:38,965 [myid:2147483652] - WARN > [SendWorker:2147483653:QuorumCnxManager$SendWorker@843] - Send worker > leaving thread id 2147483653 my id = 2147483652 > > 2740:2019-06-05 17:40:38,978 [myid:2147483652] - INFO [ > 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@637] - Received connection > request /128.0.0.5:38022 > > 2741:2019-06-05 17:40:38,986 [myid:2147483652] - INFO [ > 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@637] - Received connection > request /128.0.0.5:38024 > > 2742:2019-06-05 17:40:38,987 [myid:2147483652] - WARN > [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@920] - Interrupting > SendWorker > > 2743:2019-06-05 17:40:39,017 [myid:2147483652] - ERROR > [SendWorker:2147483653:QuorumCnxManager$SendWorker@810] - Failed to send > last message. Shutting down thread. > > 2744:java.net.SocketException: Socket closed > > 2745: at java.net.SocketOutputStream.socketWrite(Unknown Source) > > 2746: at java.net.SocketOutputStream.write(Unknown Source) > > 2747: at java.io.DataOutputStream.writeInt(Unknown Source) > > 2748: at > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:779) > > 2749: at > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:806) > > 2750:2019-06-05 17:40:39,018 [myid:2147483652] - WARN > [SendWorker:2147483653:QuorumCnxManager$SendWorker@843] - Send worker > leaving thread id 2147483653 my id = 2147483652 > > 2758:2019-06-05 17:40:39,125 [myid:2147483652] - INFO > [LearnerHandler-/128.0.0.5:49888:LearnerHandler@385] - Follower sid: > 2147483653 not in the current config 100000000 > > 2759:2019-06-05 17:40:39,130 [myid:2147483652] - INFO > [LearnerHandler-/128.0.0.5:49888:LearnerHandler@683] - Synchronizing with > Follower sid: 2147483653 maxCommittedLog=0x100000e04 minCommittedLog=0x1000 > > 00c10 lastProcessedZxid=0x100000e04 peerLastZxid=0x0 > > 2760:2019-06-05 17:40:39,132 [myid:2147483652] - WARN > [LearnerHandler-/128.0.0.5:49888:ZKDatabase@310] - Unable to find proposals > from txnlog for zxid: 0 > > 2762:2019-06-05 17:40:39,134 [myid:2147483652] - INFO > [LearnerHandler-/128.0.0.5:49888:LearnerHandler@455] - Sending snapshot > last zxid of peer is 0x0, zxid of leader is 0x100000e0b, send zxid of db as > 0 > > x100000e05, 1 concurrent snapshots, snapshot was exempt from throttle > > 2814:2019-06-05 17:40:39,438 [myid:2147483652] - INFO > [ProcessThread(sid:2147483652 cport:-1)::PrepRequestProcessor@512] - > Incremental reconfig > > 2815:2019-06-05 17:40:39,454 [myid:2147483652] - WARN > [LearnerHandler-/128.0.0.5:49888:QuorumPeer@1370] - Restarting Leader > Election > > 2816:2019-06-05 17:40:39,474 [myid:2147483652] - INFO [ > 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@659] - Leaving listener > > 2818:2019-06-05 17:40:39,494 [myid:2147483652] - WARN > [SendWorker:2147483653:QuorumCnxManager$SendWorker@834] - Interrupted while > waiting for message on queue > > 2819:java.lang.InterruptedException > > 2820: at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown > Source) > > 2821: at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown > Source) > > 2822: at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source) > > 2823: at > org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:986) > > 2824: at > org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:63) > > 2825: at > org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:822) > > 2826:2019-06-05 17:40:39,496 [myid:2147483652] - WARN > [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@917] - Connection broken > for id 2147483653, my id = 2147483652, error = > > 2827:java.net.SocketException: Socket closed > > 2828: at java.net.SocketInputStream.socketRead0(Native Method) > > 2829: at java.net.SocketInputStream.socketRead(Unknown Source) > > 2830: at java.net.SocketInputStream.read(Unknown Source) > > 2831: at java.net.SocketInputStream.read(Unknown Source) > > 2832: at java.net.SocketInputStream.read(Unknown Source) > > 2833: at java.io.DataInputStream.readInt(Unknown Source) > > 2834: at > org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:902) > > 2835:2019-06-05 17:40:39,496 [myid:2147483652] - WARN > [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@920] - Interrupting > SendWorker > > 2836:2019-06-05 17:40:39,499 [myid:2147483652] - WARN > [SendWorker:2147483653:QuorumCnxManager$SendWorker@843] - Send worker > leaving thread id 2147483653 my id = 2147483652 > > 2837:2019-06-05 17:40:39,499 [myid:2147483652] - INFO [ > QuorumPeerListener:QuorumCnxManager$Listener@631] - My election bind port: > 0.0.0.0/0.0.0.0:61864 > > 3072:2019-06-05 17:40:42,014 [myid:2147483652] - INFO > [WorkerSender[myid=2147483652]:FastLeaderElection$Messenger$WorkerSender@470] > - WorkerSender is down > > 3073:2019-06-05 17:40:42,014 [myid:2147483652] - INFO > [WorkerReceiver[myid=2147483652]:FastLeaderElection$Messenger$WorkerReceiver@440] > - WorkerReceiver is down > > 3178:2019-06-05 17:40:42,580 [myid:2147483652] - INFO > [NIOServerCxnFactory.AcceptThread:/0.0.0.0:61862 > :NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from / > 128.0.0.5:35418 > > 3179:2019-06-05 17:40:42,583 [myid:2147483652] - INFO > [NIOWorkerThread-6:ZooKeeperServer@964] - Client attempting to establish > new session at /128.0.0.5:35418
