Hey Pierre, we're currently using the embedded ZK. So as far as I know this is Version 3.5.6
--- Ursprüngliche Nachricht --- Von: Pierre Villard <pierre.villard...@gmail.com> Datum: 16.08.2021 16:59:22 An: users@nifi.apache.org, Axel Schwarz <axelkop...@emailn.de> Betreff: Re: Disfunctional cluster with version 1.13.2 > Hi, > > What's the version of ZK? > > Thanks, > Pierre > > Le jeu. 12 août 2021 à 09:55, Axel Schwarz <axelkop...@emailn.de> a > écrit : > > > Dear all, > > > > after successfully battling the load balancing and installing Version > > > 1.13.2 again in our 3 node production environment, we experienced another > > > failure in the cluster resulting in a complete cut-off of the flow just > > > 1,5h after the update. > > We noticed it just by trying to access the webinterface, which immediately > > > after login showed something like: > > > > "Cannot replicate request to Node nifiHost1.contoso.com:8443 because > the > > node is not connected" > > > > There was nothing we could do through the webinterface aside from staring > > > at this message and when looking at the live logs, there was nothing > > > suspicious. The log moved on as if nothing happened. > > After a restart of the cluster everything was working fine again, but > we > > saw, that the entire flow wasn't working for some period of time. This > > > alone is really uncool, as we running a cluster for exactly that reason: > > > The flow should keep working, even if some node decides to malfunction > for > > whatever reason. > > > > Digging a little deeper into the logs showed two noticable problems: > > > > > 1. The Zookeeper is restarting every few minutes. Which in the log always > > > looks like this: > > > > ------------------------------------(nifiHost1.contoso.com) > > nifi-app.log------------------------------------------------------------------------------------ > > > > > 2021-08-11 12:02:39,187 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > 2021-08-11 12:02:39,187 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer shutting down > > 2021-08-11 12:02:39,194 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.z.server.FinalRequestProcessor shutdown of request processor complete > > > 2021-08-11 12:02:39,196 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.z.server.SyncRequestProcessor Shutting down > > 2021-08-11 12:02:39,196 INFO [SyncThread:1] > > o.a.z.server.SyncRequestProcessor SyncRequestProcessor exited! > > 2021-08-11 12:02:39,199 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer minSessionTimeout set to 4000 > > 2021-08-11 12:02:39,200 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer maxSessionTimeout set to 40000 > > > 2021-08-11 12:02:39,200 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer Created server with tickTime 2000 > > > minSessionTimeout 4000 maxSessionTimeout 40000 datadir > > /opt/nifi/logs/zookeeper/version-2 snapdir > > /opt/nifi/state/zookeeper/version-2 > > 2021-08-11 12:02:39,276 ERROR [Timer-Driven Process Thread-27-EventThread] > > > org.apache.zookeeper.ClientCnxn Error while calling watcher > > java.lang.NullPointerException: null > > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:535) > > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) > > > 2021-08-11 12:02:39,277 INFO [main-EventThread] > > o.a.c.f.state.ConnectionStateManager State change: SUSPENDED > > 2021-08-11 12:02:40,001 INFO [main-EventThread] > > o.a.c.f.state.ConnectionStateManager State change: RECONNECTED > > 2021-08-11 12:02:40,006 INFO [Curator-ConnectionStateManager-0] > > o.a.n.c.l.e.CuratorLeaderElectionManager > > org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1d37e350 > > > Connection State changed to SUSPENDED > > 2021-08-11 12:02:40,006 INFO [Curator-ConnectionStateManager-0] > > o.a.n.c.l.e.CuratorLeaderElectionManager > > org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1908642e > > > Connection State changed to SUSPENDED > > 2021-08-11 12:02:40,006 INFO [Curator-ConnectionStateManager-0] > > o.a.n.c.l.e.CuratorLeaderElectionManager > > org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1d37e350 > > > Connection State changed to RECONNECTED > > 2021-08-11 12:02:40,006 INFO [Curator-ConnectionStateManager-0] > > o.a.n.c.l.e.CuratorLeaderElectionManager > > org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1908642e > > > Connection State changed to RECONNECTED > > 2021-08-11 12:02:40,014 INFO [main-EventThread] > > o.a.c.framework.imps.EnsembleTracker New config event received: {} > > 2021-08-11 12:02:40,127 INFO [Clustering Tasks Thread-2] > > o.a.n.c.c.ClusterProtocolHeartbeater Heartbeat created at 2021-08-11 > > > 12:02:39,176 and sent to nifiHost3.contoso.com:54321 at 2021-08-11 > > 12:02:40,127; send took 951 millis > > 2021-08-11 12:02:40,186 ERROR [Timer-Driven Process Thread-27-EventThread] > > > org.apache.zookeeper.ClientCnxn Error while calling watcher > > java.lang.NullPointerException: null > > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:535) > > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510) > > > > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > > > > > This happens, as mentioned, every few minutes and looks like this when > > > condensed through grep: > > > > ------------------------------------(nifiHost1.contoso.com) nifi-app.log > > > | grep "ZooKeeperServer Shutting > > down"--------------------------------------------- > > > > ./nifi-app.log:4237:2021-08-11 12:02:39,187 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:4378:2021-08-11 12:03:22,084 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:4447:2021-08-11 12:03:57,639 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:4563:2021-08-11 12:04:18,921 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:4629:2021-08-11 12:04:33,199 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:4650:2021-08-11 12:05:03,939 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:4801:2021-08-11 12:05:39,348 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:5327:2021-08-11 12:06:56,736 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:7250:2021-08-11 12:21:41,464 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:7703:2021-08-11 12:24:54,244 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8148:2021-08-11 12:27:21,093 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8190:2021-08-11 12:27:37,820 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8350:2021-08-11 12:28:37,417 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8797:2021-08-11 12:33:25,184 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8813:2021-08-11 12:33:39,672 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8971:2021-08-11 12:34:22,003 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:8984:2021-08-11 12:34:36,511 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:9349:2021-08-11 12:36:29,802 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:9376:2021-08-11 12:36:41,541 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:9491:2021-08-11 12:37:12,390 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:9504:2021-08-11 12:37:25,888 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:9676:2021-08-11 12:39:22,775 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:9842:2021-08-11 12:40:49,396 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:10251:2021-08-11 12:43:38,807 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:10397:2021-08-11 12:44:36,326 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:11405:2021-08-11 12:52:26,384 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:11629:2021-08-11 12:53:33,054 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:11827:2021-08-11 12:54:43,783 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:11990:2021-08-11 12:56:23,389 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:12085:2021-08-11 12:57:21,410 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:12823:2021-08-11 13:01:23,674 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:13087:2021-08-11 13:04:44,524 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:14610:2021-08-11 13:16:15,432 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:14635:2021-08-11 13:16:37,724 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:14751:2021-08-11 13:17:38,169 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:15136:2021-08-11 13:17:59,547 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:15203:2021-08-11 13:18:30,695 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:15516:2021-08-11 13:21:43,631 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:15743:2021-08-11 13:22:49,630 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:15950:2021-08-11 13:23:50,574 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16035:2021-08-11 13:24:30,913 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16396:2021-08-11 13:28:01,873 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16470:2021-08-11 13:29:12,550 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16541:2021-08-11 13:29:38,490 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16568:2021-08-11 13:29:53,554 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16678:2021-08-11 13:31:13,437 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > ./nifi-app.log:16705:2021-08-11 13:31:28,195 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > > > > > So my first question is: Is this behaviour of zookeeper normal? Well > it is > > just an INFO, but I would love to here that this is a completely normal > > > phenomenon. :D > > > > 2. After all these zookeeper restarts, nifi finally throws the following > > > error along with the "last" zookeeper restart: > > > > ------------------------------------(nifiHost1.contoso.com) > > nifi-app.log------------------------------------------------------------------------------------ > > > > > 2021-08-11 13:31:13,083 INFO [main-EventThread] > > o.a.c.f.state.ConnectionStateManager State change: SUSPENDED > > 2021-08-11 13:31:13,437 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > 2021-08-11 13:31:13,437 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer shutting down > > 2021-08-11 13:31:13,439 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.z.server.FinalRequestProcessor shutdown of request processor complete > > > 2021-08-11 13:31:13,440 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.z.server.SyncRequestProcessor Shutting down > > 2021-08-11 13:31:13,440 INFO [SyncThread:1] > > o.a.z.server.SyncRequestProcessor SyncRequestProcessor exited! > > 2021-08-11 13:31:13,450 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer minSessionTimeout set to 4000 > > 2021-08-11 13:31:13,450 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer maxSessionTimeout set to 40000 > > > 2021-08-11 13:31:13,450 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer Created server with tickTime 2000 > > > minSessionTimeout 4000 maxSessionTimeout 40000 datadir > > /opt/nifi/logs/zookeeper/version-2 snapdir > > /opt/nifi/state/zookeeper/version-2 > > 2021-08-11 13:31:13,462 ERROR [Load-Balance Server Thread-5] > > o.a.n.c.q.c.s.ConnectionLoadBalanceServer Failed to communicate with > Peer > > nifiHost2.contoso.com/192.168.1.11:19500 > > java.io.EOFException: Expected to receive a Transaction Completion > > Indicator from Peer nifiHost2.contoso.com but encountered EOF > > at > > org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.completeTransaction(StandardLoadBalanceProtocol.java:299) > > > at > > org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.receiveFlowFiles(StandardLoadBalanceProtocol.java:275) > > > at > > org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.receiveFlowFiles(StandardLoadBalanceProtocol.java:133) > > > at > > org.apache.nifi.controller.queue.clustered.server.ConnectionLoadBalanceServer$CommunicateAction.run(ConnectionLoadBalanceServer.java:162) > > > at java.lang.Thread.run(Thread.java:748) > > 2021-08-11 13:31:13,464 ERROR [Load-Balance Server Thread-6] > > o.a.n.c.q.c.s.ConnectionLoadBalanceServer Failed to communicate with > Peer > > nifiHost2.contoso.com/192.168.1.11:19502 > > java.io.EOFException: Expected to receive a Transaction Completion > > Indicator from Peer nifiHost2.contoso.com but encountered EOF > > at > > org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.completeTransaction(StandardLoadBalanceProtocol.java:299) > > > at > > org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.receiveFlowFiles(StandardLoadBalanceProtocol.java:275) > > > at > > org.apache.nifi.controller.queue.clustered.server.StandardLoadBalanceProtocol.receiveFlowFiles(StandardLoadBalanceProtocol.java:133) > > > at > > org.apache.nifi.controller.queue.clustered.server.ConnectionLoadBalanceServer$CommunicateAction.run(ConnectionLoadBalanceServer.java:162) > > > at java.lang.Thread.run(Thread.java:748) > > 2021-08-11 13:31:13,589 INFO [Process Cluster Protocol Request-1] > > o.a.n.c.c.node.NodeClusterCoordinator Status of nifiHost1.contoso.com:8443 > > > changed from NodeConnectionStatus[nodeId=nifiHost1.contoso.com:8443, > > > state=CONNECTED, updateId=16] to NodeConnectionStatus[nodeId= > > nifiHost1.contoso.com:8443, state=CONNECTING, updateId=16] > > 2021-08-11 13:31:13,793 INFO [Process Cluster Protocol Request-7] > > o.a.n.c.p.impl.SocketProtocolListener Finished processing request > > 88980d85-b956-4713-9686-50d76a407871 (type=RECONNECTION_REQUEST, > > length=2209040 bytes) from nifiHost1.contoso.com:8443 in 142 millis > > > 2021-08-11 13:31:13,794 INFO [Reconnect to Cluster] > > o.a.nifi.controller.StandardFlowService Processing reconnection request > > > from cluster coordinator. > > 2021-08-11 13:31:13,794 INFO [Reconnect to Cluster] > > o.a.n.c.l.e.CuratorLeaderElectionManager Election Listener for Role > Primary > > Node disabled > > 2021-08-11 13:31:13,802 INFO [main-EventThread] > > o.a.c.f.state.ConnectionStateManager State change: RECONNECTED > > 2021-08-11 13:31:28,195 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer Shutting down > > 2021-08-11 13:31:28,196 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer shutting down > > 2021-08-11 13:31:28,198 INFO [pool-22-thread-1] > > o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile > > > Repository > > 2021-08-11 13:31:28,202 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.z.server.FinalRequestProcessor shutdown of request processor complete > > > 2021-08-11 13:31:28,203 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.z.server.SyncRequestProcessor Shutting down > > 2021-08-11 13:31:28,203 INFO [SyncThread:1] > > o.a.z.server.SyncRequestProcessor SyncRequestProcessor exited! > > 2021-08-11 13:31:28,204 INFO [Timer-Driven Process Thread-21] > > o.a.n.p.store.WriteAheadStorePartition Successfully rolled over Event > > > Writer for Provenance Event Store Partition[directory=/opt/pr] due to > > > MAX_TIME_REACHED. Event File was 6.7 MB and contained 1626 events. > > 2021-08-11 13:31:28,208 INFO [pool-22-thread-1] > > o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log > with > > 10 Records and 0 Swap Files in 8 milliseconds (Stop-the-world time = > 1 > > milliseconds), max Transaction ID 27772177643 > > 2021-08-11 13:31:28,208 INFO [pool-22-thread-1] > > o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile > > > Repository with 10 records in 8 milliseconds > > 2021-08-11 13:31:28,295 INFO [main-EventThread] > > o.a.c.f.state.ConnectionStateManager State change: SUSPENDED > > 2021-08-11 13:31:29,808 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer minSessionTimeout set to 4000 > > 2021-08-11 13:31:29,809 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer maxSessionTimeout set to 40000 > > > 2021-08-11 13:31:29,809 INFO > > [QuorumPeer[myid=1](plain=/0.0.0.0:2181)(secure=disabled)] > > > o.a.zookeeper.server.ZooKeeperServer Created server with tickTime 2000 > > > minSessionTimeout 4000 maxSessionTimeout 40000 datadir > > /opt/nifi/logs/zookeeper/version-2 snapdir > > /opt/nifi/state/zookeeper/version-2 > > 2021-08-11 13:31:32,291 INFO [main-EventThread] > > o.a.c.f.state.ConnectionStateManager State change: RECONNECTED > > > > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > > > > > After that, the log continues with normal messages about heartbeats > and > > provenance events but no error occurs anymore. That exactly was the > time, > > the whole flow cut-off. > > > > I've got to add, that we've never experienced that flow cut-off behaviour > > > with version 1.12.1, but the zookeeper restarts occur there as well > and as > > often as seen above. > > But when first updating to version 1.13.2 back in June, we experienced > > > this flow cut-off multiple times beside the not working load balancing. > > > Back then I thought this was connected to each other, but now we had > a > > fully functional load balancing and the cut-off happened nevertheless. > > > > > I'm thankful for any hint. :) > > > > > > > > >