[ 
https://issues.apache.org/jira/browse/NIFI-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060948#comment-17060948
 ] 

Ganesh Banda commented on NIFI-6875:
------------------------------------

Hi Team,

  I am using Nifi 1.11.3 with external Zookeeper 5.3.6. Able to start the Nifi 
in a cluster mode. When I made complete ZK down, Nifi throwing bellow error and 
never join with the cluster. I need to restart the Nifi each node to to form 
cluster back again. Could you please help here ? I think for the production 
system restarting is not a good option I feel. Tried to increase zookeeper 
timeout to high values but didn't worked.

Logs:

2020-03-17 14:02:19,708 INFO [main-EventThread] 
o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
2020-03-17 14:02:19,709 INFO [Curator-ConnectionStateManager-0] 
o.a.n.c.l.e.CuratorLeaderElectionManager 
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@a032996
 Connection State changed to SUSPENDED
2020-03-17 14:02:19,710 INFO [Curator-ConnectionStateManager-0] 
o.a.n.c.l.e.CuratorLeaderElectionManager 
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@e3cbc0e
 Connection State changed to SUSPENDED

.

.

.

2020-03-17 14:19:00,044 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
2020-03-17 14:19:00,045 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background retry gave up
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
ConnectionLoss
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:972)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
2020-03-17 14:19:00,098 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:862)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:990)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
2020-03-17 14:19:00,098 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background retry gave up
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
ConnectionLoss
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:972)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:943)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:66)
 at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:346)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)

> Nifi Zookeeper Cluster_Mode broken in 1.10.0
> --------------------------------------------
>
>                 Key: NIFI-6875
>                 URL: https://issues.apache.org/jira/browse/NIFI-6875
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework, Flow Versioning
>    Affects Versions: 1.10.0
>         Environment: Kubernetes, Linux
>            Reporter: Glenn Wolfe
>            Priority: Blocker
>              Labels: bug, cluster-mode, kubernetes
>
> Expected: Exact same configuration and setup works perfectly on prior version 
> (1.9.2), as soon as I upgrade version, NIfi is unable to initialize.  
>  
> With external zookeeper (cluster_mode) configuration, Nifi is unable to 
> successfully elect leader and stuck in 'Invalid State: The Flow Controller is 
> initializing the Data Flow'. 
>  
> Logs: (Stuck in Loop)
> ```
> 2019-11-15 17:00:05,991 INFO [main-EventThread] 
> o.a.c.f.state.ConnectionStateManager State change: RECONNECTED
> 2019-11-15 17:00:05,991 INFO [Curator-ConnectionStateManager-0] 
> o.a.n.c.l.e.CuratorLeaderElectionManager 
> org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@256c961a
>  Connection State changed to RECONNECTED
> 2019-11-15 17:00:06,092 INFO [main-EventThread] 
> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
> 2019-11-15 17:00:06,092 WARN [main] o.a.nifi.controller.StandardFlowService 
> There is currently no Cluster Coordinator. This often happens upon restart of 
> NiFi when running an embedded ZooKeeper. Will register this node to become 
> the active Cluster Coordinator and will attempt to connect to cluster again
> 2019-11-15 17:00:06,093 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager 
> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader 
> Election for role 'Cluster Coordinator' but this role is already registered
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to