Hi, We have a 3 node nifi cluster (With separate zookeper instances running in the same machines) which pulls the data from mysql and write to hdfs. I am frequently running into problems with cluster. Nodes keeps disconnecting from each other, primary nodes keeps switching and sometimes it just goes into zombie state when I just cannot access the ui. I have followed best practices guide and tweaked params in nifi.properties, have switched provenanceRepositoryImplementation to volatile because cluster was not able to keep up with incoming traffic. Data traffic is not high at all (4Mbps). This is the message I frequently get from the logs.
*INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change: LOST* *INFO [Curator-ConnectionStateManager-0] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@56ebedec Connection State changed to LOST* *INFO [Curator-ConnectionStateManager-0] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1b0e2055 Connection State changed to LOST* *INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change: RECONNECTED* Am I doing something wrong with cluster setup ? Can someone give me some guidance on how to go about debugging this issue ? What kind of system metrics to look at etc. Ashwin
