Hi,
We have a 3 node nifi cluster (With separate zookeper instances running in
the same machines) which pulls the data from mysql and write to hdfs. I am
frequently running into problems with cluster. Nodes keeps disconnecting
from each other, primary nodes keeps switching and sometimes it just goes
into zombie state when I just cannot access the ui. I have followed best
practices guide and tweaked params in nifi.properties, have switched
provenanceRepositoryImplementation to volatile because cluster was not able
to keep up with incoming traffic. Data traffic is not high at all (4Mbps).
This is the message I frequently get from the logs.

*INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change:
LOST*
*INFO [Curator-ConnectionStateManager-0]
o.a.n.c.l.e.CuratorLeaderElectionManager
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@56ebedec
Connection State changed to LOST*
*INFO [Curator-ConnectionStateManager-0]
o.a.n.c.l.e.CuratorLeaderElectionManager
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@1b0e2055
Connection State changed to LOST*
*INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change:
RECONNECTED*

Am I doing something wrong with cluster setup ? Can someone give me some
guidance on how to go about debugging this issue ? What kind of system
metrics to look at etc.

Ashwin

Reply via email to