[ 
https://issues.apache.org/jira/browse/NIFI-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne updated NIFI-2406:
-----------------------------
    Status: Patch Available  (was: Open)

> Rare start-up problems resulting in all nodes disconnected
> ----------------------------------------------------------
>
>                 Key: NIFI-2406
>                 URL: https://issues.apache.org/jira/browse/NIFI-2406
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Joseph Percivall
>            Assignee: Mark Payne
>         Attachments: logs.tar.gz
>
>
> While testing PR 678[1], I came across a time where all the nodes were in a 
> disconnected state and each were in a weird state of heartbeating but not 
> connected.
> Also in the logs there were ~1000 lines of:
> 2016-07-26 11:38:07,841 INFO [Leader Election Notification Thread-1] 
> o.a.n.c.l.e.CuratorLeaderElectionManager 
> org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@24fae8c6
>  This node has been elected Leader for Role 'Cluster Coordinator'
> This message only gets called here[2] which is a call back for ZK. Also there 
> were many log messages of:
> 2016-07-26 11:54:07,910 WARN [Clustering Tasks Thread-1] 
> o.a.n.c.c.node.NodeClusterCoordinator Failed to determine which node is 
> elected active Cluster Coordinator: ZooKeeper reports the address as 
> localhost:6001, but there is no node with this address
> I believe this is a problem with ZK/NiFi that existed before this PR and not 
> directly related to the PR being reviewed. I will attach a tar of the 3 
> node's logs.
> [1] https://github.com/apache/nifi/pull/678
> [2] 
> https://github.com/apache/nifi/blame/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/leader/election/CuratorLeaderElectionManager.java#L220-L220



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to