[
https://issues.apache.org/jira/browse/NIFI-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410050#comment-15410050
]
ASF GitHub Bot commented on NIFI-2406:
--------------------------------------
Github user YolandaMDavis commented on the issue:
https://github.com/apache/nifi/pull/729
@markap14 thank you for this update. I executed my tests as previously
described and ran into no issues. When disconnecting a node designated as
cluster coordinator from both the UI and on the command line (through
stop/restart) the remaining nodes behaves as expected. It properly displays an
error to the user and nots in logs it's inability to determine a cluster
coordinator. When the coordinator is back online the logs indicate successful
reconnection and communication. Refresh of the screen properly demonstrates the
flow and configured flow (which consists of use of remote process groups
communicating with a standalone nifi node) operate successfully
+1
thank you @markap14 for the quick turnaround on this!
> Rare start-up problems resulting in all nodes disconnected
> ----------------------------------------------------------
>
> Key: NIFI-2406
> URL: https://issues.apache.org/jira/browse/NIFI-2406
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Joseph Percivall
> Assignee: Mark Payne
> Fix For: 1.0.0
>
> Attachments: logs.tar.gz
>
>
> While testing PR 678[1], I came across a time where all the nodes were in a
> disconnected state and each were in a weird state of heartbeating but not
> connected.
> Also in the logs there were ~1000 lines of:
> 2016-07-26 11:38:07,841 INFO [Leader Election Notification Thread-1]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@24fae8c6
> This node has been elected Leader for Role 'Cluster Coordinator'
> This message only gets called here[2] which is a call back for ZK. Also there
> were many log messages of:
> 2016-07-26 11:54:07,910 WARN [Clustering Tasks Thread-1]
> o.a.n.c.c.node.NodeClusterCoordinator Failed to determine which node is
> elected active Cluster Coordinator: ZooKeeper reports the address as
> localhost:6001, but there is no node with this address
> I believe this is a problem with ZK/NiFi that existed before this PR and not
> directly related to the PR being reviewed. I will attach a tar of the 3
> node's logs.
> [1] https://github.com/apache/nifi/pull/678
> [2]
> https://github.com/apache/nifi/blame/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/leader/election/CuratorLeaderElectionManager.java#L220-L220
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)