[
https://issues.apache.org/jira/browse/NIFI-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413743#comment-15413743
]
ASF GitHub Bot commented on NIFI-2406:
--------------------------------------
GitHub user markap14 opened a pull request:
https://github.com/apache/nifi/pull/820
NIFI-2406: Addressed regression introduced in NIFI-2406 where the clu…
…ster does not recognize a new Cluster Coordinator when the coordinator is
shutdown
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/markap14/nifi NIFI-2406-PART2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/nifi/pull/820.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #820
----
commit 3c9289e6c16b915e2a6dbe2e2d82d89a8d3321b8
Author: Mark Payne <[email protected]>
Date: 2016-08-09T15:11:08Z
NIFI-2406: Addressed regression introduced in NIFI-2406 where the cluster
does not recognize a new Cluster Coordinator when the coordinator is shutdown
----
> Rare start-up problems resulting in all nodes disconnected
> ----------------------------------------------------------
>
> Key: NIFI-2406
> URL: https://issues.apache.org/jira/browse/NIFI-2406
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Joseph Percivall
> Assignee: Mark Payne
> Fix For: 1.0.0
>
> Attachments: logs.tar.gz
>
>
> While testing PR 678[1], I came across a time where all the nodes were in a
> disconnected state and each were in a weird state of heartbeating but not
> connected.
> Also in the logs there were ~1000 lines of:
> 2016-07-26 11:38:07,841 INFO [Leader Election Notification Thread-1]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@24fae8c6
> This node has been elected Leader for Role 'Cluster Coordinator'
> This message only gets called here[2] which is a call back for ZK. Also there
> were many log messages of:
> 2016-07-26 11:54:07,910 WARN [Clustering Tasks Thread-1]
> o.a.n.c.c.node.NodeClusterCoordinator Failed to determine which node is
> elected active Cluster Coordinator: ZooKeeper reports the address as
> localhost:6001, but there is no node with this address
> I believe this is a problem with ZK/NiFi that existed before this PR and not
> directly related to the PR being reviewed. I will attach a tar of the 3
> node's logs.
> [1] https://github.com/apache/nifi/pull/678
> [2]
> https://github.com/apache/nifi/blame/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/leader/election/CuratorLeaderElectionManager.java#L220-L220
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)