[
https://issues.apache.org/jira/browse/NIFI-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392190#comment-15392190
]
ASF GitHub Bot commented on NIFI-2292:
--------------------------------------
Github user markap14 commented on the issue:
https://github.com/apache/nifi/pull/701
@JPercivall that's a good catch! I updated the PR to address this. The
issue was that whenever there is only 1 node in the cluster, it is complaining
if we attempt to disconnect ANY node (even one that's already marked
disconnected). And since we'd received a heartbeat from a disconnected node, we
kept trying to disconnect it. Updated code and unit tests to verify. Thanks!
> Nodes in cluster sometimes become out-of-sync with actual 'connection state'
> of node
> ------------------------------------------------------------------------------------
>
> Key: NIFI-2292
> URL: https://issues.apache.org/jira/browse/NIFI-2292
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
> Fix For: 1.0.0
>
>
> Occasionally I'll see a node that has a different view of the cluster than
> other nodes. Right now I'm actually seeing "node 1" think it's in
> 'CONNECTING' state while nodes 2-5 think we have 5/5 nodes connected.
> This also can result in a node that is elected cluster coordinator and then
> has that role revoked can continually monitor for heartbeats, even though it
> won't receive them since it's not the coordinator anymore. This results in
> continually logging a message like "Failed to retrieve any new heartbeat
> information for nodes. Will not make any decisions based on heartbeats."
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)