Hello,

Just looking through this thread now. I believe that I understand the problem. 
I have updated the JIRA with details about what I think is the problem and a 
potential remedy for the problem.

Thanks
-Mark

> On May 18, 2017, at 9:49 AM, Matt Gilman <[email protected]> wrote:
> 
> Thanks for the additional details. They will be helpful when working the 
> JIRA. All nodes, including the coordinator, heartbeat to the active 
> coordinator. This means that the coordinator effectively heartbeats to 
> itself. It appears, based on your log messages, that this is not happening. 
> Because no heartbeats were receive from any node, the lack of heartbeats from 
> the terminated node is not considered.
> 
> Matt
> 
> Sent from my iPhone
> 
>> On May 18, 2017, at 8:30 AM, ddewaele <[email protected]> wrote:
>> 
>> Found something interesting in the centos-b debug logging.... 
>> 
>> after centos-a (the coordinator) is killed centos-b takes over. Notice how
>> it "Will not disconnect any nodes due to lack of heartbeat" and how it still
>> sees centos-a as connected despite the fact that there are no heartbeats
>> anymore.
>> 
>> 2017-05-18 12:41:38,010 INFO [Leader Election Notification Thread-2]
>> o.apache.nifi.controller.FlowController This node elected Active Cluster
>> Coordinator
>> 2017-05-18 12:41:38,010 DEBUG [Leader Election Notification Thread-2]
>> o.a.n.c.c.h.ClusterProtocolHeartbeatMonitor Purging old heartbeats
>> 2017-05-18 12:41:38,014 INFO [Leader Election Notification Thread-1]
>> o.apache.nifi.controller.FlowController This node has been elected Primary
>> Node
>> 2017-05-18 12:41:38,353 DEBUG [Heartbeat Monitor Thread-1]
>> o.a.n.c.c.h.AbstractHeartbeatMonitor Received no new heartbeats. Will not
>> disconnect any nodes due to lack of heartbeat
>> 2017-05-18 12:41:41,336 DEBUG [Process Cluster Protocol Request-3]
>> o.a.n.c.c.h.ClusterProtocolHeartbeatMonitor Received new heartbeat from
>> centos-b:8080
>> 2017-05-18 12:41:41,337 DEBUG [Process Cluster Protocol Request-3]
>> o.a.n.c.c.h.ClusterProtocolHeartbeatMonitor 
>> 
>> Calculated diff between current cluster status and node cluster status as
>> follows:
>> Node: [NodeConnectionStatus[nodeId=centos-b:8080, state=CONNECTED,
>> updateId=45], NodeConnectionStatus[nodeId=centos-a:8080, state=CONNECTED,
>> updateId=42]]
>> Self: [NodeConnectionStatus[nodeId=centos-b:8080, state=CONNECTED,
>> updateId=45], NodeConnectionStatus[nodeId=centos-a:8080, state=CONNECTED,
>> updateId=42]]
>> Difference: []
>> 
>> 
>> 2017-05-18 12:41:41,337 INFO [Process Cluster Protocol Request-3]
>> o.a.n.c.p.impl.SocketProtocolListener Finished processing request
>> 410e7db5-8bb0-4f97-8ee8-fc8647c54959 (type=HEARTBEAT, length=2341 bytes)
>> from centos-b:8080 in 3 millis
>> 2017-05-18 12:41:41,339 INFO [Clustering Tasks Thread-2]
>> o.a.n.c.c.ClusterProtocolHeartbeater Heartbeat created at 2017-05-18
>> 12:41:41,330 and sent to centos-b:10001 at 2017-05-18 12:41:41,339; send
>> took 8 millis
>> 2017-05-18 12:41:43,354 INFO [Heartbeat Monitor Thread-1]
>> o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 1 heartbeats in
>> 93276 nanos
>> 2017-05-18 12:41:46,346 DEBUG [Process Cluster Protocol Request-4]
>> o.a.n.c.c.h.ClusterProtocolHeartbeatMonitor Received new heartbeat from
>> centos-b:8080
>> 2017-05-18 12:41:46,346 DEBUG [Process Cluster Protocol Request-4]
>> o.a.n.c.c.h.ClusterProtocolHeartbeatMonitor 
>> 
>> Calculated diff between current cluster status and node cluster status as
>> follows:
>> Node: [NodeConnectionStatus[nodeId=centos-b:8080, state=CONNECTED,
>> updateId=45], NodeConnectionStatus[nodeId=centos-a:8080, state=CONNECTED,
>> updateId=42]]
>> Self: [NodeConnectionStatus[nodeId=centos-b:8080, state=CONNECTED,
>> updateId=45], NodeConnectionStatus[nodeId=centos-a:8080, state=CONNECTED,
>> updateId=42]]
>> Difference: []
>> 
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-nifi-users-list.2361937.n4.nabble.com/Nifi-Cluster-fails-to-disconnect-node-when-node-was-killed-tp1942p1950.html
>> Sent from the Apache NiFi Users List mailing list archive at Nabble.com.

Reply via email to