Github user alopresto commented on the issue:
https://github.com/apache/nifi/pull/1916
Ran `contrib-check` and all tests pass. +1, merging.
I did start a two node cluster and ensure it was communicating, then used
`pf` to temporarily block the cluster port to trigger the error message. I did
not see the duplicated stack trace.
```
# Enabled port blocking
2017-06-23 11:12:55,991 INFO [Clustering Tasks Thread-1]
o.a.n.c.c.ClusterProtocolHeartbeater Heartbeat created at 2017-06-23
11:12:55,987 and sent to localhost:10000 at 2017-06-23 11:12:55,991; send took
3 millis
2017-06-23 11:12:56,561 INFO [Process Cluster Protocol Request-10]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
3ecf8509-0b0b-4602-ac55-1ec9057dc3c3 (type=HEARTBEAT, length=2344 bytes) from
localhost:8081 in 1 millis
2017-06-23 11:12:57,293 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 6930
nanos
2017-06-23 11:13:00,433 INFO [Write-Ahead Local State Provider Maintenance]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@7a80ebcd checkpointed with 1 Records and 0
Swap Files in 12 milliseconds (Stop-the-world time = 4 milliseconds, Clear Edit
Logs time = 4 millis), max Transaction ID 0
2017-06-23 11:13:00,998 INFO [Process Cluster Protocol Request-1]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
dc71898f-f3e6-4aaa-af2e-b0dae65e4f53 (type=HEARTBEAT, length=2344 bytes) from
localhost:8080 in 1 millis
2017-06-23 11:13:00,999 INFO [Clustering Tasks Thread-1]
o.a.n.c.c.ClusterProtocolHeartbeater Heartbeat created at 2017-06-23
11:13:00,995 and sent to localhost:10000 at 2017-06-23 11:13:00,999; send took
4 millis
2017-06-23 11:13:01,572 INFO [Process Cluster Protocol Request-2]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
83d6969d-6073-43d4-b8e3-c182d4feea48 (type=HEARTBEAT, length=2344 bytes) from
localhost:8081 in 2 millis
2017-06-23 11:13:02,296 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 6734
nanos
2017-06-23 11:13:07,300 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 5332
nanos
2017-06-23 11:13:12,305 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 6497
nanos
2017-06-23 11:13:17,308 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 5851
nanos
2017-06-23 11:13:22,314 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 5554
nanos
2017-06-23 11:13:27,318 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 5786
nanos
2017-06-23 11:13:32,045 WARN [Clustering Tasks Thread-1]
o.apache.nifi.controller.FlowController Failed to send heartbeat due to:
org.apache.nifi.cluster.protocol.ProtocolException: Failed to send message to
Cluster Coordinator due to: java.net.ConnectException: Operation timed out
2017-06-23 11:13:32,323 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 6357
nanos
2017-06-23 11:13:37,328 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 6451
nanos
2017-06-23 11:13:42,333 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 5380
nanos
2017-06-23 11:13:42,333 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator localhost:8081 requested disconnection
from cluster due to Have not received a heartbeat from node in 40 seconds
2017-06-23 11:13:42,333 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Status of localhost:8081 changed from
NodeConnectionStatus[nodeId=localhost:8081, state=CONNECTED, updateId=125] to
NodeConnectionStatus[nodeId=localhost:8081, state=DISCONNECTED, Disconnect
Code=Lack of Heartbeat, Disconnect Reason=Have not received a heartbeat from
node in 40 seconds, updateId=126]
2017-06-23 11:13:42,334 WARN [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8081 -- Node
disconnected from cluster due to Have not received a heartbeat from node in 40
seconds
2017-06-23 11:13:42,334 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator localhost:8080 requested disconnection
from cluster due to Have not received a heartbeat from node in 41 seconds
2017-06-23 11:13:42,334 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Status of localhost:8080 changed from
NodeConnectionStatus[nodeId=localhost:8080, state=CONNECTED, updateId=121] to
NodeConnectionStatus[nodeId=localhost:8080, state=DISCONNECTED, Disconnect
Code=Lack of Heartbeat, Disconnect Reason=Have not received a heartbeat from
node in 41 seconds, updateId=127]
2017-06-23 11:13:42,335 WARN [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8080 -- Node
disconnected from cluster due to Have not received a heartbeat from node in 41
seconds
2017-06-23 11:13:54,349 INFO [NiFi Web Server-183]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:13:54,349 INFO [NiFi Web Server-141]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:13:54,349 INFO [NiFi Web Server-158]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:13:54,352 INFO [NiFi Web Server-211]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:14:03,063 WARN [Clustering Tasks Thread-1]
o.apache.nifi.controller.FlowController Failed to send heartbeat due to:
org.apache.nifi.cluster.protocol.ProtocolException: Failed to send message to
Cluster Coordinator due to: java.net.ConnectException: Operation timed out
# Disabled port blocking
2017-06-23 11:13:42,335 WARN [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8080 -- Node
disconnected from cluster due to Have not received a heartbeat from node in 41
seconds
2017-06-23 11:13:54,349 INFO [NiFi Web Server-183]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:13:54,349 INFO [NiFi Web Server-141]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:13:54,349 INFO [NiFi Web Server-158]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:13:54,352 INFO [NiFi Web Server-211]
o.a.n.c.m.e.NoConnectedNodesException Cluster failed processing request:
org.apache.nifi.cluster.manager.exception.NoConnectedNodesException. Returning
Internal Server Error response.
2017-06-23 11:14:03,063 WARN [Clustering Tasks Thread-1]
o.apache.nifi.controller.FlowController Failed to send heartbeat due to:
org.apache.nifi.cluster.protocol.ProtocolException: Failed to send message to
Cluster Coordinator due to: java.net.ConnectException: Operation timed out
2017-06-23 11:14:05,343 INFO [pool-12-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile
Repository
2017-06-23 11:14:05,523 INFO [pool-12-thread-1]
org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@3f4b59fd checkpointed with 0 Records and 0
Swap Files in 180 milliseconds (Stop-the-world time = 70 milliseconds, Clear
Edit Logs time = 93 millis), max Transaction ID -1
2017-06-23 11:14:05,523 INFO [pool-12-thread-1]
o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed FlowFile
Repository with 0 records in 180 milliseconds
2017-06-23 11:14:34,103 WARN [Clustering Tasks Thread-1]
o.apache.nifi.controller.FlowController Failed to send heartbeat due to:
org.apache.nifi.cluster.protocol.ProtocolException: Failed to send message to
Cluster Coordinator due to: java.net.ConnectException: Operation timed out
2017-06-23 11:14:52,374 INFO [Process Cluster Protocol Request-3]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
e636ce6d-32b2-43f1-b4f5-57fcf9a31f25 (type=HEARTBEAT, length=2724 bytes) from
localhost:8080 in 1 millis
2017-06-23 11:14:52,376 INFO [Clustering Tasks Thread-1]
o.a.n.c.c.ClusterProtocolHeartbeater Heartbeat created at 2017-06-23
11:14:39,108 and sent to localhost:10000 at 2017-06-23 11:14:52,376; send took
13268 millis
2017-06-23 11:14:52,388 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8080 --
Received heartbeat from node previously disconnected due to Lack of Heartbeat.
Issuing reconnection request.
2017-06-23 11:14:52,388 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8080 --
Requesting that node connect to cluster
2017-06-23 11:14:52,388 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Status of localhost:8080 changed from
NodeConnectionStatus[nodeId=localhost:8080, state=DISCONNECTED, Disconnect
Code=Lack of Heartbeat, Disconnect Reason=Have not received a heartbeat from
node in 41 seconds, updateId=127] to
NodeConnectionStatus[nodeId=localhost:8080, state=CONNECTING, updateId=128]
2017-06-23 11:14:52,390 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 1 heartbeats in 1
millis
2017-06-23 11:14:52,395 INFO [Process Cluster Protocol Request-4]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
c9b89497-1cf4-452e-a7e3-92975222f0b7 (type=RECONNECTION_REQUEST, length=2217
bytes) from localhost:8080 in 2 millis
2017-06-23 11:14:52,395 INFO [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Processing reconnection request from
manager.
2017-06-23 11:14:52,395 INFO [Reconnect to Cluster]
o.a.n.c.c.node.NodeClusterCoordinator Resetting cluster node statuses from
{localhost:8081=NodeConnectionStatus[nodeId=localhost:8081, state=DISCONNECTED,
Disconnect Code=Lack of Heartbeat, Disconnect Reason=Have not received a
heartbeat from node in 40 seconds, updateId=126],
localhost:8080=NodeConnectionStatus[nodeId=localhost:8080, state=CONNECTING,
updateId=128]} to {localhost:8081=NodeConnectionStatus[nodeId=localhost:8081,
state=DISCONNECTED, Disconnect Code=Lack of Heartbeat, Disconnect Reason=Have
not received a heartbeat from node in 40 seconds, updateId=126],
localhost:8080=NodeConnectionStatus[nodeId=localhost:8080, state=CONNECTING,
updateId=128]}
2017-06-23 11:14:52,396 INFO [Reconnect localhost:8080]
o.a.n.c.c.node.NodeClusterCoordinator Successfully requested that
localhost:8080 join the cluster
2017-06-23 11:14:52,404 INFO [Reconnect to Cluster]
o.apache.nifi.controller.FlowController Successfully synchronized controller
with proposed flow
2017-06-23 11:14:52,413 INFO [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Setting Flow Controller's Node ID:
localhost:8080
2017-06-23 11:14:52,413 INFO [Reconnect to Cluster]
o.a.n.c.c.node.NodeClusterCoordinator This node is now connected to the
cluster. Will no longer require election of DataFlow.
2017-06-23 11:14:52,413 INFO [Reconnect to Cluster]
o.apache.nifi.controller.FlowController Starting 0 processors/ports/funnels
2017-06-23 11:14:52,414 INFO [Reconnect to Cluster]
o.apache.nifi.controller.FlowController Started 0 Remote Group Ports
transmitting
2017-06-23 11:14:52,414 INFO [Reconnect to Cluster]
o.a.n.c.c.node.NodeClusterCoordinator Resetting cluster node statuses from
{localhost:8081=NodeConnectionStatus[nodeId=localhost:8081, state=DISCONNECTED,
Disconnect Code=Lack of Heartbeat, Disconnect Reason=Have not received a
heartbeat from node in 40 seconds, updateId=126],
localhost:8080=NodeConnectionStatus[nodeId=localhost:8080, state=CONNECTING,
updateId=128]} to {localhost:8081=NodeConnectionStatus[nodeId=localhost:8081,
state=DISCONNECTED, Disconnect Code=Lack of Heartbeat, Disconnect Reason=Have
not received a heartbeat from node in 40 seconds, updateId=126],
localhost:8080=NodeConnectionStatus[nodeId=localhost:8080, state=CONNECTING,
updateId=128]}
2017-06-23 11:14:52,416 INFO [Reconnect to Cluster]
o.a.nifi.controller.StandardFlowService Node reconnected.
2017-06-23 11:14:52,977 INFO [Process Cluster Protocol Request-5]
o.a.n.c.p.impl.SocketProtocolListener Finished processing request
c42f9b35-953d-47dc-a2f9-952307c60461 (type=HEARTBEAT, length=2344 bytes) from
localhost:8081 in 1 millis
2017-06-23 11:14:57,396 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8081 --
Received heartbeat from node previously disconnected due to Lack of Heartbeat.
Issuing reconnection request.
2017-06-23 11:14:57,396 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Event Reported for localhost:8081 --
Requesting that node connect to cluster
2017-06-23 11:14:57,396 INFO [Heartbeat Monitor Thread-1]
o.a.n.c.c.node.NodeClusterCoordinator Status of localhost:8081 changed from
NodeConnectionStatus[nodeId=localhost:8081, state=DISCONNECTED, Disconnect
Code=Lack of Heartbeat, Disconnect Reason=Have not received a heartbeat from
node in 40 seconds, updateId=126] to
NodeConnectionStatus[nodeId=localhost:8081, state=CONNECTING, updateId=130]
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---