[
https://issues.apache.org/jira/browse/NIFI-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622580#comment-16622580
]
ASF GitHub Bot commented on NIFI-5585:
--------------------------------------
Github user jtstorck commented on the issue:
https://github.com/apache/nifi/pull/3010
While testing the reconnection of a decommissioned node, with ~24,000 files
split between the two nodes, an error occurred after reconnecting the
decomissioned node and attempting to drop all the flowfiles in the queue, from
both nodes:
```
2018-09-20 15:25:22,594 ERROR [Drop FlowFiles for Connection
cbbf2971-0165-1000-ffff-ffff94b81269]
o.a.n.c.q.c.SocketLoadBalancedFlowFileQueue Failed to drop FlowFiles for
org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue@469caf69
java.lang.IllegalArgumentException: null
at org.apache.nifi.controller.queue.QueueSize.<init>(QueueSize.java:31)
at org.apache.nifi.controller.queue.QueueSize.add(QueueSize.java:67)
at
org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.adjustSize(SocketLoadBalancedFlowFileQueue.java:514)
at
org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.dropFlowFiles(SocketLoadBalancedFlowFileQueue.java:903)
at
org.apache.nifi.controller.queue.AbstractFlowFileQueue$2.run(AbstractFlowFileQueue.java:285)
at java.lang.Thread.run(Thread.java:748)
```
The decommission operation was successful, all flowfiles were moved from
the decommissioned node to the other node. After reconnecting the
decommissioned node, I couldn't clear the flowfile queue.
After restarting the cluster (both nodes), the queue showed as empty.
> Decommision Nodes from Cluster
> ------------------------------
>
> Key: NIFI-5585
> URL: https://issues.apache.org/jira/browse/NIFI-5585
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Affects Versions: 1.7.1
> Reporter: Jeff Storck
> Assignee: Jeff Storck
> Priority: Major
>
> Allow a node in the cluster to be decommissioned, rebalancing flowfiles on
> the node to be decommissioned to the other active nodes. This work depends
> on NIFI-5516.
> Similar to the client sending PUT request a DISCONNECTING message to
> cluster/nodes/\{id}, a DECOMMISSIONING message can be sent as a PUT request
> to the same URI to initiate a DECOMMISSION for a DISCONNECTED node. The
> DECOMMISSIONING request will be idempotent.
> The steps to decommission a node and remove it from the cluster are:
> # Send request to disconnect the node
> # Once disconnect completes, send request to decommission the node.
> # Once decommission completes, send request to delete node.
> When an error occurs and the node can not complete decommissioning, the user
> can:
> # Send request to delete the node from the cluster
> # Diagnose why the node had issues with the decommission (out of memory, no
> network connection, etc) and address the issue
> # Restart NiFi on the node to so that it will reconnect to the cluster
> # Go through the steps to decommission and remove a node
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)