[ 
https://issues.apache.org/jira/browse/NIFI-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622580#comment-16622580
 ] 

ASF GitHub Bot commented on NIFI-5585:
--------------------------------------

Github user jtstorck commented on the issue:

    https://github.com/apache/nifi/pull/3010
  
    While testing the reconnection of a decommissioned node, with ~24,000 files 
split between the two nodes, an error occurred after reconnecting the 
decomissioned node and attempting to drop all the flowfiles in the queue, from 
both nodes:
    ```
    2018-09-20 15:25:22,594 ERROR [Drop FlowFiles for Connection 
cbbf2971-0165-1000-ffff-ffff94b81269] 
o.a.n.c.q.c.SocketLoadBalancedFlowFileQueue Failed to drop FlowFiles for 
org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue@469caf69
    java.lang.IllegalArgumentException: null
        at org.apache.nifi.controller.queue.QueueSize.<init>(QueueSize.java:31)
        at org.apache.nifi.controller.queue.QueueSize.add(QueueSize.java:67)
        at 
org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.adjustSize(SocketLoadBalancedFlowFileQueue.java:514)
        at 
org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.dropFlowFiles(SocketLoadBalancedFlowFileQueue.java:903)
        at 
org.apache.nifi.controller.queue.AbstractFlowFileQueue$2.run(AbstractFlowFileQueue.java:285)
        at java.lang.Thread.run(Thread.java:748)
    ```
    The decommission operation was successful, all flowfiles were moved from 
the decommissioned node to the other node.  After reconnecting the 
decommissioned node, I couldn't clear the flowfile queue. 
    
    After restarting the cluster (both nodes), the queue showed as empty.


> Decommision Nodes from Cluster
> ------------------------------
>
>                 Key: NIFI-5585
>                 URL: https://issues.apache.org/jira/browse/NIFI-5585
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>    Affects Versions: 1.7.1
>            Reporter: Jeff Storck
>            Assignee: Jeff Storck
>            Priority: Major
>
> Allow a node in the cluster to be decommissioned, rebalancing flowfiles on 
> the node to be decommissioned to the other active nodes.  This work depends 
> on NIFI-5516.
> Similar to the client sending PUT request a DISCONNECTING message to 
> cluster/nodes/\{id}, a DECOMMISSIONING message can be sent as a PUT request 
> to the same URI to initiate a DECOMMISSION for a DISCONNECTED node. The 
> DECOMMISSIONING request will be idempotent.
> The steps to decommission a node and remove it from the cluster are:
> # Send request to disconnect the node
> # Once disconnect completes, send request to decommission the node.
> # Once decommission completes, send request to delete node.
> When an error occurs and the node can not complete decommissioning, the user 
> can:
> # Send request to delete the node from the cluster
> # Diagnose why the node had issues with the decommission (out of memory, no 
> network connection, etc) and address the issue
> # Restart NiFi on the node to so that it will reconnect to the cluster
> # Go through the steps to decommission and remove a node



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to