[jira] [Commented] (NIFI-5585) Prepare Nodes to be Decommissioned from Cluster

ASF GitHub Bot (JIRA) Wed, 03 Oct 2018 10:30:44 -0700


    [ 
https://issues.apache.org/jira/browse/NIFI-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637257#comment-16637257
 ]


ASF GitHub Bot commented on NIFI-5585:
--------------------------------------

Github user andrewmlim commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/3010#discussion_r222397907
  
    --- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
    @@ -3929,6 +3929,13 @@ from the remote node before considering the 
communication with the node a failur
     to the cluster. It provides an additional layer of security. This value is 
blank by default, meaning that no firewall file is to be used.
     |`nifi.cluster.flow.election.max.wait.time`|Specifies the amount of time 
to wait before electing a Flow as the "correct" Flow. If the number of Nodes 
that have voted is equal to the number specified by the 
`nifi.cluster.flow.election.max.candidates` property, the cluster will not wait 
this long. The default value is `5 mins`. Note that the time starts as soon as 
the first vote is cast.
     |`nifi.cluster.flow.election.max.candidates`|Specifies the number of Nodes 
required in the cluster to cause early election of Flows. This allows the Nodes 
in the cluster to avoid having to wait a long time before starting processing 
if we reach at least this number of nodes in the cluster.
    +|`nifi.cluster.flow.election.max.wait.time`|Specifies the amount of time 
to wait before electing a Flow as the "correct" Flow. If the number of Nodes 
that have voted is equal to the number specified
    + by the `nifi.cluster.flow.election.max.candidates` property, the cluster 
will not wait this long. The default value is `5 mins`. Note that the time 
starts as soon as the first vote is cast.
    +|`nifi.cluster.flow.election.max.candidates`|Specifies the number of Nodes 
required in the cluster to cause early election of Flows. This allows the Nodes 
in the cluster to avoid having to wait a
    +long time before starting processing if we reach at least this number of 
nodes in the cluster.
    +|`nifi.cluster.load.balance.port`|Specifies the port to listen on for 
incoming connections for load balancing data across the cluster. The default 
value is `6342`.|
    +|`nifi.cluster.load.balance.host`|Specifies the hostname to listen on for 
incoming connections for load balancing data across the cluster. If not 
specified, will default to the value used by the `nifi
    --- End diff --
    
    The formatting in the table is off because the "|" at the end of line 3936 
is not needed.


> Prepare Nodes to be Decommissioned from Cluster
> -----------------------------------------------
>
>                 Key: NIFI-5585
>                 URL: https://issues.apache.org/jira/browse/NIFI-5585
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>    Affects Versions: 1.7.1
>            Reporter: Jeff Storck
>            Assignee: Jeff Storck
>            Priority: Major
>
> Allow a node in the cluster to be decommissioned, rebalancing flowfiles on 
> the node to be decommissioned to the other active nodes.  This work depends 
> on NIFI-5516.
> Only nodes that are DISCONNECTED can be transitioned to the OFFLOADING state.
> OFFLOADING nodes will transition to the OFFLOADED state once all flowfiles 
> have been rebalanced to other connected nodes in the cluster.
> OFFLOADING nodes that remain in the OFFLOADING state (due to errors 
> encountered while offloading) can be reconnected to the cluster by restarting 
> NiFi on the node.
> OFFLOADED nodes can be reconnected to the cluster by issuing a connection 
> request via the UI/CLI, or restarting NiFi on the node.
> OFFLOADED nodes can be deleted from the cluster.
> OFFLOADING a node:
> * stops all processors
> * terminates all processors
> * stops transmitting on all remote process groups
> * rebalances flowfiles to other connected nodes in the cluster (via the work 
> done in NIFI-5516)
> The steps to decommission a node and remove it from the cluster are:
>  # Send request to disconnect the node
>  # Once disconnect completes, send request to offload the node.
>  # Once offload completes, send request to delete node.
>  # Once the delete request has finished, the NiFi service on the host can be 
> stopped/removed.
> When an error occurs and the node can not complete offloading, the user can:
>  # Send request to delete the node from the cluster
>  # Diagnose why the node had issues with the offload (out of memory, no 
> network connection, etc) and address the issue
>  # Restart NiFi on the node to so that it will reconnect to the cluster
>  # Go through the steps to decommission a node
> The OFFLOADING request is idempotent.
> Toolkit CLI commands for retrieving a single node, list of nodes, and 
> connecting/disconnecting/offloading/deleting nodes have been added.
> The cluster table UI has an icon to initiate the OFFLOADING of a DISCONNECTED 
> node.
> Similar to the client sending PUT request with a DISCONNECTING message to 
> cluster/nodes/\{id}, an OFFLOADING message can be sent as a PUT request to 
> the same URI to initiate an OFFLOAD for a DISCONNECTED node.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-5585) Prepare Nodes to be Decommissioned from Cluster

Reply via email to