Denis Jakupovic created NIFI-9598:
-------------------------------------

             Summary: Load Balancing on labeled nodes and/or fixed amount of 
usable nodes in process groups
                 Key: NIFI-9598
                 URL: https://issues.apache.org/jira/browse/NIFI-9598
             Project: Apache NiFi
          Issue Type: Improvement
    Affects Versions: 1.15.3
            Reporter: Denis Jakupovic


One of NiFi's great features is its linear scalability by adding just more 
nodes. However by only having the distribute load processor or by round robin, 
load balance by attribute name or to a single node feature in the connection, 
we could need a more granular form of distributing flowfiles through the 
cluster. 

Let's assume we have a 10 node NiFi Cluster. 
Round Robin: Each node would get 1/10 of the flowfiles.
Single Node: Only one node would process all FF. Chance that other process 
groups distribute to same node is 1/10
By Attribute: 1-10 nodes could get the data, not evenly partitioned
Distribute Load Processor: Manual and fixed process, cannot scale with adding 
more nodes to the cluster and needs 

By having several dataflows with different use cases with enormous variance in 
computation, one or a few dataflows can slow down all other data flows. 
Therefore a solution could be partitioning the data to labeled nodes or by 
setting the maximum allowed nodes to use for FF partitioning/load balancing on 
process groups or a connection.

In the cluster configuration each node could be labeled. Distributing the FF by 
round robin would only be distributed to the labeled nodes with the proper 
label. A distribution by attribute name would mean to build the attribute 
accordingly and cannot be build dynamically. 

Another great feature would be the maximum amount of nodes a process group can 
use to distribute nodes.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to