Matthew Clarke created NIFI-3559:
------------------------------------

             Summary: Improve S2S load-balancing
                 Key: NIFI-3559
                 URL: https://issues.apache.org/jira/browse/NIFI-3559
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
    Affects Versions: 1.1.1
            Reporter: Matthew Clarke


The current implementation of S2S sends data continuously to the destination 
NiFi node for 0.5 seconds before closing the connection and opening a new 
connection to another node.

When the source FlowFile are all very small (0 byte in case of list based 
processors), the entire queue can end up getting sent to only one of the target 
NiFi cluster nodes.

Another common use case for S2S is to have a RPG pointed back at same cluster 
where the RPG was added.  Since FlowFiles are likely to transfer to the same 
node where the data originates (Think Primary node data redistribution within a 
cluster) much faster then transfers to other nodes, the primary node is likely 
to always end up with more FlowFiles then any other node.

There needs to be an additional load-balancing strategy that compliments the 
existing 0.5 second to improve upon the load-balancing in such cases.  The RPG 
know how many target nodes there are and how many FlowFiles exist in the queue 
at run time, so perhaps using that info to more even split the queue amongst 
all nodes smartly would help.

This is related to existing Jira: NiFI-2987



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to