Pierre Villard created NIFI-4026:
------------------------------------

             Summary: SiteToSite Partitioning
                 Key: NIFI-4026
                 URL: https://issues.apache.org/jira/browse/NIFI-4026
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
            Reporter: Pierre Villard


To answer some use cases and to always provide more flexibility to the 
Site-to-Site mechanism it would be interesting to introduce a S2S Partitioning 
Key.

The idea would be to add a parameter in the S2S configuration to compute the 
destination node based on the attribute of a flow file. The user would set the 
attribute to read from the incoming flow files and a hashing function would be 
applied on this attribute value to get a number between 1 and N (N being the 
number of nodes on the remote cluster) to select the destination node.

It could even be possible to let the user code a custom hashing function in a 
scripting language.

This approach would potentially force the “batching” to 1, or it could be 
necessary to create bins to batch together flow files that are supposed to go 
to the same node.

Obviously, it comes the question regarding how to handle cluster scale up/down. 
However, I believe this is an edge case and should not be blocking this feature.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to