[ 
https://issues.apache.org/jira/browse/NIFI-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17925669#comment-17925669
 ] 

ASF subversion and git services commented on NIFI-14236:
--------------------------------------------------------

Commit 9b84dd475ee6561b2fdf2b0c95c2c182ed3f0fb9 in nifi's branch 
refs/heads/main from David Handermann
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=9b84dd475e ]

NIFI-14236 Implement consistent hash for Attribute Partitioner (#9709)

* NIFI-14236 Implemented consistent hash for Attribute Partitioner
- Used implementation from com.google.common.hash.Hashing.consistentHash method


> Load Balance using Partition by Attribute results in poor distribution
> ----------------------------------------------------------------------
>
>                 Key: NIFI-14236
>                 URL: https://issues.apache.org/jira/browse/NIFI-14236
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.16.0, 2.2.0
>            Reporter: Mark Payne
>            Assignee: David Handermann
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> NIFI-9638 aimed to eliminate unnecessary dependencies on the Google Guava 
> library. However, in doing so, it changed the hashing algorithm from 
> Consistent Hashing to a much simpler hashing algorithm that results in poor 
> distribution of data.
> Consistent Hashing was specifically chosen for this use case because it 
> provides excellent distribution of data across a given spectrum of bins, and 
> also is designed such that if the number of bins (in this case number of NiFi 
> nodes) changes, the number of elements that need to be re-distributed (in 
> this case the number of FlowFiles) is minimal.
> We need to revert back to Consistent Hashing in order to provide better 
> distribution of data across the cluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to