[
https://issues.apache.org/jira/browse/NIFI-14236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Handermann updated NIFI-14236:
------------------------------------
Status: Patch Available (was: Open)
> Load Balance using Partition by Attribute results in poor distribution
> ----------------------------------------------------------------------
>
> Key: NIFI-14236
> URL: https://issues.apache.org/jira/browse/NIFI-14236
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 2.2.0, 1.16.0
> Reporter: Mark Payne
> Assignee: David Handermann
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> NIFI-9638 aimed to eliminate unnecessary dependencies on the Google Guava
> library. However, in doing so, it changed the hashing algorithm from
> Consistent Hashing to a much simpler hashing algorithm that results in poor
> distribution of data.
> Consistent Hashing was specifically chosen for this use case because it
> provides excellent distribution of data across a given spectrum of bins, and
> also is designed such that if the number of bins (in this case number of NiFi
> nodes) changes, the number of elements that need to be re-distributed (in
> this case the number of FlowFiles) is minimal.
> We need to revert back to Consistent Hashing in order to provide better
> distribution of data across the cluster.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)