Hello Pushkar,
Default distribution algorithm is by "hash(key) % partition_count", so
there's possibility to have the uneven distribution you saw.

Yes, there's a way to solve your problem: custom partitioner:
https://kafka.apache.org/documentation/#producerconfigs_partitioner.class

You can check the partitioner javadoc here
<https://kafka.apache.org/30/javadoc/org/apache/kafka/clients/producer/Partitioner.html>
for reference. You can see some examples from built-in partitioners, ex:
clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java.
Basically, you want to focus on the "partition" method, to define your own
algorithm to distribute the keys based on the events, ex: key-1 ->
partition-1, key-2 -> partition-2... etc.

Thank you.
Luke


On Sat, Nov 20, 2021 at 2:55 PM Pushkar Deole <[email protected]> wrote:

> Hi All,
>
> We are experiencing some uneven distribution of events across topic
> partitions for a small set of unique keys: following are the details:
>
> 1. topic with 6 partitions
> 2. 8 unique keys used to produce events onto the topic
>
> Used 'key' based partitioning while producing events onto the above topic
> Observation: only 3 partitions were utilized for all the events pertaining
> to those 8 unique keys.
>
> Any idea how can the load be even across partitions while using key based
> partitioning strategy? Any help would be greatly appreciated.
>
> Note: we cannot use round robin since key level ordering matters for us
>

Reply via email to