Hello Pushkar, Default distribution algorithm is by "hash(key) % partition_count", so there's possibility to have the uneven distribution you saw.
Yes, there's a way to solve your problem: custom partitioner: https://kafka.apache.org/documentation/#producerconfigs_partitioner.class You can check the partitioner javadoc here <https://kafka.apache.org/30/javadoc/org/apache/kafka/clients/producer/Partitioner.html> for reference. You can see some examples from built-in partitioners, ex: clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java. Basically, you want to focus on the "partition" method, to define your own algorithm to distribute the keys based on the events, ex: key-1 -> partition-1, key-2 -> partition-2... etc. Thank you. Luke On Sat, Nov 20, 2021 at 2:55 PM Pushkar Deole <[email protected]> wrote: > Hi All, > > We are experiencing some uneven distribution of events across topic > partitions for a small set of unique keys: following are the details: > > 1. topic with 6 partitions > 2. 8 unique keys used to produce events onto the topic > > Used 'key' based partitioning while producing events onto the above topic > Observation: only 3 partitions were utilized for all the events pertaining > to those 8 unique keys. > > Any idea how can the load be even across partitions while using key based > partitioning strategy? Any help would be greatly appreciated. > > Note: we cannot use round robin since key level ordering matters for us >
