Thanks Luke, I am sure this problem would have been faced by many others
before so would like to know if there are any existing custom algorithms
that can be reused,

Note that we also have requirement to maintain key level ordering,  so the
custom partitioner should support that as well

On Sun, Nov 21, 2021, 18:29 Luke Chen <[email protected]> wrote:

> Hello Pushkar,
> Default distribution algorithm is by "hash(key) % partition_count", so
> there's possibility to have the uneven distribution you saw.
>
> Yes, there's a way to solve your problem: custom partitioner:
> https://kafka.apache.org/documentation/#producerconfigs_partitioner.class
>
> You can check the partitioner javadoc here
> <
> https://kafka.apache.org/30/javadoc/org/apache/kafka/clients/producer/Partitioner.html
> >
> for reference. You can see some examples from built-in partitioners, ex:
>
> clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java.
> Basically, you want to focus on the "partition" method, to define your own
> algorithm to distribute the keys based on the events, ex: key-1 ->
> partition-1, key-2 -> partition-2... etc.
>
> Thank you.
> Luke
>
>
> On Sat, Nov 20, 2021 at 2:55 PM Pushkar Deole <[email protected]>
> wrote:
>
> > Hi All,
> >
> > We are experiencing some uneven distribution of events across topic
> > partitions for a small set of unique keys: following are the details:
> >
> > 1. topic with 6 partitions
> > 2. 8 unique keys used to produce events onto the topic
> >
> > Used 'key' based partitioning while producing events onto the above topic
> > Observation: only 3 partitions were utilized for all the events
> pertaining
> > to those 8 unique keys.
> >
> > Any idea how can the load be even across partitions while using key based
> > partitioning strategy? Any help would be greatly appreciated.
> >
> > Note: we cannot use round robin since key level ordering matters for us
> >
>

Reply via email to