Hi,

the current Partitioner interface is the following:

interface Partitioner<T> {
   int partition(T key, int numPartitions);
}

this allows for partition selection based solely on message (or
message set) key and number of partitions.

Message order guarantees provided by Kafka are per partition, a
partition being a pair (brokerid, partitionid).

With the current Partitioner interface, the partition selection is a
blind process. Some use cases might better benefit from a partition
selection process which exposes the brokerid/partitionid info.

Imagine a Kafka cluster that currently has 2 brokers, each with one
partition. Messages are dispatched to both partitions with a custom
Partitioner, sending messages witht the same key to the same
partition.

Now suppose we add a broker with one partition. With the current
Partitioner interface, you cannot ensure that messages with a given
key that were sent to a given partition will still be sent to the same
partition because you have no info on the brokerid/partitionid that a
given partition index will select. In some use cases this can be bad
as adding a broker will induce the same kind of disturbance as a
broker failure.

Could the Partitioner interface be modified to include the sorted list
of broker/partition as an additional parameter?

Reply via email to