Hi Kafka Team,

I would like to have your opinion before creating a new JIRA.

I'm working with the Java Consumer API. The current partition assignors use
the consumer ids to sort members before assigning partitions.

This works pretty well as long as all consumers are started into the same
JVM and no more consumers than the number of partitions are created.

However, in many cases consumers are distributed across multiple hosts. To
continue consuming with an optimal number of consumers (even with a host
failure) we can create as many consumers as partitions on each host.

For example, we have two consumers C0, C1 each on a dedicated host and one
topic with 4 partitions. With current assignors it is not possible to have
2 consuming threads and 2 idle threads per host.

Instead of that, C0 will have 4 consuming threads and C1 will have 4 idle
threads.

One solution could be to keep a timestamp the first time a member
subscribes to a topic. This timestamp can then be used to sort members for
the partitions assignment. In this way, the partition assignment will be
more predictable as it will not depend on member ids.

One drawback of this solution is that the consumer responsible of
assignments will keep a local state.

Thanks,

-- 
Florian

Reply via email to