Hi Florian, I'm not totally sure I understand the problem. The consumer id consists of the clientId configured by the user with a UUID appended to it. If the clientId has not been passed in configuration, we use "consumer-{n}" for it where n is incremented for every new consumer instance. Is the problem basically that this default when applied independently on different jvms is giving you less than ideal ordering?
For what it's worth, I know that Kafka Streams is a little more clever in how partitions are assigned. It uses a custom assignor which takes into account the consumer's host information. Thanks, Jason On Thu, Sep 1, 2016 at 9:00 AM, Florian Hussonnois <fhussonn...@gmail.com> wrote: > Hi Kafka Team, > > I would like to have your opinion before creating a new JIRA. > > I'm working with the Java Consumer API. The current partition assignors use > the consumer ids to sort members before assigning partitions. > > This works pretty well as long as all consumers are started into the same > JVM and no more consumers than the number of partitions are created. > > However, in many cases consumers are distributed across multiple hosts. To > continue consuming with an optimal number of consumers (even with a host > failure) we can create as many consumers as partitions on each host. > > For example, we have two consumers C0, C1 each on a dedicated host and one > topic with 4 partitions. With current assignors it is not possible to have > 2 consuming threads and 2 idle threads per host. > > Instead of that, C0 will have 4 consuming threads and C1 will have 4 idle > threads. > > One solution could be to keep a timestamp the first time a member > subscribes to a topic. This timestamp can then be used to sort members for > the partitions assignment. In this way, the partition assignment will be > more predictable as it will not depend on member ids. > > One drawback of this solution is that the consumer responsible of > assignments will keep a local state. > > Thanks, > > -- > Florian >