Hi Jason,

The downside to use client-ids is there is no certitude that they will be
increasing.
So in case there is already as many consumers as partitions, additional
consumers can change the partitions assignment.
This lead to rebalances which are unnecessary as partitions are already
well-balanced.

Thanks,

2016-09-01 20:29 GMT+02:00 Jason Gustafson <ja...@confluent.io>:

> Hi Florian,
>
> I'm not totally sure I understand the problem. The consumer id consists of
> the clientId configured by the user with a UUID appended to it. If the
> clientId has not been passed in configuration, we use "consumer-{n}" for it
> where n is incremented for every new consumer instance. Is the problem
> basically that this default when applied independently on different jvms is
> giving you less than ideal ordering?
>
> For what it's worth, I know that Kafka Streams is a little more clever in
> how partitions are assigned. It uses a custom assignor which takes into
> account the consumer's host information.
>
> Thanks,
> Jason
>
> On Thu, Sep 1, 2016 at 9:00 AM, Florian Hussonnois <fhussonn...@gmail.com>
> wrote:
>
> > Hi Kafka Team,
> >
> > I would like to have your opinion before creating a new JIRA.
> >
> > I'm working with the Java Consumer API. The current partition assignors
> use
> > the consumer ids to sort members before assigning partitions.
> >
> > This works pretty well as long as all consumers are started into the same
> > JVM and no more consumers than the number of partitions are created.
> >
> > However, in many cases consumers are distributed across multiple hosts.
> To
> > continue consuming with an optimal number of consumers (even with a host
> > failure) we can create as many consumers as partitions on each host.
> >
> > For example, we have two consumers C0, C1 each on a dedicated host and
> one
> > topic with 4 partitions. With current assignors it is not possible to
> have
> > 2 consuming threads and 2 idle threads per host.
> >
> > Instead of that, C0 will have 4 consuming threads and C1 will have 4 idle
> > threads.
> >
> > One solution could be to keep a timestamp the first time a member
> > subscribes to a topic. This timestamp can then be used to sort members
> for
> > the partitions assignment. In this way, the partition assignment will be
> > more predictable as it will not depend on member ids.
> >
> > One drawback of this solution is that the consumer responsible of
> > assignments will keep a local state.
> >
> > Thanks,
> >
> > --
> > Florian
> >
>



-- 
Florian HUSSONNOIS

Reply via email to