Hi Florian,

I'm not totally sure I understand the problem. The consumer id consists of
the clientId configured by the user with a UUID appended to it. If the
clientId has not been passed in configuration, we use "consumer-{n}" for it
where n is incremented for every new consumer instance. Is the problem
basically that this default when applied independently on different jvms is
giving you less than ideal ordering?

For what it's worth, I know that Kafka Streams is a little more clever in
how partitions are assigned. It uses a custom assignor which takes into
account the consumer's host information.

Thanks,
Jason

On Thu, Sep 1, 2016 at 9:00 AM, Florian Hussonnois <fhussonn...@gmail.com>
wrote:

> Hi Kafka Team,
>
> I would like to have your opinion before creating a new JIRA.
>
> I'm working with the Java Consumer API. The current partition assignors use
> the consumer ids to sort members before assigning partitions.
>
> This works pretty well as long as all consumers are started into the same
> JVM and no more consumers than the number of partitions are created.
>
> However, in many cases consumers are distributed across multiple hosts. To
> continue consuming with an optimal number of consumers (even with a host
> failure) we can create as many consumers as partitions on each host.
>
> For example, we have two consumers C0, C1 each on a dedicated host and one
> topic with 4 partitions. With current assignors it is not possible to have
> 2 consuming threads and 2 idle threads per host.
>
> Instead of that, C0 will have 4 consuming threads and C1 will have 4 idle
> threads.
>
> One solution could be to keep a timestamp the first time a member
> subscribes to a topic. This timestamp can then be used to sort members for
> the partitions assignment. In this way, the partition assignment will be
> more predictable as it will not depend on member ids.
>
> One drawback of this solution is that the consumer responsible of
> assignments will keep a local state.
>
> Thanks,
>
> --
> Florian
>

Reply via email to