Current partition assignment only has a few limited options -- see the
partition.assignment.strategy consumer option (which seems to be listed
twice, see the second version for a more detailed explanation). There has
been some discussion of making assignment strategies user extensible to
support use cases like this.

Is there a reason your data is unbalanced that might be avoidable? Ideally
good hashing of keys combined with a large enough number of keys with
reasonable data distribution across keys (not necessarily uniform) leads to
a reasonable balance, although there are certainly some workloads that are
so skewed that this doesn't work out.



On Tue, Jun 23, 2015 at 7:34 PM, Joel Ohman <maelstrom.thunderb...@gmail.com
> wrote:

> Hello!
>
> I'm working with a topic of largely variable partition sizes. My biggest
> concern is that I have no control over which keys are assigned to which
> consumers in my consumer group, as the amount of data my consumer sees is
> directly reflected on it's work load. Is there a way to distribute
> partitions to consumers evenly  based on the size of each partition? The
> provided Consumer Rebalancing Algorithm prioritizes assigning consumers
> even numbers of partitions, regardless of their size.
>
> Regards,
> Joel
>



-- 
Thanks,
Ewen

Reply via email to