Current partition assignment only has a few limited options -- see the partition.assignment.strategy consumer option (which seems to be listed twice, see the second version for a more detailed explanation). There has been some discussion of making assignment strategies user extensible to support use cases like this.
Is there a reason your data is unbalanced that might be avoidable? Ideally good hashing of keys combined with a large enough number of keys with reasonable data distribution across keys (not necessarily uniform) leads to a reasonable balance, although there are certainly some workloads that are so skewed that this doesn't work out. On Tue, Jun 23, 2015 at 7:34 PM, Joel Ohman <maelstrom.thunderb...@gmail.com > wrote: > Hello! > > I'm working with a topic of largely variable partition sizes. My biggest > concern is that I have no control over which keys are assigned to which > consumers in my consumer group, as the amount of data my consumer sees is > directly reflected on it's work load. Is there a way to distribute > partitions to consumers evenly based on the size of each partition? The > provided Consumer Rebalancing Algorithm prioritizes assigning consumers > even numbers of partitions, regardless of their size. > > Regards, > Joel > -- Thanks, Ewen