[ 
https://issues.apache.org/jira/browse/KAFKA-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738011#comment-16738011
 ] 

BDeus commented on KAFKA-5337:
------------------------------

I think it's an interesting feature to add.

When a consumer group read multiple topics with lots of offsets differences it 
can be lead to a unique worker to read the "big" topics.

> Partition assignment strategy that distributes lag evenly across consumers in 
> each group
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5337
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5337
>             Project: Kafka
>          Issue Type: New Feature
>          Components: consumer
>    Affects Versions: 0.10.2.1
>            Reporter: Grant Neale
>            Priority: Minor
>
> Existing partition assignment strategies (RangeAssignor and 
> RoundRobinAssignor) do not account for the current consumer group lag on each 
> partition.  This can result in sub-optimal assignments when the distribution 
> of lags for a given topic and consumer group is skewed.
> The LagBasedAssignor operates on a per-topic basis, and attempts to assign 
> partitions such that lag is distributed as evenly across a consumer group.
> h4. Algorithm:
> For each topic, we first obtain the lag on all partitions. Lag on a given 
> partition is the difference between the end offset and the last offset 
> committed by the consumer group. If no offsets have been committed for a 
> partition we determine the lag based on the code auto.offset.reset property. 
> If auto.offset.reset=latest, we assign a lag of 0. If 
> auto.offset.reset=earliest (or any other value) we assume assign lag equal to 
> the total number of message currently available in that partition.
> We then create a map storing the current total lag of all partitions assigned 
> to each member of the consumer group. Partitions are assigned in decreasing 
> order of lag, with each partition assigned to the consumer with least total 
> number of assigned partitions, breaking ties by assigning to the consumer 
> with the least total assigned lag.
> Distributing partitions evenly across consumers (by count) ensures that the 
> partition assignment is balanced when all partitions have a current lag of 0 
> or if the distribution of lags is heavily skewed. It also gives the consumer 
> group the best possible chance of remaining balanced if the assignment is 
> retained for a long period.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to