Sophie Blee-Goldman created KAFKA-8767:
------------------------------------------

             Summary: Optimize StickyAssignor for Cooperative mode
                 Key: KAFKA-8767
                 URL: https://issues.apache.org/jira/browse/KAFKA-8767
             Project: Kafka
          Issue Type: Sub-task
            Reporter: Sophie Blee-Goldman


In some rare cases, the StickyAssignor will fail to balance an assignment 
without violating stickiness despite a balanced and sticky assignment being 
possible. The implications of this for cooperative rebalancing are that an 
unnecessary additional rebalance will be triggered.

This was seen to happen for example when each consumer is subscribed to some 
random subset of all topics and all their subscriptions change to a different 
random subset, as occurs in 
AbstractStickyAssignorTest#testReassignmentWithRandomSubscriptionsAndChanges.

The initial assignment after the random subscription change obviously involved 
migrating partitions, so following the cooperative protocol those partitions 
are removed from the balanced first assignment, and a second rebalance is 
triggered. In some cases, during the second rebalance the assignor was unable 
to reach a balanced assignment without migrating a few partitions, even though 
one must have been possible (since the first assignment was balanced). A third 
rebalance was needed to reach a stable balanced state.

Under the conditions in the previously mentioned test (between 20-40 consumers, 
10-20 topics (with 0-20 partitions) this third rebalance was required roughly 
30% of the time. Some initial improvements to the sticky assignment logic 
reduced this to under 15%, but we should consider closing this gap and 
optimizing the cooperative sticky assignment

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to