Sophie Blee-Goldman created KAFKA-8951:
------------------------------------------

             Summary: Avoid unnecessary rebalances and downtime for "safe" 
partitions
                 Key: KAFKA-8951
                 URL: https://issues.apache.org/jira/browse/KAFKA-8951
             Project: Kafka
          Issue Type: Improvement
          Components: clients, streams
            Reporter: Sophie Blee-Goldman


With cooperative rebalancing, any partition that is encoded in one consumer's 
Subscription cannot be re-assigned to a different consumer during that 
rebalance. The partition must be removed from the assignment and revoked by its 
old owner before triggering a second rebalance during which it can be assigned. 
This is to enforce a synchronization barrier so that no two consumers can ever 
own the same partition at the same time

This leads to down time for that partition plus a second rebalance, which may 
not always be necessary. In Streams for example, the consumer will pause all 
partitions of an active task until it is running (ie has been initialized and 
restored). It should be safe to give these partitions away, provided they are 
not resumed between sending the joinGroup request and receiving the syncGroup 
response.

One proposal would be to modify two methods in the ConsumerPartitionAssignor 
interface. 1) ConsumerPartitionAssignor#subscriptionUserData would be passed in 
the set of `ownedPartitions` that will be included in the subscription, 
allowing it to remove any that it knows are safe to give away.

2) ConsumerPartitionAssignor#onAssignment would be passed the set of revoked 
partitions, allowing it to remove any that it knows were already reassigned and 
should not trigger another rebalance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to