philipnee opened a new pull request, #13550:
URL: https://github.com/apache/kafka/pull/13550

   This is a really long story, but the incident started in KAFKA-13419 when we 
observed a member sending out a topic partition owned from the previous 
generation when a member missed a rebalance cycle due to REBALANCE_IN_PROGRESS.
   
   Ideally, the member should continue holding onto its partition as long as 
there's no other owner with a younger generation; however, we need to be 
defensive about this approach because we aren't sure if the partition has 
already been assigned to other members.  Therefore, it is safest for us to only 
honor the members with the highest generation and the previous generation 
during the assignment phase.
   
   In this PR, I made 2 major changes
   1. In the assignor: we now honor partition owner that is only on its max - 1 
generation as long as there's no other owner with a younger generation to that 
partition. (younger = higher generationId)
   2. After getting REBALANCE_IN_PROGRESS sync group error, we immediately 
reset its generation so that we could ensure to claim lose for all of the owned 
partition if member doesn't re-join in timely member.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to