[ 
https://issues.apache.org/jira/browse/KAFKA-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Schofield resolved KAFKA-20563.
--------------------------------------
    Fix Version/s: 4.4.0
       Resolution: Fixed

> Flaky test ShareConsumerRackAwareTest.testShareConsumerWithRackAwareAssignor
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-20563
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20563
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 4.3.0
>            Reporter: Sushant Mahajan
>            Assignee: Andrew Schofield
>            Priority: Minor
>             Fix For: 4.4.0
>
>
> [https://develocity.apache.org/scans/tests?search.buildToolType=gradle&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Asia%2FCalcutta&tests.container=org.apache.kafka.clients.consumer.ShareConsumerRackAwareTest&tests.test=testShareConsumerWithRackAwareAssignor(ClusterInstance)%5B1%5D]
> Preliminary analysis:
>  
>  
>   1. Test calls alterPartitionReassignments to move partitions between 
> brokers/racks.
>   2. During the transition, a share group heartbeat fires. Share groups have 
> an extra trigger — initializedAssignmentPending()
>   (GroupMetadataManager.java) — that forces assignment recomputation on every 
> heartbeat when there are unassigned initialized partitions. Combined
>   with SHARE_GROUP_ASSIGNMENT_INTERVAL_MS_CONFIG=0, this means every 
> heartbeat triggers the assignor.
>   3. The RackAwareAssignor runs against transitional metadata where a 
> partition's rack set doesn't match any member. It throws 
> PartitionAssignorException.
>  [2026-05-10 21:15:09,514] ERROR [GroupCoordinator id=0] Operation 
> share-group-heartbeat with ShareGroupHeartbeatRequestData(groupId='group0', 
> memberId='mMvKOe5MR0aBoDlFTKTTnA', memberEpoch=10, rackId=null, 
> subscribedTopicNames=null) hit an unexpected exception: 
> org.apache.kafka.common.errors.UnknownServerException: Failed to compute a 
> new target assignment for epoch 11: No member found for racks [rack2] for 
> partition 0 of topic TDeVaIP_Q2OWvedEfXl_ng. 
> (org.apache.kafka.coordinator.group.GroupCoordinatorService:54)      
> java.util.concurrent.CompletionException: 
> org.apache.kafka.common.errors.UnknownServerException: Failed to compute a 
> new target assignment for epoch 11: No member found for racks [rack2] for 
> partition 0 of topic TDeVaIP_Q2OWvedEfXl_ng
>   4. GroupMetadataManager.maybeUpdateTargetAssignment wraps it as 
> UnknownServerException.
>   5. AbstractHeartbeatRequestManager treats this as a fatal error, 
> transitioning the consumer member to FATAL state permanently.
>   6. On test teardown, ShareConsumerImpl.close() tries to leave the group, 
> encounters the UnknownServerException in the background event queue, and 
> throws KafkaException("Failed to close Kafka share consumer").



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to