[
https://issues.apache.org/jira/browse/KAFKA-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Schofield resolved KAFKA-20563.
--------------------------------------
Fix Version/s: 4.4.0
Resolution: Fixed
> Flaky test ShareConsumerRackAwareTest.testShareConsumerWithRackAwareAssignor
> ----------------------------------------------------------------------------
>
> Key: KAFKA-20563
> URL: https://issues.apache.org/jira/browse/KAFKA-20563
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 4.3.0
> Reporter: Sushant Mahajan
> Assignee: Andrew Schofield
> Priority: Minor
> Fix For: 4.4.0
>
>
> [https://develocity.apache.org/scans/tests?search.buildToolType=gradle&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.timeZoneId=Asia%2FCalcutta&tests.container=org.apache.kafka.clients.consumer.ShareConsumerRackAwareTest&tests.test=testShareConsumerWithRackAwareAssignor(ClusterInstance)%5B1%5D]
> Preliminary analysis:
>
>
> 1. Test calls alterPartitionReassignments to move partitions between
> brokers/racks.
> 2. During the transition, a share group heartbeat fires. Share groups have
> an extra trigger — initializedAssignmentPending()
> (GroupMetadataManager.java) — that forces assignment recomputation on every
> heartbeat when there are unassigned initialized partitions. Combined
> with SHARE_GROUP_ASSIGNMENT_INTERVAL_MS_CONFIG=0, this means every
> heartbeat triggers the assignor.
> 3. The RackAwareAssignor runs against transitional metadata where a
> partition's rack set doesn't match any member. It throws
> PartitionAssignorException.
> [2026-05-10 21:15:09,514] ERROR [GroupCoordinator id=0] Operation
> share-group-heartbeat with ShareGroupHeartbeatRequestData(groupId='group0',
> memberId='mMvKOe5MR0aBoDlFTKTTnA', memberEpoch=10, rackId=null,
> subscribedTopicNames=null) hit an unexpected exception:
> org.apache.kafka.common.errors.UnknownServerException: Failed to compute a
> new target assignment for epoch 11: No member found for racks [rack2] for
> partition 0 of topic TDeVaIP_Q2OWvedEfXl_ng.
> (org.apache.kafka.coordinator.group.GroupCoordinatorService:54)
> java.util.concurrent.CompletionException:
> org.apache.kafka.common.errors.UnknownServerException: Failed to compute a
> new target assignment for epoch 11: No member found for racks [rack2] for
> partition 0 of topic TDeVaIP_Q2OWvedEfXl_ng
> 4. GroupMetadataManager.maybeUpdateTargetAssignment wraps it as
> UnknownServerException.
> 5. AbstractHeartbeatRequestManager treats this as a fatal error,
> transitioning the consumer member to FATAL state permanently.
> 6. On test teardown, ShareConsumerImpl.close() tries to leave the group,
> encounters the UnknownServerException in the background event queue, and
> throws KafkaException("Failed to close Kafka share consumer").
--
This message was sent by Atlassian Jira
(v8.20.10#820010)