Re: [PR] KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description [kafka]

via GitHub Sun, 23 Feb 2025 23:38:35 -0800


adixitconfluent commented on code in PR #18864:
URL: https://github.com/apache/kafka/pull/18864#discussion_r1967144265



##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java:
##########
@@ -67,42 +71,210 @@ private GroupAssignment assignHomogenous(
         GroupSpec groupSpec,
         SubscribedTopicDescriber subscribedTopicDescriber
     ) {
-        Set<Uuid> subscribeTopicIds = 
groupSpec.memberSubscription(groupSpec.memberIds().iterator().next())
+        Set<Uuid> subscribedTopicIds = 
groupSpec.memberSubscription(groupSpec.memberIds().iterator().next())
             .subscribedTopicIds();
-        if (subscribeTopicIds.isEmpty())
-            return new GroupAssignment(Collections.emptyMap());
+        if (subscribedTopicIds.isEmpty())
+            return new GroupAssignment(Map.of());
 
-        Map<Uuid, Set<Integer>> targetPartitions = computeTargetPartitions(
-            subscribeTopicIds, subscribedTopicDescriber);
+        // Subscribed topic partitions for the share group.
+        List<TopicIdPartition> targetPartitions = computeTargetPartitions(
+            subscribedTopicIds, subscribedTopicDescriber);
 
-        return new 
GroupAssignment(groupSpec.memberIds().stream().collect(Collectors.toMap(
-            Function.identity(), memberId -> new 
MemberAssignmentImpl(targetPartitions))));
+        // The current assignment from topic partition to members.
+        Map<TopicIdPartition, List<String>> currentAssignment = 
currentAssignment(groupSpec);
+        return newAssignmentHomogeneous(groupSpec, subscribedTopicIds, 
targetPartitions, currentAssignment);
     }
 
     private GroupAssignment assignHeterogeneous(
         GroupSpec groupSpec,
         SubscribedTopicDescriber subscribedTopicDescriber
     ) {
-        Map<String, MemberAssignment> members = new HashMap<>();
+        Map<String, List<TopicIdPartition>> memberToPartitionsSubscription = 
new HashMap<>();
         for (String memberId : groupSpec.memberIds()) {
             MemberSubscription spec = groupSpec.memberSubscription(memberId);
             if (spec.subscribedTopicIds().isEmpty())
                 continue;
 
-            Map<Uuid, Set<Integer>> targetPartitions = computeTargetPartitions(
+            // Subscribed topic partitions for the share group member.
+            List<TopicIdPartition> targetPartitions = computeTargetPartitions(
                 spec.subscribedTopicIds(), subscribedTopicDescriber);
+            memberToPartitionsSubscription.put(memberId, targetPartitions);
+        }
+
+        // The current assignment from topic partition to members.
+        Map<TopicIdPartition, List<String>> currentAssignment = 
currentAssignment(groupSpec);
+        return newAssignmentHeterogeneous(groupSpec, 
memberToPartitionsSubscription, currentAssignment);
+    }
+
+    /**
+     * Get the current assignment by topic partitions.
+     * @param groupSpec - The group metadata specifications.
+     * @return the current assignment for subscribed topic partitions to 
memberIds.
+     */
+    private Map<TopicIdPartition, List<String>> currentAssignment(GroupSpec 
groupSpec) {
+        Map<TopicIdPartition, List<String>> assignment = new HashMap<>();
 
-            members.put(memberId, new MemberAssignmentImpl(targetPartitions));
+        for (String member : groupSpec.memberIds()) {
+            Map<Uuid, Set<Integer>> assignedTopicPartitions = 
groupSpec.memberAssignment(member).partitions();
+            assignedTopicPartitions.forEach((topicId, partitions) -> 
partitions.forEach(
+                partition -> assignment.computeIfAbsent(new 
TopicIdPartition(topicId, partition), k -> new ArrayList<>()).add(member)));
         }
+        return assignment;
+    }
+
+    private GroupAssignment newAssignmentHomogeneous(
+        GroupSpec groupSpec,
+        Set<Uuid> subscribedTopicIds,
+        List<TopicIdPartition> targetPartitions,
+        Map<TopicIdPartition, List<String>> currentAssignment
+    ) {
+
+        Map<TopicIdPartition, List<String>> newAssignment = new HashMap<>();
+
+        // Step 1: Hash member IDs to partitions.
+        memberHashAssignment(targetPartitions, groupSpec.memberIds(), 
newAssignment);
+
+        // Step 2: Round-robin assignment for unassigned partitions which do 
not have members already assigned in the current assignment.
+        Set<TopicIdPartition> assignedPartitions = newAssignment.keySet();
+        List<TopicIdPartition> unassignedPartitions = targetPartitions.stream()
+            .filter(targetPartition -> 
!assignedPartitions.contains(targetPartition))
+            .filter(targetPartition -> 
!currentAssignment.containsKey(targetPartition))
+            .toList();
+
+        roundRobinAssignment(groupSpec.memberIds(), unassignedPartitions, 
newAssignment);
+
+        // Step 3: We combine current assignment and new assignment.
+        Map<String, Set<TopicIdPartition>> finalAssignment = new HashMap<>();
+
+        // When combining current assignment, we need to only consider the 
topics in current assignment that are also being
+        // subscribed in the new assignment as well.
+        currentAssignment.forEach((targetPartition, members) -> {
+            if (subscribedTopicIds.contains(targetPartition.topicId()))
+                members.forEach(member -> {
+                    if (groupSpec.memberIds().contains(member))
+                        finalAssignment.computeIfAbsent(member, k -> new 
HashSet<>()).add(targetPartition);
+                });
+        });
+        newAssignment.forEach((targetPartition, members) -> 
members.forEach(member ->
+            finalAssignment.computeIfAbsent(member, k -> new 
HashSet<>()).add(targetPartition)));
+
+        return groupAssignment(finalAssignment, groupSpec.memberIds());
+    }
+
+    private GroupAssignment newAssignmentHeterogeneous(
+        GroupSpec groupSpec,
+        Map<String, List<TopicIdPartition>> memberToPartitionsSubscription,
+        Map<TopicIdPartition, List<String>> currentAssignment
+    ) {
+
+        // Exhaustive set of all subscribed topic partitions.
+        Set<TopicIdPartition> targetPartitions = new LinkedHashSet<>();
+        
memberToPartitionsSubscription.values().forEach(targetPartitions::addAll);
+
+        // Create a map for topic to members subscription.
+        Map<Uuid, Set<String>> topicToMemberSubscription = new HashMap<>();
+        memberToPartitionsSubscription.forEach((member, partitions) ->
+            partitions.forEach(partition -> 
topicToMemberSubscription.computeIfAbsent(partition.topicId(), k -> new 
LinkedHashSet<>()).add(member)));
+
+        Map<TopicIdPartition, List<String>> newAssignment = new HashMap<>();
+
+        // Step 1: Hash member IDs to partitions.
+        memberToPartitionsSubscription.forEach((member, partitions) ->
+            memberHashAssignment(partitions, List.of(member), newAssignment));
+
+        // Step 2: Round-robin assignment for unassigned partitions which do 
not have members already assigned in the current assignment.
+        Set<TopicIdPartition> assignedPartitions = new 
LinkedHashSet<>(newAssignment.keySet());
+        Map<Uuid, List<TopicIdPartition>> unassignedPartitions = new 
HashMap<>();
+        targetPartitions.forEach(targetPartition -> {
+            if (!assignedPartitions.contains(targetPartition) && 
!currentAssignment.containsKey(targetPartition))
+                
unassignedPartitions.computeIfAbsent(targetPartition.topicId(), k -> new 
ArrayList<>()).add(targetPartition);
+        });
+
+        unassignedPartitions.keySet().forEach(unassignedTopic ->
+            
roundRobinAssignment(topicToMemberSubscription.get(unassignedTopic), 
unassignedPartitions.get(unassignedTopic), newAssignment));
+
+        // Step 3: We combine current assignment and new assignment.
+        Map<String, Set<TopicIdPartition>> finalAssignment = new HashMap<>();
+
+        // When combining current assignment, we need to only consider the 
member topic subscription in current assignment
+        // which is being subscribed in the new assignment as well.
+        currentAssignment.forEach((targetPartition, members) -> 
members.forEach(member -> {
+            if 
(topicToMemberSubscription.getOrDefault(targetPartition.topicId(), 
Collections.emptySet()).contains(member))
+                finalAssignment.computeIfAbsent(member, k -> new 
HashSet<>()).add(targetPartition);
+        }));
+        newAssignment.forEach((targetPartition, members) -> 
members.forEach(member ->
+            finalAssignment.computeIfAbsent(member, k -> new 
HashSet<>()).add(targetPartition)));
+
+        return groupAssignment(finalAssignment, groupSpec.memberIds());
+    }
+
+    private GroupAssignment groupAssignment(
+        Map<String, Set<TopicIdPartition>> assignmentByMember,
+        Collection<String> allGroupMembers
+    ) {
+        Map<String, MemberAssignment> members = new HashMap<>();
+        for (Map.Entry<String, Set<TopicIdPartition>> entry : 
assignmentByMember.entrySet()) {
+            Map<Uuid, Set<Integer>> targetPartitions = new HashMap<>();
+            entry.getValue().forEach(targetPartition -> 
targetPartitions.computeIfAbsent(targetPartition.topicId(), k -> new 
HashSet<>()).add(targetPartition.partitionId()));
+            members.put(entry.getKey(), new 
MemberAssignmentImpl(targetPartitions));
+        }
+        allGroupMembers.forEach(member -> {
+            if (!members.containsKey(member))
+                members.put(member, new MemberAssignmentImpl(new HashMap<>()));
+        });
+
         return new GroupAssignment(members);
     }
 
-    private Map<Uuid, Set<Integer>> computeTargetPartitions(
-        Set<Uuid> subscribeTopicIds,
+    /**
+     * This function updates assignment by hashing the member IDs of the 
members and maps the partitions assigned to the members based on the hash. This 
gives approximately even balance.
+     * @param targetPartitions - the subscribed topic partitions.
+     * @param memberIds - the member ids to which the topic partitions need to 
be assigned.
+     * @param assignment - the existing assignment by topic partition. We need 
to pass it as a parameter because this
+     *                   function would be called multiple times for 
heterogeneous assignment.
+     */
+    void memberHashAssignment(
+        List<TopicIdPartition> targetPartitions,
+        Collection<String> memberIds,
+        Map<TopicIdPartition, List<String>> assignment
+    ) {
+        if (!targetPartitions.isEmpty())
+            for (String memberId : memberIds) {
+                int topicPartitionIndex = Math.abs(memberId.hashCode() % 
targetPartitions.size());

Review Comment:
   @apoorvmittal10 , IMO, stickiness will be affected when either there is a 
change in members/ change in subscribed topic partitions. One of the future PR 
will aim to `Improve simple assignor by having a max no. of partitions limit 
and min no. of partitions threshold assignment to members`. The description of 
[simple assignor rules 
](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255070434#KIP932:QueuesforKafka-TheSimpleAssignor)does
 not factor in stickiness, so I guess that should be addressed in a future PR 
as well.
   
   @TaiJuWu, sorry I couldn't understand your comment. Do you mean that we 
shouldn't use `hashCode`, rather use something like `hash128` from `Murmur3`? 
Any reason for such suggestion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] KAFKA-18757: Create full-function SimpleAssignor to match KIP-932 description [kafka]

Reply via email to