showuon commented on a change in pull request #11347:
URL: https://github.com/apache/kafka/pull/11347#discussion_r713720965



##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamsPartitionAssignor.java
##########
@@ -351,6 +351,17 @@ public GroupAssignment assign(final Cluster metadata, 
final GroupSubscription gr
 
             // add the consumer and any info in its subscription to the client
             clientMetadata.addConsumer(consumerId, 
subscription.ownedPartitions());
+            if (allOwnedPartitions.stream().anyMatch(t -> 
subscription.ownedPartitions().contains(t))) {

Review comment:
       I think this check each time is quite expensive if the partition size is 
large. So, I think we can do "lazy check" for this case. Because 
`allOwnedPartitions` is a `Set`, we can check if any duplicated partitions 
existed via the size sum. That is:
   
   ```java
   // get the partition size we're going to add
   int consumerOwnedSize = subscription.ownedPartitions().size();
   int prevSize = allOwnedPartitions.size();
   
   allOwnedPartitions.addAll(subscription.ownedPartitions());
   
   if (allOwnedPartitions.size() < prevSize + consumerOwnedSize) {
       // duplicated partitions in 2 consumers found
       // log warning here
       // if we want to find out which partition cause the problem, we can also 
iterate them here.
   }
   ```
   
   WDYT? Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to