ableegoldman commented on code in PR #13993: URL: https://github.com/apache/kafka/pull/13993#discussion_r1264175800
########## clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java: ########## @@ -119,6 +119,7 @@ public final class ConsumerCoordinator extends AbstractCoordinator { private Set<String> joinedSubscription; private MetadataSnapshot metadataSnapshot; private MetadataSnapshot assignmentSnapshot; + private boolean metadataUpdated; Review Comment: I think I would personally need to see benchmarks (somehow) to be convinced that just changing it to a `LinkedHashSet` is sufficient as an optimization, or even just guaranteed not to be a regression. Speaking from experience, I once tried to optimize the cache in Streams for better range-scan performance which involved swapping out the underlying data structure for a `LinkedHashMap`. The consequences were pretty dire, with the `LinkedHashMap` having comparably terrible performance characteristics when scaling up the keyspace. Granted it's not the exact same scenario because that was a map and also the Concurrent variant which surely means the scaling characteristics are not going to be exactly the same. But it has made me incredibly suspicious of the `LinkedHashXXX` data structures for use cases like this. All that said, if this is literally only used for an equality check and nothing else, why not just hash it or something like that? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org