[ https://issues.apache.org/jira/browse/KAFKA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790080#comment-16790080 ]
Sophie Blee-Goldman commented on KAFKA-8094: -------------------------------------------- That said, maybe this is another occasion to question whether a returned iterator *should* reflect the current state of the cache/store, or the state at the time it was created (ie when it was queried) as a snapshot. Personally I still believe the snapshot is more appropriate, and if it allows us to make this improvement am all the more in favor of it (of course this might not be a *huge* improvement as it only saves us a factor of log(N) ) . WDYT [~guozhang] > Iterating over cache with get(key) is inefficient > -------------------------------------------------- > > Key: KAFKA-8094 > URL: https://issues.apache.org/jira/browse/KAFKA-8094 > Project: Kafka > Issue Type: Improvement > Reporter: Sophie Blee-Goldman > Priority: Major > Labels: streams > > Currently, range queries in the caching layer are implemented by creating an > iterator over the subset of keys in the range, and calling get() on the > underlying TreeMap for each key. While this protects against > ConcurrentModificationException, we can improve performance by replacing the > TreeMap with a concurrent data structure such as ConcurrentSkipListMap and > then just iterating over a subMap. -- This message was sent by Atlassian JIRA (v7.6.3#76005)