[ 
https://issues.apache.org/jira/browse/KAFKA-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790080#comment-16790080
 ] 

Sophie Blee-Goldman commented on KAFKA-8094:
--------------------------------------------

That said, maybe this is another occasion to question whether a returned 
iterator *should* reflect the current state of the cache/store, or the state at 
the time it was created (ie when it was queried) as a snapshot.

 

Personally I still believe the snapshot is more appropriate, and if it allows 
us to make this improvement am all the more in favor of it (of course this 
might not be a *huge* improvement as it only saves us a factor of log(N) ) . 
WDYT [~guozhang]

> Iterating over cache with get(key) is inefficient 
> --------------------------------------------------
>
>                 Key: KAFKA-8094
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8094
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Sophie Blee-Goldman
>            Priority: Major
>              Labels: streams
>
> Currently, range queries in the caching layer are implemented by creating an 
> iterator over the subset of keys in the range, and calling get() on the 
> underlying TreeMap for each key. While this protects against 
> ConcurrentModificationException, we can improve performance by replacing the 
> TreeMap with a concurrent data structure such as ConcurrentSkipListMap and 
> then just iterating over a subMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to