[ https://issues.apache.org/jira/browse/KAFKA-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guozhang Wang updated KAFKA-4608: --------------------------------- Comment: was deleted (was: I have filed https://issues.apache.org/jira/browse/KAFKA-6560 to tackle on this issue, it aims to only use point queries for window stores than range queries.) > RocksDBWindowStore.fetch() is inefficient for large ranges > ---------------------------------------------------------- > > Key: KAFKA-4608 > URL: https://issues.apache.org/jira/browse/KAFKA-4608 > Project: Kafka > Issue Type: Improvement > Components: streams > Affects Versions: 0.10.1.1 > Reporter: Elias Levy > Priority: Major > > It is not unreasonable for a user to call {{RocksDBWindowStore.fetch}} to > scan for a key across a large time range. For instance, someone may call it > with a {{timeFrom}} of zero or a {{timeTo}} of max long in an attempt to > fetch keys matching across all time forwards or backwards. > But if you do so, {{fetch}} will peg the CPU, as it attempts to iterate over > every single segment id in the range. That is obviously very inefficient. > {{fetch}} should trim the {{timeFrom}}/{{timeTo}} range based on the > available time range in the {{segments}} hash map, so that it only iterates > over the available time range. -- This message was sent by Atlassian JIRA (v7.6.3#76005)