Elias Levy created KAFKA-4608:
---------------------------------

             Summary: RocksDBWindowStore.fetch() is inefficient for large ranges
                 Key: KAFKA-4608
                 URL: https://issues.apache.org/jira/browse/KAFKA-4608
             Project: Kafka
          Issue Type: Improvement
          Components: streams
    Affects Versions: 0.10.1.1
            Reporter: Elias Levy


It is not unreasonable for a user to call {{RocksDBWindowStore.fetch}} to scan 
for a key across a large time range.  For instance, someone may call it with a 
{{timeFrom}} of zero or a {{timeTo}} of max long in an attempt to fetch keys 
matching across all time forwards or backwards.  

But if you do so, {{fetch}} will peg the CPU, as it attempts to iterate over 
every single segment id in the range. That is obviously very inefficient.  

{{fetch}} should trim the {{timeFrom}}/{{timeTo}} range based on the available 
time range in the {{segments}} hash map, so that it only iterates over the 
available time range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to