[ 
https://issues.apache.org/jira/browse/KAFKA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497236#comment-17497236
 ] 

Zhongda Zhao edited comment on KAFKA-4212 at 2/24/22, 9:29 AM:
---------------------------------------------------------------

Through: 
[https://users.kafka.apache.narkive.com/DLplc75h/keyvaluestore-implementation-that-allows-retention-policy]

We thought we could have used {{Materialized.withRetention}} for a normal 
{{KeyValueStore}} to have a compact change-log topic with matching 
{{retention.ms}} and a RocksDB with matching TTL in seconds.

Our use case: we want to use kafka-streams to count huge amount of records, 
based on different key-combinations, but on a monthly basis. Due to different 
lengths of each month, we didn't use timed window due to fixed window size 
(also have to deal with out-of-order records without deterministic grace 
period). Our current work-round is to make year-month of event time part of 
group key. For interactive query we just do prefix scan. It works for our 
purpose until we found out {{withRetention}} is not applicable to normal 
{{KeyValueStore}} and the change-log topic doesn't have matching retention 
either (the latter might be possible with Processor API).

That said, we prefer using kafka-layer retention policy to have consistent 
change-log topic and underlying store settings. Implementation details like TTL 
in seconds from RocksDB can be hidden. (or this should be another ticket)

Any alternative solution suggestions for our use case are more than welcome.


was (Author: kenix):
Through: 
[https://users.kafka.apache.narkive.com/DLplc75h/keyvaluestore-implementation-that-allows-retention-policy]

We thought we could have used {{Materialized.withRetention}} for a normal 
{{KeyValueStore}} to have a compact change-log topic with matching 
{{retention.ms}} and a RocksDB with matching TTL in seconds.

Our use case: we want to use kafka-streams to count huge amount of records, 
based on different key-combinations, but on a monthly basis. Due to different 
lengths of each month, we didn't use timed window due to fixed window size 
(also have to deal with out-of-order records without deterministic grace 
period). Our current work-round is to make year-month of event time part of 
group key. For interactive query we just do prefix scan. It works for our 
purpose until we found out {{withRetention}} is not applicable to normal 
{{KeyValueStore}} and the change-log topic doesn't have matching retention 
either (the latter might be possible with Processor API).

That said, we prefer using kafka-layer retention policy to have consistent 
change-log topic and underlying store settings. Implementation details like TTL 
in seconds from RocksDB can be hidden.

Any alternative solution suggestions for our use case are more than welcome.

> Add a key-value store that is a TTL persistent cache
> ----------------------------------------------------
>
>                 Key: KAFKA-4212
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4212
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 0.10.0.1
>            Reporter: Elias Levy
>            Priority: Major
>              Labels: api
>
> Some jobs needs to maintain as state a large set of key-values for some 
> period of time.  I.e. they need to maintain a TTL cache of values potentially 
> larger than memory. 
> Currently Kafka Streams provides non-windowed and windowed key-value stores.  
> Neither is an exact fit to this use case.  
> The {{RocksDBStore}}, a {{KeyValueStore}}, stores one value per key as 
> required, but does not support expiration.  The TTL option of RocksDB is 
> explicitly not used.
> The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment 
> dropping, but it stores multiple items per key, based on their timestamp.  
> But this store can be repurposed as a cache by fetching the items in reverse 
> chronological order and returning the first item found.
> KAFKA-2594 introduced a fixed-capacity in-memory LRU caching store, but here 
> we desire a variable-capacity memory-overflowing TTL caching store.
> Although {{RocksDBWindowsStore}} can be repurposed as a cache, it would be 
> useful to have an official and proper TTL cache API and implementation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to