Hi Lake

Flink leverage RocksDB's background compaction mechanism to filter out-of-TTL 
entries (by comparing with current timestamp provided from RocksDB's 
time_provider) to not let them stay in newly compacted data.

This would iterator over data entries with FlinkCompactionFilter::FilterV2 [1], 
and the parameter 'queryTimeAfterNumEntries' in Flink indicates the threshold 
'query_time_after_num_entries_' in FrocksDB  [2]. Once RocksDB iterator more 
than several entries .e.g 1000, it would call time_provider to update current 
timestamp to let the process of cleaning up more eagerly and accurately.

[1] 
https://github.com/dataArtisans/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/utilities/flink/flink_compaction_filter.cc#L107
[2] 
https://github.com/dataArtisans/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/utilities/flink/flink_compaction_filter.h#L140

Best
Yun Tang

________________________________
From: LakeShen <shenleifight...@gmail.com>
Sent: Tuesday, March 17, 2020 15:30
To: dev <d...@flink.apache.org>; user-zh <user...@flink.apache.org>; user 
<user@flink.apache.org>
Subject: Question about RocksDBStateBackend Compaction Filter state cleanup

Hi community ,

I see the flink RocksDBStateBackend state cleanup,now the code like this :


StateTtlConfig ttlConfig = StateTtlConfig
    .newBuilder(Time.seconds(1))
    .cleanupInRocksdbCompactFilter(1000)
    .build();


The default background cleanup for RocksDB backend queries the current 
timestamp each time 1000 entries have been processed.

What's the meaning of  1000 entries? 1000 different key ?

Thanks to your reply.

Best regards,
LakeShen

Reply via email to