Myasuka commented on code in PR #21835:
URL: https://github.com/apache/flink/pull/21835#discussion_r1095402976
##########
flink-core/src/main/java/org/apache/flink/api/common/state/StateTtlConfig.java:
##########
@@ -298,12 +298,40 @@ public Builder cleanupInRocksdbCompactFilter(long
queryTimeAfterNumEntries) {
return this;
}
+ /**
+ * Cleanup expired state while Rocksdb compaction is running.
+ *
+ * <p>RocksDB compaction filter will query current timestamp, used to
check expiration, from
+ * Flink every time after processing {@code queryTimeAfterNumEntries}
number of state
+ * entries. Updating the timestamp more often can improve cleanup
speed but it decreases
+ * compaction performance because it uses JNI call from native code.
+ *
+ * <p>Periodic compaction could speed up expired state entries
cleanup, especially for state
+ * entries rarely accessed. Files older than this value will be picked
up for compaction,
+ * and re-written to the same level as they were before. It makes sure
a file goes through
+ * compaction filters periodically.
+ *
+ * @param queryTimeAfterNumEntries number of state entries to process
by compaction filter
+ * before updating current timestamp
+ * @param periodicCompactionSeconds periodic compaction per seconds
which could speed up
+ * expired state cleanup. 0 means turning off periodic compaction.
+ */
+ @Nonnull
+ public Builder cleanupInRocksdbCompactFilter(
+ long queryTimeAfterNumEntries, long periodicCompactionSeconds)
{
Review Comment:
Why the interface is designed with `long` seconds?
From my point of view, a `Duration` parameter is better for human read.
Current value of `0xfffffffffffffffeL` looks really strange.
##########
flink-core/src/main/java/org/apache/flink/api/common/state/StateTtlConfig.java:
##########
@@ -437,18 +465,43 @@ public boolean runCleanupForEveryRecord() {
DEFAULT_ROCKSDB_COMPACT_FILTER_CLEANUP_STRATEGY =
new RocksdbCompactFilterCleanupStrategy(1000L);
+ /**
+ * Default value lets RocksDB control this feature as needed. For now,
RocksDB will change
+ * this value to 30 days (i.e 30 * 24 * 60 * 60) so that every file
goes through the
+ * compaction process at least once every 30 days if not compacted
sooner.
+ */
+ static final long DEFAULT_PERIODIC_COMPACTION_SECONDS =
0xfffffffffffffffeL;
Review Comment:
Why we enable this feature by default?
If enabled, I think a default value with 31 days looks better.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]