[
https://issues.apache.org/jira/browse/FLINK-35738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weijie Guo updated FLINK-35738:
-------------------------------
Affects Version/s: 1.20.0
> Release Testing: Verify FLINK-26050 Too many small sst files in rocksdb state
> backend when using time window created in ascending order
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-35738
> URL: https://issues.apache.org/jira/browse/FLINK-35738
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / State Backends
> Affects Versions: 1.20.0
> Reporter: Rui Fan
> Assignee: Samrat Deb
> Priority: Major
>
> The problem occurs when using RocksDB and specific queries/jobs (please see
> the ticket for the detailed description).
> To test the solution, run the following query with RocksDB as a state backend:
>
> {code:java}
> INSERT INTO top_5_highest_view_time
> SELECT *
> FROM (
> SELECT *,
> ROW_NUMBER() OVER (PARTITION BY window_start,
> window_end ORDER BY view_time DESC) AS rownum
> FROM (
> SELECT window_start,
> window_end,
> product_id,
> SUM(view_time) AS view_time,
> COUNT(*) AS cnt
> FROM TABLE(TUMBLE(TABLE
> `shoe_clickstream`, DESCRIPTOR($rowtime), INTERVAL '10' MINUTES))
> GROUP BY window_start,
> window_end,
> product_id))
> WHERE rownum <= 5;{code}
>
> With the feature disabled (default), the number of files in rocksdb working
> directory (as well as in the checkpoint) should grow indefinitely.
>
> With feature enabled, the number of files should stays constant (as they
> should get merged with each other).
> To enable the feature, set
> {code:java}
> state.backend.rocksdb.manual-compaction.min-interval{code}
> set to 1 minute for example.
>
> Please consult
> [https://github.com/apache/flink/blob/e7d7db3b6f87e53d9bace2a16cf95e5f7a79087a/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/sstmerge/RocksDBManualCompactionOptions.java#L29]
> for other options if necessary.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)