[
https://issues.apache.org/jira/browse/FLINK-27504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537788#comment-17537788
]
Alexis Sarda-Espinosa commented on FLINK-27504:
-----------------------------------------------
But that wouldn't be a problem, would it? It's ok to have most state in L0+L1
as long as L1 is cleaned periodically, avoiding L2 entirely.
In a different experiment [1], I noticed that the biggest contributors to
state's disk utilization were RocksDB's MANIFEST files. Maybe I'm looking at
the wrong thing, maybe it has nothing to do with SST files and compaction, but
rather with growth in the MANIFEST. Based on the code in the repo I mentioned
in the ticket, do you see a reason for those MANIFEST files to grow without
bound?
[1] [https://lists.apache.org/thread/xmrvmwyc0cbo7vxnxsch7zdt46ppk2pb]
> State compaction not happening with sliding window and incremental RocksDB
> backend
> ----------------------------------------------------------------------------------
>
> Key: FLINK-27504
> URL: https://issues.apache.org/jira/browse/FLINK-27504
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends
> Affects Versions: 1.14.4
> Environment: Local Flink cluster on Arch Linux.
> Reporter: Alexis Sarda-Espinosa
> Priority: Major
> Attachments: duration_trend_52ca77c.png, duration_trend_67c76bb.png,
> duration_trend_c5dd5d2.png, image-2022-05-06-10-34-35-007.png,
> size_growth_52ca77c.png, size_growth_67c76bb.png, size_growth_c5dd5d2.png
>
>
> Hello,
> I'm trying to estimate an upper bound for RocksDB's state size in my
> application. For that purpose, I have created a small job with faster timings
> whose code you can find on GitHub:
> [https://github.com/asardaes/flink-rocksdb-ttl-test]. You can see some of the
> results there, but I summarize here as well:
> * Approximately 20 events per second, 10 unique keys for partitioning are
> pre-specified.
> * Sliding window of 11 seconds with a 1-second slide.
> * Allowed lateness of 11 seconds.
> * State TTL configured to 1 minute and compaction after 1000 entries.
> * Both window-specific and window-global state used.
> * Checkpoints every 2 seconds.
> * Parallelism of 4 in stateful tasks.
> The goal is to let the job run and analyze state compaction behavior with
> RocksDB. I should note that global state is cleaned manually inside the
> functions, TTL for those is in case some keys are no longer seen in the
> actual production environment.
> I have been running the job on a local cluster (outside IDE), the
> configuration YAML is also available in the repository. After running for
> approximately 1.6 days, state size is currently 2.3 GiB (see attachments). I
> understand state can retain expired data for a while, but since TTL is 1
> minute, this seems excessive to me.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)