[
https://issues.apache.org/jira/browse/FLINK-36429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892488#comment-17892488
]
Eaugene Thomas commented on FLINK-36429:
----------------------------------------
Hi , If anyone is not working on , I am happy to take this up ?
> Enhancing Flink History Server File Storage and Retrieval with RocksDB
> ----------------------------------------------------------------------
>
> Key: FLINK-36429
> URL: https://issues.apache.org/jira/browse/FLINK-36429
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / State Backends
> Affects Versions: 1.20.0
> Reporter: Xiaowen Sun
> Priority: Major
> Labels: historyserver
> Original Estimate: 2,016h
> Remaining Estimate: 2,016h
>
> Currently, when a Flink job finishes, it writes an archive as a single file
> that maps paths to JSON files. Flink History Server (FHS) job archives are
> pulled locally where the FHS is running on, and this process creates a local
> directory that expands based on the contents of the single archive file.
> Because of how the FHS stores the files, there are a large number of
> directories created in the local file system. This system can become
> inefficient and slow as the volume of job archives increases, creating
> bottlenecks in job data navigation and retrieval.
> To illustrate the problem of inode usage, let’s consider a scenario where
> there are 5000 subtasks. Each subtask creates its own directory, and within
> each subtask directory, there are additional directories that might store
> only a single file. This structure rapidly increases the number of inodes
> consumed.
> Integrating RocksDB, a high-performance embedded database for key-value data,
> aims to resolve these issues by offering faster data access and better
> scalability. This integration is expected to significantly enhance the
> operational efficiency of FHS by allowing faster data retrieval and enabling
> a larger cache on local Kubernetes deployments, thus overcoming inode
> limitations
--
This message was sent by Atlassian Jira
(v8.20.10#820010)