Jufang He created FLINK-34334:
---------------------------------

             Summary: Add sub-task level RocksDB file count metric
                 Key: FLINK-34334
                 URL: https://issues.apache.org/jira/browse/FLINK-34334
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / State Backends
    Affects Versions: 1.18.0
            Reporter: Jufang He
         Attachments: img_v3_027i_7ed0b8ba-3f12-48eb-aab3-cc368ac47cdg.jpg

In our production environment, we encountered the problem of task deploy 
failure. The root cause was that too many sst files of a single sub-task led to 
too much task deployment information(OperatorSubtaskState), and then caused 
akka request timeout in the task deploy phase. Therefore, I wanted to add 
sub-task level RocksDB file count metrics. It is convenient to avoid 
performance problems caused by too many sst files in time.

RocksDB has provided the JNI 
(https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.html#getColumnFamilyMetaData
 ()) We can easily retrieve the file count and report it via metrics reporter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to