[jira] [Commented] (FLINK-36655) using flink state processor api to process big state in rocksdb is very slow

Gabor Somogyi (Jira) Tue, 19 Nov 2024 01:20:34 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-36655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17899412#comment-17899412
 ]


Gabor Somogyi commented on FLINK-36655:
---------------------------------------

[~guanghua] can you please elaborate what was the issue and how it was resolved 
by `reduce rocksdb writeBufferSize`?

> using flink state processor api to process big state in rocksdb is very slow 
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-36655
>                 URL: https://issues.apache.org/jira/browse/FLINK-36655
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: API / DataStream
>    Affects Versions: 1.12.7, 1.13.6, 1.14.6
>            Reporter: guanghua pi
>            Priority: Critical
>              Labels: State, processor, rocksdb
>         Attachments: image-2024-11-04-17-06-24-614.png
>
>
> My current streaming task status backend is rocksdb. A savepoint will 
> generate 65g of data. I'm using State Processor API to read data on rocksdb. 
> My demo program is very simple: it reads the original data and then writes it 
> to another HDFS directory. I use the parameters of rocksdb: 
> SPINNING_DISK_OPTIMIZED_HIGH_MEM. The configuration of my flink_config file 
> is as follows:
> ||taskmanager.memory.managed.fraction: 0.1
> taskmanager.memory.jvm-overhead.fraction: 0.05
> taskmanager.memory.jvm-overhead.max: 128mb
> taskmanager.memory.jvm-overhead.min: 64mb
> taskmanager.memory.framework.off-heap.size: 64mb
> taskmanager.memory.jvm-metaspace.size: 128m
> taskmanager.memory.network.max: 128mb
> taskmanager.memory.network.fraction: 0.1
> taskmanager.memory.managed.size: 32mb
> taskmanager.memory.task.off-heap.size: 2253mb
> state.backend.rocksdb.memory.managed: false
> state.backend.rocksdb.metrics.block-cache-capacity: true
> state.backend.rocksdb.metrics.block-cache-pinned-usage: true
> state.backend.rocksdb.metrics.block-cache-usage: true
> state.backend.rocksdb.metrics.bloom-filter-full-positive: true
> state.backend.rocksdb.memory.write-buffer-ratio: 0.5
> state.backend.rocksdb.memory.high-prio-pool-ratio: 0.2
> state.backend.rocksdb.memory.fixed-per-slot: 1024mb|| ||
> |Col A1| |
> this is my TM figure:
> !image-2024-11-04-17-06-24-614.png!
> TM memory and JM is -yjm 1G -ytm 3G
> My current problem  slow below 
> 1. After running the program for 4 hours, I will encounter " Diagnostics: 
> [2024-11-04 03:00:48.539]Container 
> [pid=8166,containerID=container_1728961635507_3104_01_000007] is running 
> 765952B beyond the 'PHYSICAL' memory limit. Current usage: 3.0 GB of 3 GB 
> physical memory used; 10.1 GB of 6.2 GB virtual memory used. Killing 
> container."
> 2. The reading speed of rocksdb continues to slow down over time. For 
> example, 60w can be read in the first hour, but only 50w can be read in 1h. 
> It will continue to decline in the end.
> 3. in log file , I find :
> Obtained shared RocksDB cache of size 67108864 bytes . but I setting 
> state.backend.rocksdb.memory.fixed-per-slot: 1024mb. The value cannot be 
> matched.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-36655) using flink state processor api to process big state in rocksdb is very slow

Reply via email to