Hi Arpith

If you use savepoint to restore RocksDB state, the actual phase is to insert 
original binary key-value pairs into an empty RocksDB which would be slow if 
state large. There existed several discussions about the optimizations of this 
phase [1] [2].

If you want to walk around this issue quickly, you could use incremental 
checkpoint to restore rocksDB state as it just open the DB with existing sst 
files instead of loading data. Moreover, rocksDB incremental checkpoint also 
support the job to change parallelism currently.

[1] https://issues.apache.org/jira/browse/FLINK-17971
[2] https://issues.apache.org/jira/browse/FLINK-17288

Best
Yun Tang
________________________________
From: Arpith P <[email protected]>
Sent: Thursday, October 15, 2020 0:50
To: user <[email protected]>
Subject: Large state RocksDb backend increases app start time

Hi,

I'm currently storing around 70GB of data in map sate backed by RocksDB backend 
. Once I restore an application from savepoint currently the application takes 
more than 4mins to start processing events. How can I speed this up or is there 
any other recommended approach.

I'm using the following predefined options with RocksDB.

RocksDBStateBackend backend = new RocksDBStateBackend(checkpointDir, 
incrementalCheckpoints);
backend.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM);

Thanks,
Arpith

Reply via email to