[jira] [Commented] (FLINK-8845) Introduce `parallel recovery` mode for full checkpoint (savepoint)
[ https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389179#comment-16389179 ] Sihua Zhou commented on FLINK-8845: --- Event though, {{SstFileWriter}} could not help us to improve performance for loading data into RocksDB, but the {{WriteBatch}} can do that, after a benchmark, I found with using {{WriteBatch}}, it help us to get (30% ~ 50%) outperformance than using {{RocksDB.put()}}. So, I would like to wipe this issue and change it to "Using WriteBatch to improve performance for recovery in RocksDB backend". > Introduce `parallel recovery` mode for full checkpoint (savepoint) > --- > > Key: FLINK-8845 > URL: https://issues.apache.org/jira/browse/FLINK-8845 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Major > Fix For: 1.6.0 > > > Base on {{ingestExternalFile()}} and {{SstFileWriter}} provided by RocksDB, > we can restore from fully checkpoint (savepoint) in parallel. This can also > be extended to incremental checkpoint easily, but for the sake of simple, we > do this in two separate tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8845) Introduce `parallel recovery` mode for full checkpoint (savepoint)
[ https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389123#comment-16389123 ] Sihua Zhou commented on FLINK-8845: --- Unfortunately, even though according to RocksDB [wiki|https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ], the best way to load data into RocksDB is "Generate SST files (using {{SstFileWriter}}) with non-overlapping ranges in parallel and bulk load the SST files.". But after implementing this and test with a simple bench mark, I found that the performance is not that good as expected, it's almost the same or worst that as using {{Rocks.put()}}. After a bit analysis I found that when building SST it consumed a lot of time to create {{DirectSlice}} and currently we can't reuse the {{DirectSlice}} in java api. Even though in C++ this could help to get a outperformance result, but in java I think we can't use this to improve the performance currently (maybe somedays RocksDB might improve this to enable us get a approximate performance in java as using C++) ... > Introduce `parallel recovery` mode for full checkpoint (savepoint) > --- > > Key: FLINK-8845 > URL: https://issues.apache.org/jira/browse/FLINK-8845 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.5.0 >Reporter: Sihua Zhou >Assignee: Sihua Zhou >Priority: Major > Fix For: 1.6.0 > > > Base on {{ingestExternalFile()}} and {{SstFileWriter}} provided by RocksDB, > we can restore from fully checkpoint (savepoint) in parallel. This can also > be extended to incremental checkpoint easily, but for the sake of simple, we > do this in two separate tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)