[
https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389123#comment-16389123
]
Sihua Zhou commented on FLINK-8845:
-----------------------------------
Unfortunately, even though according to RocksDB
[wiki|https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ], the best way to
load data into RocksDB is "Generate SST files (using {{SstFileWriter}}) with
non-overlapping ranges in parallel and bulk load the SST files.". But after
implementing this and test with a simple bench mark, I found that the
performance is not that good as expected, it's almost the same or worst that as
using {{Rocks.put()}}. After a bit analysis I found that when building SST it
consumed a lot of time to create {{DirectSlice}} and currently we can't reuse
the {{DirectSlice}} in java api. Even though in C++ this could help to get a
outperformance result, but in java I think we can't use this to improve the
performance currently (maybe somedays RocksDB might improve this to enable us
get a approximate performance in java as using C++) ...
> Introduce `parallel recovery` mode for full checkpoint (savepoint)
> -------------------------------------------------------------------
>
> Key: FLINK-8845
> URL: https://issues.apache.org/jira/browse/FLINK-8845
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Affects Versions: 1.5.0
> Reporter: Sihua Zhou
> Assignee: Sihua Zhou
> Priority: Major
> Fix For: 1.6.0
>
>
> Base on {{ingestExternalFile()}} and {{SstFileWriter}} provided by RocksDB,
> we can restore from fully checkpoint (savepoint) in parallel. This can also
> be extended to incremental checkpoint easily, but for the sake of simple, we
> do this in two separate tasks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)