[jira] [Commented] (FLINK-8845) Introduce `parallel recovery` mode for full checkpoint (savepoint)

2018-03-06 Thread Sihua Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389179#comment-16389179
 ] 

Sihua Zhou commented on FLINK-8845:
---

Event though, {{SstFileWriter}} could not help us to improve performance for 
loading data into RocksDB, but the {{WriteBatch}} can do that, after a 
benchmark, I found with using {{WriteBatch}}, it help us to get (30% ~ 50%) 
outperformance than using {{RocksDB.put()}}. So, I would like to wipe this 
issue and change it to "Using WriteBatch to improve performance for recovery in 
RocksDB backend".

>  Introduce `parallel recovery` mode for full checkpoint (savepoint)
> ---
>
> Key: FLINK-8845
> URL: https://issues.apache.org/jira/browse/FLINK-8845
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.6.0
>
>
> Base on {{ingestExternalFile()}} and {{SstFileWriter}} provided by RocksDB, 
> we can restore from fully checkpoint (savepoint) in parallel. This can also 
> be extended to incremental checkpoint easily, but for the sake of simple, we 
> do this in two separate tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8845) Introduce `parallel recovery` mode for full checkpoint (savepoint)

2018-03-06 Thread Sihua Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389123#comment-16389123
 ] 

Sihua Zhou commented on FLINK-8845:
---

Unfortunately, even though according to RocksDB 
[wiki|https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ], the best way to 
load data into RocksDB is "Generate SST files (using {{SstFileWriter}}) with 
non-overlapping ranges in parallel and bulk load the SST files.". But after 
implementing this and test with a simple bench mark, I found that the 
performance is not that good as expected, it's almost the same or worst that as 
using {{Rocks.put()}}. After a bit analysis I found that when building SST it 
consumed a lot of time to create {{DirectSlice}} and currently we can't reuse 
the {{DirectSlice}} in java api. Even though in C++ this could help to get a 
outperformance result, but in java I think we can't use this to improve the 
performance currently (maybe somedays RocksDB might improve this to enable us 
get a approximate performance in java as using C++) ...

>  Introduce `parallel recovery` mode for full checkpoint (savepoint)
> ---
>
> Key: FLINK-8845
> URL: https://issues.apache.org/jira/browse/FLINK-8845
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.6.0
>
>
> Base on {{ingestExternalFile()}} and {{SstFileWriter}} provided by RocksDB, 
> we can restore from fully checkpoint (savepoint) in parallel. This can also 
> be extended to incremental checkpoint easily, but for the sake of simple, we 
> do this in two separate tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)