[
https://issues.apache.org/jira/browse/FLINK-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496400#comment-16496400
]
ASF GitHub Bot commented on FLINK-8790:
---------------------------------------
Github user sihuazhou commented on a diff in the pull request:
https://github.com/apache/flink/pull/5582#discussion_r192065096
--- Diff:
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeySerializationUtils.java
---
@@ -138,4 +138,12 @@ private static void writeVariableIntBytes(
value >>>= 8;
} while (value != 0);
}
+
+ public static byte[] serializeKeyGroup(int keyGroup, int
keyGroupPrefixBytes) {
+ byte[] startKeyGroupPrefixBytes = new byte[keyGroupPrefixBytes];
--- End diff --
Yes, I did notice the "sstable ingestion feature", and also did some
experiment on it. You are right that currently the ingestion feature only works
for the sstables written by the sstable writer. I tried to use the sstable
writer to generate external sstables in parallel and ingest the sstables into
the target db, but unfortunately the performance of the sstable writer is quite
poor in RocksJava...I left the experiment conclusion in
[FLINK-8845](https://issues.apache.org/jira/browse/FLINK-8845)(that is the
reason why I took a step back to use the `WriteBatch` to speed up the recovery
for full checkpoint), I pasted the comments below:
**Unfortunately, even though according to RocksDB wiki, the best way to
load data into RocksDB is "Generate SST files (using SstFileWriter) with
non-overlapping ranges in parallel and bulk load the SST files.". But after
implementing this and test with a simple bench mark, I found that the
performance is not that good as expected, it's almost the same or worst that as
using Rocks.put(). After a bit analysis I found that when building SST it
consumed a lot of time to create DirectSlice and currently we can't reuse the
DirectSlice in java api. Even though in C++ this could help to get a
outperformance result, but in java I think we can't use this to improve the
performance currently (maybe somedays RocksDB might improve this to enable us
get a approximate performance in java as using C++) ...**
And regarding to https://github.com/facebook/rocksdb/issues/499, If I'm not
misunderstand, I think we might also can't use the `repairDB()` because we have
many column families, and the other opinions in that thread is quite similar
with the approach that I've tried to build the sstables in parallel and it
turned out that it didn't work properly with Java API.
> Improve performance for recovery from incremental checkpoint
> ------------------------------------------------------------
>
> Key: FLINK-8790
> URL: https://issues.apache.org/jira/browse/FLINK-8790
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Affects Versions: 1.5.0
> Reporter: Sihua Zhou
> Assignee: Sihua Zhou
> Priority: Major
> Fix For: 1.6.0
>
>
> When there are multi state handle to be restored, we can improve the
> performance as follow:
> 1. Choose the best state handle to init the target db
> 2. Use the other state handles to create temp db, and clip the db according
> to the target key group range (via rocksdb.deleteRange()), this can help use
> get rid of the `key group check` in
> `data insertion loop` and also help us get rid of traversing the useless
> record.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)