[
https://issues.apache.org/jira/browse/FLINK-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488360#comment-16488360
]
ASF GitHub Bot commented on FLINK-8790:
---------------------------------------
Github user sihuazhou commented on the issue:
https://github.com/apache/flink/pull/5582
Unfortunately, after confirming with RocksDB, the `deleteRange()` is still
an experimental feature, it may have impact on read performance currently(event
thought we could use the ReadOption to reduce the impaction).
In practice, I tested the impact of read performance of `deleteRange()` in
our case (only delete 2 ranges at most), I didn't find any impact in fact. And
the TiKV has already used it to delete entire shards. But, to be on the safe
side, I think the current PR should be frozen, but I think the implementation
base on `deleteRange()` in this PR should be a better implementation(especially
when user scaling up the job, in that case we only need to clip the RocksDB
without iterating any records, a super fast way) if the `deleteRange()` is no
longer a feature of experimental.
Anyways, even although we can't use the `deleteRange()` currently, but we
can still improve the performance of the incremental checkpoint in somehow. We
can improve it the by the follow way: if one of the state handle's key-group is
a sub-range of the target key-group range. we can open it directly to prevent
the overhead of iterating it. @StefanRRichter What do you think? If you don't
object this, I will update the PR follow the above approach.
> Improve performance for recovery from incremental checkpoint
> ------------------------------------------------------------
>
> Key: FLINK-8790
> URL: https://issues.apache.org/jira/browse/FLINK-8790
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Affects Versions: 1.5.0
> Reporter: Sihua Zhou
> Assignee: Sihua Zhou
> Priority: Major
> Fix For: 1.6.0
>
>
> When there are multi state handle to be restored, we can improve the
> performance as follow:
> 1. Choose the best state handle to init the target db
> 2. Use the other state handles to create temp db, and clip the db according
> to the target key group range (via rocksdb.deleteRange()), this can help use
> get rid of the `key group check` in
> `data insertion loop` and also help us get rid of traversing the useless
> record.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)