[
https://issues.apache.org/jira/browse/FLINK-21321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453778#comment-17453778
]
Yuan Mei edited comment on FLINK-21321 at 12/6/21, 4:50 AM:
------------------------------------------------------------
Hey [~legojoey17], thanks for contributing!
Just a kindly ask that whether you are still interested in working on this
issue?
We are planning to include this `delete range` optimization into Flink Release
1.15 (this release cycle) after rocksdb version was upgraded (in 1.14).
And a few other optimizations/tests depend on your PR as well.
Would you be able to continue on this effort within the next month? Code freeze
for 1.15 release is in Feb.
was (Author: ym):
Hey [~legojoey17], thanks for contributing!
Just a kindly ask that whether you are still interested in working on this
issue?
We are planning to include this `delete range` optimization into Flink Release
1.15 (this release cycle) after rocksdb version was upgraded (in 1.14).
And a few other optimizations/tests depend on this as well.
Would you be able to continue on this effort in the next month? Code freeze for
1.15 release is in Feb.
> Change RocksDB incremental checkpoint re-scaling to use deleteRange
> -------------------------------------------------------------------
>
> Key: FLINK-21321
> URL: https://issues.apache.org/jira/browse/FLINK-21321
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / State Backends
> Reporter: Joey Pereira
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.15.0
>
>
> InĀ FLINK-8790, it was suggested to use RocksDB's {{deleteRange}} API to more
> efficiently clip the databases for the desired target group.
> During the PR for that ticket,
> [#5582|https://github.com/apache/flink/pull/5582], the change did not end up
> using the {{deleteRange}} method as it was an experimental feature in
> RocksDB.
> At this point {{deleteRange}} is in a far less experimental state now but I
> believe is still formally "experimental". It is heavily by many others like
> CockroachDB and TiKV and they have teased out several bugs in complex
> interactions over the years.
> For certain re-scaling situations where restores trigger
> {{restoreWithScaling}} and the DB clipping logic, this would likely reduce an
> O[n] operation (N = state size/records) to O(1). For large state apps, this
> would potentially represent a non-trivial amount of time spent for
> re-scaling. In the case of my workplace, we have an operator with 100s of
> billions of records in state and re-scaling was taking a long time (>>30min,
> but it has been awhile since doing it).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)