[
https://issues.apache.org/jira/browse/FLINK-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16884101#comment-16884101
]
Yu Li commented on FLINK-13034:
-------------------------------
bq. Modify RocksDBMapState back to previous design which would first load one
element and then load more elements in the follow-up queries
It seems to be a regression according to the above description. Could you tell
more about which JIRA caused the issue and from which version? Asking because
if this is a performance regression, I would suggest to mark it as a blocker
and fix it in 1.9.0 release ([~tzulitai] [~ykt836] FYI.)
And please also paste the
[benchmark|https://github.com/dataArtisans/flink-benchmarks] result for the
state backend before/after your proposed changes. Thanks.
> Improve the performance when checking whether mapstate is empty for
> RocksDBStateBackend
> ---------------------------------------------------------------------------------------
>
> Key: FLINK-13034
> URL: https://issues.apache.org/jira/browse/FLINK-13034
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / State Backends
> Reporter: Yun Tang
> Assignee: Yun Tang
> Priority: Major
>
> Currently, there existed several scenarios to check whether map state is
> empty in Flink source code,
> e.g.[TemporalRowTimeJoinOperator|https://github.com/apache/flink/blob/8315f38e89f897e32cfa0f23990cb3fb44db0d72/flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/join/temporal/TemporalRowTimeJoinOperator.java#L192],
> [AbstractRowTimeUnboundedPrecedingOver|#L160)].
> Developers would use below command to check whether the map state is empty:
> {code:java}
> boolean noRecordsToProcess = !inputState.keys().iterator().hasNext();
> {code}
> However, if we use {{RocksDBStateBackend}},
> {{inputState.keys().iterator().hasNext()}} would actually call 1 {{seek}} and
> 128 {{next}} actions in
> [RocksDBMapState|https://github.com/apache/flink/blob/8315f38e89f897e32cfa0f23990cb3fb44db0d72/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java#L483],
> in which the redundant {{next}} actions are not what we want.
> I have two options to improve this:
> * Modify {{RocksDBMapState}} back to previous design which would first load
> one element and then load more elements in the follow-up queries. However,
> this would effect the performance of other map state methods.
> * Add a {{isEmpty()}} method in the public evolving interface {{MapState}},
> so that we could use it to check whether the map state is empty without any
> redundant RocksDB actions.
> I prefer to the 2nd option.
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)