[
https://issues.apache.org/jira/browse/FLINK-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360123#comment-16360123
]
ASF GitHub Bot commented on FLINK-8297:
---------------------------------------
Github user je-ik commented on the issue:
https://github.com/apache/flink/pull/5185
@aljoscha I (partly) reworked this PR as you suggest. There are still some
unresolved questions though:
1) I'm not 100% sure how to cleanly support the migration between list
state savepoints, would you have any pointers on how should I address this?
2) I didn't test the new version on actual flink job yet, it just passes
tests
I think there will be some more modifications needed, so I will test this
on real data when there is agreement on the actual implementation.
Thanks in advance for any comments!
> RocksDBListState stores whole list in single byte[]
> ---------------------------------------------------
>
> Key: FLINK-8297
> URL: https://issues.apache.org/jira/browse/FLINK-8297
> Project: Flink
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.4.0, 1.3.2
> Reporter: Jan Lukavský
> Priority: Major
>
> RocksDBListState currently keeps whole list of data in single RocksDB
> key-value pair, which implies that the list actually must fit into memory.
> Larger lists are not supported and end up with OOME or other error. The
> RocksDBListState could be modified so that individual items in list are
> stored in separate keys in RocksDB and can then be iterated over. A simple
> implementation could reuse existing RocksDBMapState, with key as index to the
> list and a single RocksDBValueState keeping track of how many items has
> already been added to the list. Because this implementation might be less
> efficient in come cases, it would be good to make it opt-in by a construct
> like
> {{new RocksDBStateBackend().enableLargeListsPerKey()}}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)