[ https://issues.apache.org/jira/browse/FLINK-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400570#comment-16400570 ]
Jan Lukavský commented on FLINK-8297: ------------------------------------- Yes, that seems related. I'm not 100% convinced that simply overcoming size limitation of `Integer.MAX_VALUE` solves the actual problem, because the whole list would have to still be stored in memory and can therefore result in various OOM errors, or containers being killed (e.g. on YARN). > RocksDBListState stores whole list in single byte[] > --------------------------------------------------- > > Key: FLINK-8297 > URL: https://issues.apache.org/jira/browse/FLINK-8297 > Project: Flink > Issue Type: Improvement > Components: Core > Affects Versions: 1.4.0, 1.3.2 > Reporter: Jan Lukavský > Priority: Major > > RocksDBListState currently keeps whole list of data in single RocksDB > key-value pair, which implies that the list actually must fit into memory. > Larger lists are not supported and end up with OOME or other error. The > RocksDBListState could be modified so that individual items in list are > stored in separate keys in RocksDB and can then be iterated over. A simple > implementation could reuse existing RocksDBMapState, with key as index to the > list and a single RocksDBValueState keeping track of how many items has > already been added to the list. Because this implementation might be less > efficient in come cases, it would be good to make it opt-in by a construct > like > {{new RocksDBStateBackend().enableLargeListsPerKey()}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)