[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] Optionally store elements of...
Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 @StefanRRichter I think that was exactly the initial idea, but then we came into troubles with the savepoints and changing list type. Also as @aljoscha mentioned, it can be confusing for users to see `MapState` instead of `ListState` after inspecting the savepoint. Unfortunately, I currently don't have time to work on this, so if anyone would be interested in getting this done, that would be awesome. ---
[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] Optionally store elements of...
Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 @StephanEwen I think it should be configurable. As Aljoscha pointed out, it is needed to ensure that these two representations have the same serialized form in checkpoints, because that way users can switch back and forth the implementations between application restarts. Unfortunately, I didn't have time to dive into that so far. :-( ---
[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] Optionally store elements of...
Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 @aljoscha I updated the title. I'm a little concerned about the serialization in savepoint. If the serialization is *exactly* the same, doesn't that actually mean that again, the whole List will be stored in single byte[], which will OOME for cases which the user wanted to solve by activating the "large list" implementation? Or am I missing something? ---
[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] optionally use RocksDBMapSta...
Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 @aljoscha I (partly) reworked this PR as you suggest. There are still some unresolved questions though: 1) I'm not 100% sure how to cleanly support the migration between list state savepoints, would you have any pointers on how should I address this? 2) I didn't test the new version on actual flink job yet, it just passes tests I think there will be some more modifications needed, so I will test this on real data when there is agreement on the actual implementation. Thanks in advance for any comments! ---
[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] optionally use RocksDBMapSta...
Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 I think that the failed test is not related to this PR. ---
[GitHub] flink pull request #5185: [FLINK-8297] [flink-rocksdb] optionally use RocksD...
GitHub user je-ik opened a pull request: https://github.com/apache/flink/pull/5185 [FLINK-8297] [flink-rocksdb] optionally use RocksDBMapState internally for storing lists ## What is the purpose of the change Enable storing lists not fitting to memory per single key. ## Brief change log ## Verifying this change This change added tests and can be verified as follows: passes additional tests for RocksDBStateBackend.enableLargeListsPerKey() ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes - The serializers: no - The runtime per-record code paths (performance sensitive): no, backward compatible - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? JavaDocs You can merge this pull request into a Git repository by running: $ git pull https://github.com/datadrivencz/flink rocksdb-backend-memory-optimization Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5185.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5185 commit f1bbaa30901ba8a54b02908fd3eb3615301b4400 Author: Jan Lukavsky <je...@seznam.cz> Date: 2017-12-14T20:42:06Z [FLINK-8297] [flink-rocksdb] optionally use RocksDBMapState internally for storing lists ---