[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] Optionally store elements of...

2018-05-02 Thread je-ik
Github user je-ik commented on the issue:

https://github.com/apache/flink/pull/5185
  
@StefanRRichter I think that was exactly the initial idea, but then we came 
into troubles with the savepoints and changing list type. Also as @aljoscha 
mentioned, it can be confusing for users to see `MapState` instead of 
`ListState` after inspecting the savepoint. Unfortunately, I currently don't 
have time to work on this, so if anyone would be interested in getting this 
done, that would be awesome.


---


[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] Optionally store elements of...

2018-03-09 Thread je-ik
Github user je-ik commented on the issue:

https://github.com/apache/flink/pull/5185
  
@StephanEwen I think it should be configurable. As Aljoscha pointed out, it 
is needed to ensure that these two representations have the same serialized 
form in checkpoints, because that way users can switch back and forth the 
implementations between application restarts. Unfortunately, I didn't have time 
to dive into that so far. :-(


---


[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] Optionally store elements of...

2018-02-14 Thread je-ik
Github user je-ik commented on the issue:

https://github.com/apache/flink/pull/5185
  
@aljoscha I updated the title. I'm a little concerned about the 
serialization in savepoint. If the serialization is *exactly* the same, doesn't 
that actually mean that again, the whole List will be stored in single byte[], 
which will OOME for cases which the user wanted to solve by activating the 
"large list" implementation? Or am I missing something?


---


[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] optionally use RocksDBMapSta...

2018-02-11 Thread je-ik
Github user je-ik commented on the issue:

https://github.com/apache/flink/pull/5185
  
@aljoscha I (partly) reworked this PR as you suggest. There are still some 
unresolved questions though:
 1) I'm not 100% sure how to cleanly support the migration between list 
state savepoints, would you have any pointers on how should I address this?
 2) I didn't test the new version on actual flink job yet, it just passes 
tests
I think there will be some more modifications needed, so I will test this 
on real data when there is agreement on the actual implementation.
Thanks in advance for any comments!


---


[GitHub] flink issue #5185: [FLINK-8297] [flink-rocksdb] optionally use RocksDBMapSta...

2017-12-21 Thread je-ik
Github user je-ik commented on the issue:

https://github.com/apache/flink/pull/5185
  
I think that the failed test is not related to this PR.


---


[GitHub] flink pull request #5185: [FLINK-8297] [flink-rocksdb] optionally use RocksD...

2017-12-19 Thread je-ik
GitHub user je-ik opened a pull request:

https://github.com/apache/flink/pull/5185

[FLINK-8297] [flink-rocksdb] optionally use RocksDBMapState internally for 
storing lists

## What is the purpose of the change

Enable storing lists not fitting to memory per single key.

## Brief change log

## Verifying this change

This change added tests and can be verified as follows:
  passes additional tests for RocksDBStateBackend.enableLargeListsPerKey()

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): no
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no, backward 
compatible
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? yes
  - If yes, how is the feature documented? JavaDocs


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/datadrivencz/flink 
rocksdb-backend-memory-optimization

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5185.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5185


commit f1bbaa30901ba8a54b02908fd3eb3615301b4400
Author: Jan Lukavsky <je...@seznam.cz>
Date:   2017-12-14T20:42:06Z

[FLINK-8297] [flink-rocksdb] optionally use RocksDBMapState internally for 
storing lists




---