[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 Thank you all @StefanRRichter @StephanEwen @bowenli86 ---

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread StefanRRichter
Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5979 LGTM, will merge. ---

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 Hi @StefanRRichter I rebased the PR, could you please have a look? ---

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 Hi @StefanRRichter If I'm not sure whether we can do that without `seek()`, because the `key bytes` is length is not fixed which may lead to delete wrongly, What do you think? Sure,

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread StefanRRichter
Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5979 With the approach I outlined, we would not require any `seek()` to the last key, we can simply create the exclusive end key. Nevertheless, you are right about the comment that is only in the

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 Hmm...there is another reason, indeed the mainly performance overhead is the `seek()`. Even though we use the `deleteRange()` to implement this, we also need to get the last key of the entries

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 @StefanRRichter , the reason I prefer this approach is that: - From the comment in RocksDB's source we can find that deleteRange() should be used for deleting big range, what if the

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread StefanRRichter
Github user StefanRRichter commented on the issue: https://github.com/apache/flink/pull/5979 @sihuazhou I wonder why you would chose iterator + batched write over simply calling `db.deleteRange(...)` where start key is `serializeCurrentKeyAndNamespace()` and end key is increasing the

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-17 Thread StephanEwen
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5979 Okay, looks really good from my side. Would be good if @StefanRRichter or @azagrebin to double check the change, otherwise good to go. ---

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-16 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 @StephanEwen , I had a micro-benchmark, here is the result ``` -> Batch VS Put < BATCH: end insert - duration:255 PUT: end insert - duration:545

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-16 Thread StephanEwen
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5979 Could you share some micro-benchmark numbers? When we change something that we know works well to something new, would be good to understand what benefits we are talking about. ---

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-10 Thread bowenli86
Github user bowenli86 commented on the issue: https://github.com/apache/flink/pull/5979 LGTM +1 ---

[GitHub] flink issue #5979: [FLINK-9070][state]improve the performance of RocksDBMapS...

2018-05-09 Thread sihuazhou
Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5979 cc @StefanRRichter (This is for 1.6, I just complete it when I have time currently) ---