Re: Improved performance when using incremental checkpoints

Aljoscha Krettek Tue, 16 Jun 2020 04:54:14 -0700

Hi,

it might be that the operations that Flink performs on RocksDB duringcheckpointing will "poke" RocksDB somehow and make it clean up it'sinternal hierarchies of storage more. Other than that, I'm also a bitsurprised by this.


Maybe Yun Tang will come up with another idea.

Best,
Aljoscha

On 16.06.20 12:42, nick toker wrote:

Hi,

We used both flink versions 1.9.1 and 1.10.1
We used rocksDB default configuration.
The streaming pipeline is very simple.

1. Kafka consumer
2. Process function
3. Kafka producer

The code of the process function is listed below:

private transient MapState<String, Object> testMapState;

@Override
     public void processElement(Map<String, Object> value, Context ctx,
Collector<Map<String, Object>> out) throws Exception {

             if (testMapState.isEmpty()) {

                 testMapState.putAll(value);

                 out.collect(value);

                 testMapState.clear();
             }
         }

We used the same code with ValueState and observed the same results.


BR,

Nick


‫בתאריך יום ג׳, 16 ביוני 2020 ב-11:56 מאת ‪Yun Tang‬‏ <‪myas...@live.com
‬‏>:‬

Hi Nick

It's really strange that performance could improve when checkpoint is
enabled.
In general, enable checkpoint might bring a bit performance downside to
the whole job.

Could you give more details e.g. Flink version, configurations of RocksDB
and simple code which could reproduce this problem.

Best
Yun Tang
------------------------------
*From:* nick toker <nick.toker....@gmail.com>
*Sent:* Tuesday, June 16, 2020 15:44
*To:* user@flink.apache.org <user@flink.apache.org>
*Subject:* Improved performance when using incremental checkpoints

Hello,

We are using RocksDB as the backend state.
At first we didn't enable the checkpoints mechanism.

We observed the following behaviour and we are wondering why ?

When using the rocksDB *without* checkpoint the performance was very
extremely bad.
And when we enabled the checkpoint the performance was improved by a*
factor of 10*.

Could you please explain if this behaviour is expected ?
Could you please explain why enabling the checkpoint significantly
improves the performance ?

BR,
Nick

Re: Improved performance when using incremental checkpoints

Reply via email to