Hi Avi,

The maximum parallelism is not an easy parameter to change for a job, once
the job is started.
The checkpoints/savepoints of the job will need migration to rehash the
keyed state entries to the different number of key groups (unit of keyed
state storage). You can try Bravo tool for it [1].

As for the number of keys, you can try enabling RocksDB Flink metrics [2],
it is available since 1.7.

Best,
Andrey

[1] https://github.com/king/bravo
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#state-backend-rocksdb-metrics-estimate-num-keys

On Wed, Feb 13, 2019 at 4:58 PM Avi Levi <avi.l...@bluevoyant.com> wrote:

> Hi
> Looking at the production readiness
> <https://ci.apache.org/projects/flink/flink-docs-stable/ops/production_ready.html#set-maximum-parallelism-for-operators-explicitly>
> checklist - is there any rule of thumb to determine the maximum parallelism
> ? we have a stateful pipeline with high throughput (4k requests/sec)
> running on google cloud (yarn) .
> I understood that if we are not setting it the default setting is 128 but
> it can change in the future but if we set it, it cannot be change later -
> correct ?
>
> Is there any way to get info on state (RocksDB) e.g number of keys , or
> list of keys ?
>
> Regards
> Avi
>

Reply via email to