I cannot say much about the concrete value but if our users have problems
with the existing default values, then it makes sense to me to change it.

One thing to check could be whether it is possible to provide a meaningful
exception in case that the state size exceeds the frame size. At the
moment, Flink should fail with a message saying that a rpc message exceeds
the maximum frame size. Maybe it is also possible to point the user towards
"state.backend.fs.memory-threshold" if the message exceeds the frame size
because of too much state.

Cheers,
Till

On Thu, May 14, 2020 at 2:34 PM Stephan Ewen <se...@apache.org> wrote:

> The parameter "state.backend.fs.memory-threshold" decides when a state will
> become a file and when it will be stored inline with the metadata (to avoid
> excessive amounts of small files).
>
> By default, this threshold is 1K - so every state above that size becomes a
> file. For many cases, this threshold seems to be too low.
> There is an interesting talk with background on this from Scott Kidder:
> https://www.youtube.com/watch?v=gycq0cY3TZ0
>
> I wanted to discuss increasing this to 100K by default.
>
> Advantage:
>   - This should help many users out of the box, which otherwise see
> checkpointing problems on systems like S3, GCS, etc.
>
> Disadvantage:
>   - For very large jobs, this increases the required heap memory on the JM
> side, because more state needs to be kept in-line when gathering the acks
> for a pending checkpoint.
>   - If tasks have a lot of states and each state is roughly at this
> threshold, we increase the chance of exceeding the RPC frame size and
> failing the job.
>
> What do you think?
>
> Best,
> Stephan
>

Reply via email to