Hi Yun.  The UI was not useful for this case.  I had a feeling before hand
about what the issue was.  We refactored the state and now the checkpoint
is 10x faster.

On Mon, Jun 14, 2021 at 5:47 AM Yun Gao <yungao...@aliyun.com> wrote:

> Hi Dan,
>
> Flink should already have integrate a tool in the web UI to monitor
> the detailed statistics of the checkpoint [1]. It would show the time
> consumed in each part and each task, thus it could be used to debug
> the checkpoint timeout.
>
> Best,
> Yun
>
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/monitoring/checkpoint_monitoring/
>
> ------------------Original Mail ------------------
> *Sender:*Dan Hill <quietgol...@gmail.com>
> *Send Date:*Sat Jun 12 09:15:50 2021
> *Recipients:*user <user@flink.apache.org>
> *Subject:*Checkpoint is timing out - inspecting state
>
>> Hi.
>>
>> We're doing something bad with our Flink state.  We just launched a
>> feature that creates very big values (lists of objects that we append to)
>> in MapState.
>>
>> Our checkpoints time out (10 minutes).  I'm assuming the values are too
>> big.  Backpressure is okay and cpu+memory metrics look okay.
>>
>> Questions
>>
>> 1. Is there an easy tool for inspecting the Flink state?
>>
>> I found this post about drilling into Flink state
>> <https://flink.apache.org/news/2020/01/29/state-unlocked-interacting-with-state-in-apache-flink.html>.
>> I was hoping for something more like a CLI.
>>
>> 2. Is there a way to break down the time spent during a checkout if it
>> times out?
>>
>> Thanks!
>> - Dan
>>
>>
>>

Reply via email to