Hi Congxian,
Starting from this morning we have more issues with checkpointing in 
production. What we see is sync and async duration for some subtasks are very 
long but what strange is the total of sync and async durations are much less 
than the total end to end duration. Please check the following snapshot:



For example, for the subtask 14: Sync duration is 4 mins, async duration 3 
mins, end-to-end duration is 53 mins!!!
We have a very long timeout value (1 hour) for checkpointing, but still many 
checkpoints are failing, some subtasks cannot finish checkpointing in 1 hour.

We really appreciate your help here, this is a critical production problem for 
us at the moment.

Regards,
Bekir


> On 17 Jul 2019, at 17:46, Bekir Oguz <bekir.o...@persgroep.net> wrote:
> 
> 
> And I also extracted events fr

Reply via email to