Hi Henry, @Kien is right. Take a thread dump to see what was doing in the TaskManager. Also check whether gc happens frequently.
Best, Hequn On Wed, Oct 24, 2018 at 5:03 PM 徐涛 <happydexu...@gmail.com> wrote: > Hi > I am running a flink application with parallelism 64, I left the > checkpoint timeout default value, which is 10minutes, the state size is > less than 1MB, I am using the FsStateBackend. > The application triggers some checkpoints but all of them fails > due to "Checkpoint expired before completing”, I check the checkpoint > history, found that there are 63 subtask acknowledge, but one left n/a, and > also the alignment duration is quite long, about 5m27s. > I want to know why there is one subtask does not acknowledge? And > because the alignment duration is long, what will influent the alignment > duration? > Thank a lot. > > Best > Henry