Hi Andrey,
Yes .Setting setFailOnCheckpointingErrors(false) solved the problem.
But in between I am getting this error :
2019-01-16 21:07:26,979 ERROR
org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerDetailsHandler
- Implementation error: Unhandled exception.
Hi Sohi,
Could it be that you configured your job tasks to fail if checkpoint fails
(streamExecutionEnvironment.getCheckpointConfig().setFailOnCheckpointingErrors(true))?
Could you send the complete job master log?
If checkpoint 470 has been subsumed by 471, it could be that its directory
is
Hi Sohimankotia,
you can control Flink's failure behaviour in case of a checkpoint failure
via the `ExecutionConfig#setFailTaskOnCheckpointError(boolean)`. Per
default it is set to true which means that a Flink task will fail if a
checkpoint error occurs. If you set it to false, then the job
Hi, Sohi
You can check out doc[1][2] to find out the answer.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/restart_strategies.html
Yes. File got deleted .
2019-01-15 10:40:41,360 INFO FSNamesystem.audit: allowed=true ugi=hdfs
(auth:SIMPLE) ip=/192.168.3.184 cmd=delete
src=/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19
dst=nullperm=null
Hi, Sohi
Seems like the checkpoint file
`hdfs:/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19`
did not exist for some reason, you can check the life cycle of this file
from hdfs audit log and find out why the file did not exist. maybe the
Hi ,
Flink - 1.5.5
My Streaming job has checkpoint every minute . I am getting following
exception.
2019-01-15 01:59:04,680 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed
checkpoint 469 for job e9a08c0661a6c31b5af540cf352e1265 (2736 bytes in 124
ms).
2019-01-15