Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-16 Thread sohimankotia
Hi Andrey, Yes .Setting setFailOnCheckpointingErrors(false) solved the problem. But in between I am getting this error : 2019-01-16 21:07:26,979 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerDetailsHandler - Implementation error: Unhandled exception.

Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-16 Thread Andrey Zagrebin
Hi Sohi, Could it be that you configured your job tasks to fail if checkpoint fails (streamExecutionEnvironment.getCheckpointConfig().setFailOnCheckpointingErrors(true))? Could you send the complete job master log? If checkpoint 470 has been subsumed by 471, it could be that its directory is

Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-16 Thread Till Rohrmann
Hi Sohimankotia, you can control Flink's failure behaviour in case of a checkpoint failure via the `ExecutionConfig#setFailTaskOnCheckpointError(boolean)`. Per default it is set to true which means that a Flink task will fail if a checkpoint error occurs. If you set it to false, then the job

Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-15 Thread Congxian Qiu
Hi, Sohi You can check out doc[1][2] to find out the answer. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing [2] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/restart_strategies.html

Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-15 Thread sohimankotia
Yes. File got deleted . 2019-01-15 10:40:41,360 INFO FSNamesystem.audit: allowed=true ugi=hdfs (auth:SIMPLE) ip=/192.168.3.184 cmd=delete src=/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19 dst=nullperm=null

Re: Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-15 Thread Congxian Qiu
Hi, Sohi Seems like the checkpoint file `hdfs:/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19` did not exist for some reason, you can check the life cycle of this file from hdfs audit log and find out why the file did not exist. maybe the

Streaming Checkpoint - Could not materialize checkpoint Exception

2019-01-14 Thread sohimankotia
Hi , Flink - 1.5.5 My Streaming job has checkpoint every minute . I am getting following exception. 2019-01-15 01:59:04,680 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 469 for job e9a08c0661a6c31b5af540cf352e1265 (2736 bytes in 124 ms). 2019-01-15