Henrik created FLINK-12381:
------------------------------
Summary: Without failover (aka "HA") configured, full restarts'
checkpointing crashes
Key: FLINK-12381
URL: https://issues.apache.org/jira/browse/FLINK-12381
Project: Flink
Issue Type: Bug
Components: Runtime / Checkpointing
Affects Versions: 1.8.0
Environment: Same as FLINK-\{12379, 12377, 12376}
Reporter: Henrik
{code:java}
Caused by: org.apache.hadoop.fs.FileAlreadyExistsException:
'gs://example_bucket/flink/checkpoints/00000000000000000000000000000000/chk-16/_metadata'
already exists
at
com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createChannel(GoogleHadoopOutputStream.java:85)
at
com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:74)
at
com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.create(GoogleHadoopFileSystemBase.java:797)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:929)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:910)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:807)
at
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
at
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
at
org.apache.flink.runtime.state.filesystem.FsCheckpointMetadataOutputStream.<init>(FsCheckpointMetadataOutputStream.java:65)
at
org.apache.flink.runtime.state.filesystem.FsCheckpointStorageLocation.createMetadataOutputStream(FsCheckpointStorageLocation.java:104)
at
org.apache.flink.runtime.checkpoint.PendingCheckpoint.finalizeCheckpoint(PendingCheckpoint.java:259)
at
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.completePendingCheckpoint(CheckpointCoordinator.java:829)
... 8 more
{code}
Instead, it should either just overwrite the checkpoint or fail to start the
job completely. Partial and undefined failure is not what should happen.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)