[
https://issues.apache.org/jira/browse/FLINK-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214912#comment-16214912
]
ASF GitHub Bot commented on FLINK-7783:
---------------------------------------
Github user aljoscha commented on a diff in the pull request:
https://github.com/apache/flink/pull/4879#discussion_r146218580
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java
---
@@ -302,4 +302,32 @@ public void
registerSharedStatesAfterRestored(SharedStateRegistry sharedStateReg
public String toString() {
return String.format("Checkpoint %d @ %d for %s", checkpointID,
timestamp, job);
}
+
+ @Override
+ public boolean equals(Object o) {
+ if (this == o) {
+ return true;
+ }
+ if (o == null || getClass() != o.getClass()) {
+ return false;
+ }
+
+ CompletedCheckpoint that = (CompletedCheckpoint) o;
+
+ if (checkpointID != that.checkpointID) {
+ return false;
+ }
+ if (timestamp != that.timestamp) {
--- End diff --
And there will only be checkpoints of the same JobId in a checkpoint store,
so including the JobId should also not be necessary.
> Don't always remove checkpoints in ZooKeeperCompletedCheckpointStore#recover()
> ------------------------------------------------------------------------------
>
> Key: FLINK-7783
> URL: https://issues.apache.org/jira/browse/FLINK-7783
> Project: Flink
> Issue Type: Sub-task
> Components: State Backends, Checkpointing
> Affects Versions: 1.4.0, 1.3.2
> Reporter: Aljoscha Krettek
> Assignee: Aljoscha Krettek
> Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> Currently, we always delete checkpoint handles if they (or the data from the
> DFS) cannot be read:
> https://github.com/apache/flink/blob/91a4b276171afb760bfff9ccf30593e648e91dfb/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/ZooKeeperCompletedCheckpointStore.java#L180
> This can lead to problems in case the DFS is temporarily not available, i.e.
> we could inadvertently
> delete all checkpoints even though they are still valid.
> A user reported this problem on the mailing list:
> https://lists.apache.org/thread.html/9dc9b719cf8449067ad01114fedb75d1beac7b4dff171acdcc24903d@%3Cuser.flink.apache.org%3E
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)