[ 
https://issues.apache.org/jira/browse/FLINK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15708912#comment-15708912
 ] 

ASF GitHub Bot commented on FLINK-5158:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2873#discussion_r90258544
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
 ---
    @@ -731,46 +700,100 @@ public boolean 
receiveAcknowledgeMessage(AcknowledgeCheckpoint message) throws E
     
                                                
discardState(message.getState());
                                }
    +
    +                           return true;
                        }
                        else if (checkpoint != null) {
                                // this should not happen
                                throw new IllegalStateException(
                                                "Received message for discarded 
but non-removed checkpoint " + checkpointId);
                        }
                        else {
    +                           boolean wasPendingCheckpoint;
    +
                                // message is for an unknown checkpoint, or 
comes too late (checkpoint disposed)
                                if 
(recentPendingCheckpoints.contains(checkpointId)) {
    -                                   isPendingCheckpoint = true;
    +                                   wasPendingCheckpoint = true;
                                        LOG.warn("Received late message for now 
expired checkpoint attempt {}.", checkpointId);
                                }
                                else {
                                        LOG.debug("Received message for an 
unknown checkpoint {}.", checkpointId);
    -                                   isPendingCheckpoint = false;
    +                                   wasPendingCheckpoint = false;
                                }
     
                                // try to discard the state so that we don't 
have lingering state lying around
                                discardState(message.getState());
    +
    +                           return wasPendingCheckpoint;
    +                   }
    +           }
    +   }
    +
    +   private void completePendingCheckpoint(PendingCheckpoint 
pendingCheckpoint) throws CheckpointException {
    --- End diff --
    
    Missing JavaDocs, maybe add that this needs to be called in checkpoint lock 
scope


> Handle ZooKeeperCompletedCheckpointStore exceptions in CheckpointCoordinator
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-5158
>                 URL: https://issues.apache.org/jira/browse/FLINK-5158
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>             Fix For: 1.2.0, 1.1.4
>
>
> The checkpoint coordinator does not properly handle exceptions when trying to 
> store completed checkpoints. As a result, completed checkpoints are not 
> properly cleaned up and even worse, the {{CheckpointCoordinator}} might get 
> stuck stopping triggering checkpoints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to