[
https://issues.apache.org/jira/browse/FLINK-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670029#comment-15670029
]
ASF GitHub Bot commented on FLINK-5073:
---------------------------------------
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/2816#discussion_r88204552
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/util/ZooKeeperUtils.java
---
@@ -237,11 +242,12 @@ public static CompletedCheckpointStore
createCompletedCheckpoints(
checkpointsPath +=
ZooKeeperSubmittedJobGraphStore.getPathForJob(jobId);
return new ZooKeeperCompletedCheckpointStore(
- maxNumberOfCheckpointsToRetain,
+ maxNumberOfCheckpointsToRetain,
userClassLoader,
--- End diff --
inconsistent indent
> ZooKeeperCompleteCheckpointStore executes blocking delete operation in
> ZooKeeper client thread
> ----------------------------------------------------------------------------------------------
>
> Key: FLINK-5073
> URL: https://issues.apache.org/jira/browse/FLINK-5073
> Project: Flink
> Issue Type: Bug
> Components: Distributed Coordination
> Affects Versions: 1.2.0, 1.1.3
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Fix For: 1.2.0, 1.1.4
>
>
> When deleting completed checkpoints from the
> {{ZooKeeperCompletedCheckpointStore}}, one first tries to delete the meta
> state handle from ZooKeeper and then deletes the actual checkpoint in a
> callback from the delete operation. This callback is executed by the
> ZooKeeper client's main thread which is problematic, because it blocks the
> ZooKeeper client. If a delete operation takes longer than it takes to
> complete a checkpoint, then it might even happen that delete operations of
> outdated checkpoints are piling up because they are effectively executed
> sequentially.
> I propose to execute the delete operations by a dedicated {{Executor}} so
> that we keep the client's main thread free to do ZooKeeper related work.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)