[
https://issues.apache.org/jira/browse/FLINK-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210263#comment-15210263
]
ASF GitHub Bot commented on FLINK-3257:
---------------------------------------
Github user uce commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r57319737
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java
---
@@ -450,112 +450,121 @@ else if (operator != null) {
}
}
- @Override
- public boolean triggerCheckpoint(final long checkpointId, final long
timestamp) throws Exception {
- LOG.debug("Starting checkpoint {} on task {}", checkpointId,
getName());
-
- synchronized (lock) {
- if (isRunning) {
-
- // since both state checkpointing and
downstream barrier emission occurs in this
- // lock scope, they are an atomic operation
regardless of the order in which they occur
- // we immediately emit the checkpoint barriers,
so the downstream operators can start
- // their checkpoint work as soon as possible
-
operatorChain.broadcastCheckpointBarrier(checkpointId, timestamp);
-
- // now draw the state snapshot
- final StreamOperator<?>[] allOperators =
operatorChain.getAllOperators();
- final StreamTaskState[] states = new
StreamTaskState[allOperators.length];
+ /**
+ * Checkpoints all operator states of the current StreamTask.
+ * Thread-safety must be handled outside the scope of this function
+ */
+ protected boolean checkpointStatesInternal(final long checkpointId,
long timestamp) throws Exception {
--- End diff --
What about naming this as in the comments `drawStateSnapshot`? That it is
internal is more or less communicated by the fact that it is a `protected`
method.
> Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
> -------------------------------------------------------------------
>
> Key: FLINK-3257
> URL: https://issues.apache.org/jira/browse/FLINK-3257
> Project: Flink
> Issue Type: Improvement
> Reporter: Paris Carbone
> Assignee: Paris Carbone
>
> The current snapshotting algorithm cannot support cycles in the execution
> graph. An alternative scheme can potentially include records in-transit
> through the back-edges of a cyclic execution graph (ABS [1]) to achieve the
> same guarantees.
> One straightforward implementation of ABS for cyclic graphs can work as
> follows along the lines:
> 1) Upon triggering a barrier in an IterationHead from the TaskManager start
> block output and start upstream backup of all records forwarded from the
> respective IterationSink.
> 2) The IterationSink should eventually forward the current snapshotting epoch
> barrier to the IterationSource.
> 3) Upon receiving a barrier from the IterationSink, the IterationSource
> should finalize the snapshot, unblock its output and emit all records
> in-transit in FIFO order and continue the usual execution.
> --
> Upon restart the IterationSource should emit all records from the injected
> snapshot first and then continue its usual execution.
> Several optimisations and slight variations can be potentially achieved but
> this can be the initial implementation take.
> [1] http://arxiv.org/abs/1506.08603
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)