[ https://issues.apache.org/jira/browse/FLINK-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15628527#comment-15628527 ]
ASF GitHub Bot commented on FLINK-4939: --------------------------------------- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/2707#discussion_r86108131 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/GenericWriteAheadSink.java --- @@ -77,63 +86,96 @@ public GenericWriteAheadSink(CheckpointCommitter committer, TypeSerializer<IN> s public void open() throws Exception { super.open(); committer.setOperatorId(id); - committer.setOperatorSubtaskId(getRuntimeContext().getIndexOfThisSubtask()); committer.open(); - cleanState(); - checkpointStreamFactory = - getContainingTask().createCheckpointStreamFactory(this); + + checkpointStreamFactory = getContainingTask() + .createCheckpointStreamFactory(this); + + cleanRestoredHandles(); } public void close() throws Exception { committer.close(); } /** - * Saves a handle in the state. + * Called when a checkpoint barrier arrives. + * Closes any open streams to the backend and marks them as pending for + * committing to the final output system, e.g. Cassandra. * - * @param checkpointId - * @throws IOException + * @param checkpointId the id of the latest received checkpoint. + * @throws IOException in case something went wrong when handling the stream to the backend. */ private void saveHandleInState(final long checkpointId, final long timestamp) throws Exception { + Preconditions.checkNotNull(this.pendingHandles, "The operator has not been properly initialized."); --- End diff -- this seems unnecessary. > GenericWriteAheadSink: Decouple the creating from the committing subtask for > a pending checkpoint > ------------------------------------------------------------------------------------------------- > > Key: FLINK-4939 > URL: https://issues.apache.org/jira/browse/FLINK-4939 > Project: Flink > Issue Type: Improvement > Components: Cassandra Connector > Reporter: Kostas Kloudas > Assignee: Kostas Kloudas > Fix For: 1.2.0 > > > So far the GenericWriteAheadSink expected that > the subtask that wrote a pending checkpoint to the > state backend, will be also the one to commit it to > the third-party storage system. > This issue targets at removing this assumption. To do this > the CheckpointCommitter has to be able to dynamically > take the subtaskIdx as a parameter when asking > if a checkpoint was committed and also change the > state kept by the GenericWriteAheadSink to also > include that subtask index of the subtask that wrote > the pending checkpoint. > This change is also necessary for making the operator rescalable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)