[ https://issues.apache.org/jira/browse/FLINK-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15628537#comment-15628537 ]
ASF GitHub Bot commented on FLINK-4939: --------------------------------------- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/2707#discussion_r86110695 --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/GenericWriteAheadSink.java --- @@ -201,27 +259,83 @@ public void processElement(StreamRecord<IN> element) throws Exception { serializer.serialize(value, new DataOutputViewStreamWrapper(out)); } - /** - * This state is used to keep a list of all StateHandles (essentially references to past OperatorStates) that were - * used since the last completed checkpoint. - **/ - public static class ExactlyOnceState implements Serializable { + private static final class PendingCheckpointId implements Comparable<PendingCheckpointId>, Serializable { + + private static final long serialVersionUID = -3571036395734603443L; + + private final long checkpointId; + private final int subtaskId; + + PendingCheckpointId(long checkpointId, int subtaskId) { + this.checkpointId = checkpointId; + this.subtaskId = subtaskId; + } + + void serialize(DataOutputViewStreamWrapper outputStream) throws IOException { + outputStream.writeLong(this.checkpointId); + outputStream.writeInt(this.subtaskId); + } - private static final long serialVersionUID = -3571063495273460743L; + static PendingCheckpointId restore(DataInputViewStreamWrapper inputStream) throws IOException, ClassNotFoundException { + long checkpointId = inputStream.readLong(); + int subtaskId = inputStream.readInt(); + return new PendingCheckpointId(checkpointId, subtaskId); + } + + @Override + public int compareTo(PendingCheckpointId o) { + if (o == null) { + return -1; + } - protected TreeMap<Long, Tuple2<Long, StreamStateHandle>> pendingHandles; + long res = this.checkpointId - o.checkpointId; --- End diff -- you could use `Long#compare(long, long)` instead of all this. > GenericWriteAheadSink: Decouple the creating from the committing subtask for > a pending checkpoint > ------------------------------------------------------------------------------------------------- > > Key: FLINK-4939 > URL: https://issues.apache.org/jira/browse/FLINK-4939 > Project: Flink > Issue Type: Improvement > Components: Cassandra Connector > Reporter: Kostas Kloudas > Assignee: Kostas Kloudas > Fix For: 1.2.0 > > > So far the GenericWriteAheadSink expected that > the subtask that wrote a pending checkpoint to the > state backend, will be also the one to commit it to > the third-party storage system. > This issue targets at removing this assumption. To do this > the CheckpointCommitter has to be able to dynamically > take the subtaskIdx as a parameter when asking > if a checkpoint was committed and also change the > state kept by the GenericWriteAheadSink to also > include that subtask index of the subtask that wrote > the pending checkpoint. > This change is also necessary for making the operator rescalable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)