[
https://issues.apache.org/jira/browse/FLINK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512533#comment-16512533
]
ASF GitHub Bot commented on FLINK-9487:
---------------------------------------
Github user azagrebin commented on a diff in the pull request:
https://github.com/apache/flink/pull/6159#discussion_r195441098
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/state/AbstractKeyGroupPartitionedSnapshot.java
---
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.core.memory.DataOutputView;
+
+import javax.annotation.Nonnull;
+
+import java.io.IOException;
+
+
+/**
+ * Abstract base class for implementations of
+ * {@link
org.apache.flink.runtime.state.StateSnapshot.KeyGroupPartitionedSnapshot} based
on the result of a
+ * {@link AbstractKeyGroupPartitioner}.
+ *
+ * @param <T> type of the written elements.
+ */
+public abstract class AbstractKeyGroupPartitionedSnapshot<T> implements
StateSnapshot.KeyGroupPartitionedSnapshot {
+
+ /** The partitioning result to be written by key-group. */
+ @Nonnull
+ private final AbstractKeyGroupPartitioner.PartitioningResult<T>
partitioningResult;
+
+ public AbstractKeyGroupPartitionedSnapshot(
+ @Nonnull AbstractKeyGroupPartitioner.PartitioningResult<T>
partitioningResult) {
+ this.partitioningResult = partitioningResult;
+ }
+
+ @Override
+ public void writeMappingsInKeyGroup(@Nonnull DataOutputView dov, int
keyGroupId) throws IOException {
+
+ final T[] groupedOut =
partitioningResult.getPartitionedElements();
+
+ int startOffset =
partitioningResult.getKeyGroupStartOffsetInclusive(keyGroupId);
+ int endOffset =
partitioningResult.getKeyGroupEndOffsetExclusive(keyGroupId);
+
+ // write number of mappings in key-group
+ dov.writeInt(endOffset - startOffset);
+
+ // write mappings
+ for (int i = startOffset; i < endOffset; ++i) {
+ if(groupedOut[i] == null) {
+ throw new IllegalStateException();
+ }
+ writeElement(groupedOut[i], dov);
+ groupedOut[i] = null; // free asap for GC
+ }
+ }
+
+ /**
+ * This method defines how to write a single element to the output.
+ *
+ * @param element the element to be written.
+ * @param dov the output view to write the element.
+ * @throws IOException on write-related problems.
+ */
+ protected abstract void writeElement(@Nonnull T element, @Nonnull
DataOutputView dov) throws IOException;
--- End diff --
`AbstractKeyGroupPartitionedSnapshot.writeElement` looks like strategy for
`writeMappingsInKeyGroup`. It could be injected in constructor or
`writeMappingsInKeyGroup` as lambda `ElementWriter` interface and eliminate
inheritance abstractions.
The partitioner could just output PartitionedSnapshot which would be
partitioningResult + writeMappingsInKeyGroup. The result seems to have just one
purpose to be written as keyed snapshot.
> Prepare InternalTimerHeap for asynchronous snapshots
> ----------------------------------------------------
>
> Key: FLINK-9487
> URL: https://issues.apache.org/jira/browse/FLINK-9487
> Project: Flink
> Issue Type: Sub-task
> Components: State Backends, Checkpointing, Streaming
> Reporter: Stefan Richter
> Assignee: Stefan Richter
> Priority: Major
> Fix For: 1.6.0
>
>
> When we want to snapshot timers with the keyed backend state, this must
> happen as part of an asynchronous snapshot.
> The data structure {{InternalTimerHeap}} needs to offer support for this
> through a lightweight copy mechanism (e.g. arraycopy of the timer queue,
> because timers are immutable w.r.t. serialization).
> We can also stop keeping the dedup maps in {{InternalTimerHeap}} separated by
> key-group, all timers can go into one map.
> Instead, we can implement online-partitioning as part of the asynchronous
> operation, similar to what we do in {{CopyOnWriteStateTable}} snapshots.
> Notice that in this intermediate state, the code will still run in the
> synchronous part until we are integrated with the backends for async
> snapshotting (next subtask of this jira).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)