[
https://issues.apache.org/jira/browse/FLINK-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512296#comment-16512296
]
ASF GitHub Bot commented on FLINK-9487:
---------------------------------------
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/6159#discussion_r195376144
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/state/AbstractKeyGroupPartitioner.java
---
@@ -0,0 +1,227 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.state;
+
+import org.apache.flink.util.Preconditions;
+
+import javax.annotation.Nonnegative;
+import javax.annotation.Nonnull;
+
+/**
+ * Abstract class that contains the base algorithm for partitioning data
into key-groups. This algorithm currently works
+ * with two array (input, output) for optimal algorithmic complexity.
Notice that this could also be implemented over a
+ * single array, using some cuckoo-hashing-style element replacement. This
would have worse algorithmic complexity but
+ * better space efficiency. We currently prefer the trade-off in favor of
better algorithmic complexity.
+ */
--- End diff --
👍
> Prepare InternalTimerHeap for asynchronous snapshots
> ----------------------------------------------------
>
> Key: FLINK-9487
> URL: https://issues.apache.org/jira/browse/FLINK-9487
> Project: Flink
> Issue Type: Sub-task
> Components: State Backends, Checkpointing, Streaming
> Reporter: Stefan Richter
> Assignee: Stefan Richter
> Priority: Major
> Fix For: 1.6.0
>
>
> When we want to snapshot timers with the keyed backend state, this must
> happen as part of an asynchronous snapshot.
> The data structure {{InternalTimerHeap}} needs to offer support for this
> through a lightweight copy mechanism (e.g. arraycopy of the timer queue,
> because timers are immutable w.r.t. serialization).
> We can also stop keeping the dedup maps in {{InternalTimerHeap}} separated by
> key-group, all timers can go into one map.
> Instead, we can implement online-partitioning as part of the asynchronous
> operation, similar to what we do in {{CopyOnWriteStateTable}} snapshots.
> Notice that in this intermediate state, the code will still run in the
> synchronous part until we are integrated with the backends for async
> snapshotting (next subtask of this jira).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)