Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/4353#discussion_r127708906
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/OperatorSubtaskState.java
---
@@ -18,20 +18,40 @@
package org.apache.flink.runtime.checkpoint;
+import org.apache.flink.annotation.VisibleForTesting;
import org.apache.flink.runtime.state.CompositeStateHandle;
import org.apache.flink.runtime.state.KeyedStateHandle;
import org.apache.flink.runtime.state.OperatorStateHandle;
import org.apache.flink.runtime.state.SharedStateRegistry;
import org.apache.flink.runtime.state.StateObject;
import org.apache.flink.runtime.state.StateUtil;
import org.apache.flink.runtime.state.StreamStateHandle;
+import org.apache.flink.util.Preconditions;
+
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
-import java.util.Arrays;
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
/**
- * Container for the state of one parallel subtask of an operator. This is
part of the {@link OperatorState}.
+ * This class encapsulates the state for one parallel instance of an
operator. The complete state of a (logical)
+ * operator (e.g. a flatmap operator) consists of the union of all {@link
OperatorSubtaskState}s from all
+ * parallel tasks that physically execute parallelized, physical instances
of the operator.
+ * <p>The full state of the logical operator is represented by {@link
OperatorState} which consists of
+ * {@link OperatorSubtaskState}s.
+ * <p>Typically, we expect all collections in this class to be of size 0
or 1, because there up to one state handle
+ * produced per state type (e.g. managed-keyed, raw-operator, ...). In
particular, this holds when taking a snapshot.
+ * The purpose of having the state handles in collections is that this
class is also reused in restoring state.
+ * Under normal circumstances, the expected size of each collection is
still 0 or 1, except for scale-down. In
+ * scale-down, one operator subtask can become responsible for the state
of multiple previous subtasks. The collections
+ * can then store all the state handles that are relevant to build up the
new subtask state.
+ * <p>There is no collection for legacy state because it is nor rescalable.
--- End diff --
typo: nor -> not
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---