[
https://issues.apache.org/jira/browse/BEAM-10760?focusedWorklogId=478414&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478414
]
ASF GitHub Bot logged work on BEAM-10760:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 03/Sep/20 09:36
Start Date: 03/Sep/20 09:36
Worklog Time Spent: 10m
Work Description: mxm commented on a change in pull request #12759:
URL: https://github.com/apache/beam/pull/12759#discussion_r482841893
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkStateInternals.java
##########
@@ -75,7 +80,14 @@
public class FlinkStateInternals<K> implements StateInternals {
private final KeyedStateBackend<ByteBuffer> flinkStateBackend;
- private Coder<K> keyCoder;
+ private final Coder<K> keyCoder;
+
+ /**
+ * A set which contains all state descriptors created in the global window.
Used for cleanup on
Review comment:
Yes, that's right but that doesn't matter because all other state should
also be cleaned up with the global window.
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkStateInternals.java
##########
@@ -75,7 +80,14 @@
public class FlinkStateInternals<K> implements StateInternals {
private final KeyedStateBackend<ByteBuffer> flinkStateBackend;
- private Coder<K> keyCoder;
+ private final Coder<K> keyCoder;
+
+ /**
+ * A set which contains all state descriptors created in the global window.
Used for cleanup on
Review comment:
I'll remove "global window".
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkStateInternals.java
##########
@@ -139,17 +163,27 @@ private FlinkStateBinder(StateNamespace namespace,
StateContext<?> stateContext)
@Override
public <T2> ValueState<T2> bindValue(
String id, StateSpec<ValueState<T2>> spec, Coder<T2> coder) {
- return new FlinkValueState<>(flinkStateBackend, id, namespace, coder);
+ ValueStateDescriptor<T2> valueStateDescriptor =
+ new ValueStateDescriptor<>(id, new CoderTypeSerializer<>(coder));
+ globalWindowStateDescriptors.add(valueStateDescriptor);
+ return new FlinkValueState<>(flinkStateBackend, id, namespace,
valueStateDescriptor);
}
@Override
public <T2> BagState<T2> bindBag(String id, StateSpec<BagState<T2>> spec,
Coder<T2> elemCoder) {
- return new FlinkBagState<>(flinkStateBackend, id, namespace, elemCoder);
+ ListStateDescriptor<T2> listStateDescriptor =
+ new ListStateDescriptor<>(id, new CoderTypeSerializer<>(elemCoder));
+ globalWindowStateDescriptors.add(listStateDescriptor);
Review comment:
This is unrelated to the changes here. This object was just created a
layer down before (FlinkBagState). Let's handle such optimizations in a
follow-up.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 478414)
Time Spent: 3h 40m (was: 3.5h)
> Cleanup timers lead to unbounded state accumulation in global window
> --------------------------------------------------------------------
>
> Key: BEAM-10760
> URL: https://issues.apache.org/jira/browse/BEAM-10760
> Project: Beam
> Issue Type: Bug
> Components: runner-core, runner-flink
> Affects Versions: 2.21.0
> Reporter: Thomas Weise
> Assignee: Thomas Weise
> Priority: P2
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> For each key, the runner sets a cleanup timer that is designed to garbage
> collect state at the end of a window. For a global window, these timers will
> stay around until the pipeline terminates. Depending on the key cardinality,
> this can lead to unbounded state growth, which in the case of the Flink
> runner is observable in the growth of checkpoint size.
> https://lists.apache.org/thread.html/rae268806035688b77646195505e5b7a56568a38feb1e52d6341feedd%40%3Cdev.beam.apache.org%3E
--
This message was sent by Atlassian Jira
(v8.3.4#803005)