[
https://issues.apache.org/jira/browse/FLINK-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346476#comment-15346476
]
ASF GitHub Bot commented on FLINK-3974:
---------------------------------------
Github user tillrohrmann commented on a diff in the pull request:
https://github.com/apache/flink/pull/2110#discussion_r68239970
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/CopyingDirectedOutput.java
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.collector.selector;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.streaming.api.graph.StreamEdge;
+import org.apache.flink.streaming.api.operators.Output;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+
+import java.util.List;
+import java.util.Set;
+
+
+/**
+ * Special version of {@link DirectedOutput} that performs a shallow copy
of the
+ * {@link StreamRecord} to ensure that multi-chaining works correctly.
+ */
+public class CopyingDirectedOutput<OUT> extends DirectedOutput<OUT> {
+
+ @SuppressWarnings({"unchecked", "rawtypes"})
+ public CopyingDirectedOutput(
+ List<OutputSelector<OUT>> outputSelectors,
+ List<Tuple2<Output<StreamRecord<OUT>>, StreamEdge>>
outputs) {
+ super(outputSelectors, outputs);
+ }
+
+ @Override
+ public void collect(StreamRecord<OUT> record) {
+ Set<Output<StreamRecord<OUT>>> selectedOutputs =
selectOutputs(record);
+
+ for (Output<StreamRecord<OUT>> out : selectedOutputs) {
+ StreamRecord<OUT> shallowCopy =
record.copy(record.getValue());
+ out.collect(shallowCopy);
+ }
--- End diff --
Can't we save one copy operation by giving `record` to the last selected
output?
> enableObjectReuse fails when an operator chains to multiple downstream
> operators
> --------------------------------------------------------------------------------
>
> Key: FLINK-3974
> URL: https://issues.apache.org/jira/browse/FLINK-3974
> Project: Flink
> Issue Type: Bug
> Components: DataStream API
> Affects Versions: 1.0.3
> Reporter: B Wyatt
> Attachments: ReproFLINK3974.java, bwyatt-FLINK3974.1.patch
>
>
> Given a topology that looks like this:
> {code:java}
> DataStream<A> input = ...
> input
> .map(MapFunction<A,B>...)
> .addSink(...);
> input
> .map(MapFunction<A,C>...)
> .addSink(...);
> {code}
> enableObjectReuse() will cause an exception in the form of
> {{"java.lang.ClassCastException: B cannot be cast to A"}} to be thrown.
> It looks like the input operator calls {{Output<StreamRecord<A>>.collect}}
> which attempts to loop over the downstream operators and process them.
> However, the first map operation will call {{StreamRecord<>.replace}} which
> mutates the value stored in the StreamRecord<>.
> As a result, when the {{Output<StreamRecord<A>>.collect}} call passes the
> {{StreamRecord<A>}} to the second map operation it is actually a
> {{StreamRecord<B>}} and behaves as if the two map operations were serial
> instead of parallel.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)