[ 
https://issues.apache.org/jira/browse/FLINK-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346476#comment-15346476
 ] 

ASF GitHub Bot commented on FLINK-3974:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2110#discussion_r68239970
  
    --- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/collector/selector/CopyingDirectedOutput.java
 ---
    @@ -0,0 +1,51 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.streaming.api.collector.selector;
    +
    +import org.apache.flink.api.java.tuple.Tuple2;
    +import org.apache.flink.streaming.api.graph.StreamEdge;
    +import org.apache.flink.streaming.api.operators.Output;
    +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
    +
    +import java.util.List;
    +import java.util.Set;
    +
    +
    +/**
    + * Special version of {@link DirectedOutput} that performs a shallow copy 
of the
    + * {@link StreamRecord} to ensure that multi-chaining works correctly.
    + */
    +public class CopyingDirectedOutput<OUT> extends DirectedOutput<OUT> {
    +
    +   @SuppressWarnings({"unchecked", "rawtypes"})
    +   public CopyingDirectedOutput(
    +                   List<OutputSelector<OUT>> outputSelectors,
    +                   List<Tuple2<Output<StreamRecord<OUT>>, StreamEdge>> 
outputs) {
    +           super(outputSelectors, outputs);
    +   }
    +
    +   @Override
    +   public void collect(StreamRecord<OUT> record) {
    +           Set<Output<StreamRecord<OUT>>> selectedOutputs = 
selectOutputs(record);
    +
    +           for (Output<StreamRecord<OUT>> out : selectedOutputs) {
    +                   StreamRecord<OUT> shallowCopy = 
record.copy(record.getValue());
    +                   out.collect(shallowCopy);
    +           }
    --- End diff --
    
    Can't we save one copy operation by giving `record` to the last selected 
output?


> enableObjectReuse fails when an operator chains to multiple downstream 
> operators
> --------------------------------------------------------------------------------
>
>                 Key: FLINK-3974
>                 URL: https://issues.apache.org/jira/browse/FLINK-3974
>             Project: Flink
>          Issue Type: Bug
>          Components: DataStream API
>    Affects Versions: 1.0.3
>            Reporter: B Wyatt
>         Attachments: ReproFLINK3974.java, bwyatt-FLINK3974.1.patch
>
>
> Given a topology that looks like this:
> {code:java}
> DataStream<A> input = ...
> input
>     .map(MapFunction<A,B>...)
>     .addSink(...);
> input
>     .map(MapFunction<A,C>...)
>     ​.addSink(...);
> {code}
> enableObjectReuse() will cause an exception in the form of 
> {{"java.lang.ClassCastException: B cannot be cast to A"}} to be thrown.
> It looks like the input operator calls {{Output<StreamRecord<A>>.collect}} 
> which attempts to loop over the downstream operators and process them.
> However, the first map operation will call {{StreamRecord<>.replace}} which 
> mutates the value stored in the StreamRecord<>.  
> As a result, when the {{Output<StreamRecord<A>>.collect}} call passes the 
> {{StreamRecord<A>}} to the second map operation it is actually a 
> {{StreamRecord<B>}} and behaves as if the two map operations were serial 
> instead of parallel.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to