[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812967#comment-16812967
 ] 

yankai zhang commented on FLINK-12113:
--------------------------------------

Yes, _fromCollection(Iterator, Class)_ works well as expected without anonymous 
class.

Problem here is anonymous class object in instance method implicitly references 
outer _this_(but not actually used), while outer _this_ is not serializable, 
and this is exactly what _StreamExecutionEnvironment#clean_ supposed to do.

In act, the iterator passed by user is wrapped within a _FromIteratorFunction_, 
and then _StreamExecutionEnvironment#clean_ is called on that wrapper __ 
instance, not the iterator itself. However current implementation of 
_StreamExecutionEnvironment#clean_ is not recursive, it can't find and clean 
_this_ deeply nested in closure.

Here is my fully reproducible code:
{code:java}
public class MainTest {


    interface IS<E> extends Iterator<E>, Serializable {
    }

    @Test
    public void cleanTest() {
        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
        env.fromCollection(new IS<Object>() {
            @Override
            public boolean hasNext() {
                return false;
            }

            @Override
            public Object next() {
                return null;
            }
        }, Object.class);
    }
}{code}

> User code passing to fromCollection(Iterator, Class) not cleaned
> ----------------------------------------------------------------
>
>                 Key: FLINK-12113
>                 URL: https://issues.apache.org/jira/browse/FLINK-12113
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>    Affects Versions: 1.7.2
>            Reporter: yankai zhang
>            Priority: Major
>         Attachments: image-2019-04-07-21-52-37-264.png, 
> image-2019-04-08-23-19-27-359.png
>
>
>  
> {code:java}
> interface IS<E> extends Iterator<E>, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS<Object>() {
>     @Override
>     public boolean hasNext() {
>         return false;
>     }
>     @Override
>     public Object next() {
>         return null;
>     }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> ....{code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS<Object>() {
>     @Override
>     public boolean hasNext() {
>         return false;
>     }
>     @Override
>     public Object next() {
>         return null;
>     }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to