[
https://issues.apache.org/jira/browse/FLINK-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837427#comment-15837427
]
ASF GitHub Bot commented on FLINK-5612:
---------------------------------------
GitHub user mushketyk opened a pull request:
https://github.com/apache/flink/pull/3206
[FLINK-5612] Fix GlobPathFilter not-serializable exception
Thanks for contributing to Apache Flink. Before you open your pull request,
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your
pull request. For more information and/or questions please refer to the [How To
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful
description of your changes.
- [x] General
- The pull request references the related JIRA issue ("[FLINK-XXX] Jira
title text")
- The pull request addresses only one issue
- Each commit in the PR has a meaningful commit message (including the
JIRA id)
- [x] Documentation
- Documentation has been added for new functionality
- Old documentation affected by the pull request has been updated
- JavaDoc for public methods has been added
- [x] Tests & Build
- Functionality added by the pull request is covered by tests
- `mvn clean verify` has been executed successfully locally or a Travis
build has passed
Fixed GlobPathFilter serialization exception. As suggested in the JIRA I've
made instantiation of PathMatcher's objects lazy to avoid their serialization.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mushketyk/flink fix-serialization
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/3206.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3206
----
commit 47ab75792340191a0bbfdc8e9a3b527f819181a3
Author: Ivan Mushketyk <[email protected]>
Date: 2017-01-25T09:24:21Z
[FLINK-5612] Fix GlobPathFilter not-serializable exception
----
> GlobPathFilter not-serializable exception
> -----------------------------------------
>
> Key: FLINK-5612
> URL: https://issues.apache.org/jira/browse/FLINK-5612
> Project: Flink
> Issue Type: Bug
> Components: Batch Connectors and Input/Output Formats
> Affects Versions: 1.2.0, 1.3.0
> Reporter: Chesnay Schepler
> Assignee: Ivan Mushketyk
> Priority: Blocker
>
> A user reported on the mailing list a non-serializable exception when using
> the GlobFIlePathFilters.
> It appears that the PathMatchers are all created as anonymous inner classes
> and thus contain a reference to the encapsulating, non-serializable
> FileSystem class.
> We can fix this by moving the Matcher instantiation into filterPath(...).
> {code}
> public static void main(String[] args) throws Exception {
> final ExecutionEnvironment env =
> ExecutionEnvironment.getExecutionEnvironment();
> final TextInputFormat format = new TextInputFormat(new Path("/temp"));
> format.setFilesFilter(new GlobFilePathFilter(
> Collections.singletonList("**"),
> Arrays.asList("**/another_file.bin", "**/dataFile1.txt")
> ));
> DataSet<String> result = env.readFile(format,"/tmp");
> result.writeAsText("/temp/out");
> env.execute("GlobFilePathFilter-Test");
> }
> {code}
> {code}
> Exception in thread "main" org.apache.flink.optimizer.CompilerException:
> Error translating node 'Data Source "at
> readFile(ExecutionEnvironment.java:520)
> (org.apache.flink.api.java.io.TextInputFormat)" : NONE [[ GlobalProperties
> [partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null,
> grouped=null, unique=null] ]]': Could not write the user code wrapper class
> org.apache.flink.api.common.operators.util.UserCodeObjectWrapper :
> java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:381)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:106)
> at
> org.apache.flink.optimizer.plan.SourcePlanNode.accept(SourcePlanNode.java:86)
> at
> org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
> at
> org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:128)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:192)
> at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:188)
> at
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
> at com.apsaltis.EventDetectionJob.main(EventDetectionJob.java:75)
> Caused by:
> org.apache.flink.runtime.operators.util.CorruptConfigurationException:
> Could not write the user code wrapper class
> org.apache.flink.api.common.operators.util.UserCodeObjectWrapper :
> java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
> at
> org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:281)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createDataSourceVertex(JobGraphGenerator.java:888)
> at
> org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:281)
> ... 8 more
> Caused by: java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at java.util.ArrayList.writeObject(ArrayList.java:747)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:317)
> at
> org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:254)
> at
> org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:279)
> ... 10 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)