[
https://issues.apache.org/jira/browse/FLINK-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276952#comment-16276952
]
ASF GitHub Bot commented on FLINK-8186:
---------------------------------------
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/5120
Looks good to me. This is actually two fixes in one:
1. Avro should not be in `flink-dist`, because starting with
`org.apache.flink`, it would always be loaded *"parent first"*. That causes
problems because Avro classes exist multiple times, both in the parent and in
the child classloader, failing equality and 'instanceof' comparisons.
2. The Avro Utils were never using any code in the user code jar, only
code in the classpath.
This is good for now, but it shows that the reveres classloading has some
subtle implication as soon as flink dependencies occur both in the user code
jar and in `/lib`.
> AvroInputFormat regression: fails to deserialize GenericRecords on standalone
> cluster with hadoop27 compat
> ----------------------------------------------------------------------------------------------------------
>
> Key: FLINK-8186
> URL: https://issues.apache.org/jira/browse/FLINK-8186
> Project: Flink
> Issue Type: Bug
> Affects Versions: 1.4.0
> Reporter: Sebastian Klemke
> Assignee: Aljoscha Krettek
> Priority: Blocker
> Fix For: 1.4.0
>
> Attachments: GenericRecordCount.java, pom.xml
>
>
> The following job runs fine on a Flink 1.3.2 cluster, but fails on a Flink
> 1.4.0 RC2 standalone cluster, "hadoop27" flavour:
> {code}
> public class GenericRecordCount {
> public static void main(String[] args) throws Exception {
> String input = ParameterTool.fromArgs(args).getRequired("input");
> ExecutionEnvironment env =
> ExecutionEnvironment.getExecutionEnvironment();
> long count = env.readFile(new AvroInputFormat<>(new Path(input),
> GenericRecord.class), input)
> .count();
> System.out.printf("Counted %d records\n", count);
> }
> }
> {code}
> Runs fine in LocalExecutionEnvironment and also on no-hadoop flavour
> standalone cluster, though. Exception thrown in Flink 1.4.0 hadoop27:
> {code}
> 12/01/2017 13:22:09 DataSource (at
> readFile(ExecutionEnvironment.java:514)
> (org.apache.flink.formats.avro.AvroInputFormat))(4/4) switched to FAILED
> java.lang.RuntimeException: java.lang.NoSuchMethodException:
> org.apache.avro.generic.GenericRecord.<init>()
> at
> org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:353)
> at
> org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:369)
> at org.apache.avro.reflect.ReflectData.newRecord(ReflectData.java:901)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:212)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233)
> at
> org.apache.flink.formats.avro.AvroInputFormat.nextRecord(AvroInputFormat.java:165)
> at
> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:167)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NoSuchMethodException:
> org.apache.avro.generic.GenericRecord.<init>()
> at java.lang.Class.getConstructor0(Class.java:3082)
> at java.lang.Class.getDeclaredConstructor(Class.java:2178)
> at
> org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:347)
> ... 11 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)