[ 
https://issues.apache.org/jira/browse/AVRO-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860724#comment-17860724
 ] 

Oscar Westra van Holthe - Kind commented on AVRO-2787:
------------------------------------------------------

The JVM can only throw errors like this when the runtime version differs from 
the compile time version, AFAIK.
Because the original reporter is no longer able to refute this (having moved on 
to a new field of work), I'm closing this issue.

> Hadoop Mapreduce job fails when creating Writer
> -----------------------------------------------
>
>                 Key: AVRO-2787
>                 URL: https://issues.apache.org/jira/browse/AVRO-2787
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.9.2
>         Environment: Development
>  * OS: Fedora 31
>  * Java version 8
>  * Gradle version 6.2.2
>  * Avro version 1.9.2
>  * Shadow version 5.2.0
>  * Gradle-avro-plugin version 0.19.1
> Running in a Podman container
>  * OS: Ubuntu 18.04
>  * Podman 1.8.2
>  * Hadoop version 3.2.1
>  * Java version 8
>            Reporter: Anton Oellerer
>            Priority: Blocker
>         Attachments: CategoryData.avsc, CategoryTokensReducer.java, 
> TextprocessingfundamentalsApplication.java
>
>
> Hey,
> I am trying to create a Hadoop pipeline getting the chi squared value in for 
> tokens in reviews saved in JSON.
> For this, I created multiple Hadoop jobs, and the communication between them 
> happens, partly, with Avro Data containers.
> When trying to run this pipeline, I get the following error at the end of the 
> first reduce Job (Signature
> {code:java}
> public class CategoryTokensReducer extends Reducer<Text, StringArrayWritable, 
> AvroKey<CharSequence>, AvroValue<CategoryData>>{code}
> )
> Error:
> {code:java}
> java.lang.Exception: java.lang.NoSuchMethodError: 
> org.apache.avro.Schema$Field.<init>(Ljava/lang/String;Lorg/apache/avro/Schema;Ljava/lang/String;Ljava/lang/Object;)V
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
>                               
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559)      
>                               
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.avro.Schema$Field.<init>(Ljava/lang/String;Lorg/apache/avro/Schema;Ljava/lang/String;Ljava/lang/Object;)V
>         at 
> org.apache.avro.hadoop.io.AvroKeyValue.getSchema(AvroKeyValue.java:111)       
>              
>         at 
> org.apache.avro.mapreduce.AvroKeyValueRecordWriter.<init>(AvroKeyValueRecordWriter.java:84)
>          
>         at 
> org.apache.avro.mapreduce.AvroKeyValueOutputFormat.getRecordWriter(AvroKeyValueOutputFormat.java:70)
>         at 
> org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:542)
>         at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:615)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)       
>                         
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)           
>             
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)                              
>                                          
> {code}
> The Job is setup like this:
> {code:java}
> Job jsonToCategoryTokensJob = Job.getInstance(conf, "json to category data");
> AvroJob.setOutputKeySchema(jsonToCategoryTokensJob, 
> Schema.create(Schema.Type.STRING));
> AvroJob.setOutputValueSchema(jsonToCategoryTokensJob, 
> CategoryData.getClassSchema());
> jsonToCategoryTokensJob.setJarByClass(TextprocessingfundamentalsApplication.class);
> jsonToCategoryTokensJob.setMapperClass(JsonToCategoryTokensMapper.class);
> jsonToCategoryTokensJob.setMapOutputKeyClass(Text.class);
> jsonToCategoryTokensJob.setMapOutputValueClass(StringArrayWritable.class);
> jsonToCategoryTokensJob.setReducerClass(CategoryTokensReducer.class);
> jsonToCategoryTokensJob.setOutputFormatClass(AvroKeyValueOutputFormat.class);
> String in = otherArgs.get(0);
> String out = otherArgs.get(1);
> FileInputFormat.addInputPath(jsonToCategoryTokensJob, new Path(in));
> FileOutputFormat.setOutputPath(jsonToCategoryTokensJob, new Path(out, 
> "outCategoryData"));
> {code}
> The pipeline is run by first building a shadowJar from the source in the 
> development environment and then running it in a podman container.
> With Avro 1.8.2 and gradle plugin 0.16.0 the reduce job works. 
> Does someone know what the problem here might be?
> Best regards
> Anton



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to