>From Spark's root pom.xml : <avro.version>1.7.7</avro.version>
FYI On Wed, Dec 16, 2015 at 3:06 PM, Igor Berman <igor.ber...@gmail.com> wrote: > check version compatibility > I think avro lib should be 1.7.4 > check that no other lib brings transitive dependency of other avro version > > > On 16 December 2015 at 09:44, Jinyuan Zhou <zhou.jiny...@gmail.com> wrote: > >> Hi, I tried to load avro files in hdfs but keep getting NPE. >> I am using AvroKeyValueInputFormat inside newAPIHadoopFile method. >> Anyone have any clue? Here is stack trace >> >> Exception in thread "main" org.apache.spark.SparkException: Job aborted >> due to stage failure: Task 4 in stage 0.0 failed 4 times, most recent >> failure: Lost task 4.3 in stage 0.0 (TID 11, xyz.abc.com): >> java.lang.NullPointerException >> >> at org.apache.avro.Schema.getAliases(Schema.java:1415) >> >> at org.apache.avro.Schema.getAliases(Schema.java:1429) >> >> at org.apache.avro.Schema.applyAliases(Schema.java:1340) >> >> at >> org.apache.avro.generic.GenericDatumReader.getResolver(GenericDatumReader.java:125) >> >> at >> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140) >> >> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) >> >> at >> org.apache.avro.mapreduce.AvroRecordReaderBase.nextKeyValue(AvroRecordReaderBase.java:118) >> >> at >> org.apache.avro.mapreduce.AvroKeyValueRecordReader.nextKeyValue(AvroKeyValueRecordReader.java:62) >> >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:143) >> >> at >> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) >> >> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) >> >> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1626) >> >> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1099) >> >> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1099) >> >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767) >> >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767) >> >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) >> >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> >> at java.lang.Thread.run(Thread.java:744) >> >> >> Thanks, >> >> Jack >> Jinyuan (Jack) Zhou >> > >