Using Avro 1.7.3 with hadoop-0.20.2 Under MapR M3.
I have sucessfully used AVRO in Non-Map/Reduce jobs to read and write
AVRO format from HDFS. Now I'm running code based on the
AvroGenericMaxTemperature example in Chapter 4 of Tom White's "Hadoop -
The definitive Guide" to convert many large .tsv files to AVRO format.
My code uses the DistributedCache.addFileToClassPath() method to pass
avro-1.7.3.jar and avro-mapred-1.7.3-hadoop1.jar to the TT.
When run, all the map jobs fail due to: ClassNotFoundException:
com.thoughtworks.paranamer.Paranamer
I did a lot of digging around and wound up using the DistributedCache to
pass avro-tools-1.6.1.jar in an attempt to resolve this. This goes to
hell like this:
java.lang.RuntimeException: Error in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:401)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:336)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
Any suggestions what I am doing wrong here?
Thanks