I'm trying to build a simple mapreduce job that reads avro files and outputs 
plain text.

I pulled data from a mysql database with sqoop and wrote the files out as 
snappy compressed avro files. I've operated on the files using AvroStorage in 
pig, but the current task I'm trying to accomplish would be better suited with 
a plain MR job I think. I'm using avro 1.7.4 but I think the version used to 
generate the files is 1.5.3 (whatever ships with HDP 1.2). I've tried depending 
on avro 1.5.3 in my project but I get the same error (but a different line).

When I try to execute my job the following exception is printed:

Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.avro.Schema.access$1400()Ljava/lang/ThreadLocal;
        at org.apache.avro.Schema$Parser.parse(Schema.java:924)
        at org.apache.avro.Schema$Parser.parse(Schema.java:917)
        at my.classpath.jobs.CandidatesFor.run(CandidatesFor.java:44)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at my.classpath.jobs.CandidatesFor.main(CandidatesFor.java:72)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

Below is the setup portion of my mapreduce job. The failure occurs when I call 
parse on the Schema object.

public class CandidatesFor extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {
        JobConf conf = new JobConf(getConf(), getClass());
        conf.setJobName("CandidatesFor");

        InputStream is = 
getClass().getClassLoader().getResourceAsStream("avro/data.avsc");
        assert null != is;
        String schemaString = IOUtils.toString(is);
        System.out.println(schemaString);
        Schema.Parser parser = new Schema.Parser();
        Schema schema = parser.parse(schemaString); // exception is thrown here
        …

Looking in the code for Schema the line is:

        boolean saved = validateNames.get();

I wrote a unit test to help me understand how things hang together. The test 
passes:

    @Test
    public void canReadSchema() throws Exception {
        ClassLoader loader = getClass().getClassLoader();
        InputStream schema_is = loader.getResourceAsStream("avro/data.avsc");
        File data = new 
File(loader.getResource("avro/part-m-00000.avro").toURI());
        assertNotNull(data);
        Schema schema = new Schema.Parser().parse(schema_is);
        DatumReader<GenericRecord> datumReader = new 
GenericDatumReader<GenericRecord>(schema);
        DataFileReader<GenericRecord> dataFileReader = new 
DataFileReader<GenericRecord>(data, datumReader);
        GenericRecord case_ = null;
        while (dataFileReader.hasNext()) {
            case_ = dataFileReader.next(case_);
            assertNotNull(case_.get("subject"));
            assertNotNull(case_.get("description"));
        }
    }

I'm sure I'm missing something obvious but I don't know enough to recognize it. 
Any help would be greatly appreciated. 
Thanks

Reply via email to