Avro-MapRed: Provide a fallback using avro beans instead of schema in job 
configuration
---------------------------------------------------------------------------------------

                 Key: AVRO-923
                 URL: https://issues.apache.org/jira/browse/AVRO-923
             Project: Avro
          Issue Type: Improvement
          Components: java
    Affects Versions: 1.5.4
         Environment: any
            Reporter: Julien Muller
             Fix For: 1.6.0


The current implementation of Avro MapRed is designed to use JobConf. While it 
is possible to use job.xml file, it is pretty painful since you have to 
copy/paste the all schemes for input and output. This is error prone and time 
consuming. Also any update in a bean requires to recopy/repaste the schema (if 
using JobConf a simple recompile would be enough).

A proposition to improve this and to stay backward compatible would be to 
introduce new keys in AvroJob and reference the actual avro bean used. This can 
be implemented as a fallback.

New keys would be created:
- avro.input.schema > avro.input.class
- avro.map.output.schema > avro.map.output.class
- avro.output.schema > avro.output.class


Only 3 methods would be impacted in AvroJob:
- getInputSchema(Configuration job) {
        // Implement a fallback like
        String s = job.get(INPUT_SCHEMA);
        if(s==null) s = 
(String)Class.forName(job.get(INPUT_CLASS)).getDeclaredField("SCHEMA$").get(null);
            return Schema.parse(s);
        }
  }
- getMapOutputSchema()
- getOutputSchema()

Also, it would be more consistent to add new setters. This is not mandatory 
since in that use case, the new keys are filled up directly in the job, not 
using AvroJob. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to