Avro-MapRed: Provide a fallback using avro beans instead of schema in job
configuration
---------------------------------------------------------------------------------------
Key: AVRO-923
URL: https://issues.apache.org/jira/browse/AVRO-923
Project: Avro
Issue Type: Improvement
Components: java
Affects Versions: 1.5.4
Environment: any
Reporter: Julien Muller
Fix For: 1.6.0
The current implementation of Avro MapRed is designed to use JobConf. While it
is possible to use job.xml file, it is pretty painful since you have to
copy/paste the all schemes for input and output. This is error prone and time
consuming. Also any update in a bean requires to recopy/repaste the schema (if
using JobConf a simple recompile would be enough).
A proposition to improve this and to stay backward compatible would be to
introduce new keys in AvroJob and reference the actual avro bean used. This can
be implemented as a fallback.
New keys would be created:
- avro.input.schema > avro.input.class
- avro.map.output.schema > avro.map.output.class
- avro.output.schema > avro.output.class
Only 3 methods would be impacted in AvroJob:
- getInputSchema(Configuration job) {
// Implement a fallback like
String s = job.get(INPUT_SCHEMA);
if(s==null) s =
(String)Class.forName(job.get(INPUT_CLASS)).getDeclaredField("SCHEMA$").get(null);
return Schema.parse(s);
}
}
- getMapOutputSchema()
- getOutputSchema()
Also, it would be more consistent to add new setters. This is not mandatory
since in that use case, the new keys are filled up directly in the job, not
using AvroJob.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira