[ https://issues.apache.org/jira/browse/MAPREDUCE-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798970#action_12798970 ]
Aaron Kimball commented on MAPREDUCE-815: ----------------------------------------- Now that MAPREDUCE-1126 is in, I'm going to attack this and complete the loop. Given that TextInputFormat yields a semi-arbitrary key and encapsulates the file contents in the value, I plan to follow suit here -- the value produced by the AvroRecordReader will contain the next object in the file. As for output: I think that it's best to leave the output format accepting a single value only (rather than explicitly making a hybrid of key and value pair). Users can implement their own UnionAvroOutputFormat (or whatever) if they need both, but I think the basic version should only do the most straightforward thing. I plan to make this write the user's key to the file, and drop the value. That way InverseMapper -> IdentityReducer should emit it all in sorted order. > Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro > Serialization > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-815 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-815 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Ravi Gummadi > Assignee: Aaron Kimball > > MapReduce needs AvroInputFormat similar to other InputFormats like > TextInputFormat to be able to use avro serialization in hadoop. Similarly > AvroOutputFormat is needed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.