[jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization

Aaron Kimball (JIRA) Mon, 11 Jan 2010 17:03:19 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798970#action_12798970
 ]


Aaron Kimball commented on MAPREDUCE-815:
-----------------------------------------

Now that MAPREDUCE-1126 is in, I'm going to attack this and complete the loop.

Given that TextInputFormat yields a semi-arbitrary key and encapsulates the 
file contents in the value, I plan to follow suit here -- the value produced by 
the AvroRecordReader will contain the next object in the file. 

As for output: I think that it's best to leave the output format accepting a 
single value only (rather than explicitly making a hybrid of key and value 
pair). Users can implement their own UnionAvroOutputFormat (or whatever) if 
they need both, but I think the basic version should only do the most 
straightforward thing. I plan to make this write the user's key to the file, 
and drop the value. That way InverseMapper -> IdentityReducer should emit it 
all in sorted order.



> Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro 
> Serialization
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-815
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-815
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Ravi Gummadi
>            Assignee: Aaron Kimball
>
> MapReduce needs AvroInputFormat similar to other InputFormats like 
> TextInputFormat to be able to use avro serialization in hadoop. Similarly 
> AvroOutputFormat is needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization

Reply via email to