[jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization

Aaron Kimball (JIRA) Thu, 14 Jan 2010 11:12:20 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800309#action_12800309
 ]


Aaron Kimball commented on MAPREDUCE-815:
-----------------------------------------

The only reason I could think of to use the position would be building some 
sort of index over an avro file. I think this probably doesn't make much sense 
here. That having been said, we can't use null or we'll break the identity 
mapper. (The MapOutputBuffer expects non-null keys and values only. A 
{{context.write(k, null)}} from the mapper will throw NullPointerException.) 

This is why writables included NullWritable, I think. We could add a type e.g. 
"Empty" which implements AvroReflectSerializable and whose toString method 
returns the empty string; this would work fairly transparently I think and be 
entirely avro-based.


> Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro 
> Serialization
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-815
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-815
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Ravi Gummadi
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-815.patch
>
>
> MapReduce needs AvroInputFormat similar to other InputFormats like 
> TextInputFormat to be able to use avro serialization in hadoop. Similarly 
> AvroOutputFormat is needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization

Reply via email to