[jira] Commented: (AVRO-593) Avro mapreduce apis incompatible with hadoop 0.20.2

Garrett Wu (JIRA) Fri, 29 Oct 2010 13:28:41 -0700

    [ 
https://issues.apache.org/jira/browse/AVRO-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926441#action_12926441
 ]


Garrett Wu commented on AVRO-593:
---------------------------------

I'm also interested in using the newer mapreduce API with Avro, so I'm trying 
to write an AvroWritable and some input and output format classes that know how 
to deal with the schemas.  I should have a patch next week, but the idea is:

- Introduce new classes AvroKey and AvroValue that implement Writable.
- Users can call AvroJob.setInputKeySchema(), AvroJob.setInputValueSchema(), 
AvroJob.setMapOutputKeySchema(), AvroJob.setMapOutputValueSchema(), 
AvroJob.setReduceOutputKeySchema(), AvroJob.setReduceOutputValueSchema() as 
needed.
- Provide AvroContainerFileInputFormat/AvroContainerFileOutputFormat, 
AvroSequenceFileInputFormat, AvroSequenceFileOutputFormat that read and write 
the schemas for the data appropriately.  The schema in the sequence files can 
be stored in the header's metadata.
- Users can write Mappers and Reducers as they normally would.  Note that this 
differs slightly from the org.apache.avro.mapred.* way of doing things -- I 
don't plan to supply special AvroMapper and AvroReducer base classes or a new 
Serialization, since the AvroKey/AvroValue classes are Writable just like any 
other hadoop key/value type.

> Avro mapreduce apis incompatible with hadoop 0.20.2
> ---------------------------------------------------
>
>                 Key: AVRO-593
>                 URL: https://issues.apache.org/jira/browse/AVRO-593
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.3.2, 1.3.3
>         Environment: Avro 1.3.3, Hadoop 0.20.2
>            Reporter: Steve Severance
>
> The avro api's for hadoop use the hadoop mapreduce api that has been 
> deprecated. A new avro mapreduce api should be implemented for hadoop 0.20 
> and higher.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (AVRO-593) Avro mapreduce apis incompatible with hadoop 0.20.2

Reply via email to