[ 
https://issues.apache.org/jira/browse/AVRO-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Cutting updated AVRO-662:
------------------------------

    Attachment: AVRO-662.patch

Here's a patch that adds this feature.  A SequenceFileInputFormat is added that 
presents sequence file data in a form compatible with Avro's MapReduce API.  In 
particular, primitive Writable types (LongWritable, Text, etc.) are converted 
to corresponding Avro types (Long, CharSequence, etc.), while reflection is 
used to infer a schema for complex Writables.  The Writable implementation must 
be available at runtime, of course.

I also abstracted a FileReader interface and added a SequenceFileReader 
implementation.  This permits easier integration of SequenceFile and other 
formats into Avro tools.  For example, it would now be a simple matter to 
extend Avro's 'tojson' command to also dump SequenceFile data as JSON.

> Java: Add InputFormat for SequenceFiles using Reflect API
> ---------------------------------------------------------
>
>                 Key: AVRO-662
>                 URL: https://issues.apache.org/jira/browse/AVRO-662
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.1
>
>         Attachments: AVRO-662.patch
>
>
> It would be useful to be able to read SequenceFile-based data into an 
> Avro-based Java mapreduce program.  Once the reflect, specific and generic 
> representations are fully compatible (AVRO-638) then a RecordReader for 
> SequenceFiles could be added that uses Avro's reflect representation.  
> AvroOutputFormat could also be changed to accept such reflected data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to