[jira] Updated: (AVRO-592) Pig to Avro translation -- Pig DatumReader/Writer

Scott Carey (JIRA) Tue, 06 Jul 2010 14:57:50 -0700

     [ 
https://issues.apache.org/jira/browse/AVRO-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Scott Carey updated AVRO-592:
-----------------------------

    Attachment: AVRO-592.patch

Incomplete Patch, work in progress!

Missing:  I have to learn Ivy and make it pull the Pig dependencies.  You will 
need pig 0.7's jar file in your path to build this.  However, be warned that 
one of the pig jars has all of its dependencies inside of it -- including ALL 
of hadoop, including Jackson 1.0.1 which will break a build.

To do:
* Remove org.apache.avro.mapreduce and put that code into PigAvroOutputFormat 
and PigAvroInputFormat.
* A pass of code comments, potential name changes and other clarifications.
* Testing around the Hadoop related bits and AvroStorage class.
* Concerns raised here by any review.

> Pig to Avro translation -- Pig DatumReader/Writer
> -------------------------------------------------
>
>                 Key: AVRO-592
>                 URL: https://issues.apache.org/jira/browse/AVRO-592
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Scott Carey
>            Assignee: Scott Carey
>             Fix For: 1.4.0
>
>         Attachments: AVRO-592.patch
>
>
> It would be great to use Avro to store Pig outputs.   Because Avro persists 
> the schema as well, one can store data in one script, then load it in another 
> and preserve the schema.
> Additionally, one can serialize pig Tuples to Avro and read Avro into pig 
> Tuples.  Avro Schemas are significantly more rich than Pig schemas, but a 
> limited translation is possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (AVRO-592) Pig to Avro translation -- Pig DatumReader/Writer

Reply via email to