Jeff Hammerbacher commented on PIG-833:


Raghu, you mention that a design document is forthcoming. It would be great to 
have a PDF design document, like Matei's for the fair scheduler, in addition to 
the Javadoc and wiki page. Any progress on that front? I'm quite interested in 
learning more about Zebra's use and implementation.

On a larger note, it would be great if Pig moved to the Hadoop model for new 
features, where a design document and test plan is required to commit. See 
https://issues.apache.org/jira/browse/HADOOP-5587. It's tough to digest the 
bulk dumps of Owl, Zebra, and Giraffe, though we certainly appreciate the work 
Yahoo has done on these projects!


> Storage access layer
> --------------------
>                 Key: PIG-833
>                 URL: https://issues.apache.org/jira/browse/PIG-833
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jay Tang
>         Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, 
> PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, 
> TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz
> A layer is needed to provide a high level data access abstraction and a 
> tabular view of data in Hadoop, and could free Pig users from implementing 
> their own data storage/retrieval code.  This layer should also include a 
> columnar storage format in order to provide fast data projection, 
> CPU/space-efficient data serialization, and a schema language to manage 
> physical storage metadata.  Eventually it could also support predicate 
> pushdown for further performance improvement.  Initially, this layer could be 
> a contrib project in Pig and become a hadoop subproject later on.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to