[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raghu Angadi updated PIG-833: ----------------------------- Attachment: PIG-833-zebra.patch The first cut of contrib/zebra. The patch is very large and should probably compress the subsequent versions of it. More documentation on design and usage will be added to the jira. How to compile : ---------------------- * check out latest PIG trunk * Apply the latest patch from PIG-660 * copy attached hadoop20.jar to ./lib * run '{{ant jar}}' (and {{'ant -Dtestcase=none test-core'}} for zebra tests). * cd contrib/zebra * ant jar * ant test (for tests). Currently there are compile time deprecation warnings related to use of deprecated mapred API (JobConf). There is will be fixed later. > Storage access layer > -------------------- > > Key: PIG-833 > URL: https://issues.apache.org/jira/browse/PIG-833 > Project: Pig > Issue Type: New Feature > Reporter: Jay Tang > Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch > > > A layer is needed to provide a high level data access abstraction and a > tabular view of data in Hadoop, and could free Pig users from implementing > their own data storage/retrieval code. This layer should also include a > columnar storage format in order to provide fast data projection, > CPU/space-efficient data serialization, and a schema language to manage > physical storage metadata. Eventually it could also support predicate > pushdown for further performance improvement. Initially, this layer could be > a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.