[jira] Updated: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-833: - Attachment: PIG-833-zebra.patch.bz2 Updated patch. Only change is that ant prints a descriptive error to user if hadoop20.jar does not exist in top level lib directory. It lists basic steps to get this built until PIG-660 is committed. Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-833: - Attachment: PIG-833-zebra.patch.bz2 Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-833: --- Attachment: TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt Okay, now that I've first built Pig's test, I run the tests and I get: {code} [delete] Deleting directory /Users/gates/src/pig/apache/top/zebra/trunk/build/contrib/zebra/test/logs [mkdir] Created dir: /Users/gates/src/pig/apache/top/zebra/trunk/build/contrib/zebra/test/logs [junit] Running org.apache.hadoop.zebra.io.TestCheckin [junit] Tests run: 125, Failures: 0, Errors: 0, Time elapsed: 16.894 sec [junit] Running org.apache.hadoop.zebra.mapred.TestCheckin [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 158.741 sec [junit] Running org.apache.hadoop.zebra.pig.TestCheckin1 [junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.13 sec [junit] Test org.apache.hadoop.zebra.pig.TestCheckin1 FAILED [junit] Running org.apache.hadoop.zebra.pig.TestCheckin2 [junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.131 sec [junit] Test org.apache.hadoop.zebra.pig.TestCheckin2 FAILED [junit] Running org.apache.hadoop.zebra.pig.TestCheckin3 [junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.133 sec [junit] Test org.apache.hadoop.zebra.pig.TestCheckin3 FAILED [junit] Running org.apache.hadoop.zebra.pig.TestCheckin4 [junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.128 sec [junit] Test org.apache.hadoop.zebra.pig.TestCheckin4 FAILED [junit] Running org.apache.hadoop.zebra.pig.TestCheckin5 [junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.128 sec [junit] Test org.apache.hadoop.zebra.pig.TestCheckin5 FAILED [junit] Running org.apache.hadoop.zebra.types.TestCheckin [junit] Tests run: 45, Failures: 0, Errors: 0, Time elapsed: 0.253 sec {code} I've attached the output from one of the tests. Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-833: - Attachment: hadoop20.jar.bz2 Attaching hadoop20.jar that needs to be placed under lib/ directory under the top level PIG directory. will included specific instructions later in the jira. Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2 A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-833: - Attachment: PIG-833-zebra.patch The first cut of contrib/zebra. The patch is very large and should probably compress the subsequent versions of it. More documentation on design and usage will be added to the jira. How to compile : -- * check out latest PIG trunk * Apply the latest patch from PIG-660 * copy attached hadoop20.jar to ./lib * run '{{ant jar}}' (and {{'ant -Dtestcase=none test-core'}} for zebra tests). * cd contrib/zebra * ant jar * ant test (for tests). Currently there are compile time deprecation warnings related to use of deprecated mapred API (JobConf). There is will be fixed later. Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-833) Storage access layer
[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghu Angadi updated PIG-833: - Attachment: zebra-javadoc.tgz Storage access layer Key: PIG-833 URL: https://issues.apache.org/jira/browse/PIG-833 Project: Pig Issue Type: New Feature Reporter: Jay Tang Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, zebra-javadoc.tgz A layer is needed to provide a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. This layer should also include a columnar storage format in order to provide fast data projection, CPU/space-efficient data serialization, and a schema language to manage physical storage metadata. Eventually it could also support predicate pushdown for further performance improvement. Initially, this layer could be a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.