[jira] Updated: (PIG-833) Storage access layer

2009-08-11 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-833:
-

Attachment: PIG-833-zebra.patch.bz2

Updated patch. Only change is that ant prints a descriptive error to user if 
hadoop20.jar does not exist in top level lib directory. It lists basic steps to 
get this built until PIG-660 is committed.


 Storage access layer
 

 Key: PIG-833
 URL: https://issues.apache.org/jira/browse/PIG-833
 Project: Pig
  Issue Type: New Feature
Reporter: Jay Tang
 Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, 
 PIG-833-zebra.patch.bz2, zebra-javadoc.tgz


 A layer is needed to provide a high level data access abstraction and a 
 tabular view of data in Hadoop, and could free Pig users from implementing 
 their own data storage/retrieval code.  This layer should also include a 
 columnar storage format in order to provide fast data projection, 
 CPU/space-efficient data serialization, and a schema language to manage 
 physical storage metadata.  Eventually it could also support predicate 
 pushdown for further performance improvement.  Initially, this layer could be 
 a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-833) Storage access layer

2009-08-11 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-833:
-

Attachment: PIG-833-zebra.patch.bz2

 Storage access layer
 

 Key: PIG-833
 URL: https://issues.apache.org/jira/browse/PIG-833
 Project: Pig
  Issue Type: New Feature
Reporter: Jay Tang
 Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, 
 PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, zebra-javadoc.tgz


 A layer is needed to provide a high level data access abstraction and a 
 tabular view of data in Hadoop, and could free Pig users from implementing 
 their own data storage/retrieval code.  This layer should also include a 
 columnar storage format in order to provide fast data projection, 
 CPU/space-efficient data serialization, and a schema language to manage 
 physical storage metadata.  Eventually it could also support predicate 
 pushdown for further performance improvement.  Initially, this layer could be 
 a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-833) Storage access layer

2009-08-11 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-833:
---

Attachment: TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt

Okay, now that I've first built Pig's test, I run the tests and I get:

{code}
 [delete] Deleting directory 
/Users/gates/src/pig/apache/top/zebra/trunk/build/contrib/zebra/test/logs
[mkdir] Created dir: 
/Users/gates/src/pig/apache/top/zebra/trunk/build/contrib/zebra/test/logs
[junit] Running org.apache.hadoop.zebra.io.TestCheckin
[junit] Tests run: 125, Failures: 0, Errors: 0, Time elapsed: 16.894 sec
[junit] Running org.apache.hadoop.zebra.mapred.TestCheckin
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 158.741 sec
[junit] Running org.apache.hadoop.zebra.pig.TestCheckin1
[junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.13 sec
[junit] Test org.apache.hadoop.zebra.pig.TestCheckin1 FAILED
[junit] Running org.apache.hadoop.zebra.pig.TestCheckin2
[junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.131 sec
[junit] Test org.apache.hadoop.zebra.pig.TestCheckin2 FAILED
[junit] Running org.apache.hadoop.zebra.pig.TestCheckin3
[junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.133 sec
[junit] Test org.apache.hadoop.zebra.pig.TestCheckin3 FAILED
[junit] Running org.apache.hadoop.zebra.pig.TestCheckin4
[junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.128 sec
[junit] Test org.apache.hadoop.zebra.pig.TestCheckin4 FAILED
[junit] Running org.apache.hadoop.zebra.pig.TestCheckin5
[junit] Tests run: 0, Failures: 0, Errors: 2, Time elapsed: 0.128 sec
[junit] Test org.apache.hadoop.zebra.pig.TestCheckin5 FAILED
[junit] Running org.apache.hadoop.zebra.types.TestCheckin
[junit] Tests run: 45, Failures: 0, Errors: 0, Time elapsed: 0.253 sec
{code}

I've attached the output from one of the tests.

 Storage access layer
 

 Key: PIG-833
 URL: https://issues.apache.org/jira/browse/PIG-833
 Project: Pig
  Issue Type: New Feature
Reporter: Jay Tang
 Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, 
 PIG-833-zebra.patch.bz2, PIG-833-zebra.patch.bz2, 
 TEST-org.apache.hadoop.zebra.pig.TestCheckin1.txt, test.out, zebra-javadoc.tgz


 A layer is needed to provide a high level data access abstraction and a 
 tabular view of data in Hadoop, and could free Pig users from implementing 
 their own data storage/retrieval code.  This layer should also include a 
 columnar storage format in order to provide fast data projection, 
 CPU/space-efficient data serialization, and a schema language to manage 
 physical storage metadata.  Eventually it could also support predicate 
 pushdown for further performance improvement.  Initially, this layer could be 
 a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-833) Storage access layer

2009-07-28 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-833:
-

Attachment: hadoop20.jar.bz2

Attaching hadoop20.jar that needs to be placed under lib/ directory under the 
top level PIG directory. will included specific instructions later in the jira.

 Storage access layer
 

 Key: PIG-833
 URL: https://issues.apache.org/jira/browse/PIG-833
 Project: Pig
  Issue Type: New Feature
Reporter: Jay Tang
 Attachments: hadoop20.jar.bz2


 A layer is needed to provide a high level data access abstraction and a 
 tabular view of data in Hadoop, and could free Pig users from implementing 
 their own data storage/retrieval code.  This layer should also include a 
 columnar storage format in order to provide fast data projection, 
 CPU/space-efficient data serialization, and a schema language to manage 
 physical storage metadata.  Eventually it could also support predicate 
 pushdown for further performance improvement.  Initially, this layer could be 
 a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-833) Storage access layer

2009-07-28 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-833:
-

Attachment: PIG-833-zebra.patch

The first cut of contrib/zebra. The patch is very large and should probably 
compress the subsequent versions of it.

More documentation on design and usage will be added to the jira.

How to compile :
--
 * check out latest PIG trunk
 * Apply the latest patch from PIG-660
 * copy attached hadoop20.jar to ./lib
 * run '{{ant jar}}' (and {{'ant -Dtestcase=none test-core'}} for zebra tests).
 * cd contrib/zebra
 * ant jar
 * ant test (for tests).

Currently there are compile time deprecation warnings related to use of 
deprecated mapred API (JobConf). There is will be fixed later.


 Storage access layer
 

 Key: PIG-833
 URL: https://issues.apache.org/jira/browse/PIG-833
 Project: Pig
  Issue Type: New Feature
Reporter: Jay Tang
 Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch


 A layer is needed to provide a high level data access abstraction and a 
 tabular view of data in Hadoop, and could free Pig users from implementing 
 their own data storage/retrieval code.  This layer should also include a 
 columnar storage format in order to provide fast data projection, 
 CPU/space-efficient data serialization, and a schema language to manage 
 physical storage metadata.  Eventually it could also support predicate 
 pushdown for further performance improvement.  Initially, this layer could be 
 a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-833) Storage access layer

2009-07-28 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-833:
-

Attachment: zebra-javadoc.tgz

 Storage access layer
 

 Key: PIG-833
 URL: https://issues.apache.org/jira/browse/PIG-833
 Project: Pig
  Issue Type: New Feature
Reporter: Jay Tang
 Attachments: hadoop20.jar.bz2, PIG-833-zebra.patch, zebra-javadoc.tgz


 A layer is needed to provide a high level data access abstraction and a 
 tabular view of data in Hadoop, and could free Pig users from implementing 
 their own data storage/retrieval code.  This layer should also include a 
 columnar storage format in order to provide fast data projection, 
 CPU/space-efficient data serialization, and a schema language to manage 
 physical storage metadata.  Eventually it could also support predicate 
 pushdown for further performance improvement.  Initially, this layer could be 
 a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.