[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863750#action_12863750 ] Jeff Hammerbacher commented on PIG-1331: Hey, Does this issue make PIG-823 a duplicate? Thanks, Jeff Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Assignee: Ajay Kidave Fix For: 0.8.0 Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12863835#action_12863835 ] Jay Tang commented on PIG-1331: --- Yes, Jeff. Owl, as a table management service, has a metadata module. Please see http://wiki.apache.org/pig/owl for more information. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Assignee: Ajay Kidave Fix For: 0.8.0 Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864088#action_12864088 ] Jeff Hammerbacher commented on PIG-1331: Okay, seems like PIG-823 should be closed then. I don't have the ability to do that. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Assignee: Ajay Kidave Fix For: 0.8.0 Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856550#action_12856550 ] Allen Wittenauer commented on PIG-1331: --- Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. Then shouldn't it start out in Hadoop contrib? I don't understand why this is under Pig if it is meant for Hadoop-general. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856553#action_12856553 ] Olga Natkovich commented on PIG-1331: - Pig will be the first consumer of it because of SQL that we are planning to open source shortly. We are open for it to move out once we have an agreement on where is should reside permanently. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856208#action_12856208 ] Alan Gates commented on PIG-1331: - Now that I've figured out how to read directions I've run the tests on the new patch and they pass. I've also run javadocs and findbugs and all looks good. I'll start a vote on whether to accept this as a contrib. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856239#action_12856239 ] Alan Gates commented on PIG-1331: - After looking through the rules it looks like we don't need a vote on contrib projects. This JIRA serves as the place for people to voice their concerns and vote for or against. I'd like to commit this within a few days unless I hear otherwise. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: anttestoutput.tgz, build.log, ivy_version.patch, owl.contrib.3.tgz, owl.contrib.4.tar.gz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853603#action_12853603 ] Alan Gates commented on PIG-1331: - Given that hudson can't apply this patch (because it's a .tgz file) I've run the parts of the patch test manually. There are issues with ant test, javadoc, and findbugs. When I try to do ant test I get: {code} war: [war] Building war: /home/gates/src/pig/owl/trunk/contrib/owl/build/owl-0.8.0-dev.war jetty-stop: BUILD FAILED /home/gates/src/pig/owl/trunk/contrib/owl/build.xml:908: Execute failed: java.io.IOException: Cannot run program /home/gates/src/pig/owl/trunk/contrib/owl/ci//jetty_stop.sh: java.io.IOException: error=13, Permission denied {code} I see ten warnings when I build. These are all in the parser so it looks like they're ok. javadoc generates five warnings, which need to be addressed. findbugs generates one warning: {code} Se org.apache.hadoop.owl.protocol.OwlSchema$OwlColumnSchemaComparable implements Comparator but not Serializable Bug type SE_COMPARATOR_SHOULD_BE_SERIALIZABLE (click for details) In class org.apache.hadoop.owl.protocol.OwlSchema$OwlColumnSchemaComparable At OwlSchema.java:[lines 180-185] Details SE_COMPARATOR_SHOULD_BE_SERIALIZABLE: Comparator doesn't implement Serializable This class implements the Comparator interface. You should consider whether or not it should also implement the Serializable interface. If a comparator is used to construct an ordered collection such as a TreeMap, then the TreeMap will be serializable only if the comparator is also serializable. As most comparators have little or no state, making them serializable is generally easy and good defensive programming. {code} Rat only complains about files produced by javacc, so that should be fine. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, ivy_version.patch, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12853606#action_12853606 ] Alan Gates commented on PIG-1331: - Alright, after reading the directions and doing the chmod before running the tests I get the tests to run. However, I see a number of failures. I've attached the logs from the run. It looks like the jetty server isn't running or isn't accepting connections. Did I miss another step? Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, ivy_version.patch, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851998#action_12851998 ] Jay Tang commented on PIG-1331: --- There seems to be an issue with maven repo. We'll attach jar files and update build scripts. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852023#action_12852023 ] Ajay Kidave commented on PIG-1331: -- The build output shows that Alan's build used ivy-2.0.0-rc2.jar (which is what Pig uses), Owl includes ivy-2.1.0.jar. The Owl build works with ivy 2.1. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852205#action_12852205 ] Ajay Kidave commented on PIG-1331: -- If ant still uses the older ivy version, this might be because the older version of ivy jar file is present in $ANT_HOME/lib or ~/.ant/lib. Removing the older version jar from these locations should cause ivy 2.1 to be used. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, ivy_version.patch, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852207#action_12852207 ] Alan Gates commented on PIG-1331: - After applying the ivy_version patch and removing my cached version of ivy from ~/.ant/lib the patch compiles. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, ivy_version.patch, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851602#action_12851602 ] Alan Gates commented on PIG-1331: - Patch as provided doesn't build. It gets an ivy error. I've attached a copy of the stdout and stderr from the build run. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: build.log, owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851040#action_12851040 ] Ashish Thusoo commented on PIG-1331: Actually Jay, all those things are also already provided by the Hive metastore + compiler. I thought I would just mention that here - perhaps I am missing something.. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851230#action_12851230 ] Jay Tang commented on PIG-1331: --- Ashish, the goal of Owl is to provide a table-like abstraction to manage Hadoop data. The design would allow any customer MapReduce applications, Pig Latin, and even Hive query language to consume data via Owl's interface. Our vision is to build a full data life cycle management stack that encompasses data creation, notification, consumption, retention, and security management, etc. Owl would make things easier for a MapReduce application writer or for someone to build another query processing language on top of it. We will update Owl wikie page with more detailed information. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850688#action_12850688 ] Jay Tang commented on PIG-1331: --- Carl, from a serialization/deserialization perspective, the functionality appears similar. Owl also handles other storage layer interactions like data pruning. Owl supports partition and column pruning; we plan to support row pruning via predicate pushdown. The goal is to push data filtering work down. If a storage layer does not support a certain filter capability, Owl would provide an implementation. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850485#action_12850485 ] Hadoop QA commented on PIG-1331: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12439938/owl.contrib.3.tgz against trunk revision 928080. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/253/console This message is automatically generated. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850638#action_12850638 ] Jay Tang commented on PIG-1331: --- Owl's data access API, OwlInputFormat, provides a uniform API to access data stored in different storage format like Zebra, RCFile, SequenceFile, etc. Its a single data access abstraction on top of disparate data. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850639#action_12850639 ] Carl Steinbach commented on PIG-1331: - bq. Owl's data access API, OwlInputFormat, provides a uniform API to access data stored in different storage format like Zebra, RCFile, SequenceFile, etc. Its a single data access abstraction on top of disparate data. This sounds like Hive's SerDe interface. Are there any differences? Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Affects Versions: 0.8.0 Reporter: Jay Tang Attachments: owl.contrib.3.tgz This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850342#action_12850342 ] Santhosh Srinivasan commented on PIG-1331: -- Jay, In PIG-823 there was a discussion around how Owl is different from Hive's metastore. Is that still true today? If not, can you elaborate on the key differences between the two systems? Thanks, Santhosh Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Reporter: Jay Tang This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1331) Owl Hadoop Table Management Service
[ https://issues.apache.org/jira/browse/PIG-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12850347#action_12850347 ] Jay Tang commented on PIG-1331: --- Owl has an internal metastore that has a similar relational table and partition model with Hive's metastore. Owl goes beyond this and provides a uniform data access mechanism on top of multiple storage format. This interface can be leveraged by Pig and MapReduce applications. There is room for collaboration between Owl and Hive so that we could eventually converge on a common metastore for Hadoop. Owl Hadoop Table Management Service --- Key: PIG-1331 URL: https://issues.apache.org/jira/browse/PIG-1331 Project: Pig Issue Type: New Feature Reporter: Jay Tang This JIRA is a proposal to create a Hadoop table management service: Owl. Today, MapReduce and Pig applications interacts directly with HDFS directories and files and must deal with low level data management issues such as storage format, serialization/compression schemes, data layout, and efficient data accesses, etc, often with different solutions. Owl aims to provide a standard way to addresses this issue and abstracts away the complexities of reading/writing huge amount of data from/to HDFS. Owl has a data access API that is modeled after the traditional Hadoop !InputFormt and a management API to manipulate Owl objects. This JIRA is related to Pig-823 (Hadoop Metadata Service) as Owl has an internal metadata store. Owl integrates with different storage module like Zebra with a pluggable architecture. Initially, the proposal is to submit Owl as a Pig contrib project. Over time, it makes sense to move it to a Hadoop subproject. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.