[Pig Wiki] Update of "owl" by jaytang

Apache Wiki Mon, 22 Mar 2010 16:21:55 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.


The "owl" page has been changed by jaytang.
http://wiki.apache.org/pig/owl?action=diff&rev1=4&rev2=5

--------------------------------------------------

  
  = Apache Owl Wiki =
  
- The goal of Owl is to provide a high level data management abstraction than 
that provided by HDFS directories and files.  Applications written in MapReduce 
and Pig scripts must deal with low data data management issues such as storage 
format, serialization/compression schemes, data layout, and efficient data 
access paths, often with different solutions. Owl attempts to provide a 
standard way to addresses this issue.
+ The goal of Owl is to provide a high level data management abstraction than 
that provided by HDFS directories and files.  Applications written in 
!MapReduce and Pig scripts must deal with low data data management issues such 
as storage format, serialization/compression schemes, data layout, and 
efficient data access paths, often with different solutions. Owl attempts to 
provide a standard way to addresses this issue.
  
  Owl supports the notion of "Owl Tables", a basic unit of data management.  An 
Owl Table has these characteristics:
  
     * lives in an Owl database name space and could contain multiple partitions
     * has columns and rows and supports a unified table level schema
-    * supports MapReduce and Pig Latin and potentially other languages
+    * supports !MapReduce and Pig Latin and potentially other languages
     * designed for batch read/write operations
     * supports external tables (data already exists on file system)
     * pluggable architecture for different storage format such as Zebra
@@ -20, +20 @@

     * efficient data access mechanisms via partition and projection pruning
  
  
- Owl supports two major public APIs.  "Owl Driver" provides management APIs 
against "Owl Table", "Owl Database", and "Partition".  This API is backed up by 
an internal Owl metadata store that runs on Tomcat and a relational database.  
"OwlInputFormat" provides a data access API and is modeled after the 
traditional Hadoop InputFormat.  In the future, we plan to support 
"OwlOutputFormat" and thus the notion of "Owl Managed Table" where Owl controls 
the data flow into and out of "Owl Tables".  Owl supports Pig integration with 
OwlPigLoader/Storer module.
+ Owl supports two major public APIs.  Owl Driver provides management APIs 
against "Owl Table", "Owl Database", and "Partition".  This API is backed up by 
an internal Owl metadata store that runs on Tomcat and a relational database.  
!OwlInputFormat provides a data access API and is modeled after the traditional 
Hadoop !InputFormat.  In the future, we plan to support !OwlOutputFormat and 
thus the notion of "Owl Managed Table" where Owl controls the data flow into 
and out of "Owl Tables".  Owl supports Pig integration with 
!OwlPigLoader/Storer module.
  
  
  == Prerequisite ==
  
- Owl depends on Pig for its tuple classes as a basic unit of data container, 
and Hadoop 20 for "OwlInputFormat".  Owl supports Zebra integration out of the 
box.
+ Owl depends on Pig for its tuple classes as a basic unit of data container, 
and Hadoop 20 for !OwlInputFormat.
  
  == Getting Owl ==
  
@@ -38, +38 @@

     * JDK 1.6
     * Ant 1.7.1
     * download [[http://dev.mysql.com/downloads/connector/j/5.1.html|MySQL 5.1 
JDBC driver]]
-    * Oracle
+    * or Oracle 11g JDBC driver
  
  How to compile

[Pig Wiki] Update of "owl" by jaytang

Reply via email to