Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The following page has been changed by jaytang: http://wiki.apache.org/pig/zebra ------------------------------------------------------------------------------ #language en #pragma section-numbers off - = Apache Pig-Zebra Wiki = + = Apache Zebra Wiki = + + == Introduction == Zebra is a storage layer that provides a high level data access abstraction and a tabular view of data in Hadoop, and could free Pig users from implementing their own data storage/retrieval code. It provites @@ -14, +16 @@ In the future, it could also support predicate pushdown for further performance improvement. Initially, Zebra is released as a contrib project in Pig and can become a hadoop subproject later on. + + == Prerequisite == + + Zebra requires Hadoop 20 (as of July 24th, 2009 with Hadoop patch 6150) that supports TFile and works with Pig 0.3.0 with patch PIG-660. This patch makes PIG work with Hadoop 20. Zebra has been submitted as PIG-833. + + == Getting Zebra == + + Zebra has been committed as a Pig contrib project at: + + [http://svn.apache.org/viewvc/hadoop/pig/trunk/contrib/zebra /Zebra source code] + + Compilation prerequisite: + + * JDK 1.6 + * Ant 1.7.1 + * Javacc 4.2 + + How to compile: + + * check out latest PIG trunk + * apply the latest patch from PIG-660 + * copy [https://issues.apache.org/jira/secure/attachment/12414823/hadoop20.jar.bz2 /hadoop20.jar] attached to PIG-833 to Pig's top level ./lib + * run 'ant jar' (generate Pig binary compatible with Hadoop 20) + * run 'ant -Dtestcase=none test-core' (for zebra tests) + * cd contrib/zebra + * ant jar + * ant test (for tests) + + Zebra jar will be generated at build/contrib/zebra directory + + + == Running Zebra == + + Sample Mapreduced code, Pig scripts attached to this wiki. Java doc is available at [https://issues.apache.org/jira/secure/attachment/12414838/zebra-javadoc.tgz /Zebra JavaDoc] +
