Author: olga
Date: Fri Jun 5 18:20:55 2009
New Revision: 782088
URL: http://svn.apache.org/viewvc?rev=782088&view=rev
Log:
missing doc file
Added:
hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/getstarted.xml
Added: hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/getstarted.xml
URL:
http://svn.apache.org/viewvc/hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/getstarted.xml?rev=782088&view=auto
==============================================================================
--- hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/getstarted.xml
(added)
+++ hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/getstarted.xml
Fri Jun 5 18:20:55 2009
@@ -0,0 +1,232 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
"http://forrest.apache.org/dtd/document-v20.dtd">
+<document>
+ <header>
+ <title>Pig Getting Started Guide</title>
+ </header>
+ <body>
+
+ <section id="req">
+ <title>Requirements</title>
+
+ <p><strong>Unix</strong> and <strong>Windows</strong> users need the
following:</p>
+ <ol>
+ <li> <strong>Hadoop 18</strong>: <a
href="http://hadoop.apache.org/core/">http://hadoop.apache.org/core/</a></li>
+ <li> <strong>Java 1.6</strong>, preferably from Sun: <a
href="http://java.sun.com/javase/downloads/index.jsp">http://java.sun.com/javase/downloads/index.jsp</a>.
Set JAVA_HOME to the root of your Java installation.</li>
+ <li> <strong>Ant</strong> for builds: <a
href="http://ant.apache.org/">http://ant.apache.org/</a>.</li>
+ <li> <strong>JUnit</strong> for unit tests: <a
href="http://junit.sourceforge.net/">http://junit.sourceforge.net/</a>.</li>
+ </ol>
+ <p><strong>Windows</strong> users need to install Cygwin and the Perl
package: <a href="http://www.cygwin.com/"> http://www.cygwin.com/</a>.</p>
+ </section>
+
+ <section>
+ <title>Download Pig</title>
+ <p>To get a Pig distribution, download a recent stable release from one
of the Apache Download Mirrors (see <a
href="http://hadoop.apache.org/pig/releases.html"> Pig Releases</a>).</p>
+ <p>Unpack the downloaded Pig distribution. You can find the Pig script
in the bin directory (/pig-n.n.n/bin/pig).</p>
+ <p>Add /pig-n.n.n/bin to your path. Use export (bash,sh,ksh) or setenv
(tcsh,csh). For example: </p>
+<source>
+$ export PATH=/<my-path-to-pig>/pig-n.n.n/bin:$PATH
+</source>
+ <p>Try the following command, to get a listing of all Pig commands </p>
+<source>
+$ pig -help
+</source>
+ <p>Try the following command, to start the Grunt Shell:</p>
+<source>
+$ pig
+</source>
+
+
+ </section>
+
+ <section>
+ <title>Build Pig</title>
+ <p>(optional) To build pig, do the following:</p>
+ <ol>
+ <li> Check out the Pig code from SVN: <em>svn co
http://svn.apache.org/repos/asf/hadoop/pig/trunk</em>. </li>
+ <li> Build the code from the top directory: <em>ant</em>. If the
build is successful, you should see the <em>pig.jar</em> created in that
directory. </li>
+ <li> Validate your <em>pig.jar</em> by running a unit test: <em>ant
test</em></li>
+ </ol>
+ </section>
+
+<section>
+ <title>Run Pig</title>
+ <p>Pig has two run modes or exectypes: </p>
+ <ul>
+ <li><p> Local Mode: To run Pig in local mode, you need access to a
single machine. </p></li>
+ <li><p> Mapreduce Mode: To run Pig in mapreduce mode, you need access to
a Hadoop cluster and HDFS installation.
+ Pig will automatically allocate and deallocate a 15-node
cluster.</p></li>
+ </ul>
+
+
+<section>
+<title>Grunt Shell</title>
+<p>Use Pig's interactive shell, Grunt, to enter pig commands manually.
+(You can also run or execute script files from the Grunt shell. See the RUN
and EXEC commands in the <a href="piglatin.html">Pig Latin Manual</a>). </p>
+<p>Local mode:
+</p>
+<source>
+$ pig -x local
+</source>
+<p>Mapreduce mode:
+</p>
+<source>
+$ pig
+or
+$ pig -x mapreduce
+</source>
+<p>The Grunt shell is invoked and you can enter commands at the prompt.
+</p>
+<source>
+grunt> A = load 'passwd' using PigStorage(':');
+grunt> B = foreach A generate $0 as id;
+grunt> dump B;
+</source>
+</section>
+
+<section>
+<title>Script Files</title>
+<p>Use script files to run Pig commands as batch jobs. See the sample code for
the script file (id.pig) used in the examples.</p>
+<p>Local mode:</p>
+<source>
+$ pig -x local id.pig
+</source>
+<p>Mapreduce mode: </p>
+<source>
+$ pig id.pig
+or
+$ pig -x mapreduce id.pig
+</source>
+<p>The Pig Latin statements are executed and the results are displayed to your
terminal screen (if DUMP is used) or to a file (if STORE is used).</p>
+</section>
+
+<section>
+ <title>Embedded Programs</title>
+<p>Embed Pig commands in a host language and run the program.
+See the sample code for the java files (idlocal.java, idmapreduce.java) used
in the examples.</p>
+ <section>
+<title> Local Mode</title>
+<p>From your current working directory, compile the program: </p>
+<source>
+$ javac -cp pig.jar idlocal.java
+</source>
+<p>Note: idlocal.class is written to your current working directory. Include
â.â in the class path when you run the program. </p>
+<p>From your current working directory, run the program:
+</p>
+<source>
+Unix: $ java -cp pig.jar:. idlocal
+Cygwin: $ java âcp â.;pig.jarâ idlocal
+</source>
+<p>To view the results, check the output file, id.out. </p>
+</section>
+<section>
+<title>Mapreduce Mode</title>
+<p>Point $HADOOPDIR to the directory that contains the hadoop-site.xml file.
Example:
+</p>
+<source>
+$ export HADOOPDIR=/yourHADOOPsite/conf
+</source>
+<p>From your current working directory, compile the program:
+</p>
+<source>
+$ javac -cp pig.jar idhadoop.java
+</source>
+<p>Note: idhadoop.class is written to your current working directory. Include
â.â in the class path when you run the program. </p>
+<p>From your current working directory, run the program:
+</p>
+<source>
+Unix: $ java -cp pig.jar:.:$HADOOPDIR idhadoop
+Cygwin: $ java âcp â.;pig.jar;$HADOOPDIRâ idhadoop
+</source>
+<p>To view the results, check the idout directory on your Hadoop system. </p>
+
+</section>
+</section>
+</section>
+
+<section>
+<title>Sample Code</title>
+
+<p>The sample code is based on Pig Latin statements that extract all user IDs
from the /etc/passwd file. </p>
+<p>Copy the /etc/passwd file to your local working directory.</p>
+
+<section>
+<title>id.pig</title>
+<p>For the Grunt Shell and script files. </p>
+<source>
+A = load 'passwd' using PigStorage(':');
+B = foreach A generate $0 as id;
+dump B;
+store B into âid.outâ;
+</source>
+</section>
+
+<section>
+<title>idlocal.java</title>
+<p>For embedded programs. </p>
+<source>
+import java.io.IOException;
+import org.apache.pig.PigServer;
+public class idlocal{
+public static void main(String[] args) {
+try {
+ PigServer pigServer = new PigServer("local");
+ runIdQuery(pigServer, "passwd");
+ }
+ catch(Exception e) {
+ }
+ }
+public static void runIdQuery(PigServer pigServer, String inputFile) throws
IOException {
+ pigServer.registerQuery("A = load '" + inputFile + "' using
PigStorage(':');");
+ pigServer.registerQuery("B = foreach A generate $0 as id;");
+ pigServer.store("B", "id.out");
+ }
+}
+</source>
+</section>
+
+<section>
+<title>idmapreduce.java</title>
+<p>For embedded programs. </p>
+<source>
+import java.io.IOException;
+import org.apache.pig.PigServer;
+public class idhadoop {
+ public static void main(String[] args) {
+ try {
+ PigServer pigServer = new PigServer("mapreduce");
+ runIdQuery(pigServer, "passwd");
+ }
+ catch(Exception e) {
+ }
+}
+public static void runIdQuery(PigServer pigServer, String inputFile) throws
IOException {
+ pigServer.registerQuery("A = load '" + inputFile + "' using
PigStorage(':');")
+ pigServer.registerQuery("B = foreach A generate $0 as id;");
+ pigServer.store("B", "idout");
+ }
+}
+</source>
+</section>
+
+
+</section>
+
+</body>
+</document>