Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig ------------------------------------------------------------------------------ - This page provides the information you need to get started running Pig. You should alread have access to a Hadoop cluster, and you should have built pig ([BuildPig]). + This page provides the information you need to get started running Pig. You should have access to a Hadoop cluster and you should have Pig set up (see BuildPig ). === Environment === First we need to set up a few things. - Unix and Windows users need to install and set up Java (including $JAVA_HOME). Use Sun Java 6 if at all possible. + '''Unix''' and '''Windows''' users need to install and set up Java, including $JAVA_HOME. - Windows users need to install Cygwin and the Perl package (http://www.cygwin.com/) + '''Windows''' users need to install Cygwin and the Perl package (http://www.cygwin.com/) To set environment variables, use the right command for your shell (The examples use the bash flavor): * setenv PIGDIR /pig (tcsh, csh) * export PIGDIR=/pig (bash, sh, ksh) - In newer versions of pig, you may also need to set some properties in the + In newer versions of Pig, you may also need to set some properties in the conf/pig.properties file (in the main pig directory). You may wish to set verbose=true until things are up and running. @@ -24, +24 @@ * Script file: attachment:id.pig * Embedded program: attachment:idlocal.java and attachment:idhadoop.java - To start, we're going to parse a small text file, namely the /etc/passwd file. (Don't worry -- for arcane reasons there are no passwords in the etc/passwd file, only user names and public info. Windows users, just paste from the snippet below) Copy that file into the local directory: `cp /etc/passwd .` + To start, we're going to parse a small text file, namely the /etc/passwd file. (Don't worry -- for arcane reasons there are no passwords in the etc/passwd file, only user names and public info.) Copy the passwd file into your local directory: `cp /etc/passwd .` - Yours may looks something like this: + Your file may look something like this. Fields are separated by colons (:). {{{ games:x:5:60:games:/usr/games:/bin/sh @@ -47, +47 @@ = Local Mode = - Start simple: Pig in local mode, using the Grunt shell. Later we'll look at running a Pig script, then an embedded program, then all of the above across a hadoop cluster. + This section shows you how to run Pig in local mode. - To run Pig in local mode, you only need access to a single machine. To make things simple, copy these files to your current working directory (you may want to create a temp directory and move to it): + To run Pig in local mode, you '''do not''' need access to a hadoop cluster - you only need access to a single machine. To make things simple, copy these files to your current working directory (you may want to create a temp directory and move to it): * The /etc/passwd file (again: this is just a handy file to parse, it doesn't configure anything in pig). * The pig.jar file, created when you build Pig (see BuildPig)
