[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- * '''Script File''': Place Pig commands in a script file and run the script. * '''Embedded Program''': Embed Pig commands in a host language and run the program. - Note: The script file mentioned above is a script file that you create and which contains the Pig commands that you want to run using Pig (we provide a sample script file in the next section). However, Pig, itself, is also a script file (pig.sh), and is referred below to as The Pig Script. + Note: The script file mentioned above is a script file that you create and which contains the Pig commands that you want to run using Pig (we provide a sample script file in the next section). Please note, however, that Pig, itself, is also a script (pig.sh), and is referred below to as The Pig Script. === Sample Code === The sample code files you need to run the examples on this page include:
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- * '''Script File''': Place Pig commands in a script file and run the script. * '''Embedded Program''': Embed Pig commands in a host language and run the program. - Note: The script file mentioned above is a script file that you create and which contains the Pig commands that you want to run using Pig (we provide a sample script file in the next section). Please note, however, that Pig, itself, is also a script (pig.sh), and is referred below to as The Pig Script. + Note: The script file mentioned above is a script that you create and which contains the Pig commands that you want to run using Pig (we provide a sample script in the next section). Please note, however, that Pig, itself, is also a script (pig.sh), and is referred below to as The Pig Script. === Sample Code === The sample code files you need to run the examples on this page include:
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- === Sample Code === The sample code files you need to run the examples on this page include: - * Script file: XXX.pig - * Embedded program: XXX.java + * Script file: attachment:id.pig + * Embedded program: attachment:idlocal.java and attachment:idhadoop.java The examples are based on these Pig commands, which extract all user IDs from the /etc/passwd file.
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- === Environment === - Unix and Windows users need to install and set up Java (including $JAVA_HOME and $PATH). + Unix and Windows users need to install and set up Java (including $JAVA_HOME). Windows users need to install Cygwin and the Perl package (http://www.cygwin.com/)
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- - You can run Pig locally or on a Hadoop. To run Pig locally (in local mode), no Hadoop cluster is required. To run Pig on Hadoop, you need access to a Hadoop cluster: [http://lucene.apache.org/hadoop/]. + This page provides the information you need to get started running Pig. - == Running Pig Programs == + == Run Modes == - This section will be updated shortly ... + Pig has two run modes or exectypes, local and hadoop (currently called mapreduce). To run Pig in local mode, you need access to a single machine. To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. - [Pi] This needs to be changed. Now users can start Pig by simplying running ./bin/pig and all the configuration things can be set at ./conf/pig.properties. + To get a listing of all Pig commands use {{{$pig âhelp}}} - There are two ways to run pig. The first way is by using `pig.pl` that can be found in the scripts directory of your source tree. Using the script would require having Perl installed on your machine. You can use it by issuing the following command: `pig.pl -cp pig.jar:HADOOPSITEPATH` where HADOOPSITEPATH is the directory in which `hadoop-site.xml` file for your Hadoop cluster is located. Example: + Note: A ticket has been entered to change {{{-x, -exectype local|mapreduce}}} to {{{-x, -exectype local|hadoop}}} - `pig.pl -cp pig.jar:/hadoop/conf` - - The second way to do this is by using java directly: - - `java -cp pig.jar:HADOOPSITEPATH org.apache.pig.Main` - - This starts pig in the default map-reduce mode. You can also start pig in local mode: - - `java -cp pig.jar org.apache.pig.Main -x local` - - Or - - `java -jar pig.jar -x local` - - Regardless of how you invoke pig, the commands that are specified above will take you to an interactive shell called grunt where you can run DFS and pig commands. The documentation about grunt will be posted on wiki soon. If you want to run Pig in batch mode, you can append your pig script to either of the commands above. Example: - - {{{pig.pl -cp pig.jar:/hadoop/conf myscript.pig}}} - - or - - {{{java -cp pig.jar:/hadoop/conf org.apache.pig.Main myscript.pig}}} -
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- * setenv PIGDIR /pig (tcsh, csh) * export PIGDIR=/pig (bash, sh, ksh) - To make things simple, copy all files to your current working directory (you may want to create a temp directory and move to it): + To make things simple, copy these files to your current working directory (you may want to create a temp directory and move to it): - * Copy the /etc/passwd file to your current working directory ('''.'''): + * The /etc/passwd file: {{{ $ cp /etc/passwd . }}} - * Copy the pig.jar file from your SVN tree (see BuildPig) to your current working directory. + * Copy the pig.jar file from your SVN tree (see BuildPig): {{{ $ cp /yourSVNtree/pig.jar . }}} - * Copy the sample code files (XXX.pig and XXX.java) to your current working directory. + * The sample code files (XXX.pig and XXX.java) on this page. == Local Mode == This section shows you how to run Pig in local mode, using the Grunt shell, a Pig script, and an embedded program.
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- You can run Pig three ways â using either local mode or hadoop (mapreduce) mode: * '''Grunt Shell''': Enter Pig commands manually using Pigâs interactive shell, Grunt. * '''Script File''': Place Pig commands in a script file and run the script. - * '''Embedded Program''': Embed Pig commands in a host language (Java) and run the program. + * '''Embedded Program''': Embed Pig commands in a host language and run the program. + + Note: Pig, itself, is a script file (pig.sh), and is referred toas The Pig Script File. The Run Ways script file mentioned above is a script file that you create and which contains the Pig commands that you want to run (we provide a sample in the next section). == Sample Code == - The sample code files you need to run the examples on this page include: XXX.pig and XXX.java. + The sample code files you need to run the examples on this page include: + * Script file: XXX.pig + * Embedded program: XXX.java The examples are based on these Pig commands, which extract all user IDs from the /etc/passwd file. @@ -71, +75 @@ $ export PIG_CLASSPATH=./pig.jar }}} - (1) With Pig Script + (1) With the Pig Script From your local directory, run: {{{ @@ -85, +89 @@ grunt dump B; }}} - (2) Without Pig Script + (2) Without the Pig Script From your current working directory, run: {{{
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- == Run Modes == - Pig has two run modes or exectypes, local and hadoop (currently called mapreduce). To run Pig in local mode, you need access to a single machine. To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. + Pig has two run modes or exectypes, local and hadoop (currently called mapreduce). + * '''Local Mode''': To run Pig in local mode, you need access to a single machine. + * '''Hadoop (mapreduce) Mode''': To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. - To get a listing of all Pig commands, including the run modes, use + To get a listing of all Pig commands, including the run modes, use: {{{ $ pig âhelp }}} @@ -62, +64 @@ * The sample code files (XXX.pig and XXX.java) on this page. - == Local Mode == + = Local Mode = This section shows you how to run Pig in local mode, using the Grunt shell, a Pig script, and an embedded program. To run Pig in local mode, you only need access to a single machine. @@ -70, +72 @@ === Grunt Shell === To run Pigâs Grunt shell in local mode, follow these instructions. - First, point $PIG_CLASSPATH to the pig.jar file (in your current working directory). Example: + First, point $PIG_CLASSPATH to the pig.jar file (in your current working directory): {{{ $ export PIG_CLASSPATH=./pig.jar }}} @@ -100, +102 @@ The Grunt shell is invoked and you can enter commands at the prompt. + === Script File === + + To run a Pig script file in local mode, follow these instructions (which are the same as the Grunt Shell instructions above â you just include the script file). - === Script File === + First, point $PIG_CLASSPATH to the pig.jar file (in your current working directory): + {{{ + $ export PIG_CLASSPATH=./pig.jar + }}} + + (1) With the Pig Script + + From your current working directory, run: + + {{{ + $ pig -x local XXX.pig + }}} + + The results are displayed to your terminal screen. + + (2) Without the Pig Script + + From your current working directory, run: + {{{ + $ java -cp pig.jar org.apache.pig.Main -x local XXX.pig + Or + $ java âjar pig.jar âx local XXX.pig + }}} + + The results are displayed to your terminal screen. + === Embedded Program ===
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- To view the results, check the output file, XXX.out. + = Hadoop Mode = + + This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig in Hadoop mode, you need access to a Hadoop cluster. + + === Grunt Shell === + + === Script File === + + === Embedded Program === +
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- This page provides the information you need to get started running Pig. - == Run Modes == + === Run Modes === Pig has two run modes or exectypes, local and hadoop (currently called mapreduce). * '''Local Mode''': To run Pig in local mode, you need access to a single machine. @@ -15, +15 @@ Note: A ticket has been entered to change {{{-x, -exectype local|mapreduce}}} to {{{-x, -exectype local|hadoop}}} - == Run Ways == + === Run Ways === You can run Pig three ways â using either local mode or hadoop (mapreduce) mode: * '''Grunt Shell''': Enter Pig commands manually using Pigâs interactive shell, Grunt. @@ -24, +24 @@ Note: Pig, itself, is a script file (pig.sh), and is referred toas The Pig Script File. The Run Ways script file mentioned above is a script file that you create and which contains the Pig commands that you want to run (we provide a sample in the next section). - == Sample Code == + === Sample Code === The sample code files you need to run the examples on this page include: * Script file: XXX.pig * Embedded program: XXX.java @@ -40, +40 @@ - == Environment == + === Environment === Unix and Windows users need to install and set up Java (including $JAVA_HOME and $PATH). @@ -155, +155 @@ = Hadoop Mode = - This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig in Hadoop mode, you need access to a Hadoop cluster. + This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. === Grunt Shell === + To run Pigâs Grunt shell in hadoop (mapreduce) mode, follow these instructions. When you begin the session, Pig will allocate a 15-node cluster. When you quit the session, Pig will deallocate the nodes. - To run Pigâs Grunt shell in hadoop (mapreduce) mode, follow these instructions. - - Note: When you begin the session, Pig will allocate a 15-node cluster. When you quit the session, Pig will automatically deallocate the nodes. From your current working directory, run: {{{ @@ -177, +175 @@ === Script File === - To run Pig script files in hadoop (mapreduce) mode, follow these instructions (which are the same as the Grunt Shell instructions above â you just include the script file). + To run Pig script files in hadoop (mapreduce) mode, follow these instructions (which are the same as the Grunt Shell instructions above â you just include the script file). Again, Pig will automatically allocate and deallocate a 15-node cluster. - - Note: Again, Pig will automatically allocate and deallocate a 15-node cluster. From your current working directory, run: {{{
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- = Hadoop Mode = - This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. + This section shows you how to run Pig in hadoop (mapreduce) mode, using the Grunt shell, a Pig script, and an embedded program. + + To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. === Grunt Shell === To run Pigâs Grunt shell in hadoop (mapreduce) mode, follow these instructions. When you begin the session, Pig will allocate a 15-node cluster. When you quit the session, Pig will deallocate the nodes.
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- * setenv PIGDIR /pig (tcsh, csh) * export PIGDIR=/pig (bash, sh, ksh) + The examples use export. - To make things simple, copy these files to your current working directory (you may want to create a temp directory and move to it): - - * The /etc/passwd file: - {{{ - $ cp /etc/passwd . - }}} - - * Copy the pig.jar file from your SVN tree (see BuildPig): - {{{ - $ cp /yourSVNtree/pig.jar . - }}} - - * The sample code files (XXX.pig and XXX.java) on this page. = Local Mode = - This section shows you how to run Pig in local mode, using the Grunt shell, a Pig script, and an embedded program. + This section shows you how to run Pig in local mode, using the Grunt shell, a Pig script, and an embedded program. - To run Pig in local mode, you only need access to a single machine. + To run Pig in local mode, you only need access to a single machine. To make things simple, copy these files to your current working directory (you may want to create a temp directory and move to it): + + * The /etc/passwd file + * The pig.jar file, located in your SVN tree (see BuildPig) + * The sample code files (XXX.pig and XXX.java) located on this page === Grunt Shell === To run Pigâs Grunt shell in local mode, follow these instructions. @@ -157, +149 @@ This section shows you how to run Pig in hadoop (mapreduce) mode, using the Grunt shell, a Pig script, and an embedded program. - To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. + To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. You will also need to copy these files to your current working directory. + + * The /etc/passwd file + * The pig.jar file, located in your SVN tree (see BuildPig) + * The sample code files (XXX.pig and XXX.java) located on this page === Grunt Shell === To run Pigâs Grunt shell in hadoop (mapreduce) mode, follow these instructions. When you begin the session, Pig will allocate a 15-node cluster. When you quit the session, Pig will deallocate the nodes. @@ -190, +186 @@ === Embedded Program === To compile and run an embedded Java/Pig program in hadoop (mapreduce) mode, follow these instructions. - First, point $HADOOPDIR to the directory that contains the hadoop-site.xml file. + First, point $HADOOPDIR to the directory that contains the hadoop-site.xml file. Example: + {{{ + $ export HADOOPDIR=/yourHADOOPsite/conf + }}} From your current working directory, compile the program: {{{
[Pig Wiki] Trivial Update of RunPig by CorinneC
Dear Wiki user, You have subscribed to a wiki page or wiki category on Pig Wiki for change notification. The following page has been changed by CorinneC: http://wiki.apache.org/pig/RunPig -- * '''Script File''': Place Pig commands in a script file and run the script. * '''Embedded Program''': Embed Pig commands in a host language and run the program. - Note: Pig, itself, is a script file (pig.sh), and is referred toas The Pig Script File. The Run Ways script file mentioned above is a script file that you create and which contains the Pig commands that you want to run (we provide a sample in the next section). + Note: The script file mentioned above is a script file that you create and which contains the Pig commands that you want to run using Pig (we provide a sample script file in the next section). However, Pig, itself, is also a script file (pig.sh), and is referred below to as The Pig Script. === Sample Code === The sample code files you need to run the examples on this page include: