[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-11 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
* '''Script File''': Place Pig commands in a script file and run the script.
* '''Embedded Program''': Embed Pig commands in a host language and run the 
program.
  
- Note: The script file mentioned above is a script file that you create and 
which contains the Pig commands that you want to run using Pig (we provide a 
sample script file in the next section). However, Pig, itself, is also a script 
file (pig.sh), and is referred below to as The Pig Script.
+ Note: The script file mentioned above is a script file that you create and 
which contains the Pig commands that you want to run using Pig (we provide a 
sample script file in the next section). Please note, however, that Pig, 
itself, is also a script (pig.sh), and is referred below to as The Pig Script.
  
  === Sample Code ===
  The sample code files you need to run the examples on this page include: 


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-11 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
* '''Script File''': Place Pig commands in a script file and run the script.
* '''Embedded Program''': Embed Pig commands in a host language and run the 
program.
  
- Note: The script file mentioned above is a script file that you create and 
which contains the Pig commands that you want to run using Pig (we provide a 
sample script file in the next section). Please note, however, that Pig, 
itself, is also a script (pig.sh), and is referred below to as The Pig Script.
+ Note: The script file mentioned above is a script that you create and which 
contains the Pig commands that you want to run using Pig (we provide a sample 
script in the next section). Please note, however, that Pig, itself, is also a 
script (pig.sh), and is referred below to as The Pig Script.
  
  === Sample Code ===
  The sample code files you need to run the examples on this page include: 


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-11 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  
  === Sample Code ===
  The sample code files you need to run the examples on this page include: 
-   * Script file: XXX.pig 
-   * Embedded program: XXX.java
+   * Script file: attachment:id.pig 
+   * Embedded program: attachment:idlocal.java and attachment:idhadoop.java
   
  The examples are based on these Pig commands, which extract all user IDs from 
the /etc/passwd file. 
  


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-11 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  
  === Environment ===
  
- Unix and Windows users need to install and set up Java (including $JAVA_HOME 
and $PATH).
+ Unix and Windows users need to install and set up Java (including $JAVA_HOME).
  
  Windows users need to install Cygwin and the Perl package 
(http://www.cygwin.com/)
  


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
- You can run Pig locally or on a Hadoop. To run Pig locally (in local mode), 
no Hadoop cluster is required. To run Pig on Hadoop, you  need access to a 
Hadoop cluster: [http://lucene.apache.org/hadoop/]. 
+ This page provides the information you need to get started running Pig.
  
- == Running Pig Programs ==
+ == Run Modes ==
  
- This section will be updated shortly ...
+ Pig has two run modes or exectypes, local and hadoop (currently called 
mapreduce). To run Pig in local mode, you need access to a single machine. To 
run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster.
  
- [Pi] This needs to be changed. Now users can start Pig by simplying running 
./bin/pig and all the configuration things can be set at ./conf/pig.properties.
+ To get a listing of all Pig commands use {{{$pig –help}}}
  
- There are two ways to run pig. The first way is by using `pig.pl` that can be 
found in the scripts directory of your source tree. Using the script would 
require having Perl installed on your machine. You can use it by issuing the 
following command: `pig.pl -cp pig.jar:HADOOPSITEPATH` where HADOOPSITEPATH is 
the directory in which `hadoop-site.xml` file for your Hadoop cluster is 
located. Example:
+ Note: A ticket has been entered to change {{{-x, -exectype local|mapreduce}}} 
 to  {{{-x, -exectype local|hadoop}}}
  
- `pig.pl -cp pig.jar:/hadoop/conf`
- 
- The second way to do this is by using java directly:
- 
- `java -cp pig.jar:HADOOPSITEPATH org.apache.pig.Main`
- 
- This starts pig in the default map-reduce mode. You can also start pig in 
local mode:
- 
- `java -cp pig.jar org.apache.pig.Main -x local`
- 
- Or
- 
- `java -jar pig.jar -x local`
- 
- Regardless of how you invoke pig, the commands that are specified above will 
take you to an interactive shell called grunt where you can run DFS and pig 
commands. The documentation about grunt will be posted on wiki soon. If you 
want to run Pig in batch mode, you can append your pig script to either of the 
commands above. Example:
- 
- {{{pig.pl -cp pig.jar:/hadoop/conf myscript.pig}}}
- 
- or
- 
- {{{java -cp pig.jar:/hadoop/conf org.apache.pig.Main myscript.pig}}}
- 


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
* setenv PIGDIR /pig  (tcsh, csh) 
* export PIGDIR=/pig (bash, sh, ksh)
  
- To make things simple, copy all files to your current working directory (you 
may want to create a temp directory and move to it):
+ To make things simple, copy these files to your current working directory 
(you may want to create a temp directory and move to it):
  
-   * Copy the /etc/passwd file to your current working directory ('''.'''):
+   * The /etc/passwd file:
  {{{
  $ cp /etc/passwd .
  }}}
  
-   * Copy the pig.jar file from your SVN tree (see BuildPig) to your current 
working directory.
+   * Copy the pig.jar file from your SVN tree (see BuildPig):
  {{{ 
  $ cp /yourSVNtree/pig.jar .
  }}}
  
-   * Copy the sample code files (XXX.pig and XXX.java) to your current working 
directory.
+   * The sample code files (XXX.pig and XXX.java) on this page.
  
  == Local Mode ==
  This section shows you how to run Pig in local mode, using the Grunt shell, a 
Pig script, and an embedded program.


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  You can run Pig three ways – using either local mode or hadoop (mapreduce) 
mode:
* '''Grunt Shell''': Enter Pig commands manually using Pig’s interactive 
shell, Grunt. 
* '''Script File''': Place Pig commands in a script file and run the script.
-   * '''Embedded Program''': Embed Pig commands in a host language (Java) and 
run the program.
+   * '''Embedded Program''': Embed Pig commands in a host language and run the 
program.
+ 
+ Note: Pig, itself, is a script file (pig.sh), and is referred toas The Pig 
Script File. The Run Ways script file mentioned above is a script file that 
you create and which contains the Pig commands that you want to run (we provide 
a sample in the next section).
  
  == Sample Code ==
- The sample code files you need to run the examples on this page include: 
XXX.pig and XXX.java.
+ The sample code files you need to run the examples on this page include: 
+   * Script file: XXX.pig 
+   * Embedded program: XXX.java
   
  The examples are based on these Pig commands, which extract all user IDs from 
the /etc/passwd file. 
  
@@ -71, +75 @@

  $ export PIG_CLASSPATH=./pig.jar
  }}} 
  
- (1) With Pig Script
+ (1) With the Pig Script
  
  From your local directory, run:
  {{{
@@ -85, +89 @@

  grunt dump B; 
  }}}
  
- (2) Without Pig Script
+ (2) Without the Pig Script
  
  From your current working directory, run:
  {{{


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  
  == Run Modes ==
  
- Pig has two run modes or exectypes, local and hadoop (currently called 
mapreduce). To run Pig in local mode, you need access to a single machine. To 
run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster.
+ Pig has two run modes or exectypes, local and hadoop (currently called 
mapreduce). 
+   * '''Local Mode''': To run Pig in local mode, you need access to a single 
machine. 
+   * '''Hadoop (mapreduce) Mode''': To run Pig in hadoop (mapreduce) mode, you 
need access to a Hadoop cluster.
  
- To get a listing of all Pig commands, including the run modes, use 
+ To get a listing of all Pig commands, including the run modes, use: 
  {{{
  $ pig –help
  }}}
@@ -62, +64 @@

  
* The sample code files (XXX.pig and XXX.java) on this page.
  
- == Local Mode ==
+ = Local Mode =
  This section shows you how to run Pig in local mode, using the Grunt shell, a 
Pig script, and an embedded program.
  
  To run Pig in local mode, you only need access to a single machine. 
@@ -70, +72 @@

  === Grunt Shell ===
  To run Pig’s Grunt shell in local mode, follow these instructions.
  
- First, point $PIG_CLASSPATH to the pig.jar file (in your current working 
directory). Example:
+ First, point $PIG_CLASSPATH to the pig.jar file (in your current working 
directory):
  {{{
  $ export PIG_CLASSPATH=./pig.jar
  }}} 
@@ -100, +102 @@

  
  The Grunt shell is invoked and you can enter commands at the prompt.
  
+ === Script File ===
+ 
+ To run a Pig script file in local mode, follow these instructions (which are 
the same as the Grunt Shell instructions above – you just include the script 
file).
  
  
- === Script File ===
+ First, point $PIG_CLASSPATH to the pig.jar file (in your current working 
directory):
+ {{{
+ $ export PIG_CLASSPATH=./pig.jar
+ }}}
+ 
+ (1) With the Pig Script
+ 
+ From your current working directory, run:
+ 
+ {{{ 
+ $ pig -x local XXX.pig
+ }}}
+ 
+ The results are displayed  to your terminal screen.
+ 
+ (2) Without the Pig Script
+ 
+ From your current working directory, run:
+ {{{
+ $ java -cp pig.jar org.apache.pig.Main -x local XXX.pig
+ Or
+ $ java –jar pig.jar –x local XXX.pig
+ }}}
+ 
+ The results are displayed to your terminal screen.
+ 
  
  === Embedded Program ===
  


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  
  To view the results, check the output file, XXX.out.
  
+ = Hadoop Mode =
+ 
+ This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig 
in Hadoop mode, you need access to a Hadoop cluster. 
+ 
+ === Grunt Shell ===
+ 
+ === Script File ===
+ 
+ === Embedded Program ===
+ 


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  This page provides the information you need to get started running Pig.
  
- == Run Modes ==
+ === Run Modes ===
  
  Pig has two run modes or exectypes, local and hadoop (currently called 
mapreduce). 
* '''Local Mode''': To run Pig in local mode, you need access to a single 
machine. 
@@ -15, +15 @@

  
  Note: A ticket has been entered to change {{{-x, -exectype local|mapreduce}}} 
 to  {{{-x, -exectype local|hadoop}}}
  
- == Run Ways ==
+ === Run Ways ===
  
  You can run Pig three ways – using either local mode or hadoop (mapreduce) 
mode:
* '''Grunt Shell''': Enter Pig commands manually using Pig’s interactive 
shell, Grunt. 
@@ -24, +24 @@

  
  Note: Pig, itself, is a script file (pig.sh), and is referred toas The Pig 
Script File. The Run Ways script file mentioned above is a script file that 
you create and which contains the Pig commands that you want to run (we provide 
a sample in the next section).
  
- == Sample Code ==
+ === Sample Code ===
  The sample code files you need to run the examples on this page include: 
* Script file: XXX.pig 
* Embedded program: XXX.java
@@ -40, +40 @@

  
  
  
- == Environment ==
+ === Environment ===
  
  Unix and Windows users need to install and set up Java (including $JAVA_HOME 
and $PATH).
  
@@ -155, +155 @@

  
  = Hadoop Mode =
  
- This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig 
in Hadoop mode, you need access to a Hadoop cluster. 
+ This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig 
in hadoop (mapreduce) mode, you need access to a Hadoop cluster. 
  
  === Grunt Shell ===
+ To run Pig’s Grunt shell in hadoop (mapreduce) mode, follow these 
instructions. When you begin the session, Pig will allocate a 15-node cluster. 
When you quit the session, Pig will deallocate the nodes.
- To run Pig’s Grunt shell in hadoop (mapreduce) mode, follow these 
instructions. 
- 
- Note: When you begin the session, Pig will allocate a 15-node cluster. When 
you quit the session, Pig will automatically deallocate the nodes.
  
  From your current working directory, run:
  {{{
@@ -177, +175 @@

  
  
  === Script File ===
- To run Pig script files in hadoop (mapreduce) mode, follow these instructions 
(which are the same as the Grunt Shell instructions above – you just include 
the script file).
+ To run Pig script files in hadoop (mapreduce) mode, follow these instructions 
(which are the same as the Grunt Shell instructions above – you just include 
the script file). Again, Pig will automatically allocate and deallocate a 
15-node cluster.
- 
- Note: Again, Pig will automatically allocate and deallocate a 15-node cluster.
  
  From your current working directory, run:
  {{{


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
  
  = Hadoop Mode =
  
- This section shows you how to run Pig in hadoop (mapreduce) mode. To run Pig 
in hadoop (mapreduce) mode, you need access to a Hadoop cluster. 
+ This section shows you how to run Pig in hadoop (mapreduce) mode, using the 
Grunt shell, a Pig script, and an embedded program.
+ 
+ To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. 
  
  === Grunt Shell ===
  To run Pig’s Grunt shell in hadoop (mapreduce) mode, follow these 
instructions. When you begin the session, Pig will allocate a 15-node cluster. 
When you quit the session, Pig will deallocate the nodes.


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
* setenv PIGDIR /pig  (tcsh, csh) 
* export PIGDIR=/pig (bash, sh, ksh)
  
+ The examples use export.
- To make things simple, copy these files to your current working directory 
(you may want to create a temp directory and move to it):
- 
-   * The /etc/passwd file:
- {{{
- $ cp /etc/passwd .
- }}}
- 
-   * Copy the pig.jar file from your SVN tree (see BuildPig):
- {{{ 
- $ cp /yourSVNtree/pig.jar .
- }}}
- 
-   * The sample code files (XXX.pig and XXX.java) on this page.
  
  = Local Mode =
- This section shows you how to run Pig in local mode, using the Grunt shell, a 
Pig script, and an embedded program.
+ This section shows you how to run Pig in local mode, using the Grunt shell, a 
Pig script, and an embedded program. 
  
- To run Pig in local mode, you only need access to a single machine. 
+ To run Pig in local mode, you only need access to a single machine. To make 
things simple, copy these files to your current working directory (you may want 
to create a temp directory and move to it):
+ 
+   * The /etc/passwd file
+   * The pig.jar file, located in your SVN tree (see BuildPig)
+   * The sample code files (XXX.pig and XXX.java) located on this page
  
  === Grunt Shell ===
  To run Pig’s Grunt shell in local mode, follow these instructions.
@@ -157, +149 @@

  
  This section shows you how to run Pig in hadoop (mapreduce) mode, using the 
Grunt shell, a Pig script, and an embedded program.
  
- To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. 
+ To run Pig in hadoop (mapreduce) mode, you need access to a Hadoop cluster. 
You will also need to copy these files to your current working directory.
+ 
+   * The /etc/passwd file
+   * The pig.jar file, located in your SVN tree (see BuildPig)
+   * The sample code files (XXX.pig and XXX.java) located on this page
  
  === Grunt Shell ===
  To run Pig’s Grunt shell in hadoop (mapreduce) mode, follow these 
instructions. When you begin the session, Pig will allocate a 15-node cluster. 
When you quit the session, Pig will deallocate the nodes.
@@ -190, +186 @@

  === Embedded Program ===
  To compile and run an embedded Java/Pig program in hadoop (mapreduce) mode, 
follow these instructions. 
  
- First, point $HADOOPDIR to the directory that contains the hadoop-site.xml 
file. 
+ First, point $HADOOPDIR to the directory that contains the hadoop-site.xml 
file. Example:
+ {{{
+ $ export HADOOPDIR=/yourHADOOPsite/conf 
+ }}}
  
  From your current working directory, compile the program:
  {{{


[Pig Wiki] Trivial Update of RunPig by CorinneC

2008-09-10 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Pig Wiki for change 
notification.

The following page has been changed by CorinneC:
http://wiki.apache.org/pig/RunPig

--
* '''Script File''': Place Pig commands in a script file and run the script.
* '''Embedded Program''': Embed Pig commands in a host language and run the 
program.
  
- Note: Pig, itself, is a script file (pig.sh), and is referred toas The Pig 
Script File. The Run Ways script file mentioned above is a script file that 
you create and which contains the Pig commands that you want to run (we provide 
a sample in the next section).
+ Note: The script file mentioned above is a script file that you create and 
which contains the Pig commands that you want to run using Pig (we provide a 
sample script file in the next section). However, Pig, itself, is also a script 
file (pig.sh), and is referred below to as The Pig Script.
  
  === Sample Code ===
  The sample code files you need to run the examples on this page include: