Author: olga Date: Thu Oct 22 16:10:31 2009 New Revision: 828767 URL: http://svn.apache.org/viewvc?rev=828767&view=rev Log: PIG-1039: documentation update (chandec via olgan)
Modified: hadoop/pig/trunk/CHANGES.txt hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tabs.xml hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tutorial.xml Modified: hadoop/pig/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/pig/trunk/CHANGES.txt?rev=828767&r1=828766&r2=828767&view=diff ============================================================================== --- hadoop/pig/trunk/CHANGES.txt (original) +++ hadoop/pig/trunk/CHANGES.txt Thu Oct 22 16:10:31 2009 @@ -110,6 +110,8 @@ IMPROVEMENTS +PIG-1039: documentation update (chandec via olgan) + OPTIMIZATIONS BUG FIXES Modified: hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml URL: http://svn.apache.org/viewvc/hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml?rev=828767&r1=828766&r2=828767&view=diff ============================================================================== --- hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml (original) +++ hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/piglatin_reference.xml Thu Oct 22 16:10:31 2009 @@ -2129,6 +2129,7 @@ <para>(condition ? value_if_true : value_if_false) </para> <para>The bincond should be enclosed in parenthesis. </para> <para>The schemas for the two conditional outputs of the bincond should match.</para> + <para>Use expressions only (relational operators are not allowed).</para> </entry> </row></tbody></tgroup> </informaltable> @@ -4750,9 +4751,9 @@ <section> <title>Relational Operators</title> -<section> + <section id="COGROUP"> <title>COGROUP</title> - <para>COGROUP is the same as GROUP, but for readability purposes programmers usually use GROUP when only one relation is involved and COGROUP with multiple relations. See <xref linkend="GROUP" /> for more information.</para> + <para>COGROUP is the same as GROUP. For readability, programmers usually use GROUP when only one relation is involved and COGROUP with multiple relations re involved. See <xref linkend="GROUP" /> for more information.</para> </section> <section> <title>CROSS</title> @@ -5403,7 +5404,8 @@ <section id="GROUP"> <title>GROUP</title> - <para>Groups the data in a one or multiple relations. For readability COGROUP is usually used with multiple relations and group is used with a single relation, but they are the same operator.</para> + <para>Groups the data in one or multiple relations. GROUP is the same as <xref linkend="COGROUP" />. For +readability, programmers usually use GROUP when only one relation is involved and COGROUP with multiple relations are involved. </para> <section> <title>Syntax</title> @@ -7674,7 +7676,7 @@ </row></tbody></tgroup> </informaltable></section></section> - <section> + <section > <title>COUNT</title> <para>Computes the number of elements in a bag. </para> <section> @@ -7682,7 +7684,7 @@ <informaltable frame="all"> <tgroup cols="1"><tbody><row> <entry> - <para>COUNT(expression)Â Â Â Â </para> + <para>COUNT(expression) </para> </entry> </row></tbody></tgroup> </informaltable></section> @@ -7702,11 +7704,11 @@ <section> <title>Usage</title> - <para>Use the COUNT function to compute the number of elements in a bag. + <para>Use the COUNT function to compute the number of elements in a bag. COUNT requires a preceding GROUP ALL statement for global counts and a GROUP BY statement for group counts.</para> <para> - The COUNT function now ignores NULL values. If you want to include NULL values in the count computation, see + The COUNT function ignores NULL values. If you want to include NULL values in the count computation, use <ulink url="piglatin_reference.html#COUNT_STAR">COUNT_STAR</ulink>. </para> @@ -7833,7 +7835,9 @@ <title>Usage</title> <para>Use the COUNT_STAR function to compute the number of elements in a bag. COUNT_STAR requires a preceding GROUP ALL statement for global counts and a GROUP BY statement for group counts.</para> - <para>COUNT_STAR includes NULL values in the count computation (unlike COUNT, which ignores NULL values).</para> + <para>COUNT_STAR includes NULL values in the count computation + (unlike <ulink url="piglatin_reference.html#COUNT">COUNT</ulink>, which ignores NULL values). + </para> </section> <section> @@ -8180,7 +8184,7 @@ <section> <title>SIZE</title> - <para>Computes the number of elements based on the data type.</para> + <para>Computes the number of elements based on any Pig data type. </para> <section> <title>Syntax</title> @@ -8207,7 +8211,9 @@ <section> <title>Usage</title> - <para>Use the SIZE function to compute the number of elements based on the data type (see the Types Tables below).</para></section> + <para>Use the SIZE function to compute the number of elements based on the data type (see the Types Tables below). + SIZE includes NULL values in the size computation. SIZE is not algebraic.</para> + </section> <section> <title>Example</title> Modified: hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tabs.xml URL: http://svn.apache.org/viewvc/hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tabs.xml?rev=828767&r1=828766&r2=828767&view=diff ============================================================================== --- hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tabs.xml (original) +++ hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tabs.xml Thu Oct 22 16:10:31 2009 @@ -32,6 +32,6 @@ --> <tab label="Project" href="http://hadoop.apache.org/pig/" type="visible" /> <tab label="Wiki" href="http://wiki.apache.org/pig/" type="visible" /> - <tab label="Pig 0.4.0 Documentation" dir="" type="visible" /> + <tab label="Pig 0.5.0 Documentation" dir="" type="visible" /> </tabs> Modified: hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tutorial.xml URL: http://svn.apache.org/viewvc/hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tutorial.xml?rev=828767&r1=828766&r2=828767&view=diff ============================================================================== --- hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tutorial.xml (original) +++ hadoop/pig/trunk/src/docs/src/documentation/content/xdocs/tutorial.xml Thu Oct 22 16:10:31 2009 @@ -27,16 +27,16 @@ <section> <title>Overview</title> -<p>The Pig tutorial shows you how to run two Pig scripts in local mode and hadoop mode. </p> +<p>The Pig tutorial shows you how to run two Pig scripts in local mode and mapreduce mode. </p> <ul> <li><p> <strong>Local Mode</strong>: To run the scripts in local mode, no Hadoop or HDFS installation is required. All files are installed and run from your local host and file system. </p> </li> -<li><p> <strong>Hadoop Mode</strong>: To run the scripts in hadoop (mapreduce) mode, you need access to a Hadoop cluster and HDFS installation. </p> +<li><p> <strong>Mapreduce Mode</strong>: To run the scripts in mapreduce mode, you need access to a Hadoop cluster and HDFS installation. </p> </li> </ul> <p>The Pig tutorial file (tutorial/pigtutorial.tar.gz file in the pig distribution) includes the Pig JAR file (pig.jar) and the tutorial files (tutorial.jar, Pigs scripts, log files). -These files work with Hadoop 0.18 and provide everything you need to run the Pig scripts.</p> +These files work with Hadoop 0.20 and provide everything you need to run the Pig scripts.</p> <p>To get started, follow these basic steps: </p> <ol> @@ -119,9 +119,9 @@ </section> <section> -<title> Running the Pig Scripts in Hadoop Mode</title> +<title> Running the Pig Scripts in Mapreduce Mode</title> -<p>To run the Pig scripts in hadoop (mapreduce) mode, do the following: </p> +<p>To run the Pig scripts in mapreduce mode, do the following: </p> <ol> <li><p>Move to the pigtmp directory. </p> </li> @@ -135,14 +135,14 @@ $ hadoop fs âcopyFromLocal excite.log.bz2 . </source> <ol> -<li><p>Set the HADOOPSITEPATH environment variable to the location of your hadoop-site.xml file. </p> +<li><p>Set the HADOOP_CONF_DIR environment variable to the location of your core-site.xml, hdfs-site.xml and mapred-site.xml files. </p> </li> <li><p>Execute the following command (using either script1-hadoop.pig or script2-hadoop.pig): </p> </li> </ol> <source> -$ java -cp $PIGDIR/pig.jar:$HADOOPSITEPATH org.apache.pig.Main script1-hadoop.pig +$ java -cp $PIGDIR/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main script1-hadoop.pig </source> <ol> <li><p>Review the result files (located in either the script1-hadoop-results or script2-hadoop-results HDFS directory): </p>