svn commit: r674838 [1/3] - in /hadoop/core/branches/branch-0.18: ./ docs/ src/docs/src/documentation/content/xdocs/

ddas Tue, 08 Jul 2008 07:17:51 -0700

Author: ddas
Date: Tue Jul  8 07:17:19 2008
New Revision: 674838

URL: http://svn.apache.org/viewvc?rev=674838&view=rev
Log:
Merge -r 674833:674834 from trunk onto 0.18 branch. Fixes HADOOP-3691.


Modified:
    hadoop/core/branches/branch-0.18/CHANGES.txt
    hadoop/core/branches/branch-0.18/docs/changes.html
    hadoop/core/branches/branch-0.18/docs/mapred_tutorial.html
    hadoop/core/branches/branch-0.18/docs/mapred_tutorial.pdf
    hadoop/core/branches/branch-0.18/docs/streaming.html
    hadoop/core/branches/branch-0.18/docs/streaming.pdf
    
hadoop/core/branches/branch-0.18/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
    
hadoop/core/branches/branch-0.18/src/docs/src/documentation/content/xdocs/streaming.xml

Modified: hadoop/core/branches/branch-0.18/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.18/CHANGES.txt?rev=674838&r1=674837&r2=674838&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.18/CHANGES.txt (original)
+++ hadoop/core/branches/branch-0.18/CHANGES.txt Tue Jul  8 07:17:19 2008
@@ -729,6 +729,8 @@
     HADOOP-3692. Fix documentation for Cluster setup and Quick start guides. 
     (Amareshwari Sriramadasu via ddas)
 
+    HADOOP-3691. Fix streaming and tutorial docs. (Jothi Padmanabhan via ddas)
+
 Release 0.17.1 - Unreleased
 
   INCOMPATIBLE CHANGES

Modified: hadoop/core/branches/branch-0.18/docs/changes.html
URL: 
http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.18/docs/changes.html?rev=674838&r1=674837&r2=674838&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.18/docs/changes.html (original)
+++ hadoop/core/branches/branch-0.18/docs/changes.html Tue Jul  8 07:17:19 2008
@@ -305,7 +305,7 @@
     </ol>
   </li>
   <li><a 
href="javascript:toggleList('release_0.18.0_-_unreleased_._bug_fixes_')">  BUG 
FIXES
-</a>&nbsp;&nbsp;&nbsp;(123)
+</a>&nbsp;&nbsp;&nbsp;(124)
     <ol id="release_0.18.0_-_unreleased_._bug_fixes_">
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-2905";>HADOOP-2905</a>. 'fsck 
-move' triggers NPE in NameNode.<br />(Lohit Vjayarenu via rangadi)</li>
       <li>Increment ClientProtocol.versionID missed by <a 
href="http://issues.apache.org/jira/browse/HADOOP-2585";>HADOOP-2585</a>.<br 
/>(shv)</li>
@@ -550,6 +550,7 @@
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3653";>HADOOP-3653</a>. Fix 
test-patch target to properly account for Eclipse
 classpath jars.<br />(Brice Arnould via nigel)</li>
       <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3692";>HADOOP-3692</a>. Fix 
documentation for Cluster setup and Quick start guides.<br />(Amareshwari 
Sriramadasu via ddas)</li>
+      <li><a 
href="http://issues.apache.org/jira/browse/HADOOP-3691";>HADOOP-3691</a>. Fix 
streaming and tutorial docs.<br />(Jothi Padmanabhan via ddas)</li>
     </ol>
   </li>
 </ul>

Modified: hadoop/core/branches/branch-0.18/docs/mapred_tutorial.html
URL: 
http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.18/docs/mapred_tutorial.html?rev=674838&r1=674837&r2=674838&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.18/docs/mapred_tutorial.html (original)
+++ hadoop/core/branches/branch-0.18/docs/mapred_tutorial.html Tue Jul  8 
07:17:19 2008
@@ -5,7 +5,7 @@
 <meta content="Apache Forrest" name="Generator">
 <meta name="Forrest-version" content="0.8">
 <meta name="Forrest-skin-name" content="pelt">
-<title>Hadoop Map-Reduce Tutorial</title>
+<title>Hadoop Map/Reduce Tutorial</title>
 <link type="text/css" href="skin/basic.css" rel="stylesheet">
 <link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet">
 <link media="print" type="text/css" href="skin/print.css" rel="stylesheet">
@@ -187,7 +187,7 @@
 <a class="dida" href="mapred_tutorial.pdf"><img alt="PDF -icon" 
src="skin/images/pdfdoc.gif" class="skin"><br>
         PDF</a>
 </div>
-<h1>Hadoop Map-Reduce Tutorial</h1>
+<h1>Hadoop Map/Reduce Tutorial</h1>
 <div id="minitoc-area">
 <ul class="minitoc">
 <li>
@@ -217,7 +217,7 @@
 </ul>
 </li>
 <li>
-<a href="#Map-Reduce+-+User+Interfaces">Map-Reduce - User Interfaces</a>
+<a href="#Map%2FReduce+-+User+Interfaces">Map/Reduce - User Interfaces</a>
 <ul class="minitoc">
 <li>
 <a href="#Payload">Payload</a>
@@ -328,7 +328,7 @@
 <h2 class="h3">Purpose</h2>
 <div class="section">
 <p>This document comprehensively describes all user-facing facets of the 
-      Hadoop Map-Reduce framework and serves as a tutorial.
+      Hadoop Map/Reduce framework and serves as a tutorial.
       </p>
 </div>
     
@@ -356,11 +356,11 @@
 <a name="N10032"></a><a name="Overview"></a>
 <h2 class="h3">Overview</h2>
 <div class="section">
-<p>Hadoop Map-Reduce is a software framework for easily writing 
+<p>Hadoop Map/Reduce is a software framework for easily writing 
       applications which process vast amounts of data (multi-terabyte 
data-sets) 
       in-parallel on large clusters (thousands of nodes) of commodity 
       hardware in a reliable, fault-tolerant manner.</p>
-<p>A Map-Reduce <em>job</em> usually splits the input data-set into 
+<p>A Map/Reduce <em>job</em> usually splits the input data-set into 
       independent chunks which are processed by the <em>map tasks</em> in a
       completely parallel manner. The framework sorts the outputs of the maps, 
       which are then input to the <em>reduce tasks</em>. Typically both the 
@@ -368,12 +368,12 @@
       takes care of scheduling tasks, monitoring them and re-executes the 
failed
       tasks.</p>
 <p>Typically the compute nodes and the storage nodes are the same, that is, 
-      the Map-Reduce framework and the <a href="hdfs_design.html">Distributed 
+      the Map/Reduce framework and the <a href="hdfs_design.html">Distributed 
       FileSystem</a> are running on the same set of nodes. This configuration
       allows the framework to effectively schedule tasks on the nodes where 
data 
       is already present, resulting in very high aggregate bandwidth across 
the 
       cluster.</p>
-<p>The Map-Reduce framework consists of a single master 
+<p>The Map/Reduce framework consists of a single master 
       <span class="codefrag">JobTracker</span> and one slave <span 
class="codefrag">TaskTracker</span> per 
       cluster-node. The master is responsible for scheduling the jobs' 
component 
       tasks on the slaves, monitoring them and re-executing the failed tasks. 
The 
@@ -388,7 +388,7 @@
       scheduling tasks and monitoring them, providing status and diagnostic 
       information to the job-client.</p>
 <p>Although the Hadoop framework is implemented in Java<sup>TM</sup>, 
-      Map-Reduce applications need not be written in Java.</p>
+      Map/Reduce applications need not be written in Java.</p>
 <ul>
         
 <li>
@@ -403,7 +403,7 @@
           
 <a href="api/org/apache/hadoop/mapred/pipes/package-summary.html">
           Hadoop Pipes</a> is a <a href="http://www.swig.org/";>SWIG</a>-
-          compatible <em>C++ API</em> to implement Map-Reduce applications 
(non 
+          compatible <em>C++ API</em> to implement Map/Reduce applications 
(non 
           JNI<sup>TM</sup> based).
         </li>
       
@@ -414,7 +414,7 @@
 <a name="N1008B"></a><a name="Inputs+and+Outputs"></a>
 <h2 class="h3">Inputs and Outputs</h2>
 <div class="section">
-<p>The Map-Reduce framework operates exclusively on 
+<p>The Map/Reduce framework operates exclusively on 
       <span class="codefrag">&lt;key, value&gt;</span> pairs, that is, the 
framework views the 
       input to the job as a set of <span class="codefrag">&lt;key, 
value&gt;</span> pairs and 
       produces a set of <span class="codefrag">&lt;key, value&gt;</span> pairs 
as the output of 
@@ -426,7 +426,7 @@
       <a href="api/org/apache/hadoop/io/WritableComparable.html">
       WritableComparable</a> interface to facilitate sorting by the framework.
       </p>
-<p>Input and Output types of a Map-Reduce job:</p>
+<p>Input and Output types of a Map/Reduce job:</p>
 <p>
         (input) <span class="codefrag">&lt;k1, v1&gt;</span> 
         -&gt; 
@@ -448,7 +448,7 @@
 <a name="N100CD"></a><a name="Example%3A+WordCount+v1.0"></a>
 <h2 class="h3">Example: WordCount v1.0</h2>
 <div class="section">
-<p>Before we jump into the details, lets walk through an example Map-Reduce 
+<p>Before we jump into the details, lets walk through an example Map/Reduce 
       application to get a flavour for how they work.</p>
 <p>
 <span class="codefrag">WordCount</span> is a simple application that counts 
the number of
@@ -1226,11 +1226,11 @@
 </div>
     
     
-<a name="N105A3"></a><a name="Map-Reduce+-+User+Interfaces"></a>
-<h2 class="h3">Map-Reduce - User Interfaces</h2>
+<a name="N105A3"></a><a name="Map%2FReduce+-+User+Interfaces"></a>
+<h2 class="h3">Map/Reduce - User Interfaces</h2>
 <div class="section">
 <p>This section provides a reasonable amount of detail on every user-facing 
-      aspect of the Map-Reduce framwork. This should help users implement, 
+      aspect of the Map/Reduce framwork. This should help users implement, 
       configure and tune their jobs in a fine-grained manner. However, please 
       note that the javadoc for each class/interface remains the most 
       comprehensive documentation available; this is only meant to be a 
tutorial.
@@ -1260,7 +1260,7 @@
           intermediate records. The transformed intermediate records do not 
need
           to be of the same type as the input records. A given input pair may 
           map to zero or many output pairs.</p>
-<p>The Hadoop Map-Reduce framework spawns one map task for each 
+<p>The Hadoop Map/Reduce framework spawns one map task for each 
           <span class="codefrag">InputSplit</span> generated by the <span 
class="codefrag">InputFormat</span> for 
           the job.</p>
 <p>Overall, <span class="codefrag">Mapper</span> implementations are passed 
the 
@@ -1423,7 +1423,7 @@
 <h4>Reporter</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/Reporter.html">
-          Reporter</a> is a facility for Map-Reduce applications to report 
+          Reporter</a> is a facility for Map/Reduce applications to report 
           progress, set application-level status messages and update 
           <span class="codefrag">Counters</span>.</p>
 <p>
@@ -1443,20 +1443,20 @@
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputCollector.html">
           OutputCollector</a> is a generalization of the facility provided by
-          the Map-Reduce framework to collect data output by the 
+          the Map/Reduce framework to collect data output by the 
           <span class="codefrag">Mapper</span> or the <span 
class="codefrag">Reducer</span> (either the 
           intermediate outputs or the output of the job).</p>
-<p>Hadoop Map-Reduce comes bundled with a 
+<p>Hadoop Map/Reduce comes bundled with a 
         <a href="api/org/apache/hadoop/mapred/lib/package-summary.html">
         library</a> of generally useful mappers, reducers, and 
partitioners.</p>
 <a name="N107B6"></a><a name="Job+Configuration"></a>
 <h3 class="h4">Job Configuration</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/JobConf.html">
-        JobConf</a> represents a Map-Reduce job configuration.</p>
+        JobConf</a> represents a Map/Reduce job configuration.</p>
 <p>
 <span class="codefrag">JobConf</span> is the primary interface for a user to 
describe
-        a map-reduce job to the Hadoop framework for execution. The framework 
+        a Map/Reduce job to the Hadoop framework for execution. The framework 
         tries to faithfully execute the job as described by <span 
class="codefrag">JobConf</span>, 
         however:</p>
 <ul>
@@ -1747,7 +1747,7 @@
         with the <span class="codefrag">JobTracker</span>.</p>
 <p>
 <span class="codefrag">JobClient</span> provides facilities to submit jobs, 
track their 
-        progress, access component-tasks' reports/logs, get the Map-Reduce 
+        progress, access component-tasks' reports and logs, get the Map/Reduce 
         cluster's status information and so on.</p>
 <p>The job submission process involves:</p>
 <ol>
@@ -1762,7 +1762,7 @@
           </li>
           
 <li>
-            Copying the job's jar and configuration to the map-reduce system 
+            Copying the job's jar and configuration to the Map/Reduce system 
             directory on the <span class="codefrag">FileSystem</span>.
           </li>
           
@@ -1802,8 +1802,8 @@
         <span class="codefrag">JobClient</span> to submit the job and monitor 
its progress.</p>
 <a name="N10A48"></a><a name="Job+Control"></a>
 <h4>Job Control</h4>
-<p>Users may need to chain map-reduce jobs to accomplish complex
-          tasks which cannot be done via a single map-reduce job. This is 
fairly
+<p>Users may need to chain Map/Reduce jobs to accomplish complex
+          tasks which cannot be done via a single Map/Reduce job. This is 
fairly
           easy since the output of the job typically goes to distributed 
           file-system, and the output, in turn, can be used as the input for 
the 
           next job.</p>
@@ -1840,9 +1840,9 @@
 <h3 class="h4">Job Input</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/InputFormat.html">
-        InputFormat</a> describes the input-specification for a Map-Reduce job.
+        InputFormat</a> describes the input-specification for a Map/Reduce job.
         </p>
-<p>The Map-Reduce framework relies on the <span 
class="codefrag">InputFormat</span> of 
+<p>The Map/Reduce framework relies on the <span 
class="codefrag">InputFormat</span> of 
         the job to:</p>
 <ol>
           
@@ -1914,9 +1914,9 @@
 <h3 class="h4">Job Output</h3>
 <p>
 <a href="api/org/apache/hadoop/mapred/OutputFormat.html">
-        OutputFormat</a> describes the output-specification for a Map-Reduce 
+        OutputFormat</a> describes the output-specification for a Map/Reduce 
         job.</p>
-<p>The Map-Reduce framework relies on the <span 
class="codefrag">OutputFormat</span> of 
+<p>The Map/Reduce framework relies on the <span 
class="codefrag">OutputFormat</span> of 
         the job to:</p>
 <ol>
           
@@ -1946,7 +1946,7 @@
           application-writer will have to pick unique names per task-attempt 
           (using the attemptid, say <span 
class="codefrag">attempt_200709221812_0001_m_000000_0</span>), 
           not just per task.</p>
-<p>To avoid these issues the Map-Reduce framework maintains a special 
+<p>To avoid these issues the Map/Reduce framework maintains a special 
           <span 
class="codefrag">${mapred.output.dir}/_temporary/_${taskid}</span> sub-directory
           accessible via <span 
class="codefrag">${mapred.work.output.dir}</span>
           for each task-attempt on the <span 
class="codefrag">FileSystem</span> where the output
@@ -1966,7 +1966,7 @@
 <p>Note: The value of <span class="codefrag">${mapred.work.output.dir}</span> 
during 
           execution of a particular task-attempt is actually 
           <span 
class="codefrag">${mapred.output.dir}/_temporary/_{$taskid}</span>, and this 
value is 
-          set by the map-reduce framework. So, just create any side-files in 
the 
+          set by the Map/Reduce framework. So, just create any side-files in 
the 
           path  returned by
           <a 
href="api/org/apache/hadoop/mapred/FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf)">
           FileOutputFormat.getWorkOutputPath() </a>from map/reduce 
@@ -1988,7 +1988,7 @@
 <h4>Counters</h4>
 <p>
 <span class="codefrag">Counters</span> represent global counters, defined 
either by 
-          the Map-Reduce framework or applications. Each <span 
class="codefrag">Counter</span> can 
+          the Map/Reduce framework or applications. Each <span 
class="codefrag">Counter</span> can 
           be of any <span class="codefrag">Enum</span> type. Counters of a 
particular 
           <span class="codefrag">Enum</span> are bunched into groups of type 
           <span class="codefrag">Counters.Group</span>.</p>
@@ -2009,7 +2009,7 @@
           files efficiently.</p>
 <p>
 <span class="codefrag">DistributedCache</span> is a facility provided by the 
-          Map-Reduce framework to cache files (text, archives, jars and so on) 
+          Map/Reduce framework to cache files (text, archives, jars and so on) 
           needed by applications.</p>
 <p>Applications specify the files to be cached via urls (hdfs://)
           in the <span class="codefrag">JobConf</span>. The <span 
class="codefrag">DistributedCache</span> 
@@ -2078,7 +2078,7 @@
           interface supports the handling of generic Hadoop command-line 
options.
           </p>
 <p>
-<span class="codefrag">Tool</span> is the standard for any Map-Reduce tool or 
+<span class="codefrag">Tool</span> is the standard for any Map/Reduce tool or 
           application. The application should delegate the handling of 
           standard command-line options to 
           <a href="api/org/apache/hadoop/util/GenericOptionsParser.html">
@@ -2116,7 +2116,7 @@
 <h4>IsolationRunner</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/IsolationRunner.html">
-          IsolationRunner</a> is a utility to help debug Map-Reduce 
programs.</p>
+          IsolationRunner</a> is a utility to help debug Map/Reduce 
programs.</p>
 <p>To use the <span class="codefrag">IsolationRunner</span>, first set 
           <span class="codefrag">keep.failed.tasks.files</span> to <span 
class="codefrag">true</span> 
           (also see <span 
class="codefrag">keep.tasks.files.pattern</span>).</p>
@@ -2219,11 +2219,11 @@
 <h4>JobControl</h4>
 <p>
 <a href="api/org/apache/hadoop/mapred/jobcontrol/package-summary.html">
-          JobControl</a> is a utility which encapsulates a set of Map-Reduce 
jobs
+          JobControl</a> is a utility which encapsulates a set of Map/Reduce 
jobs
           and their dependencies.</p>
 <a name="N10D57"></a><a name="Data+Compression"></a>
 <h4>Data Compression</h4>
-<p>Hadoop Map-Reduce provides facilities for the application-writer to
+<p>Hadoop Map/Reduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
           job-outputs i.e. output of the reduces. It also comes bundled with
           <a href="api/org/apache/hadoop/io/compress/CompressionCodec.html">
@@ -2268,7 +2268,7 @@
 <h2 class="h3">Example: WordCount v2.0</h2>
 <div class="section">
 <p>Here is a more complete <span class="codefrag">WordCount</span> which uses 
many of the
-      features provided by the Map-Reduce framework we discussed so far.</p>
+      features provided by the Map/Reduce framework we discussed so far.</p>
 <p>This needs the HDFS to be up and running, especially for the 
       <span class="codefrag">DistributedCache</span>-related features. Hence 
it only works with a 
       <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
@@ -3655,7 +3655,7 @@
 <a name="N1160B"></a><a name="Highlights"></a>
 <h3 class="h4">Highlights</h3>
 <p>The second version of <span class="codefrag">WordCount</span> improves upon 
the 
-        previous one by using some features offered by the Map-Reduce 
framework:
+        previous one by using some features offered by the Map/Reduce 
framework:
         </p>
 <ul>

svn commit: r674838 [1/3] - in /hadoop/core/branches/branch-0.18: ./ docs/ src/docs/src/documentation/content/xdocs/

Reply via email to