Modified: hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod.xml
URL: 
http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod.xml?rev=629361&r1=629360&r2=629361&view=diff
==============================================================================
--- hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod.xml 
(original)
+++ hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod.xml Tue Feb 
19 21:17:48 2008
@@ -31,636 +31,17 @@
     <section>
       <title> Introduction </title>
       <p>
-      The Hadoop On Demand (<acronym title="Hadoop On Demand">HOD</acronym>) 
project is a system for provisioning and managing independent Hadoop MapReduce 
instances on a shared cluster of nodes. HOD uses a resource manager for 
allocation. At present it supports <a 
href="http://www.clusterresources.com/pages/products/torque-resource-manager.php";>Torque</a>
 out of the box.
+Hadoop On Demand (HOD) is a system for provisioning virtual Hadoop clusters 
over a large physical cluster. It uses the Torque resource manager to do node 
allocation. On the allocated nodes, it can start Hadoop Map/Reduce and HDFS 
daemons. It automatically generates the appropriate configuration files 
(hadoop-site.xml) for the Hadoop daemons and client. HOD also has the 
capability to distribute Hadoop to the nodes in the virtual cluster that it 
allocates. In short, HOD makes it easy for administrators and users to quickly 
setup and use Hadoop. It is also a very useful tool for Hadoop developers and 
testers who need to share a physical cluster for testing their own Hadoop 
versions.
       </p>
-    </section>
-
-    <section> 
-      <title> Feature List </title>
-
-      <section> 
-        <title> Simplified Interface for Provisioning Hadoop Clusters </title>
-        <p>
-        By far, the biggest advantage of HOD is to quickly setup a Hadoop 
cluster. The user interacts with the cluster through a simple command line 
interface, the HOD client. HOD brings up a virtual MapReduce cluster with the 
required number of nodes, which the user can use for running Hadoop jobs. When 
done, HOD will automatically clean up the resources and make the nodes 
available again.
-        </p>
-      </section>
- 
-      <section> 
-        <title> Automatic installation of Hadoop </title>
-        <p>
-        With HOD, Hadoop does not need to be even installed on the cluster. 
The user can provide a Hadoop tarball that HOD will automatically distribute to 
all the nodes in the cluster.
-        </p>
-      </section>
-
-      <section> 
-        <title> Configuring Hadoop </title>
-        <p>
-        Dynamic parameters of Hadoop configuration, such as the NameNode and 
JobTracker addresses and ports, and file system temporary directories are 
generated and distributed by HOD automatically to all nodes in the cluster. In 
addition, HOD allows the user to configure Hadoop parameters at both the server 
(for e.g. JobTracker) and client (for e.g. JobClient) level, including 'final' 
parameters, that were introduced with Hadoop 0.15.
-        </p>
-      </section>
- 
-      <section> 
-        <title> Auto-cleanup of Unused Clusters </title>
-        <p>
-        HOD has an automatic timeout so that users cannot misuse resources 
they aren't using. The timeout applies only when there is no MapReduce job 
running. 
-        </p>
-      </section>
- 
-      <section> 
-        <title> Log Services </title>
-        <p>
-        HOD can be used to collect all MapReduce logs to a central location 
for archiving and inspection after the job is completed.
-        </p>
-      </section>
-    </section>
-
-    <section>
-      
-      <title> HOD Components </title>
-      <p>
-      This is a brief overview of the various components of HOD and how they 
interact to provision Hadoop.
-      </p>
-
-      <section>
-        <title> HOD Client </title>
-        <p>
-        The HOD client is a Unix command that users use to allocate Hadoop 
MapReduce clusters. The command provides other options to list allocated 
clusters and deallocate them. The HOD client generates the 
<em>hadoop-site.xml</em> in a user specified directory. The user can point to 
this configuration file while running Map/Reduce jobs on the allocated cluster.
-        </p>
-        <p>
-        The nodes from where the HOD Client is run are called <em>submit 
nodes</em> because jobs are submitted to the resource manager system for 
allocating and running clusters from these nodes.
-        </p>
       </section>
-
       <section>
-        <title> RingMaster </title>
-        <p>
-        The RingMaster is a HOD process that is started on one node per every 
allocated cluster. It is submitted as a 'job' to the resource manager by the 
HOD client. It controls which Hadoop daemons start on which nodes. It provides 
this information to other HOD processes, such as the HOD client, so users can 
also determine this information. The RingMaster is responsible for hosting and 
distributing the Hadoop tarball to all nodes in the cluster. It also 
automatically cleans up unused clusters.
-        </p>
-        <p>
-        </p>
-      </section>
-
-      <section>
-        <title> HodRing </title>
-        <p>
-        The HodRing is a HOD process that runs on every allocated node in the 
cluster. These processes are run by the RingMaster through the resource 
manager, using a facility of parallel execution. The HodRings are responsible 
for launching Hadoop commands on the nodes to bring up the Hadoop daemons. They 
get the command to launch from the RingMaster.
-        </p>
-      </section>
-
-      <section>
-        <title> Hodrc / HOD configuration file </title>
-        <p>
-        An INI style configuration file where the users configure various 
options for the HOD system, including install locations of different software, 
resource manager parameters, log and temp file directories, parameters for 
their MapReduce jobs, etc.
-        </p>
-      </section>
-
-      <section>
-        <title> Submit Nodes and Compute Nodes </title>
-        <p>
-        The nodes from where the <em>HOD Client</em> is run are referred as 
<em>submit nodes</em> because jobs are submitted to the resource manager system 
for allocating and running clusters from these nodes.
-        </p>
-        <p>
-        The nodes where the <em>Ringmaster</em> and <em>HodRings</em> run are 
called the Compute nodes. These are the nodes that get allocated by a resource 
manager, and on which the Hadoop daemons are provisioned and started.
-        </p>
-      </section>
-    </section>
-
-    <section>
-      <title> Getting Started with HOD </title>
-
-      <section>
-        <title> Pre-Requisites </title>
-
-        <section>
-          <title> Hardware </title>
-          <p>
-          HOD requires a minimum of 3 nodes configured through a resource 
manager.
-          </p>          
-        </section>
-
-        <section>
-          <title> Software </title>
-          <p>
-          The following components are assumed to be installed before using 
HOD:
-          </p>
-          <ul>
-            <li>
-              <em>Torque:</em> Currently HOD supports Torque out of the box. 
We assume that you are familiar with configuring Torque. You can get 
information about this from <a 
href="http://www.clusterresources.com/wiki/doku.php?id=torque:torque_wiki";>here</a>.
-            </li>
-            <li>
-              <em>Python:</em> We require version 2.5.1, which can be 
downloaded from <a href="http://www.python.org/";>here</a>.
-            </li>
-          </ul>
-          <p>
-          The following components can be optionally installed for getting 
better functionality from HOD:
-          </p>
-          <ul>
-            <li>
-              <em>Twisted Python:</em> This can be used for improving the 
scalability of HOD. Twisted Python is available <a 
href="http://twistedmatrix.com/trac/";>here</a>.
-            </li>
-            <li>
-            <em>Hadoop:</em> HOD can automatically distribute Hadoop to all 
nodes in the cluster. However, it can also use a pre-installed version of 
Hadoop, if it is available on all nodes in the cluster. HOD currently supports 
Hadoop 0.15 and above.
-            </li>
-          </ul>
-          <p>
-          HOD configuration requires the location of installs of these 
components to be the same on all nodes in the cluster. It will also make the 
configuration simpler to have the same location on the submit nodes.
-          </p>
-        </section>
-   
-        <section>
-          <title>Resource Manager Configuration Pre-requisites</title>
-          <p>
-          For using HOD with Torque:
-          </p>
-          <ul>
-            <li>
-            Install Torque components: pbs_server on a head node, pbs_moms on 
all compute nodes, and PBS client tools on all compute nodes and submit nodes.
-            </li>
-            <li>
-            Create a queue for submitting jobs on the pbs_server.
-            </li>
-            <li>
-            Specify a name for all nodes in the cluster, by setting a 'node 
property' to all the nodes. This can be done by using the 'qmgr' command. For 
example:
-            <em>qmgr -c "set node node properties=cluster-name"</em>
-            </li>
-            <li>
-            Ensure that jobs can be submitted to the nodes. This can be done 
by using the 'qsub' command. For example:
-            <em>echo "sleep 30" | qsub -l nodes=3</em>
-            </li>
-          </ul>
-          <p>
-          More information about setting up Torque can be found by referring 
to the documentation <a 
href="http://www.clusterresources.com/pages/products/torque-resource-manager.php";>here.</a>
-          </p>
-        </section>
-      </section>
-
-      <section>
-        <title>Setting up HOD</title>
-        <ul>
-          <li>
-          HOD is available in the 'contrib' section of Hadoop under the root 
directory 'hod'. Distribute the files under this directory to all the nodes in 
the cluster.
-          </li>
-          <li>
-          On the node from where you want to run hod, edit the file hodrc 
which can be found in the <em>install dir/conf</em> directory. This file 
contains the minimal set of values required for running hod.
-          </li>
-          <li>
-          Specify values suitable to your environment for the following 
variables defined in the configuration file. Note that some of these variables 
are defined at more than one place in the file.
-          </li>
+        <title>Documentation</title>
+      <p>Please go through the following to know more about using HOD</p>
+      <ul>
+        <li><a href="hod_admin_guide.html">Hod Admin Guide</a> : This guide 
will walk you through an overview of architecture of HOD, prerequisites, 
installing various components and dependent software, and configuring HOD to 
get it up and running.</li>
+        <li><a href="hod_user_guide.html">Hod User Guide</a> : This guide will 
let you know about how to get started on running hod, its various features, 
command line options and help on troubleshooting in detail.</li>
+        <li><a href="hod_config_guide.html">Hod Configuration Guide</a> : This 
guide discusses about onfiguring HOD, describing various configuration 
sections, parameters and their purpose in detail.</li>
       </ul>
-        <table>
-          <tr>
-            <th> Variable Name </th>
-            <th> Meaning </th>
-          </tr>
-          <tr>
-            <td> ${JAVA_HOME} </td>
-            <td> Location of Java for Hadoop. Hadoop supports Sun JDK 1.5.x 
</td>
-          </tr>
-          <tr>
-            <td> ${CLUSTER_NAME} </td>
-            <td> Name of the cluster which is specified in the 'node property' 
as mentioned in resource manager configuration. </td>
-          </tr>
-          <tr>
-            <td> ${HADOOP_HOME} </td>
-            <td> Location of Hadoop installation on the compute and submit 
nodes. </td>
-          </tr>
-          <tr>
-            <td> ${RM_QUEUE} </td>
-            <td> Queue configured for submiting jobs in the resource manager 
configuration. </td>
-          </tr>
-          <tr>
-            <td> ${RM_HOME} </td>
-            <td> Location of the resource manager installation on the compute 
and submit nodes. </td>
-          </tr>
-        </table>
-        <ul>
-          <li>
-          The following environment variables *may* need to be set depending 
on your environment. These variables must be defined where you run the HOD 
client, and also be specified in the HOD configuration file as the value of the 
key resource_manager.env-vars. Multiple variables can be specified as a comma 
separated list of key=value pairs.
-          </li>
-        </ul>
-        <table>
-          <tr>
-            <th> Variable Name </th>
-            <th> Meaning </th>
-          </tr>
-          <tr>
-            <td>HOD_PYTHON_HOME</td>
-            <td>
-            If you install python to a non-default location of the compute 
nodes, or submit nodes, then, this variable must be defined to point to the 
python executable in the non-standard   location.
-            </td>
-          </tr>
-        </table>
-        <p>
-        You can also review other configuration options in the file and modify 
them to suit your needs. Refer to the the section on configuration below for 
information about the HOD configuration.
-        </p>
-      </section>
-    </section>
-
-    <section>
-      <title>Running HOD</title>
-
-      <section>
-        <title>Overview</title>
-        <p>
-        A typical session of HOD will involve atleast three steps: allocate, 
run hadoop jobs, deallocate.
-        </p>
-        <section>
-          <title>Operation allocate</title>
-          <p>
-          The allocate operation is used to allocate a set of nodes and 
install and provision Hadoop on them. It has the following syntax:
-          </p>
-          <table>
-            <tr>
-              <td>hod -c config_file -t hadoop_tarball_location -o "allocate   
              cluster_dir number_of_nodes"</td>
-            </tr>
-          </table>
-          <p>
-          The hadoop_tarball_location must be a location on a shared file 
system accesible from all nodes in the cluster. Note, the cluster_dir must 
exist before running the command. If the command completes successfully then 
cluster_dir/hadoop-site.xml will be generated and will contain information 
about the allocated cluster's JobTracker and NameNode.
-          </p>
-          <p>
-          For example, the following command uses a hodrc file in 
~/hod-config/hodrc and allocates Hadoop (provided by the tarball 
~/share/hadoop.tar.gz) on 10 nodes, storing the generated Hadoop configuration 
in a directory named <em>~/hadoop-cluster</em>:
-          </p>
-          <table>
-            <tr>
-              <td>$ hod -c ~/hod-config/hodrc -t ~/share/hadoop.tar.gz -o 
"allocate ~/hadoop-cluster 10"</td>
-            </tr>
-          </table>
-          <p>
-          HOD also supports an environment variable called 
<em>HOD_CONF_DIR</em>. If this is defined, HOD will look for a default hodrc 
file at $HOD_CONF_DIR/hodrc. Defining this allows the above command to also be 
run as follows:
-          </p>
-          <table>
-            <tr>
-              <td>
-                <p>$ export HOD_CONF_DIR=~/hod-config</p>
-                <p>$ hod -t ~/share/hadoop.tar.gz -o "allocate 
~/hadoop-cluster 10"</p>
-              </td>
-            </tr>
-          </table>
-        </section>
-        
-        <section>
-          <title>Running Hadoop jobs using the allocated cluster</title>
-          <p>
-          Now, one can run Hadoop jobs using the allocated cluster in the 
usual manner:
-          </p>
-          <table>
-            <tr>
-              <td>hadoop --config cluster_dir hadoop_command 
hadoop_command_args</td>
-            </tr>
-          </table>
-          <p>
-          Continuing our example, the following command will run a wordcount 
example on the allocated cluster:
-          </p>
-          <table>
-            <tr>
-              <td>$ hadoop --config ~/hadoop-cluster jar 
/path/to/hadoop/hadoop-examples.jar wordcount /path/to/input 
/path/to/output</td>
-            </tr>
-          </table>
-        </section>
-
-        <section>
-          <title>Operation deallocate</title>
-          <p>
-          The deallocate operation is used to release an allocated cluster. 
When finished with a cluster, deallocate must be run so that the nodes become 
free for others to use. The deallocate operation has the following syntax:
-          </p>
-          <table>
-            <tr>
-              <td>hod -o "deallocate cluster_dir"</td>
-            </tr>
-          </table>
-          <p>
-          Continuing our example, the following command will deallocate the 
cluster:
-          </p>
-          <table>
-            <tr>
-              <td>$ hod -o "deallocate ~/hadoop-cluster"</td>
-            </tr>
-          </table>
-        </section>
-      </section>
-
-      <section>
-        <title>Command Line Options</title>
-        <p>
-        This section covers the major command line options available via the 
hod command:
-        </p>
-
-        <p>
-        <em>--help</em>
-        </p>
-        <p>
-        Prints out the help message to see the basic options.
-        </p>
-
-        <p>
-        <em>--verbose-help</em>
-        </p>
-        <p>
-        All configuration options provided in the hodrc file can be passed on 
the command line, using the syntax --section_name.option_name[=value]. When 
provided this way, the value provided on command line overrides the option 
provided in hodrc. The verbose-help command lists all the available options in 
the hodrc file. This is also a nice way to see the meaning of the configuration 
options.
-        </p>
-
-        <p>
-        <em>-c config_file</em>
-        </p>
-        <p>
-        Provides the configuration file to use. Can be used with all other 
options of HOD. Alternatively, the HOD_CONF_DIR environment variable can be 
defined to specify a directory that contains a file named hodrc, alleviating 
the need to specify the configuration file in each HOD command.
-        </p>
-
-        <p>
-        <em>-b 1|2|3|4</em>
-        </p>
-        <p>
-        Enables the given debug level. Can be used with all other options of 
HOD. 4 is most verbose.
-        </p>
-
-        <p>
-        <em>-o "help"</em>
-        </p>
-        <p>
-        Lists the operations available in the operation mode.
-        </p>
-
-        <p>
-        <em>-o "allocate cluster_dir number_of_nodes"</em>
-        </p>
-        <p>
-        Allocates a cluster on the given number of cluster nodes, and store 
the allocation information in cluster_dir for use with subsequent hadoop 
commands. Note that the cluster_dir must exist before running the command.
-        </p>
-
-        <p>
-        <em>-o "list"</em>
-        </p>
-        <p>
-        Lists the clusters allocated by this user. Information provided 
includes the Torque job id corresponding to the cluster, the cluster directory 
where the allocation information is stored, and whether the Map/Reduce daemon 
is still active or not.
-        </p>
-
-        <p>
-        <em>-o "info cluster_dir"</em>
-        </p>
-        <p>
-        Lists information about the cluster whose allocation information is 
stored in the specified cluster directory.
-        </p>
-
-        <p>
-        <em>-o "deallocate cluster_dir"</em>
-        </p>
-        <p>
-       Deallocates the cluster whose allocation information is stored in the 
specified cluster directory.
-        </p>
-
-        <p>
-        <em>-t hadoop_tarball</em>
-        </p>
-        <p>
-        Provisions Hadoop from the given tar.gz file. This option is only 
applicable to the allocate operation. For better distribution performance it is 
recommended that the Hadoop tarball contain only the libraries and binaries, 
and not the source or documentation. 
-        </p>
-
-        <p>
-        <em>-Mkey1=value1 -Mkey2=value2</em>
-        </p>
-        <p>
-        Provides configuration parameters for the provisioned Map/Reduce 
daemons (JobTracker and TaskTrackers). A hadoop-site.xml is generated with 
these values on the cluster nodes
-        </p>
-
-        <p>
-        <em>-Hkey1=value1 -Hkey2=value2</em>
-        </p>
-        <p>
-        Provides configuration parameters for the provisioned HDFS daemons 
(NameNode and DataNodes). A hadoop-site.xml is generated with these values on 
the cluster nodes
-        </p>
-
-        <p>
-        <em>-Ckey1=value1 -Ckey2=value2</em>
-        </p>
-        <p>
-        Provides configuration parameters for the client from where jobs can 
be submitted. A hadoop-site.xml is generated with these values on the submit 
node.
-        </p>
-
-      </section>
-    </section>
-    <section>
-      <title> HOD Configuration </title>
-      <section>
-        <title> Introduction to HOD Configuration </title>
-        <p>
-        Configuration options for HOD are organized as sections and options 
within them. They can be specified in two ways: a configuration file in the INI 
format, and as command line options to the HOD shell, specified in the format 
--section.option[=value]. If the same option is specified in both places, the 
value specified on the command line overrides the value in the configuration 
file.
-        </p>
-        <p>
-        To get a simple description of all configuration options, you can type 
<em>hod --verbose-help</em>
-        </p>
-        <p>
-        This section explains some of the most important or commonly used 
configuration options in some more detail.
-        </p>
-      </section>
-      <section>
-        <title> Categories / Sections in HOD Configuration </title>
-        <p>
-        The following are the various sections in the HOD configuration:
-        </p>
-        <table>
-          <tr>
-            <th> Section Name </th>
-            <th> Description </th>
-          </tr>
-          <tr>
-            <td>hod</td>
-            <td>Options for the HOD client</td>
-          </tr>
-          <tr>
-            <td>resource_manager</td>
-            <td>Options for specifying which resource manager to use, and 
other parameters for using that resource manager</td>
-          </tr>
-          <tr>
-            <td>ringmaster</td>
-            <td>Options for the RingMaster process</td>
-          </tr>
-          <tr>
-            <td>hodring</td>
-            <td>Options for the HodRing process</td>
-          </tr>
-          <tr>
-            <td>gridservice-mapred</td>
-            <td>Options for the MapReduce daemons</td>
-          </tr>
-          <tr>
-            <td>gridservice-hdfs</td>
-            <td>Options for the HDFS daemons</td>
-          </tr>
-        </table>
-      </section>
-
-      <section>
-        <title> Important and Commonly Used Configuration Options </title>
-        
-        <section>
-          <title> Common configuration options </title>
-          <p>
-          Certain configuration options are defined in most of the sections of 
the HOD configuration. Options defined in a section, are used by the process 
for which that section applies. These options have the same meaning, but can 
have different values in each section.
-          </p>
-          <table>
-            <tr>
-              <th> Option Name </th>
-              <th> Description </th>
-            </tr>
-            <tr>
-              <td>temp-dir</td>
-              <td>Temporary directory for usage by the HOD processes. Make 
sure that the users who will run hod have rights to create directories under 
the directory specified here.</td>
-            </tr>
-            <tr>
-              <td>debug</td>
-              <td>A numeric value from 1-4. 4 produces the most log 
information, and 1 the least.</td>
-            </tr>
-            <tr>
-              <td>log-dir</td>
-              <td>Directory where log files are stored. By default, this is 
<em>install-location/logs/</em>. The restrictions and notes for the temp-dir 
variable apply here too.</td>
-            </tr>
-            <tr>
-              <td>xrs-port-range</td>
-              <td>A range of ports, among which an available port shall be 
picked for use to run any XML-RPC based server daemon processes of HOD.</td>
-            </tr>
-            <tr>
-              <td>http-port-range</td>
-              <td>A range of ports, among which an available port shall be 
picked for use to run any HTTP based server daemon processes of HOD.</td>
-            </tr>
-          </table>
-        </section>
-
-        <section>
-          <title> hod options </title>
-          <table>
-            <tr>
-              <th> Option Name </th>
-              <th> Description </th>
-            </tr>
-            <tr>
-              <td>cluster</td>
-              <td>A descriptive name given to the cluster. For Torque, this is 
specified as a 'Node property' for every node in the cluster. HOD uses this 
value to compute the number of available nodes.</td>
-            </tr>
-            <tr>
-              <td>client-params</td>
-              <td>A comma-separated list of hadoop config parameters specified 
as key-value pairs. These will be used to generate a hadoop-site.xml on the 
submit node that should be used for running MapReduce jobs.</td>
-            </tr>
-          </table>
-        </section>
-
-        <section>
-          <title> resource_manager options </title>
-          <table>
-            <tr>
-              <th> Option Name </th>
-              <th> Description </th>
-            </tr>
-            <tr>
-              <td>queue</td>
-              <td>Name of the queue configured in the resource manager to 
which jobs are to be submitted.</td>
-            </tr>
-            <tr>
-              <td>batch-home</td>
-              <td>Install directory to which 'bin' is appended and under which 
the executables of the resource manager can be found. </td>
-            </tr>
-            <tr>
-              <td>env-vars</td>
-              <td>This is a comma separated list of key-value pairs, expressed 
as key=value, which would be passed to the jobs launched on the compute nodes. 
For example, if the python installation is in a non-standard location, one can 
set the environment variable 'HOD_PYTHON_HOME' to the path to the python 
executable. The HOD processes launched on the compute nodes can then use this 
variable.</td>
-            </tr>
-          </table>
-        </section>
-
-        <section>
-          <title> ringmaster options </title>
-          <table>
-            <tr>
-              <th> Option Name </th>
-              <th> Description </th>
-            </tr>
-            <tr>
-              <td>work-dirs</td>
-              <td>These are a list of comma separated paths that will serve as 
the root for directories that HOD generates and passes to Hadoop for use to 
store DFS / MapReduce data. For e.g. this is where DFS data blocks will be 
stored. Typically, as many paths are specified as there are disks available to 
ensure all disks are being utilized. The restrictions and notes for the 
temp-dir variable apply here too.</td>
-            </tr>
-          </table>
-        </section>
-
-        <section>
-          <title> gridservice-hdfs options </title>
-          <table>
-            <tr>
-              <th> Option Name </th>
-              <th> Description </th>
-            </tr>
-            <tr>
-              <td>external</td>
-              <td>
-              <p> If false, this indicates that a HDFS cluster must be bought 
up by the HOD system, on the nodes which it allocates via the allocate command. 
Note that in that case, when the cluster is de-allocated, it will bring down 
the HDFS cluster, and all the data will be lost. If true, it will try and 
connect to an externally configured HDFS system. </p>
-              <p>Typically, because input for jobs are placed into HDFS before 
jobs are run, and also the output from jobs in HDFS is required to be 
persistent, an internal HDFS cluster is of little value in a production system. 
However, it allows for quick testing.</p>
-              </td>
-            </tr>
-            <tr>
-              <td>host</td>
-              <td>Hostname of the externally configured NameNode, if any.</td>
-            </tr>
-            <tr>
-              <td>fs_port</td>
-              <td>Port to which NameNode RPC server is bound.</td>
-            </tr>
-            <tr>
-              <td>info_port</td>
-              <td>Port to which the NameNode web UI server is bound.</td>
-            </tr>
-            <tr>
-              <td>pkgs</td>
-              <td>Installation directory, under which bin/hadoop executable is 
located. This can be used to use a pre-installed version of Hadoop on the 
cluster.</td>
-            </tr>
-            <tr>
-              <td>server-params</td>
-              <td>A comma-separated list of hadoop config parameters specified 
key-value pairs. These will be used to generate a hadoop-site.xml that will be 
used by the NameNode and DataNodes.</td>
-            </tr>
-            <tr>
-              <td>final-server-params</td>
-              <td>Same as above, except they will be marked final.</td>
-            </tr>
-          </table>
-        </section>
-
-        <section>
-          <title> gridservice-mapred options </title>
-          <table>
-            <tr>
-              <th> Option Name </th>
-              <th> Description </th>
-            </tr>
-            <tr>
-              <td>external</td>
-              <td>
-              <p> If false, this indicates that a MapReduce cluster must be 
bought up by the HOD system on the nodes which it allocates via the allocate 
command. If true, if will try and connect to an externally configured MapReduce 
system.</p>
-              </td>
-            </tr>
-            <tr>
-              <td>host</td>
-              <td>Hostname of the externally configured JobTracker, if 
any.</td>
-            </tr>
-            <tr>
-              <td>tracker_port</td>
-              <td>Port to which the JobTracker RPC server is bound.</td>
-            </tr>
-            <tr>
-              <td>info_port</td>
-              <td>Port to which the JobTracker web UI server is bound.</td>
-            </tr>
-            <tr>
-              <td>pkgs</td>
-              <td>Installation directory, under which bin/hadoop executable is 
located. This can be used to use a pre-installed version of Hadoop on the 
cluster.</td>
-            </tr>
-            <tr>
-              <td>server-params</td>
-              <td>A comma-separated list of hadoop config parameters specified 
key-value pairs. These will be used to generate a hadoop-site.xml that will be 
used by the JobTracker and TaskTrackers.</td>
-            </tr>
-            <tr>
-              <td>final-server-params</td>
-              <td>Same as above, except they will be marked final.</td>
-            </tr>
-          </table>
-        </section>
-      </section>
     </section>
   </body>
 </document>
-

Added: 
hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml
URL: 
http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml?rev=629361&view=auto
==============================================================================
--- 
hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml 
(added)
+++ 
hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_admin_guide.xml 
Tue Feb 19 21:17:48 2008
@@ -0,0 +1,238 @@
+<?xml version="1.0"?>
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
+          "http://forrest.apache.org/dtd/document-v20.dtd";>
+
+
+<document>
+
+  <header>
+    <title> 
+      Hadoop On Demand
+    </title>
+  </header>
+
+  <body>
+<section>
+<title>Overview</title>
+
+<p>The Hadoop On Demand (HOD) project is a system for provisioning and
+managing independent Hadoop MapReduce and HDFS instances on a shared cluster 
+of nodes. HOD is a tool that makes it easy for administrators and users to 
+quickly setup and use Hadoop. It is also a very useful tool for Hadoop 
developers 
+and testers who need to share a physical cluster for testing their own Hadoop 
+versions.
+</p>
+
+<p>HOD relies on a resource manager (RM) for allocation of nodes that it can 
use for
+running Hadoop instances. At present it runs with the <a 
href="ext:hod/torque">Torque
+resource manager</a>.
+</p>
+
+<p>
+The basic system architecture of HOD includes components from:</p>
+<ul>
+  <li>A Resource manager (possibly together with a scheduler),</li>
+  <li>HOD components, and </li>
+  <li>Hadoop Map/Reduce and HDFS daemons.</li>
+</ul>
+
+<p>
+HOD provisions and maintains Hadoop Map/Reduce and, optionally, HDFS instances 
+through interaction with the above components on a given cluster of nodes. A 
cluster of
+nodes can be thought of as comprising of two sets of nodes:</p>
+<ul>
+  <li>Submit nodes: Users use the HOD client on these nodes to allocate 
clusters, and then
+use the Hadoop client to submit Hadoop jobs. </li>
+  <li>Compute nodes: Using the resource manager, HOD components are run on 
these nodes to 
+provision the Hadoop daemons. After that Hadoop jobs run on them.</li>
+</ul>
+
+<p>
+Here is a brief description of the sequence of operations in allocating a 
cluster and
+running jobs on them.
+</p>
+
+<ul>
+  <li>The user uses the HOD client on the Submit node to allocate a required 
number of
+cluster nodes, and provision Hadoop on them.</li>
+  <li>The HOD client uses a Resource Manager interface, (qsub, in Torque), to 
submit a HOD
+process, called the RingMaster, as a Resource Manager job, requesting the user 
desired number 
+of nodes. This job is submitted to the central server of the Resource Manager 
(pbs_server, in Torque).</li>
+  <li>On the compute nodes, the resource manager slave daemons, (pbs_moms in 
Torque), accept
+and run jobs that they are given by the central server (pbs_server in Torque). 
The RingMaster 
+process is started on one of the compute nodes (mother superior, in 
Torque).</li>
+  <li>The Ringmaster then uses another Resource Manager interface, (pbsdsh, in 
Torque), to run
+the second HOD component, HodRing, as distributed tasks on each of the compute
+nodes allocated.</li>
+  <li>The Hodrings, after initializing, communicate with the Ringmaster to get 
Hadoop commands, 
+and run them accordingly. Once the Hadoop commands are started, they register 
with the RingMaster,
+giving information about the daemons.</li>
+  <li>All the configuration files needed for Hadoop instances are generated by 
HOD itself, 
+some obtained from options given by user in its own configuration file.</li>
+  <li>The HOD client keeps communicating with the RingMaster to find out the 
location of the 
+JobTracker and HDFS daemons.</li>
+</ul>
+
+<p>The rest of the document deals with the steps needed to setup HOD on a 
physical cluster of nodes.</p>
+
+</section>
+
+<section>
+<title>Pre-requisites</title>
+
+<p>Operating System: HOD is currently tested on RHEL4.<br/>
+Nodes : HOD requires a minimum of 3 nodes configured through a resource 
manager.<br/></p>
+
+<p> Software </p>
+<p>The following components are to be installed on *ALL* the nodes before 
using HOD:</p>
+<ul>
+ <li>Torque: Resource manager</li>
+ <li><a href="ext:hod/python">Python</a> : HOD requires version 2.5.1 of 
Python.</li>
+</ul>
+
+<p>The following components can be optionally installed for getting better
+functionality from HOD:</p>
+<ul>
+ <li><a href="ext:hod/twisted-python">Twisted Python</a>: This can be
+  used for improving the scalability of HOD. If this module is detected to be
+  installed, HOD uses it, else it falls back to default modules.</li>
+ <li><a href="ext:site">Hadoop</a>: HOD can automatically
+ distribute Hadoop to all nodes in the cluster. However, it can also use a
+ pre-installed version of Hadoop, if it is available on all nodes in the 
cluster.
+  HOD currently supports Hadoop 0.15 and above.</li>
+</ul>
+
+<p>NOTE: HOD configuration requires the location of installs of these
+components to be the same on all nodes in the cluster. It will also
+make the configuration simpler to have the same location on the submit
+nodes.
+</p>
+</section>
+
+<section>
+<title>Resource Manager</title>
+<p>  Currently HOD works with the Torque resource manager, which it uses for 
its node
+  allocation and job submission. Torque is an open source resource manager from
+  <a href="ext:hod/cluster-resources">Cluster Resources</a>, a community effort
+  based on the PBS project. It provides control over batch jobs and 
distributed compute nodes. Torque is
+  freely available for download from <a 
href="ext:hod/torque-download">here</a>.
+  </p>
+
+<p>  All documentation related to torque can be seen under
+  the section TORQUE Resource Manager <a
+  href="ext:hod/torque-docs">here</a>. You can
+  get wiki documentation from <a
+  href="ext:hod/torque-wiki">here</a>.
+  Users may wish to subscribe to TORQUE’s mailing list or view the archive 
for questions,
+  comments <a
+  href="ext:hod/torque-mailing-list">here</a>.
+</p>
+
+<p>For using HOD with Torque:</p>
+<ul>
+ <li>Install Torque components: pbs_server on one node(head node), pbs_mom on 
all
+  compute nodes, and PBS client tools on all compute nodes and submit
+  nodes. Perform atleast a basic configuration so that the Torque system is up 
and
+  running i.e pbs_server knows which machines to talk to. Look <a
+  href="ext:hod/torque-basic-config">here</a>
+  for basic configuration.
+
+  For advanced configuration, see <a
+  href="ext:hod/torque-advanced-config">here</a></li>
+ <li>Create a queue for submitting jobs on the pbs_server. The name of the 
queue is the
+  same as the HOD configuration parameter, resource-manager.queue. The Hod 
client uses this queue to
+  submit the Ringmaster process as a Torque job.</li>
+ <li>Specify a 'cluster name' as a 'property' for all nodes in the cluster.
+  This can be done by using the 'qmgr' command. For example:
+  qmgr -c "set node node properties=cluster-name". The name of the cluster is 
the same as
+  the HOD configuration parameter, hod.cluster. </li>
+ <li>Ensure that jobs can be submitted to the nodes. This can be done by
+  using the 'qsub' command. For example:
+  echo "sleep 30" | qsub -l nodes=3</li>
+</ul>
+
+</section>
+
+<section>
+<title>Installing HOD</title>
+
+<p>Now that the resource manager set up is done, we proceed on to obtaining and
+installing HOD.</p>
+<ul>
+ <li>If you are getting HOD from the Hadoop tarball,it is available under the 
+  'contrib' section of Hadoop, under the root  directory 'hod'.</li>
+ <li>If you are building from source, you can run ant tar from the Hadoop root
+  directory, to generate the Hadoop tarball, and then pick HOD from there,
+  as described in the point above.</li>
+ <li>Distribute the files under this directory to all the nodes in the
+  cluster. Note that the location where the files are copied should be
+  the same on all the nodes.</li>
+  <li>Note that compiling hadoop would build HOD with appropriate permissions 
+  set on all the required script files in HOD.</li>
+</ul>
+</section>
+
+<section>
+<title>Configuring HOD</title>
+
+<p>After HOD installation is done, it has to be configured before we start 
using
+it.</p>
+<section>
+  <title>Minimal Configuration to get started</title>
+<ul>
+ <li>On the node from where you want to run hod, edit the file hodrc
+  which can be found in the &lt;install dir&gt;/conf directory. This file
+  contains the minimal set of values required for running hod.</li>
+ <li>
+<p>Specify values suitable to your environment for the following
+  variables defined in the configuration file. Note that some of these
+  variables are defined at more than one place in the file.</p>
+
+  <ul>
+   <li>${JAVA_HOME}: Location of Java for Hadoop. Hadoop supports Sun JDK
+    1.5.x and above.</li>
+   <li>${CLUSTER_NAME}: Name of the cluster which is specified in the
+    'node property' as mentioned in resource manager configuration.</li>
+   <li>${HADOOP_HOME}: Location of Hadoop installation on the compute and
+    submit nodes.</li>
+   <li>${RM_QUEUE}: Queue configured for submiting jobs in the resource
+    manager configuration.</li>
+   <li>${RM_HOME}: Location of the resource manager installation on the
+    compute and submit nodes.</li>
+    </ul>
+</li>
+
+<li>
+<p>The following environment variables *may* need to be set depending on
+  your environment. These variables must be defined where you run the
+  HOD client, and also be specified in the HOD configuration file as the
+  value of the key resource_manager.env-vars. Multiple variables can be
+  specified as a comma separated list of key=value pairs.</p>
+
+  <ul>
+   <li>HOD_PYTHON_HOME: If you install python to a non-default location
+    of the compute nodes, or submit nodes, then, this variable must be
+    defined to point to the python executable in the non-standard
+    location.</li>
+    </ul>
+</li>
+</ul>
+</section>
+
+  <section>
+    <title>Advanced Configuration</title>
+    <p> You can review other configuration options in the file and modify them 
to suit
+ your needs. Refer to the <a href="hod_config_guide.html">Configuration 
Guide</a> for information about the HOD
+ configuration.
+    </p>
+  </section>
+</section>
+
+  <section>
+    <title>Running HOD</title>
+    <p>You can now proceed to <a href="hod_user_guide.html">HOD User Guide</a> 
for information about how to run HOD,
+    what are the various features, options and for help in 
trouble-shooting.</p>
+  </section>
+</body>
+</document>

Added: 
hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml
URL: 
http://svn.apache.org/viewvc/hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml?rev=629361&view=auto
==============================================================================
--- 
hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml 
(added)
+++ 
hadoop/core/trunk/src/docs/src/documentation/content/xdocs/hod_config_guide.xml 
Tue Feb 19 21:17:48 2008
@@ -0,0 +1,207 @@
+<?xml version="1.0"?>
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
+          "http://forrest.apache.org/dtd/document-v20.dtd";>
+
+
+<document>
+
+  <header>
+    <title> 
+      Hadoop On Demand: Configuration Guide
+    </title>
+  </header>
+
+  <body>
+    <section>
+      <title>1. Introduction</title>
+    
+      <p>Configuration options for HOD are organized as sections and options 
+      within them. They can be specified in two ways: a configuration file 
+      in the INI format, and as command line options to the HOD shell, 
+      specified in the format --section.option[=value]. If the same option is 
+      specified in both places, the value specified on the command line 
+      overrides the value in the configuration file.</p>
+      
+      <p>
+        To get a simple description of all configuration options, you can type
+      </p>
+      <table><tr><td><code>$ hod --verbose-help</code></td></tr></table>
+      
+      <p>This document explains some of the most important or commonly used
+      configuration options in some more detail.</p>
+    </section>
+    
+    <section>
+      <title>2. Sections</title>
+    
+      <p>The following are the various sections in the HOD configuration:</p>
+      
+      <ul>
+        <li>  hod:                  Options for the HOD client</li>
+        <li>  resource_manager:     Options for specifying which resource 
manager
+         to use, and other parameters for using that resource manager</li>
+        <li>  ringmaster:           Options for the RingMaster process, </li>
+        <li>  hodring:              Options for the HodRing processes</li>
+        <li>  gridservice-mapred:   Options for the MapReduce daemons</li>
+        <li>  gridservice-hdfs:     Options for the HDFS daemons.</li>
+      </ul>
+    
+      
+      <p>The next section deals with some of the important options in the HOD 
+        configuration.</p>
+    </section>
+    
+    <section>
+      <title>3. Important / Commonly Used Configuration Options</title>
+  
+  
+      <section> 
+        <title>3.1 Common configuration options</title>
+        
+        <p>Certain configuration options are defined in most of the sections 
of 
+        the HOD configuration. Options defined in a section, are used by the
+        process for which that section applies. These options have the same
+        meaning, but can have different values in each section.
+        </p>
+        
+        <ul>
+          <li>temp-dir: Temporary directory for usage by the HOD processes. 
Make 
+                      sure that the users who will run hod have rights to 
create 
+                      directories under the directory specified here.</li>
+          
+          <li>debug: A numeric value from 1-4. 4 produces the most log 
information,
+                   and 1 the least.</li>
+          
+          <li>log-dir: Directory where log files are stored. By default, this 
is
+                     &lt;install-location&gt;/logs/. The restrictions and 
notes for the
+                     temp-dir variable apply here too.
+          </li>
+          
+          <li>xrs-port-range: A range of ports, among which an available port 
shall
+                            be picked for use to run an XML-RPC server.</li>
+          
+          <li>http-port-range: A range of ports, among which an available port 
shall
+                             be picked for use to run an HTTP server.</li>
+          
+          <li>java-home: Location of Java to be used by Hadoop.</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>3.2 hod options</title>
+        
+        <ul>
+          <li>cluster: A descriptive name given to the cluster. For Torque, 
this is
+                     specified as a 'Node property' for every node in the 
cluster.
+                     HOD uses this value to compute the number of available 
nodes.</li>
+          
+          <li>client-params: A comma-separated list of hadoop config parameters
+                           specified as key-value pairs. These will be used to
+                           generate a hadoop-site.xml on the submit node that 
+                           should be used for running MapReduce jobs.</li>
+         </ul>
+      </section>
+      
+      <section>
+        <title>3.3 resource_manager options</title>
+      
+        <ul>
+          <li>queue: Name of the queue configured in the resource manager to 
which
+                   jobs are to be submitted.</li>
+          
+          <li>batch-home: Install directory to which 'bin' is appended and 
under 
+                        which the executables of the resource manager can be 
+                        found.</li> 
+          
+          <li>env-vars: This is a comma separated list of key-value pairs, 
+                      expressed as key=value, which would be passed to the 
jobs 
+                      launched on the compute nodes. 
+                      For example, if the python installation is 
+                      in a non-standard location, one can set the environment
+                      variable 'HOD_PYTHON_HOME' to the path to the python 
+                      executable. The HOD processes launched on the compute 
nodes
+                      can then use this variable.</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>3.4 ringmaster options</title>
+        
+        <ul>
+          <li>work-dirs: These are a list of comma separated paths that will 
serve
+                       as the root for directories that HOD generates and 
passes
+                       to Hadoop for use to store DFS / MapReduce data. For 
e.g.
+                       this is where DFS data blocks will be stored. Typically,
+                       as many paths are specified as there are disks available
+                       to ensure all disks are being utilized. The restrictions
+                       and notes for the temp-dir variable apply here too.</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>3.5 gridservice-hdfs options</title>
+        
+        <ul>
+          <li>external: If false, this indicates that a HDFS cluster must be 
+                      bought up by the HOD system, on the nodes which it 
+                      allocates via the allocate command. Note that in that 
case,
+                      when the cluster is de-allocated, it will bring down the 
+                      HDFS cluster, and all the data will be lost.
+                      If true, it will try and connect to an externally 
configured
+                      HDFS system.
+                      Typically, because input for jobs are placed into HDFS
+                      before jobs are run, and also the output from jobs in 
HDFS 
+                      is required to be persistent, an internal HDFS cluster 
is 
+                      of little value in a production system. However, it 
allows 
+                      for quick testing.</li>
+          
+          <li>host: Hostname of the externally configured NameNode, if any</li>
+          
+          <li>fs_port: Port to which NameNode RPC server is bound.</li>
+          
+          <li>info_port: Port to which the NameNode web UI server is 
bound.</li>
+          
+          <li>pkgs: Installation directory, under which bin/hadoop executable 
is 
+                  located. This can be used to use a pre-installed version of
+                  Hadoop on the cluster.</li>
+          
+          <li>server-params: A comma-separated list of hadoop config parameters
+                           specified key-value pairs. These will be used to
+                           generate a hadoop-site.xml that will be used by the
+                           NameNode and DataNodes.</li>
+          
+          <li>final-server-params: Same as above, except they will be marked 
final.</li>
+        </ul>
+      </section>
+      
+      <section>
+        <title>3.6 gridservice-mapred options</title>
+        
+        <ul>
+          <li>external: If false, this indicates that a MapReduce cluster must 
be
+                      bought up by the HOD system on the nodes which it 
allocates
+                      via the allocate command.
+                      If true, if will try and connect to an externally 
+                      configured MapReduce system.</li>
+          
+          <li>host: Hostname of the externally configured JobTracker, if 
any</li>
+          
+          <li>tracker_port: Port to which the JobTracker RPC server is 
bound</li>
+          
+          <li>info_port: Port to which the JobTracker web UI server is 
bound.</li>
+          
+          <li>pkgs: Installation directory, under which bin/hadoop executable 
is 
+                  located</li>
+          
+          <li>server-params: A comma-separated list of hadoop config parameters
+                           specified key-value pairs. These will be used to
+                           generate a hadoop-site.xml that will be used by the
+                           JobTracker and TaskTrackers</li>
+          
+          <li>final-server-params: Same as above, except they will be marked 
final.</li>
+        </ul>
+      </section>
+    </section>
+  </body>
+</document>


Reply via email to