[Hadoop Wiki] Update of "Running Hadoop On Ubuntu Linux (Single-Node Cluster)" by ChristopheTaton

Apache Wiki Sun, 17 Feb 2008 02:55:47 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by ChristopheTaton:
http://wiki.apache.org/hadoop/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

------------------------------------------------------------------------------
  
  = What we want to do =
  
- In this short tutorial, I will describe the required steps for setting up a 
single-node [http://lucene.apache.org/hadoop/ Hadoop] cluster using the 
[http://lucene.apache.org/hadoop/hdfs_design.html Hadoop Distributed File 
System (HDFS)] on [http://www.ubuntu.com/ Ubuntu Linux].
+ In this short tutorial, I will describe the required steps for setting up a 
single-node [http://hadoop.apache.org/core/ Hadoop] cluster using the 
[http://hadoop.apache.org/core/hdfs_design.html Hadoop Distributed File System 
(HDFS)] on [http://www.ubuntu.com/ Ubuntu Linux].
  
- [http://lucene.apache.org/hadoop/ Hadoop] is a framework written in Java for 
running applications on large clusters of commodity hardware and incorporates 
features similar to those of the 
[http://en.wikipedia.org/wiki/Google_File_System Google File System] and of 
[http://en.wikipedia.org/wiki/MapReduce MapReduce]. 
[http://lucene.apache.org/hadoop/hdfs_design.html HDFS] is a highly 
fault-tolerant distributed file system and like Hadoop designed to be deployed 
on low-cost hardware. It provides high throughput access to application data 
and is suitable for applications that have large data sets.
+ [http://hadoop.apache.org/core/ Hadoop] is a framework written in Java for 
running applications on large clusters of commodity hardware and incorporates 
features similar to those of the 
[http://en.wikipedia.org/wiki/Google_File_System Google File System] and of 
[http://en.wikipedia.org/wiki/MapReduce MapReduce]. 
[http://hadoop.apache.org/core/hdfs_design.html HDFS] is a highly 
fault-tolerant distributed file system and like Hadoop designed to be deployed 
on low-cost hardware. It provides high throughput access to application data 
and is suitable for applications that have large data sets.
  
  The main goal of this tutorial is to get a ''simple'' Hadoop installation up 
and running so that you can play around with the software and learn more about 
it.
  
  This tutorial has been tested with the following software versions:
  
   * [http://www.ubuntu.com/ Ubuntu Linux] 7.10, 7.04
-  * [http://lucene.apache.org/hadoop/ Hadoop] 0.14.2, released October 2007 
(also works with 0.13.0)
+  * [http://hadoop.apache.org/core/ Hadoop] 0.14.2, released October 2007 
(also works with 0.13.0)
  
  You can find the time of the last document update at the very bottom of this 
page.
  
@@ -161, +161 @@

  
  === hadoop-site.xml ===
  
- Any site-specific configuration of Hadoop is configured in 
{{{<HADOOP_INSTALL>/conf/hadoop-site.xml}}}. Here we will configure the 
directory where Hadoop will store its data files, the ports it listens to, etc. 
Our setup will use Hadoop's Distributed File System, 
[http://lucene.apache.org/hadoop/hdfs_design.html HDFS], even though our little 
"cluster" only contains our single local machine.
+ Any site-specific configuration of Hadoop is configured in 
{{{<HADOOP_INSTALL>/conf/hadoop-site.xml}}}. Here we will configure the 
directory where Hadoop will store its data files, the ports it listens to, etc. 
Our setup will use Hadoop's Distributed File System, 
[http://hadoop.apache.org/core/hdfs_design.html HDFS], even though our little 
"cluster" only contains our single local machine.
  
  You can leave the settings below as is with the exception of the 
{{{hadoop.tmp.dir}}} variable which you have to change to the directory of your 
choice, for example {{{/usr/local/hadoop-datastore/hadoop-${user.name}}}}. 
Hadoop will expand {{{${user.name}}}} to the system user which is running 
Hadoop, so in our case this will be {{{hadoop}}} and thus the final path will 
be {{{/usr/local/hadoop-datastore/hadoop-hadoop}}}.
  
@@ -212, +212 @@

  </configuration>
  }}}
  
- See GettingStartedWithHadoop and the documentation in 
[http://lucene.apache.org/hadoop/api/overview-summary.html Hadoop's API 
Overview] if you have any questions about Hadoop's configuration options.
+ See GettingStartedWithHadoop and the documentation in 
[http://hadoop.apache.org/core/api/overview-summary.html Hadoop's API Overview] 
if you have any questions about Hadoop's configuration options.
  
  == Formatting the name node ==
  
@@ -352, +352 @@

  
  === Copy local example data to HDFS ===
  
- Before we run the actual MapReduce job, we first have to copy the files from 
our local file system to Hadoop's 
[http://lucene.apache.org/hadoop/hdfs_design.html HDFS]. See ImportantConcepts 
for more information about this step.
+ Before we run the actual MapReduce job, we first have to copy the files from 
our local file system to Hadoop's 
[http://hadoop.apache.org/core/hdfs_design.html HDFS]. See ImportantConcepts 
for more information about this step.
  
  {{{
    [EMAIL PROTECTED]:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal 
/tmp/gutenberg gutenberg

[Hadoop Wiki] Update of "Running Hadoop On Ubuntu Linux (Single-Node Cluster)" by ChristopheTaton

Reply via email to