Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by ChristopheTaton: http://wiki.apache.org/hadoop/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 ------------------------------------------------------------------------------ = What we want to do = - In this short tutorial, I will describe the required steps for setting up a single-node [http://lucene.apache.org/hadoop/ Hadoop] cluster using the [http://lucene.apache.org/hadoop/hdfs_design.html Hadoop Distributed File System (HDFS)] on [http://www.ubuntu.com/ Ubuntu Linux]. + In this short tutorial, I will describe the required steps for setting up a single-node [http://hadoop.apache.org/core/ Hadoop] cluster using the [http://hadoop.apache.org/core/hdfs_design.html Hadoop Distributed File System (HDFS)] on [http://www.ubuntu.com/ Ubuntu Linux]. - [http://lucene.apache.org/hadoop/ Hadoop] is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the [http://en.wikipedia.org/wiki/Google_File_System Google File System] and of [http://en.wikipedia.org/wiki/MapReduce MapReduce]. [http://lucene.apache.org/hadoop/hdfs_design.html HDFS] is a highly fault-tolerant distributed file system and like Hadoop designed to be deployed on low-cost hardware. It provides high throughput access to application data and is suitable for applications that have large data sets. + [http://hadoop.apache.org/core/ Hadoop] is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the [http://en.wikipedia.org/wiki/Google_File_System Google File System] and of [http://en.wikipedia.org/wiki/MapReduce MapReduce]. [http://hadoop.apache.org/core/hdfs_design.html HDFS] is a highly fault-tolerant distributed file system and like Hadoop designed to be deployed on low-cost hardware. It provides high throughput access to application data and is suitable for applications that have large data sets. The main goal of this tutorial is to get a ''simple'' Hadoop installation up and running so that you can play around with the software and learn more about it. This tutorial has been tested with the following software versions: * [http://www.ubuntu.com/ Ubuntu Linux] 7.10, 7.04 - * [http://lucene.apache.org/hadoop/ Hadoop] 0.14.2, released October 2007 (also works with 0.13.0) + * [http://hadoop.apache.org/core/ Hadoop] 0.14.2, released October 2007 (also works with 0.13.0) You can find the time of the last document update at the very bottom of this page. @@ -161, +161 @@ === hadoop-site.xml === - Any site-specific configuration of Hadoop is configured in {{{<HADOOP_INSTALL>/conf/hadoop-site.xml}}}. Here we will configure the directory where Hadoop will store its data files, the ports it listens to, etc. Our setup will use Hadoop's Distributed File System, [http://lucene.apache.org/hadoop/hdfs_design.html HDFS], even though our little "cluster" only contains our single local machine. + Any site-specific configuration of Hadoop is configured in {{{<HADOOP_INSTALL>/conf/hadoop-site.xml}}}. Here we will configure the directory where Hadoop will store its data files, the ports it listens to, etc. Our setup will use Hadoop's Distributed File System, [http://hadoop.apache.org/core/hdfs_design.html HDFS], even though our little "cluster" only contains our single local machine. You can leave the settings below as is with the exception of the {{{hadoop.tmp.dir}}} variable which you have to change to the directory of your choice, for example {{{/usr/local/hadoop-datastore/hadoop-${user.name}}}}. Hadoop will expand {{{${user.name}}}} to the system user which is running Hadoop, so in our case this will be {{{hadoop}}} and thus the final path will be {{{/usr/local/hadoop-datastore/hadoop-hadoop}}}. @@ -212, +212 @@ </configuration> }}} - See GettingStartedWithHadoop and the documentation in [http://lucene.apache.org/hadoop/api/overview-summary.html Hadoop's API Overview] if you have any questions about Hadoop's configuration options. + See GettingStartedWithHadoop and the documentation in [http://hadoop.apache.org/core/api/overview-summary.html Hadoop's API Overview] if you have any questions about Hadoop's configuration options. == Formatting the name node == @@ -352, +352 @@ === Copy local example data to HDFS === - Before we run the actual MapReduce job, we first have to copy the files from our local file system to Hadoop's [http://lucene.apache.org/hadoop/hdfs_design.html HDFS]. See ImportantConcepts for more information about this step. + Before we run the actual MapReduce job, we first have to copy the files from our local file system to Hadoop's [http://hadoop.apache.org/core/hdfs_design.html HDFS]. See ImportantConcepts for more information about this step. {{{ [EMAIL PROTECTED]:/usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /tmp/gutenberg gutenberg
