RE: Detailed steps to run Hadoop in distributed system...

Devaraj Das Fri, 02 Mar 2007 00:42:38 -0800

> Have anyone successfully tried running hadoop in two systems?
Of course! We have Hadoop running on clusters of 900 nodes.


> In master node, i have a user name called "jaya"... Is it necessary to
> create a user name called "jaya" in the slave system also... or we can
> simply use the user name that exist in the slave machine?
You should ideally run hadoop as the same user on all machines in the
cluster. The shell scripts for starting/stopping hadoop daemons uses ssh to
connect to the machines listed in the slaves file. Although you can probably
work around that, I would recommend that you have the same user everywhere.

>From the log messages, it looks like the host 10.229.62.6 could not
communicate with the other host in order to start the hadoop daemons. Please
address that issue first.

> -----Original Message-----
> From: jaylac [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 02, 2007 1:17 PM
> To: [email protected]
> Subject: Detailed steps to run Hadoop in distributed system...
> 
> 
> Hi Hadoop-Users.....
> 
> Have anyone successfully tried running hadoop in two systems?
> 
> I've tried running the wordcount example in one system.. It works fine...
> But when i try to add nodes to the cluster and run wordcount example, i
> get
> errors....
> 
> So please let me know the detailed steps to be followed...
> 
> Though the steps are given in the hadoop website, i need some help from u
> people...
> 
> They might have thought some steps to be obvious and would have not stold
> that in the website...
> 
> Im new user... So i simply followed the instructions given... I might have
> overlooked some steps which is necessary to run it....
> 
> Another important doubt....
> 
> In master node, i have a user name called "jaya"... Is it necessary to
> create a user name called "jaya" in the slave system also... or we can
> simply use the user name that exist in the slave machine?
> 
> 
> 
> Im using two RED HAT LINUX machines... one master(10.229.62.6) and the
> other
> slave(10.229.62.56)
> In master node, the user name is jaya
> In slave node, the user name is 146736
> 
> The steps which i follow is.....
> 
> Edit /home/jaya/.bashrc file
>           Here ill set the HADOOP_CONF_DIR environment variable
> 
> MASTER NODE
> 
> 1. Edit conf/slaves file....
>         Contents
>         ====================
>          localhost
>           [EMAIL PROTECTED]
>          ====================
> 
> 2. Edit conf/hadoop-en.sh file
>          Here ill set the JAVA_HOME environment variable
>          Thats it.... No other changes in this file....
>          PLEASE LET ME KNOW IF I SHOULD ADD ANYTHING HERE
> 
> 3. Edit conf/hadoop-site.xml file
>        Contents
>         ===========================================
>          <?xml version="1.0"?>
>          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
>          <!-- Put site-specific property overrides in this file. -->
> 
>          <configuration>
> 
>          <property>
>          <name>fs.default.name</name>
>          <value>10.229.62.6:50010</value>
>          </property>
> 
>          <property>
>          <name>mapred.job.tracker</name>
>          <value>10.229.62.6:50011</value>
>          </property>
> 
>          <property>
>          <name>dfs.replication</name>
>          <value>2</value>
>          </property>
> 
>          /configuration>
>          ====================================
> 
>          LET ME KNOW IF I NEED TO ADD ANYTHING HERE....
> 
> SLAVE NODE
> 
> 1. Edit conf/masters file....
>         Contents
>         ====================
>          localhost
>           [EMAIL PROTECTED]
>          ====================
> 
> 2. Edit conf/hadoop-en.sh file
>          Here ill set the JAVA_HOME environment variable
>          Thats it.... No other changes in this file....
>          PLEASE LET ME KNOW IF I SHOULD ADD ANYTHING HERE
> 
> 3. Edit conf/hadoop-site.xml file
>        Contents
>         ===========================================
>          <?xml version="1.0"?>
>          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
>          <!-- Put site-specific property overrides in this file. -->
> 
>          <configuration>
> 
>          <property>
>          <name>fs.default.name</name>
>          <value>10.229.62.6:50010</value>
>          </property>
> 
>          <property>
>          <name>mapred.job.tracker</name>
>          <value>10.229.62.6:50011</value>
>          </property>
> 
>          <property>
>          <name>dfs.replication</name>
>          <value>2</value>
>          </property>
> 
>          /configuration>
>          ====================================
> 
>          LET ME KNOW IF I NEED TO ADD ANYTHING HERE....
> 
> I've already done steps for passwordless login
> 
> Thats is all........... Then ill perform the following operations....
> 
> In the HADOOP_HOME directory,
> 
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop namenode -format
> Re-format filesystem in /tmp/hadoop-146736/dfs/name ? (Y or N) Y
> Formatted /tmp/hadoop-146736/dfs/name
> [EMAIL PROTECTED] hadoop-0.11.0]$
> 
> Then
> 
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/start-all.sh
> starting namenode, logging to
> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-namenode-
> localhost.localdomain.out
> localhost: starting datanode, logging to
> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-datanode-
> localhost.localdomain.out
> [EMAIL PROTECTED]: ssh: connect to host 10.229.62.56 port 22: No route
> to
> host
> localhost: starting secondarynamenode, logging to
> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-secondarynamenode-
> localhost.localdomain.out
> starting jobtracker, logging to
> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-jobtracker-
> localhost.localdomain.out
> localhost: starting tasktracker, logging to
> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-tasktracker-
> localhost.localdomain.out
> [EMAIL PROTECTED]: ssh: connect to host 10.229.62.56 port 22: No route
> to
> host
> [EMAIL PROTECTED] hadoop-0.11.0]$
> 
> [EMAIL PROTECTED] hadoop-0.11.0]$ mkdir input
> [EMAIL PROTECTED] hadoop-0.11.0]$ cp conf/*.xml input
> [EMAIL PROTECTED] hadoop-0.11.0]$
> 
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -put input input
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -lsr /
> /tmp    <dir>
> /tmp/hadoop-jaya        <dir>
> /tmp/hadoop-jaya/mapred <dir>
> /tmp/hadoop-jaya/mapred/system  <dir>
> /user   <dir>
> /user/jaya      <dir>
> /user/jaya/input        <dir>
> /user/jaya/input/hadoop-default.xml     <r 2>   21708
> /user/jaya/input/hadoop-site.xml        <r 2>   1333
> /user/jaya/input/mapred-default.xml     <r 2>   180
> [EMAIL PROTECTED] hadoop-0.11.0]$
> 
> 
> 
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -ls input
> Found 3 items
> /user/jaya/input/hadoop-default.xml     <r 2>   21708
> /user/jaya/input/hadoop-site.xml        <r 2>   1333
> /user/jaya/input/mapred-default.xml     <r 2>   180
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -ls output
> Found 0 items
> [EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop jar hadoop-0.11.0-examples.jar
> wordcount input output
> java.net.SocketTimeoutException: timed out waiting for rpc response
>         at org.apache.hadoop.ipc.Client.call(Client.java:469)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:164)
>         at $Proxy1.getProtocolVersion(Unknown Source)
>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:248)
>         at org.apache.hadoop.mapred.JobClient.init(JobClient.java:200)
>         at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:192)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:381)
>         at org.apache.hadoop.examples.WordCount.main(WordCount.java:143)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> 39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> pl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriv
> er.java:71)
>         at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
>         at
> org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> 39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
> pl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> [EMAIL PROTECTED] hadoop-0.11.0]$
> 
> 
> I dont know where the problem is.......
> 
> I've not created any directory called output.... if at all we need to
> create
> one, where should we create?
> Should i configure some more settings.... Please explain in detail....
> 
> Please do help me.....
> 
> Thanks in advance
> Jaya
> --
> View this message in context: http://www.nabble.com/Detailed-steps-to-run-
> Hadoop-in-distributed-system...-tf3332250.html#a9265480
> Sent from the Hadoop Users mailing list archive at Nabble.com.

RE: Detailed steps to run Hadoop in distributed system...

Reply via email to