Re: Detailed steps to run Hadoop in distributed system...

Albert Chern Fri, 02 Mar 2007 08:48:36 -0800

Your hadoop-site.xml is missing a "<" in the last line, but it looks like
you're having a network problem.  Can you ping the machines from each other?


On 3/1/07, jaylac <[EMAIL PROTECTED]> wrote:



Hi Hadoop-Users.....

Have anyone successfully tried running hadoop in two systems?

I've tried running the wordcount example in one system.. It works fine...
But when i try to add nodes to the cluster and run wordcount example, i
get
errors....

So please let me know the detailed steps to be followed...

Though the steps are given in the hadoop website, i need some help from u
people...

They might have thought some steps to be obvious and would have not stold
that in the website...

Im new user... So i simply followed the instructions given... I might have
overlooked some steps which is necessary to run it....

Another important doubt....

In master node, i have a user name called "jaya"... Is it necessary to
create a user name called "jaya" in the slave system also... or we can
simply use the user name that exist in the slave machine?



Im using two RED HAT LINUX machines... one master(10.229.62.6) and the
other
slave(10.229.62.56)
In master node, the user name is jaya
In slave node, the user name is 146736

The steps which i follow is.....

Edit /home/jaya/.bashrc file
          Here ill set the HADOOP_CONF_DIR environment variable

MASTER NODE

1. Edit conf/slaves file....
        Contents
        ====================
         localhost
          [EMAIL PROTECTED]
         ====================

2. Edit conf/hadoop-en.sh file
         Here ill set the JAVA_HOME environment variable
         Thats it.... No other changes in this file....
         PLEASE LET ME KNOW IF I SHOULD ADD ANYTHING HERE

3. Edit conf/hadoop-site.xml file
       Contents
        ===========================================
         <?xml version="1.0"?>
         <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

         <!-- Put site-specific property overrides in this file. -->

         <configuration>

         <property>
         <name>fs.default.name</name>
         <value>10.229.62.6:50010</value>
         </property>

         <property>
         <name>mapred.job.tracker</name>
         <value>10.229.62.6:50011</value>
         </property>

         <property>
         <name>dfs.replication</name>
         <value>2</value>
         </property>

         /configuration>
         ====================================

         LET ME KNOW IF I NEED TO ADD ANYTHING HERE....

SLAVE NODE

1. Edit conf/masters file....
        Contents
        ====================
         localhost
          [EMAIL PROTECTED]
         ====================

2. Edit conf/hadoop-en.sh file
         Here ill set the JAVA_HOME environment variable
         Thats it.... No other changes in this file....
         PLEASE LET ME KNOW IF I SHOULD ADD ANYTHING HERE

3. Edit conf/hadoop-site.xml file
       Contents
        ===========================================
         <?xml version="1.0"?>
         <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

         <!-- Put site-specific property overrides in this file. -->

         <configuration>

         <property>
         <name>fs.default.name</name>
         <value>10.229.62.6:50010</value>
         </property>

         <property>
         <name>mapred.job.tracker</name>
         <value>10.229.62.6:50011</value>
         </property>

         <property>
         <name>dfs.replication</name>
         <value>2</value>
         </property>

         /configuration>
         ====================================

         LET ME KNOW IF I NEED TO ADD ANYTHING HERE....

I've already done steps for passwordless login

Thats is all........... Then ill perform the following operations....

In the HADOOP_HOME directory,

[EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop namenode -format
Re-format filesystem in /tmp/hadoop-146736/dfs/name ? (Y or N) Y
Formatted /tmp/hadoop-146736/dfs/name
[EMAIL PROTECTED] hadoop-0.11.0]$

Then

[EMAIL PROTECTED] hadoop-0.11.0]$ bin/start-all.sh
starting namenode, logging to
/opt/hadoop-0.11.0/bin/../logs/hadoop-
jaya-namenode-localhost.localdomain.out
localhost: starting datanode, logging to
/opt/hadoop-0.11.0/bin/../logs/hadoop-
jaya-datanode-localhost.localdomain.out
[EMAIL PROTECTED]: ssh: connect to host 10.229.62.56 port 22: No route
to
host
localhost: starting secondarynamenode, logging to
/opt/hadoop-0.11.0/bin/../logs/hadoop-
jaya-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to
/opt/hadoop-0.11.0/bin/../logs/hadoop-
jaya-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to
/opt/hadoop-0.11.0/bin/../logs/hadoop-
jaya-tasktracker-localhost.localdomain.out
[EMAIL PROTECTED]: ssh: connect to host 10.229.62.56 port 22: No route
to
host
[EMAIL PROTECTED] hadoop-0.11.0]$

[EMAIL PROTECTED] hadoop-0.11.0]$ mkdir input
[EMAIL PROTECTED] hadoop-0.11.0]$ cp conf/*.xml input
[EMAIL PROTECTED] hadoop-0.11.0]$

[EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -put input input
[EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -lsr /
/tmp    <dir>
/tmp/hadoop-jaya        <dir>
/tmp/hadoop-jaya/mapred <dir>
/tmp/hadoop-jaya/mapred/system  <dir>
/user   <dir>
/user/jaya      <dir>
/user/jaya/input        <dir>
/user/jaya/input/hadoop-default.xml     <r 2>   21708
/user/jaya/input/hadoop-site.xml        <r 2>   1333
/user/jaya/input/mapred-default.xml     <r 2>   180
[EMAIL PROTECTED] hadoop-0.11.0]$



[EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -ls input
Found 3 items
/user/jaya/input/hadoop-default.xml     <r 2>   21708
/user/jaya/input/hadoop-site.xml        <r 2>   1333
/user/jaya/input/mapred-default.xml     <r 2>   180
[EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop dfs -ls output
Found 0 items
[EMAIL PROTECTED] hadoop-0.11.0]$ bin/hadoop jar hadoop-0.11.0-examples.jar
wordcount input output
java.net.SocketTimeoutException: timed out waiting for rpc response
        at org.apache.hadoop.ipc.Client.call(Client.java:469)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:164)
        at $Proxy1.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:248)
        at org.apache.hadoop.mapred.JobClient.init(JobClient.java:200)
        at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:192)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:381)
        at org.apache.hadoop.examples.WordCount.main(WordCount.java:143)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(
ProgramDriver.java:71)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
[EMAIL PROTECTED] hadoop-0.11.0]$


I dont know where the problem is.......

I've not created any directory called output.... if at all we need to
create
one, where should we create?
Should i configure some more settings.... Please explain in detail....

Please do help me.....

Thanks in advance
Jaya
--
View this message in context:
http://www.nabble.com/Detailed-steps-to-run-Hadoop-in-distributed-system...-tf3332250.html#a9265480
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Detailed steps to run Hadoop in distributed system...

Reply via email to