Hi Df,
check that you have w1153435 user in all machines in the cluster
and use the same configuration for all machines. use IP instead of name.
(already you said that you didn't have the root permission )
<name>fs.default.name</name>
<value>hdfs://109.9.3.101(ex):3000</value>
<name>mapred.job.tracker</name>
<value>109.9.3.101(ex):3001</value>
check that ssh passwordless login
Regards,
Shanmuganathan
---- On Wed, 17 Aug 2011 17:12:25 +0530 A Df
<[email protected]> wrote ----
Hello Everyone:
I am adding the contents of my config file in the hopes that someone will be
able to help. See inline for the discussions. I really don't understand why it
works in pseudo-mode but gives so much problems in cluster. I have tried the
instructions from the Apache cluster setup, Yahoo Development Network and from
Michael Noll's tutorial.
w1153435@ngs:~/hadoop-0.20.2_cluster/conf> cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://ngs.uni.ac.uk:3000</value>
</property>
<property>
<name>HADOOP_LOG_DIR</name>
<value>/home/w1153435/hadoop-0.20.2_cluster/var/log/hadoop</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop</value>
</property>
</configuration>
w1153435@ngs:~/hadoop-0.20.2_cluster/conf> cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.http.address</name>
<value>0.0.0.0:3500</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/w1153435/hadoop-0.20.2_cluster/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/w1153435/hadoop-0.20.2_cluster/dfs/name</value>
<final>true</final>
</property>
</configuration>
w1153435@ngs:~/hadoop-0.20.2_cluster/conf> cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>ngs.uni.ac.uk:3001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/w1153435/hadoop-0.20.2_cluster/mapred/system</value>
</property>
<property>
<name>mapred.map.tasks</name>
<value>80</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>16</value>
</property>
</configuration>
In addition:
w1153435@ngs:~/hadoop-0.20.2_cluster> bin/hadoop dfsadmin -report
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: �%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Name: 161.74.12.36:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Aug 17 12:40:17 BST 2011
Cheers,
A Df
>________________________________
>From: A Df <[email protected]>
>To: "[email protected]" <[email protected]>;
"[email protected]" <[email protected]>
>Sent: Tuesday, 16 August 2011, 16:20
>Subject: Re: hadoop cluster mode not starting up
>
>
>
>See inline:
>
>
>>________________________________
>>From: shanmuganathan.r <[email protected]>
>>To: [email protected]
>>Sent: Tuesday, 16 August 2011, 13:35
>>Subject: Re: hadoop cluster mode not starting up
>>
>>Hi Df,
>>
>> Are you use the IP instead of names in conf/masters and
conf/slaves . For running the secondary namenode in separate machine refer the
following link
>>
>>
>>=Yes, I use the names in those files but the ip address are mapped to
the names in the /extras/hosts file. Does this cause problems?
>>
>>
>>http://www.hadoop-blog.com/2010/12/secondarynamenode-process-is-starting.html
>>
>>
>>=I want to making too many changes so I will stick to having the master
be both namenode and secondarynamenode. I tried starting up the hdfs and
mapreduce but the jobtracker is not running on the master and their is still
errors regarding the datanodes because only 5 of 7 datanodes have tasktracker.
I ran both commands for to start the hdfs and mapreduce so why is the
jobtracker missing?
>>
>>Regards,
>>
>>Shanmuganathan
>>
>>
>>
>>---- On Tue, 16 Aug 2011 17:06:04 +0530 A
Df&lt;[email protected]&gt; wrote ----
>>
>>
>>I already used a few tutorials as follows:
>> * Hadoop Tutorial on Yahoo Developer network which uses an old
hadoop and thus older conf files.
>>
>> *
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
which only has two nodes and the master acts as namenode and secondary
namenode. I need one with more than that.
>>
>>
>>Is there a way to prevent the node from using the central file system
because I don't have root permission and my user folder is in a central file
system which is replicated on all the nodes?
>>
>>See inline too for my responses
>>
>>
>>
>>&gt;________________________________
>>&gt;From: Steve Loughran &lt;[email protected]&gt;
>>&gt;To: [email protected]
>>&gt;Sent: Tuesday, 16 August 2011, 12:08
>>&gt;Subject: Re: hadoop cluster mode not starting up
>>&gt;
>>&gt;On 16/08/11 11:19, A Df wrote:
>>&gt;&gt; See inline
>>&gt;&gt;
>>&gt;&gt;
>>&gt;&gt;
>>&gt;&gt;&gt; ________________________________
>>&gt;&gt;&gt; From: Steve
Loughran&lt;[email protected]&gt;
>>&gt;&gt;&gt; To: [email protected]
>>&gt;&gt;&gt; Sent: Tuesday, 16 August 2011, 11:08
>>&gt;&gt;&gt; Subject: Re: hadoop cluster mode not starting
up
>>&gt;&gt;&gt;
>>&gt;&gt;&gt; On 16/08/11 11:02, A Df wrote:
>>&gt;&gt;&gt;&gt; Hello All:
>>&gt;&gt;&gt;&gt;
>>&gt;&gt;&gt;&gt; I used a combination of tutorials to
setup hadoop but most seems to be using either an old version of hadoop or only
using 2 machines for the cluster which isn't really a cluster. Does anyone know
of a good tutorial which setups multiple nodes for a cluster?? I already looked
at the Apache website but it does not give sample values for the conf files.
Also each set of tutorials seem to have a different set of parameters which
they indicate should be changed so now its a bit confusing. For example, my
configuration sets a dedicate namenode, secondary namenode and 8 slave nodes
but when I run the start command it gives an error. Should I install hadoop to
my user directory or on the root? I have it in my directory but all the nodes
have a central file system as opposed to distributed so whatever I do on one
node in my user folder it affect all the others so how do i set the paths to
ensure that it uses a distributed system?
>>&gt;&gt;&gt;&gt;
>>&gt;&gt;&gt;&gt; For the errors below, I checked the
directories and the files are there. Am I not sure what went wrong and how to
set the conf to not have central file system. Thank you.
>>&gt;&gt;&gt;&gt;
>>&gt;&gt;&gt;&gt; Error message
>>&gt;&gt;&gt;&gt; CODE
>>&gt;&gt;&gt;&gt;
w1153435@n51:~/hadoop-0.20.2_cluster&gt; bin/start-dfs.sh
>>&gt;&gt;&gt;&gt; bin/start-dfs.sh: line 28:
/w1153435/hadoop-0.20.2_cluster/bin/hadoop-config.sh: No such file or directory
>>&gt;&gt;&gt;&gt; bin/start-dfs.sh: line 50:
/w1153435/hadoop-0.20.2_cluster/bin/hadoop-daemon.sh: No such file or directory
>>&gt;&gt;&gt;&gt; bin/start-dfs.sh: line 51:
/w1153435/hadoop-0.20.2_cluster/bin/hadoop-daemons.sh: No such file or directory
>>&gt;&gt;&gt;&gt; bin/start-dfs.sh: line 52:
/w1153435/hadoop-0.20.2_cluster/bin/hadoop-daemons.sh: No such file or directory
>>&gt;&gt;&gt;&gt; CODE
>>&gt;&gt;&gt;
>>&gt;&gt;&gt; there's No such file or directory as
>>&gt;&gt;&gt;
/w1153435/hadoop-0.20.2_cluster/bin/hadoop-daemons.sh
>>&gt;&gt;&gt;
>>&gt;&gt;&gt;
>>&gt;&gt;&gt; There is, I checked as shown
>>&gt;&gt;&gt; w1153435@n51:~/hadoop-0.20.2_cluster&gt;
ls bin
>>&gt;&gt;&gt; hadoop rcc
start-dfs.sh stop-dfs.sh
>>&gt;&gt;&gt; hadoop-config.sh slaves.sh
start-mapred.sh stop-mapred.sh
>>&gt;&gt;&gt; hadoop-daemon.sh start-all.sh stop-all.sh
>>&gt;&gt;&gt; hadoop-daemons.sh start-balancer.sh
stop-balancer.sh
>>&gt;
>>&gt;try "pwd" to print out where the OS thinks you are, as it
doesn't seem
>>&gt;to be where you think you are
>>&gt;
>>&gt;
>>&gt;w1153435@ngs:~/hadoop-0.20.2_cluster&gt; pwd
>>&gt;/home/w1153435/hadoop-0.20.2_cluster
>>&gt;
>>&gt;
>>&gt;w1153435@ngs:~/hadoop-0.20.2_cluster/bin&gt; pwd
>>&gt;/home/w1153435/hadoop-0.20.2_cluster/bin
>>&gt;
>>&gt;&gt;&gt;
>>&gt;&gt;&gt;
>>&gt;&gt;&gt;
>>&gt;&gt;&gt;
>>&gt;&gt;&gt;&gt;
>>&gt;&gt;&gt;&gt; I had tried running this command below
earlier but also got problems:
>>&gt;&gt;&gt;&gt; CODE
>>&gt;&gt;&gt;&gt;
w1153435@ngs:~/hadoop-0.20.2_cluster&gt; export
HADOOP_CONF_DIR=${HADOOP_HOME}/conf
>>&gt;&gt;&gt;&gt;
w1153435@ngs:~/hadoop-0.20.2_cluster&gt; export
HADOOP_SLAVES=${HADOOP_CONF_DIR}/slaves
>>&gt;&gt;&gt;&gt;
w1153435@ngs:~/hadoop-0.20.2_cluster&gt; ${HADOOP_HOME}/bin/slaves.sh
"mkdir -p /home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop"
>>&gt;&gt;&gt;&gt; -bash: /bin/slaves.sh: No such file or
directory
>>&gt;&gt;&gt;&gt;
w1153435@ngs:~/hadoop-0.20.2_cluster&gt; export
HADOOP_HOME=/home/w1153435/hadoop-0.20.2_cluster
>>&gt;&gt;&gt;&gt;
w1153435@ngs:~/hadoop-0.20.2_cluster&gt; ${HADOOP_HOME}/bin/slaves.sh
"mkdir -p /home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop"
>>&gt;&gt;&gt;&gt; cat: /conf/slaves: No such file or
directory
>>&gt;&gt;&gt;&gt; CODE
>>&gt;&gt;&gt;&gt;
>>&gt;&gt;&gt; there's No such file or directory as
/conf/slaves because you set
>>&gt;&gt;&gt; HADOOP_HOME after setting the other env
variables, which are expanded at
>>&gt;&gt;&gt; set-time, not run-time.
>>&gt;&gt;&gt;
>>&gt;&gt;&gt; I redid the command but still have errors on
the slaves
>>&gt;&gt;&gt;
>>&gt;&gt;&gt;
>>&gt;&gt;&gt; w1153435@n51:~/hadoop-0.20.2_cluster&gt;
export HADOOP_HOME=/home/w1153435/hadoop-0.20.2_cluster
>>&gt;&gt;&gt; w1153435@n51:~/hadoop-0.20.2_cluster&gt;
export HADOOP_CONF_DIR=${HADOOP_HOME}/conf
>>&gt;&gt;&gt; w1153435@n51:~/hadoop-0.20.2_cluster&gt;
export HADOOP_SLAVES=${HADOOP_CONF_DIR}/slaves
>>&gt;&gt;&gt; w1153435@n51:~/hadoop-0.20.2_cluster&gt;
${HADOOP_HOME}/bin/slaves.sh "mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop"
>>&gt;&gt;&gt; privn51: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn58: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn52: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn55: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn57: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn54: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn53: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;&gt;&gt; privn56: bash: mkdir -p
/home/w1153435/hadoop-0.20.2_cluster/tmp/hadoop: No such file or directory
>>&gt;
>>&gt;try ssh-ing in, do it by hand, make sure you have the right
permissions etc
>>&gt;
>>&gt;
>>&gt;I reset the above path variables again and checked that they
existed and tried the command above but same error. I used ssh with no problems
and no password request so that is fine. What else could be wrong?
>>&gt;w1153435@ngs:~/hadoop-0.20.2_cluster&gt; echo $HADOOP_HOME
/home/w1153435/hadoop-0.20.2_cluster
>>&gt;w1153435@ngs:~/hadoop-0.20.2_cluster&gt; echo
$HADOOP_CONF_DIR /home/w1153435/hadoop-0.20.2_cluster/conf
>>&gt;w1153435@ngs:~/hadoop-0.20.2_cluster&gt; echo
$HADOOP_SLAVES
/home/w1153435/hadoop-0.20.2_cluster/conf/slaves
>>&gt;w1153435@ngs:~/hadoop-0.20.2_cluster&gt;
>>&gt;
>>&gt;
>>&gt;
>>&gt;
>>&gt;
>>
>>
>>
>>
>
>