In my opinion, you should make the conf setting files both in master and
slave node to be same. That means that the files in conf/slaves should be
same between your small cluster.
-----邮件原件-----
发件人: Gaurav Agarwal [mailto:[EMAIL PROTECTED]
发送时间: 2007年3月7日 16:22
收件人: [email protected]
主题: Hadoop 'wordcount' program hanging in the Reduce phase.
Hi Everyone!
I am new user to Hadoop and trying to set up a small cluster using Hadoop.
but I am facing some issues doing that.
I am trying to run the Hadoop 'wordcount' example program which come bundled
with it. I am able to successfully run the program on a single node cluster
(that is using my local machine only). But, when I try to run the same
program on a cluster of two machines, the program hangs in the 'reduce'
phase.
Settings:
Master Node: 192.168.1.150 (dennis-laptop)
Slave Node: 192.168.1.201 (traal)
User Account on both Master and Slave is named : Hadoop
Password-less ssh login to Slave from the Master is working.
JAVA_HOME is set appropriately in the hadoop-env.sh file on both
Master/Slave.
MASTER
1) conf/slaves
localhost
[EMAIL PROTECTED]
2) conf/master
localhost
3) conf/hadoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>192.168.1.150:50000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>192.168.1.150:50001</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
SLAVE
1) conf/slaves
localhost
2) conf/master
[EMAIL PROTECTED]
3) conf/hadoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>192.168.1.150:50000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>192.168.1.150:50001</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
CONSOLE OUTPUT
bin/hadoop jar hadoop-*-examples.jar wordcount -m 10 -r 2 input output
07/03/06 23:17:17 INFO mapred.InputFormatBase: Total input paths to process
: 1
07/03/06 23:17:18 INFO mapred.JobClient: Running job: job_0001
07/03/06 23:17:19 INFO mapred.JobClient: map 0% reduce 0%
07/03/06 23:17:29 INFO mapred.JobClient: map 20% reduce 0%
07/03/06 23:17:30 INFO mapred.JobClient: map 40% reduce 0%
07/03/06 23:17:32 INFO mapred.JobClient: map 80% reduce 0%
07/03/06 23:17:33 INFO mapred.JobClient: map 100% reduce 0%
07/03/06 23:17:42 INFO mapred.JobClient: map 100% reduce 3%
07/03/06 23:17:43 INFO mapred.JobClient: map 100% reduce 5%
07/03/06 23:17:44 INFO mapred.JobClient: map 100% reduce 8%
07/03/06 23:17:52 INFO mapred.JobClient: map 100% reduce 10%
07/03/06 23:17:53 INFO mapred.JobClient: map 100% reduce 13%
07/03/06 23:18:03 INFO mapred.JobClient: map 100% reduce 16%
The only exception I can see from the log files is in the 'TaskTracker' log
file:
2007-03-06 23:17:32,214 INFO org.apache.hadoop.mapred.TaskRunner:
task_0001_r_000000_0 Copying task_0001_m_000002_0 output from traal.
2007-03-06 23:17:32,221 INFO org.apache.hadoop.mapred.TaskRunner:
task_0001_r_000000_0 Copying task_0001_m_000001_0 output from dennis-laptop.
2007-03-06 23:17:32,368 WARN org.apache.hadoop.mapred.TaskRunner:
task_0001_r_000000_0 copy failed: task_0001_m_000002_0 from traal
2007-03-06 23:17:32,368 WARN org.apache.hadoop.mapred.TaskRunner:
java.io.IOException: File
/tmp/hadoop-hadoop/mapred/local/task_0001_r_000000_0/map_2.out-0 not created
at
org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.copyOutput(ReduceT
askRunner.java:301)
at
org.apache.hadoop.mapred.ReduceTaskRunner$MapOutputCopier.run(ReduceTaskRunn
er.java:262)
2007-03-06 23:17:32,369 WARN org.apache.hadoop.mapred.TaskRunner:
task_0001_r_000000_0 adding host traal to penalty box, next contact in 99
seconds
I am attaching the master log files just in case anyone wants to check them.
Any help will be greatly appreciated!
-gaurav
http://www.nabble.com/file/7013/hadoop-hadoop-tasktracker-dennis-laptop.log
hadoop-hadoop-tasktracker-dennis-laptop.log </br>
http://www.nabble.com/file/7012/hadoop-hadoop-jobtracker-dennis-laptop.log
hadoop-hadoop-jobtracker-dennis-laptop.log </br>
http://www.nabble.com/file/7011/hadoop-hadoop-namenode-dennis-laptop.log
hadoop-hadoop-namenode-dennis-laptop.log </br>
http://www.nabble.com/file/7010/hadoop-hadoop-datanode-dennis-laptop.log
hadoop-hadoop-datanode-dennis-laptop.log
--
View this message in context:
http://www.nabble.com/Hadoop-%27wordcount%27-program-hanging-in-the-Reduce-p
hase.-tf3360661.html#a9348424
Sent from the Hadoop Users mailing list archive at Nabble.com.