Re: Reducer hangs at 16%

Amar Kamat Mon, 23 Feb 2009 04:57:07 -0800

Looks like the reducer is able to fetch map output files from the localbox but fails to fetch it from the remote box. Can you check if there isno firewall issue or /etc/hosts entries are correct?

Amar
Jagadesh_Doddi wrote:

Hi


I have changed the configuration to run Name node and job tracker on the same 
system.
The job is started with bin/start-all.sh on NN
With a single slave node, the job completes in 12 seconds, and the console 
output is shown below:

[r...@fedora1 hadoop-0.18.3]# bin/hadoop jar samples/wordcount.jar 
org.myorg.WordCount input output1
09/02/23 17:19:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
the arguments. Applications should implement Tool for the same.
09/02/23 17:19:30 INFO mapred.FileInputFormat: Total input paths to process : 1
09/02/23 17:19:30 INFO mapred.FileInputFormat: Total input paths to process : 1
09/02/23 17:19:30 INFO mapred.JobClient: Running job: job_200902231717_0001
09/02/23 17:19:31 INFO mapred.JobClient:  map 0% reduce 0%
09/02/23 17:19:37 INFO mapred.JobClient:  map 100% reduce 0%
09/02/23 17:19:42 INFO mapred.JobClient: Job complete: job_200902231717_0001
09/02/23 17:19:42 INFO mapred.JobClient: Counters: 16
09/02/23 17:19:42 INFO mapred.JobClient:   Job Counters
09/02/23 17:19:42 INFO mapred.JobClient:     Data-local map tasks=2
09/02/23 17:19:42 INFO mapred.JobClient:     Launched reduce tasks=1
09/02/23 17:19:42 INFO mapred.JobClient:     Launched map tasks=2
09/02/23 17:19:42 INFO mapred.JobClient:   Map-Reduce Framework
09/02/23 17:19:42 INFO mapred.JobClient:     Map output records=25
09/02/23 17:19:42 INFO mapred.JobClient:     Reduce input records=23
09/02/23 17:19:42 INFO mapred.JobClient:     Map output bytes=238
09/02/23 17:19:42 INFO mapred.JobClient:     Map input records=5
09/02/23 17:19:42 INFO mapred.JobClient:     Combine output records=46
09/02/23 17:19:42 INFO mapred.JobClient:     Map input bytes=138
09/02/23 17:19:42 INFO mapred.JobClient:     Combine input records=48
09/02/23 17:19:42 INFO mapred.JobClient:     Reduce input groups=23
09/02/23 17:19:42 INFO mapred.JobClient:     Reduce output records=23
09/02/23 17:19:42 INFO mapred.JobClient:   File Systems
09/02/23 17:19:42 INFO mapred.JobClient:     HDFS bytes written=175
09/02/23 17:19:42 INFO mapred.JobClient:     Local bytes written=648
09/02/23 17:19:42 INFO mapred.JobClient:     HDFS bytes read=208
09/02/23 17:19:42 INFO mapred.JobClient:     Local bytes read=281

With two slave nodes, the job completes in 13 minutes, and the console output 
is shown below:

[r...@fedora1 hadoop-0.18.3]# bin/hadoop jar samples/wordcount.jar 
org.myorg.WordCount input output2
09/02/23 17:25:38 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
the arguments. Applications should implement Tool for the same.
09/02/23 17:25:38 INFO mapred.FileInputFormat: Total input paths to process : 1
09/02/23 17:25:38 INFO mapred.FileInputFormat: Total input paths to process : 1
09/02/23 17:25:39 INFO mapred.JobClient: Running job: job_200902231722_0001
09/02/23 17:25:40 INFO mapred.JobClient:  map 0% reduce 0%
09/02/23 17:25:42 INFO mapred.JobClient:  map 50% reduce 0%
09/02/23 17:25:43 INFO mapred.JobClient:  map 100% reduce 0%
09/02/23 17:25:58 INFO mapred.JobClient:  map 100% reduce 16%
09/02/23 17:38:31 INFO mapred.JobClient: Task Id : 
attempt_200902231722_0001_m_000000_0, Status : FAILED
Too many fetch-failures
09/02/23 17:38:31 WARN mapred.JobClient: Error reading task outputNo route to 
host
09/02/23 17:38:31 WARN mapred.JobClient: Error reading task outputNo route to 
host
09/02/23 17:38:43 INFO mapred.JobClient: Job complete: job_200902231722_0001
09/02/23 17:38:43 INFO mapred.JobClient: Counters: 16
09/02/23 17:38:43 INFO mapred.JobClient:   Job Counters
09/02/23 17:38:43 INFO mapred.JobClient:     Data-local map tasks=3
09/02/23 17:38:43 INFO mapred.JobClient:     Launched reduce tasks=1
09/02/23 17:38:43 INFO mapred.JobClient:     Launched map tasks=3
09/02/23 17:38:43 INFO mapred.JobClient:   Map-Reduce Framework
09/02/23 17:38:43 INFO mapred.JobClient:     Map output records=25
09/02/23 17:38:43 INFO mapred.JobClient:     Reduce input records=23
09/02/23 17:38:43 INFO mapred.JobClient:     Map output bytes=238
09/02/23 17:38:43 INFO mapred.JobClient:     Map input records=5
09/02/23 17:38:43 INFO mapred.JobClient:     Combine output records=46
09/02/23 17:38:43 INFO mapred.JobClient:     Map input bytes=138
09/02/23 17:38:43 INFO mapred.JobClient:     Combine input records=48
09/02/23 17:38:43 INFO mapred.JobClient:     Reduce input groups=23
09/02/23 17:38:43 INFO mapred.JobClient:     Reduce output records=23
09/02/23 17:38:43 INFO mapred.JobClient:   File Systems
09/02/23 17:38:43 INFO mapred.JobClient:     HDFS bytes written=175
09/02/23 17:38:43 INFO mapred.JobClient:     Local bytes written=648
09/02/23 17:38:43 INFO mapred.JobClient:     HDFS bytes read=208
09/02/23 17:38:43 INFO mapred.JobClient:     Local bytes read=281

Thanks

Jagadesh



-----Original Message-----
From: Jothi Padmanabhan [mailto:[email protected]]
Sent: Monday, February 23, 2009 4:57 PM
To: [email protected]
Subject: Re: Reducer hangs at 16%

OK. I am guessing that your problem arises from having two entries for
master. The master should be the node where the JT is run (for
start-mapred.sh) and NN is run (for start-dfs.sh). This might need a bit
more effort to set up. To start with, you might want to try out having both
the JT and NN in the same machine (the node designated as master) and then
try start-all.sh. You need to configure you hadoop-site.xml correctly as
well.

Jothi




On 2/23/09 4:36 PM, "Jagadesh_Doddi" <[email protected]> wrote:

Hi

I have setup as per the documentation in hadoop site.
On namenode, I am running bin/start-dfs.sh and on job tracker, I am running
bin\start-mapred.sh

Thanks and Regards

Jagadesh Doddi
Telephone: 040-30657556
Mobile: 9949497414



-----Original Message-----
From: Jothi Padmanabhan [mailto:[email protected]]
Sent: Monday, February 23, 2009 4:00 PM
To: [email protected]
Subject: Re: Reducer hangs at 16%

Hi,

This looks like a set up issue. See
http://hadoop.apache.org/core/docs/current/cluster_setup.html#Configuration+
Files
On how to set this up correctly.

As an aside, how are you bringing up the hadoop daemons (JobTracker,
Namenode, TT and Datanodes)?  Are you manually bringing them up or are you
using bin/start-all.sh?

Jothi


On 2/23/09 3:14 PM, "Jagadesh_Doddi" <[email protected]> wrote:

I have setup a distributed environment on Fedora OS to run Hadoop.
System Fedora1 is the name node, Fedora2 is Job tracker, Fedora3 and Fedora4
are task trackers.
Conf/masters contains the entries Fedora1, Fedors2, and conf/slaves contains
the entries Fedora3, Fedora4.
When I run the sample wordcount example with single task tracker (either
Fedora3 or Fedora4), it works fine and the job completes in a few seconds.
However, when I add the other task tracker in conf/slaves, the reducer stop
at
16% and the job completes after 13 minutes.
The same problem exists in versions 16.4, 17.2.1 and 18.3. The output on the
namenode console is shown below:

[r...@fedora1 hadoop-0.17.2.1Cluster]# bin/hadoop jar samples/wordcount.jar
org.myorg.WordCount input output
09/02/19 17:43:18 INFO mapred.FileInputFormat: Total input paths to process :
1
09/02/19 17:43:19 INFO mapred.JobClient: Running job: job_200902191741_0001
09/02/19 17:43:20 INFO mapred.JobClient:  map 0% reduce 0%
09/02/19 17:43:26 INFO mapred.JobClient:  map 50% reduce 0%
09/02/19 17:43:27 INFO mapred.JobClient:  map 100% reduce 0%
09/02/19 17:43:35 INFO mapred.JobClient:  map 100% reduce 16%
09/02/19 17:56:15 INFO mapred.JobClient: Task Id :
task_200902191741_0001_m_000001_0, Status : FAILED
Too many fetch-failures
09/02/19 17:56:15 WARN mapred.JobClient: Error reading task outputNo route to
host
09/02/19 17:56:18 WARN mapred.JobClient: Error reading task outputNo route to
host
09/02/19 17:56:25 INFO mapred.JobClient:  map 100% reduce 81%
09/02/19 17:56:26 INFO mapred.JobClient:  map 100% reduce 100%
09/02/19 17:56:27 INFO mapred.JobClient: Job complete: job_200902191741_0001
09/02/19 17:56:27 INFO mapred.JobClient: Counters: 16
09/02/19 17:56:27 INFO mapred.JobClient:   Job Counters
09/02/19 17:56:27 INFO mapred.JobClient:     Launched map tasks=3
09/02/19 17:56:27 INFO mapred.JobClient:     Launched reduce tasks=1
09/02/19 17:56:27 INFO mapred.JobClient:     Data-local map tasks=3
09/02/19 17:56:27 INFO mapred.JobClient:   Map-Reduce Framework
09/02/19 17:56:27 INFO mapred.JobClient:     Map input records=5
09/02/19 17:56:27 INFO mapred.JobClient:     Map output records=25
09/02/19 17:56:27 INFO mapred.JobClient:     Map input bytes=138
09/02/19 17:56:27 INFO mapred.JobClient:     Map output bytes=238
09/02/19 17:56:27 INFO mapred.JobClient:     Combine input records=25
09/02/19 17:56:27 INFO mapred.JobClient:     Combine output records=23
09/02/19 17:56:27 INFO mapred.JobClient:     Reduce input groups=23
09/02/19 17:56:27 INFO mapred.JobClient:     Reduce input records=23
09/02/19 17:56:27 INFO mapred.JobClient:     Reduce output records=23
09/02/19 17:56:27 INFO mapred.JobClient:   File Systems
09/02/19 17:56:27 INFO mapred.JobClient:     Local bytes read=522
09/02/19 17:56:27 INFO mapred.JobClient:     Local bytes written=1177
09/02/19 17:56:27 INFO mapred.JobClient:     HDFS bytes read=208
09/02/19 17:56:27 INFO mapred.JobClient:     HDFS bytes written=175

Appreciate any help on this.

Thanks

Jagadesh

DISCLAIMER:
This email (including any attachments) is intended for the sole use of the
intended recipient/s and may contain material that is CONFIDENTIAL AND
PRIVATE
COMPANY INFORMATION. Any review or reliance by others or copying or
distribution or forwarding of any or all of the contents in this message is
STRICTLY PROHIBITED. If you are not the intended recipient, please contact
the
sender by email and delete all copies; your cooperation in this regard is
appreciated.


DISCLAIMER:
This email (including any attachments) is intended for the sole use of the
intended recipient/s and may contain material that is CONFIDENTIAL AND PRIVATE
COMPANY INFORMATION. Any review or reliance by others or copying or
distribution or forwarding of any or all of the contents in this message is
STRICTLY PROHIBITED. If you are not the intended recipient, please contact the
sender by email and delete all copies; your cooperation in this regard is
appreciated.




DISCLAIMER:
This email (including any attachments) is intended for the sole use of the 
intended recipient/s and may contain material that is CONFIDENTIAL AND PRIVATE 
COMPANY INFORMATION. Any review or reliance by others or copying or 
distribution or forwarding of any or all of the contents in this message is 
STRICTLY PROHIBITED. If you are not the intended recipient, please contact the 
sender by email and delete all copies; your cooperation in this regard is 
appreciated.

Re: Reducer hangs at 16%

Reply via email to