Thanks Amareshwari,
    That explains the spurious extra tasks n the log. However I am not getting 
the userlogs for the failed setup task because the jvm it tries to run in fails 
immediately.
I get only tasktracker log like this:
2010-10-15 03:46:53,397 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner 
constructed JVM ID: jvm_201010140533_0157_m_-1758278022
2010-10-15 03:46:53,398 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner 
jvm_201010140533_0157_m_-1758278022 spawned.
2010-10-15 03:46:53,918 INFO org.apache.hadoop.mapred.JvmManager: JVM : 
jvm_201010140533_0157_m_-1758278022 exited. Number of tasks it ran: 0
2010-10-15 03:46:56,946 INFO org.apache.hadoop.mapred.TaskRunner: 
attempt_201010140533_0157_m_000005_1 done; removing files.
2010-10-15 03:46:56,946 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot 
: 
current free slots : 2
2010-10-15 03:46:58,050 INFO org.apache.hadoop.mapred.TaskTracker: 
LaunchTaskAction (registerTask): attempt_201010140533_0157_m_000005_2 task's 
state:UNASSIG
NED

 It is tough to figure out what is going wrong in my setup task without 
userlogs 
. It is a series of same job with different input. Usually the first 2 job 
succeeds and the 3rd job fails. What exactly gets run in setup task? I guess 
the 
split calculation etc. Since the jvm is getting exited with in few milliseconds 
according to the above log, I am not sure whether it is reaching the 
application's code at all.  



Thanks,
Murali Krishna




________________________________
From: Amareshwari Sri Ramadasu <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Fri, 15 October, 2010 5:17:38 PM
Subject: Re: Hadoop starting extra map tasks and eventually failing

These extra tasks are job-setup and job-cleanup tasks which use map/reduce 
slots 
to run.
Looks like job-setup task failed for your second job even after retries, so no 
maps are scheduled. But you should see tasklogs for the failed tasks.

Thanks
Amareshwari

On 10/15/10 5:11 PM, "Murali Krishna. P" <[email protected]> wrote:

Hi,
    I have attached the relevant part of jobtracker log. The job1 had 3 splits, 
but it started 5 map tasks, m_00000 through m_00004. ( I have the speculative 
execution turned off). The job some how succededs, the log files for 4th and 
5th 
task didnt get any records. Hovewer the next job again has 3 splits but this 
time it schedules only m_00003 m_00004 and both of them fail. There is no 
userlogs created for these 2 tasks. The tasktracker log mentions that the jvm 
has spawned and exited immediately. And it doesnot schedule the first 3 map 
tasks and the job fails since 4th and 5th task fail even after retries.

Why is extra tasks gettin scheduled ?
How did those tasks pass in the first case?
Why the right tasks are not scheduled in the second job?

This is easily reproducible, please take a look at JT log and advise.

Thanks,
Murali Krishna

Reply via email to