[ https://issues.apache.org/jira/browse/MAPREDUCE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757008#action_12757008 ]
Sreekanth Ramakrishnan commented on MAPREDUCE-964: -------------------------------------------------- The reason for the negative values being present in the JobSummary results from the taskStatus not having the finishTime field not being set in the taskStatus. This happens especially when the tasks which are being killed when they just have finished, resulting in a race condition. If look at the attempt which is mentioned has actually finished and same time a kill task action has been recv for same attempt at the same time. Had modified the code to put in the debug statement following is example of task which upset the metering: {noformat} 2009-09-18 06:18:32,001 INFO org.apache.hadoop.mapred.JobInProgress: TaskDebug attemptId : attempt_200909180346_0004_m_000575_0 slots : SLOTS_MILLIS_MAPS tip.numslots is: 1 difference to add : -1253254705577 status start : 1253254705577 status end time : 0 {noformat} TT logs : {noformat} 2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200909180346_0004_m_000575_0 task's state:UNASSIGNED 2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:01,200 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 1 and trying to launch attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:02,750 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200909180346_0004_m_1883468461 given task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:09,034 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.18553433% xxxx/my_reliability_test_input/part-00032:335544320+67108864 2009-09-18 06:18:12,040 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.25399202% xxx/my_reliability_test_input/part-00032:335544320+67108864 2009-09-18 06:18:15,317 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.3378629% xxx/my_reliability_test_input/part-00032:335544320+67108864 2009-09-18 06:18:15,319 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction for task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:15,319 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:20,413 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. 2009-09-18 06:18:20,415 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200909180346_0004_m_000575_0 not found in cache 2009-09-18 06:18:25,511 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. 2009-09-18 06:18:25,515 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_200909180346_0004_m_000575_0 task's state:FAILED_UNCLEAN 2009-09-18 06:18:25,516 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:25,516 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_200909180346_0004_m_000575_0 which needs 1 slots 2009-09-18 06:18:26,651 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_200909180346_0004_m_-1834354161 given task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:26,888 INFO org.apache.hadoop.mapred.TaskTracker: Received KillTaskAction for task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:26,888 INFO org.apache.hadoop.mapred.TaskTracker: About to purge task: attempt_200909180346_0004_m_000575_0 2009-09-18 06:18:31,986 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 0.0% 2009-09-18 06:18:31,986 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200909180346_0004_m_000575_0 Ignoring status-update since runState: FAILED 2009-09-18 06:18:31,989 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. 2009-09-18 06:18:31,990 WARN org.apache.hadoop.ipc.Server: IPC Server Responder, call statusUpdate(attempt_200909180346_0004_m_000575_0, org.apache.hadoop.mapred.maptasksta...@4e69b84c) from 127.0.0.1:51146: output error 2009-09-18 06:18:31,995 INFO org.apache.hadoop.mapred.IndexCache: Map ID attempt_200909180346_0004_m_000575_0 not found in cache 2009-09-18 06:18:34,992 INFO org.apache.hadoop.mapred.TaskRunner: attempt_200909180346_0004_m_000575_0 done; removing files. {noformat} > Inaccurate values in jobSummary logs > ------------------------------------ > > Key: MAPREDUCE-964 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-964 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 0.20.1 > Reporter: Rajiv Chittajallu > Priority: Critical > > For some jobs the mapSlotSeconds is incorrect. > negative value > 09/09/01 18:31:44 INFOmapred.JobInProgress$JobSummary: > jobId=job_200908270718_4568,submitTime=1251823543976,launchTime=1251823554310,finishTime=1251829904565, > > numMaps=7965,numSlotsPerMap=1,numReduces=40,numSlotsPerReduce=1,user=wile,queue=runner,status=SUCCEEDED, > > mapSlotSeconds=-2503133523,reduceSlotsSeconds=186536,clusterMapCapacity=11262,clusterReduceCapacity=3754 > or too high > 09/09/02 23:59:57 INFO mapred.JobInProgress$JobSummary: > jobId=job_200908270718_5861,submitTime=1251935672924,launchTime=1251935687698,finishTime=1251935997949, > > numMaps=1026,numSlotsPerMap=1,numReduces=10,numSlotsPerReduce=1,user=dfsload,queue=gridops,status=SUCCEEDED, > > mapSlotSeconds=1251949742,reduceSlotsSeconds=537,clusterMapCapacity=11262,clusterReduceCapacity=3754 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.