[jira] Commented: (HADOOP-1764) Inconsistancy between Mapper/Reducer book keeping

Arun C Murthy (JIRA) Sat, 25 Aug 2007 00:50:03 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522731
 ]


Arun C Murthy commented on HADOOP-1764:
---------------------------------------

Srikanth, thanks for the detailed logs, appreciate it.

Looking at them, here is my version of events: 

a) task_200708210155_0003_m_002598_0 succeeds on node50 some time before 
09:38:49 (actually before 09:28:49 since it takes 10mins to detect a lost 
tasktracker).
b) task_200708210155_0003_r_000006_2 is told that 
task_200708210155_0003_m_002598_0 is complete and it schedules map-output fetch.
c) At 09:38:49 the JobTracker detects that the TaskTracker on node50 is *lost*.
d) task_200708210155_0003_r_000006_2 tried (unsuccessfully) to fetch 
map-outputs from node50 (with backoffs i.e. at 09:47 and 09:57). *
e) task_200708210155_0003_m_002598_1 succeeded on node75 somewhere before 
10:00:56.
f) task_200708210155_0003_r_000006_2 is told by JobTracker that 
task_200708210155_0003_m_002598_1 is complete, schedules the copy and 
successfully copies at 10:00:56.
g) At 10:02:17 task_200708210155_0003_r_000006_2 discarded the scheduled fetch 
from task_200708210155_0003_m_002598_0 since it had already copied the output 
from task_200708210155_0003_m_002598_1.
h) At 11:22:46 the JobTracker detects that the TaskTracker on node75 is *lost*, 
fails the completed map task task_200708210155_0003_m_002598_1, and re-executes 
the tip (task_200708210155_0003_m_002598_2) on node55.
i) The reduce task task_200708210155_0003_r_000006_2 doesn't copy output of 
task_200708210155_0003_m_002598_2, this could be due to:
   * task_200708210155_0003_r_000006_2 has already completed (success or 
failure).
   * Even if task_200708210155_0003_r_000006_2 is still running, it sees that 
task_200708210155_0003_m_002598_2 has completed but ignores that event since it 
already has the output of task_200708210155_0003_m_002598_1 which ran 
successfully. Clearly this is safe since the output of 
task_200708210155_0003_m_002598_1 & task_200708210155_0003_m_002598_2 are 
identical.


Rebuttals:

bq. 1. Even finally the reducer seems to fetch data from the incorrect 
TaskTracker, it is not checking with the job tracker for the final/correct map 
output
bq. 3. An obvious solution may be to go to the job tracker and directly get the 
correct map output (I was able to get the correct map output from node55 using 
http, without any errors)

This, as noted above, is due to the fact that the 
task_200708210155_0003_r_000006_2 doesn't reschedule a copy of output of 
task_200708210155_0003_m_002598_2 since it already successfully copied it from 
task_200708210155_0003_m_002598_1 (before that failed due to lost TaskTracker 
on node75).


bq. 2. It seems to retry more times and sleeps for longer time (looking at the 
interval of log messages)

The reduce task, on a failed output-fetch doesn't immediately schedule another 
fetch but backs-off for a period greater than 5mins.

The one minor issue here is that reduces do not process 'task failed' events, 
i.e. ignore them and continue to schedule copies from the failed map tasks, 
with backoffs (which is why you see the exceptions while trying to fetch from 
the failed task_200708210155_0003_m_002598_0). Devaraj can chime in on this, my 
take is that this is 'ok' since we back-off for a reasonably long period 
(5mins) and meanwhile we'd schedule the fetch from the other successful 
task-attempt and then proceed to ignore the failed one.

Overall, my take after looking at the logs is that there isn't anything that 
hits me as a *correctness issue* and follows what we've designed for and expect 
in such scenarios.

Given that I'm inclined to close this bug... thoughts?


> Inconsistancy between Mapper/Reducer book keeping
> -------------------------------------------------
>
>                 Key: HADOOP-1764
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1764
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>         Environment: Related: HADOOP-1763 (Same environment)
> Version: 0.15.0-dev, r565628
> Compiled: Tue Aug 14 20:55:37 UTC 2007 by hadoopqa
> 1400 Nodes
>            Reporter: Srikanth Kakani
>            Assignee: Arun C Murthy
>            Priority: Blocker
>
> Refer to HADOOP-1763
> This occurs in that scenario once many job trackers are lost, reducers do not 
> know where the map outputs are present. They keep retrying the wrong node 
> causing the reducers to run forever without failures.
> Relevant logs:
> Reducer output:
> 2007-08-21 09:47:47,046 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 Copying task_200708210155_0003_m_002598_0 
> output from node50
> 2007-08-21 09:47:53,643 WARN org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 copy failed: 
> task_200708210155_0003_m_002598_0 from node50
> 2007-08-21 09:47:53,643 WARN org.apache.hadoop.mapred.ReduceTask: 
> java.io.FileNotFoundException: 
> http://wm511750.inktomisearch.com:50060/mapOutput?map=task_200708210155_0003_m_002598_0&reduce=6
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
>       at 
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:207)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:673)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:631)
> 2007-08-21 09:53:02,327 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 Copying task_200708210155_0003_m_002598_0 
> output from node50
> 2007-08-21 09:53:02,333 WARN org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 copy failed: 
> task_200708210155_0003_m_002598_0 from node50
> 2007-08-21 09:53:02,333 WARN org.apache.hadoop.mapred.ReduceTask: 
> java.io.FileNotFoundException: 
> http://node50:50060/mapOutput?map=task_200708210155_0003_m_002598_0&reduce=6
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
>       at 
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:207)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:673)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:631)
> 2007-08-21 09:57:33,899 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 Copying task_200708210155_0003_m_002598_0 
> output from node50.inktomisearch.com.
> 2007-08-21 09:57:33,908 WARN org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 copy failed: 
> task_200708210155_0003_m_002598_0 from node50.inktomisearch.com
> 2007-08-21 09:57:33,908 WARN org.apache.hadoop.mapred.ReduceTask: 
> java.io.FileNotFoundException: 
> http://node50:50060/mapOutput?map=task_200708210155_0003_m_002598_0&reduce=6
>       at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
>       at 
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:207)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:673)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:631)
> 2007-08-21 10:00:56,337 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 Copying task_200708210155_0003_m_002598_1 
> output from node75.inktomisearch.com.
> 2007-08-21 10:00:56,342 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 done copying 
> task_200708210155_0003_m_002598_1 output from node75
> 2007-08-21 10:02:17,486 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200708210155_0003_r_000006_2 Ignoring obsolete copy result for Map Task: 
> task_200708210155_0003_m_002598_0 from host: node50
> Looking at TIP task_200708210155_0003_m_002598:
> task_200708210155_0003_m_002598_0     node50  KILLED  0.00%           
> 21-Aug-2007 09:38:49    Lost task tracker
> task_200708210155_0003_m_002598_1     node75  KILLED  0.00%           
> 21-Aug-2007 11:22:42    Lost task tracker
> task_200708210155_0003_m_002598_2     node55  SUCCEEDED       100.00% 
> 21-Aug-2007 11:22:46    21-Aug-2007 11:27:19 (4mins, 33sec)     
> task_200708210155_0003_m_002598_3     node49  KILLED  100.00% 21-Aug-2007 
> 11:22:48    21-Aug-2007 11:27:48 (4mins, 59sec)     Already completed TIP
> Notes:
> 1. Even finally the reducer seems to fetch data from the incorrect 
> TaskTracker, it is not checking with the job tracker for the final/correct 
> map output
> 2. It seems to retry more times and sleeps for longer time (looking at the 
> interval of log messages)
> 3. An obvious solution may be to go to the job tracker and directly get the 
> correct map output (I was able to get the correct map output from node55 
> using http, without any errors)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1764) Inconsistancy between Mapper/Reducer book keeping

Reply via email to