[ https://issues.apache.org/jira/browse/MAPREDUCE-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771769#action_12771769 ]
Qi Liu commented on MAPREDUCE-1171: ----------------------------------- This is caused by a behavioral change in hadoop 0.20.1. In 0.18.3, the fetch-retry behavior used to be that each map output fetch will retry N (by default 6) times. Now in 0.20.1, not each map output fetch will have N retries. For a particular map output, only the first fetch attempt will have N retries. If the first N retry fails, subsequent fetch attempt, even it is from a different node, will only have 2 retries before failure. Thus, it greatly increased the chance of having "too many fetch failures". The line of code is in src/mapred/org/apache/hadoop/mapred/ReduceTask.java, line 2090 to line 2104. I would argue that if the subsequent map output fetch attempt is from the mapper node, it should only have 2 retries. However, if the map output is from a different mapper node (basically a different map attempt), it should still have N retries. > Lots of fetch failures > ---------------------- > > Key: MAPREDUCE-1171 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1171 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task > Affects Versions: 0.20.1 > Reporter: Christian Kunz > > Since we upgraded to hadoop-0.20.1 from hadoop0.18.3, we see lot of more map > task failures because of 'Too many fetch-failures'. > One of our jobs makes hardly any progress, because of 3000 reduces not able > to get map output of 2 trailing maps (with about 80GB output each), which > repeatedly are marked as failures because of reduces not being able to get > their map output. > One difference to hadoop-0.18.3 seems to be that reduce tasks report a failed > mapoutput fetch even after a single try when it was a read error > (cr.getError().equals(CopyOutputErrorType.READ_ERROR). I do not think this is > a good idea, as trailing map tasks will be attacked by all reduces > simultaneously. > Here is a log output of a reduce task: > {noformat} > 2009-10-29 21:38:36,148 WARN org.apache.hadoop.mapred.ReduceTask: > attempt_200910281903_0028_r_000000_0 copy failed: > attempt_200910281903_0028_m_002781_1 from some host > 2009-10-29 21:38:36,148 WARN org.apache.hadoop.mapred.ReduceTask: > java.net.SocketTimeoutException: Read timed out at > java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:129) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1064) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1496) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1377) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1289) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1220) > 2009-10-29 21:38:36,149 INFO org.apache.hadoop.mapred.ReduceTask: Task > attempt_200910281903_0028_r_000000_0: Failed fetch #1 from > attempt_200910281903_0028_m_002781_1 > 2009-10-29 21:38:36,149 INFO org.apache.hadoop.mapred.ReduceTask: Failed to > fetch map-output from attempt_200910281903_0028_m_002781_1 even after > MAX_FETCH_RETRIES_PER_MAP retries... or it is a read error, reporting to > the JobTracker. > {noformat} > Also I saw a few log messages which look suspicious as if successfully > fetched map output is discarded because of the map being marked as failed > (because of too many fetch failures). This would make the situation even > worse. > {noformat} > 2009-10-29 22:07:28,729 INFO org.apache.hadoop.mapred.ReduceTask: header: > attempt_200910281903_0028_m_001076_0, compressed len: 21882555, decompressed > len: 23967845 > 2009-10-29 22:07:28,729 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling > 23967845 bytes (21882555 raw bytes) into RAM from > attempt_200910281903_0028_m_001076_0 > 2009-10-29 22:07:43,602 INFO org.apache.hadoop.mapred.ReduceTask: Read > 23967845 bytes from map-output for attempt_200910281903_0028_m_001076_0 > 2009-10-29 22:07:43,602 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from > attempt_200910281903_0028_m_001076_0 -> (20, 39772) from some host > ... > 2009-10-29 22:10:07,220 INFO org.apache.hadoop.mapred.ReduceTask: Ignoring > obsolete output of FAILED map-task: 'attempt_200910281903_0028_m_001076_0' > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.