[jira] Commented: (MAPREDUCE-1171) Lots of fetch failures

Qi Liu (JIRA) Thu, 29 Oct 2009 21:25:26 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771769#action_12771769
 ]


Qi Liu commented on MAPREDUCE-1171:
-----------------------------------

This is caused by a behavioral change in hadoop 0.20.1. In 0.18.3, the 
fetch-retry behavior used to be that each map output fetch will retry N (by 
default 6) times. Now in 0.20.1, not each map output fetch will have N retries. 
For a particular map output, only the first fetch attempt will have N retries. 
If the first N retry fails, subsequent fetch attempt, even it is from a 
different node, will only have 2 retries before failure. Thus, it greatly 
increased the chance of having "too many fetch failures".

The line of code is in src/mapred/org/apache/hadoop/mapred/ReduceTask.java, 
line 2090 to line 2104.

I would argue that if the subsequent map output fetch attempt is from the 
mapper node, it should only have 2 retries. However, if the map output is from 
a different mapper node (basically a different map attempt), it should still 
have N retries.

> Lots of fetch failures
> ----------------------
>
>                 Key: MAPREDUCE-1171
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1171
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
>
> Since we upgraded to hadoop-0.20.1  from hadoop0.18.3, we see lot of more map 
> task failures because of 'Too many fetch-failures'.
> One of our jobs makes hardly any progress, because of 3000 reduces not able 
> to get map output of 2 trailing maps (with about 80GB output each), which 
> repeatedly are marked as failures because of reduces not being able to get 
> their map output.
> One difference to hadoop-0.18.3 seems to be that reduce tasks report a failed 
> mapoutput fetch even after a single try when it was a read error 
> (cr.getError().equals(CopyOutputErrorType.READ_ERROR). I do not think this is 
> a good idea, as trailing map tasks will be attacked by all reduces 
> simultaneously.
> Here is a log output of a reduce task:
> {noformat}
> 2009-10-29 21:38:36,148 WARN org.apache.hadoop.mapred.ReduceTask: 
> attempt_200910281903_0028_r_000000_0 copy failed: 
> attempt_200910281903_0028_m_002781_1 from some host
> 2009-10-29 21:38:36,148 WARN org.apache.hadoop.mapred.ReduceTask: 
> java.net.SocketTimeoutException: Read timed out        at 
> java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>         at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
>         at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1064)
>         at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1496)
>         at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1377)
>         at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1289)
>         at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1220)
> 2009-10-29 21:38:36,149 INFO org.apache.hadoop.mapred.ReduceTask: Task 
> attempt_200910281903_0028_r_000000_0: Failed fetch #1 from 
> attempt_200910281903_0028_m_002781_1
> 2009-10-29 21:38:36,149 INFO org.apache.hadoop.mapred.ReduceTask: Failed to 
> fetch map-output from attempt_200910281903_0028_m_002781_1 even after 
> MAX_FETCH_RETRIES_PER_MAP retries...  or it is a read error,  reporting to 
> the JobTracker.
> {noformat}
> Also I saw a few log messages which look suspicious as if successfully 
> fetched map output is discarded because of the map being marked as failed 
> (because of too many fetch failures). This would make the situation even 
> worse.
> {noformat}
> 2009-10-29 22:07:28,729 INFO org.apache.hadoop.mapred.ReduceTask: header: 
> attempt_200910281903_0028_m_001076_0, compressed len: 21882555, decompressed 
> len: 23967845
> 2009-10-29 22:07:28,729 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 
> 23967845 bytes (21882555 raw bytes) into RAM from 
> attempt_200910281903_0028_m_001076_0
> 2009-10-29 22:07:43,602 INFO org.apache.hadoop.mapred.ReduceTask: Read 
> 23967845 bytes from map-output for attempt_200910281903_0028_m_001076_0
> 2009-10-29 22:07:43,602 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1 from 
> attempt_200910281903_0028_m_001076_0 -> (20, 39772) from some host
> ...
> 2009-10-29 22:10:07,220 INFO org.apache.hadoop.mapred.ReduceTask: Ignoring 
> obsolete output of FAILED map-task: 'attempt_200910281903_0028_m_001076_0'
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1171) Lots of fetch failures

Reply via email to