[
https://issues.apache.org/jira/browse/HADOOP-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674149#action_12674149
]
Sharad Agarwal commented on HADOOP-5241:
----------------------------------------
The high estimate is due to the reason that blow up ratio is added for each
completed task. The maps which have 0 input create lot of skew. In the current
code, consider this case:
1. first map completes blowupOnThisTask (output/input) = 1000/200 = 5
mapBlowupRatio = 5
2. second map completes blowupOnThisTask (output/input) = 50/1 = 50 (in this
case the input was 0 and there is some output produced)
mapBlowupRatio becomes 50/2 + ((2 - 1) / 2) * 5 = 27.5
This leads to unreasonable blow up from 5 to 27.5
The fix I propose is as follows:
Instead of adding the blowup ratio, calculate the blow up based on (cumulative
completed map output size)/(cumulative completed map input size). For above
example, it works out as follows:
1. first map completes blowupOnThisTask (output/input) = 1000/200 = 5
mapBlowupRatio = 5
2. second map completes blowupOnThisTask (output/input) = 50/1 = 50 (in this
case the input was 0 and there is some output produced)
mapBlowupRatio becomes (1000+50)/(200+1) ~ 5.2
This leads to reasonable blow up from 5 to 5.2
> Reduce tasks get stuck because of over-estimated task size (regression from
> 0.18)
> ---------------------------------------------------------------------------------
>
> Key: HADOOP-5241
> URL: https://issues.apache.org/jira/browse/HADOOP-5241
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.0
> Environment: Red Hat Enterprise Linux Server release 5.2
> JDK 1.6.0_11
> Hadoop 0.19.0
> Reporter: Andy Pavlo
> Assignee: Sharad Agarwal
> Priority: Blocker
> Fix For: 0.19.1
>
> Attachments: hadoop-jobtracker.log.gz
>
>
> I have a simple MR benchmark job that computes PageRank on about 600 GB of
> HTML files using a 100 node cluster. For some reason, my reduce tasks get
> caught in a pending state. The JobTracker's log gets filled with the
> following messages:
> 2009-02-12 15:47:29,839 WARN org.apache.hadoop.mapred.JobInProgress: No room
> for reduce task. Node tracker_d-59.cs.wisc.edu:localhost/127.0.0.1:33227 has
> 110125027328 bytes free; but we expect reduce input to take 399642198235
> 2009-02-12 15:47:29,852 WARN org.apache.hadoop.mapred.JobInProgress: No room
> for reduce task. Node tracker_d-67.cs.wisc.edu:localhost/127.0.0.1:48626 has
> 107537776640 bytes free; but we expect reduce input to take 399642198235
> 2009-02-12 15:47:29,885 WARN org.apache.hadoop.mapred.JobInProgress: No room
> for reduce task. Node tracker_d-73.cs.wisc.edu:localhost/127.0.0.1:58849 has
> 113631690752 bytes free; but we expect reduce input to take 399642198235
> <SNIP>
> The weird thing is that I get through about 70 reduce tasks completing before
> it hangs. If I reduce the amount of the input data on 100 nodes down to
> 200GB, then it seems to work. As I scale the amount of input to the number of
> nodes, I can get it work some of the times on 50 nodes and without any
> problems on 25 nodes and less.
> Note that it worked without any problems on Hadoop 0.18 late last year
> without changing any of the input data or the actual MR code.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.