[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13551474#comment-13551474
 ] 

Bikas Saha commented on MAPREDUCE-4892:
---------------------------------------

I did consider the alternate implementation. Arguments against it. In general 
and specially in skewed distributions we will be running the inner for loop 
(that tries to group blocks) multiple times with no useful outcome. So for say 
10K machines that might end up slowing down the grouping calculation.
Secondly, for large jobs with multi-wave mappers, effects of non-local reads or 
delay scheduling might be small compared to the overall cost (including 
multiple runs). So this change mainly boils down to effects on single mapper 
wave jobs or small jobs. In these cases, it is arguable that having a few rack 
local mappers (which significantly increases the on time start of the mapper) 
might be beneficial than having the mapper wait on a machine because it was 
node local to it. Likeliness of waiting is high since the machine has many 
mappers already assigned to it.

That being said, I did not fully get your example. What is the maxMapperSize? 
If its 3 blocks then how can there be 9 splits created? If its not 3 blocks 
then why would avg per node be = 3?
                
> CombineFileInputFormat node input split can be skewed on small clusters
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4892
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4892
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-4892.1.patch
>
>
> The CombineFileInputFormat split generation logic tries to group blocks by 
> node in order to create splits. It iterates through the nodes and creates 
> splits on them until there aren't enough blocks left on a node that can be 
> grouped into a valid split. If the first few nodes have a lot of blocks on 
> them then they can end up getting a disproportionately large share of the 
> total number of splits created. This can result in poor locality of maps. 
> This problem is likely to happen on small clusters where its easier to create 
> a skew in the distribution of blocks on nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to