[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643426#comment-13643426
 ] 

Sangjin Lee commented on MAPREDUCE-5186:
----------------------------------------

I understand this check was introduced early on with MAPREDUCE-1943. However, 
the way CombineFileInputFormat works, the default value (10) for this just 
doesn't work nicely with it.

Currently we work around it by setting this value to a large value (> number of 
data nodes). But it would be great if we can come up with a way to reconcile 
these two needs.
                
> mapreduce.job.max.split.locations causes some splits created by 
> CombineFileInputFormat to fail
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5186
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1, mrv2
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sangjin Lee
>
> CombineFileInputFormat can easily create splits that can come from many 
> different locations (during the last pass of creating "global" splits). 
> However, we observe that this often runs afoul of the 
> mapreduce.job.max.split.locations check that's done by JobSplitWriter.
> The default value for mapreduce.job.max.split.locations is 10, and with any 
> decent size cluster, CombineFileInputFormat creates splits that are well 
> above this limit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to