[
https://issues.apache.org/jira/browse/MAPREDUCE-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643426#comment-13643426
]
Sangjin Lee commented on MAPREDUCE-5186:
----------------------------------------
I understand this check was introduced early on with MAPREDUCE-1943. However,
the way CombineFileInputFormat works, the default value (10) for this just
doesn't work nicely with it.
Currently we work around it by setting this value to a large value (> number of
data nodes). But it would be great if we can come up with a way to reconcile
these two needs.
> mapreduce.job.max.split.locations causes some splits created by
> CombineFileInputFormat to fail
> ----------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5186
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5186
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv1, mrv2
> Affects Versions: 2.0.4-alpha
> Reporter: Sangjin Lee
>
> CombineFileInputFormat can easily create splits that can come from many
> different locations (during the last pass of creating "global" splits).
> However, we observe that this often runs afoul of the
> mapreduce.job.max.split.locations check that's done by JobSplitWriter.
> The default value for mapreduce.job.max.split.locations is 10, and with any
> decent size cluster, CombineFileInputFormat creates splits that are well
> above this limit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira