[
https://issues.apache.org/jira/browse/MAPREDUCE-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736464#action_12736464
]
eric baldeschwieler commented on MAPREDUCE-801:
-----------------------------------------------
Hi Doug,
I think we are making the perfect the enemy of the good here. A real bug
existed that cost us performance. Having 20 options on placement is not going
to improve scheduling noticeably. Having hundreds can bring down the
centralize resources of the system and even 20 would cause lots of completely
unneeded work in the JT for little gain.
I'd like to see us discard anything beyond the first 5 options in the JT just
to keep bugs from DOSing the central server. I am not aware of any use case
where this would hinder performance. Having a warning and truncating this list
would have saved use a lot of resource and time.
The system is full of numbers. Sometime it is simpler to harden the system
then ID general principles. There are many places in the system where I think
this would be the wrong approach, but huge huge split lists are much more
likely to be the result of bugs or ignorance than need.
If we inject a warning and anyone hits the case, we can then do more work to
enhance this.
E14
> MAPREDUCE framework should issue warning with too many locations for a split
> ----------------------------------------------------------------------------
>
> Key: MAPREDUCE-801
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-801
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Reporter: Hong Tang
>
> Customized input-format may be buggy and report misleading locations through
> input-split, an example of which is PIG-878. When an input split returns too
> many locations, it would not only artificially inflate the percentage of data
> local or rack local maps, but also force scheduler to use more memory and
> work harder to conduct task assignment.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.