[
https://issues.apache.org/jira/browse/MAPREDUCE-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13800644#comment-13800644
]
Ben Podgursky commented on MAPREDUCE-199:
-----------------------------------------
Hey Harsh. Delay was because I've only worked with MR1 so far (cloudera hadoop
4) and all of my source suggestions were in the context of MR1, so I spent a
bit of time checking out what in the source changed between MR1 and MR2.
After looking around your patch seems like a pretty nice way of enabling this
functionality without baking anything else into the API or complicating the
code (since it bootstraps on locality logic which already exists.)
The other alternative I was thinking about was making the logic pluggable via
the JobConf, similar to how partitions are set, eg
conf.setReduceTaskLocalizer(MyLocalityLogic.class);
Where MyLocalityLogic would have logic for assigning task -> host. I'm not
really sure how it would work though since (1) I'm not sure whether user-code
is on the classpath at the time tasks are assigned to nodes and (2) the
locality logic would need to be presented with a whole network topology to be
able to do anything intelligent, and I'm not sure where that would come from...
> Locality hints for Reduce
> -------------------------
>
> Key: MAPREDUCE-199
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-199
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: applicationmaster, mrv2
> Reporter: Benjamin Reed
> Assignee: Harsh J
> Attachments: MAPREDUCE-199.patch, MAPREDUCE-199.patch
>
>
> It would be nice if we could add method to OutputFormat that would allow a
> job to indicate where a reducer for a given partition should should run. This
> is similar to the getSplits() method on InputFormat. In our application the
> reducer is using other data in addition to the map outputs during processing
> and data accesses could be made more efficient if the JobTracker scheduled
> the reducers to run on specific hosts.
--
This message was sent by Atlassian JIRA
(v6.1#6144)