[ 
https://issues.apache.org/jira/browse/MAPREDUCE-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758015#action_12758015
 ] 

eric baldeschwieler commented on MAPREDUCE-989:
-----------------------------------------------

Philip's idea seems interesting.

In terms of specifying map vs reduce, would it just be possible to lazily load 
an item when it is needed?  Asking users to configure more stuff seems awkward.

In a previous system, we had a call to get the path to a cached object.  If the 
object was not in the cache, it was downloaded upon the first request.  This 
would allow objects to be used whenever needed without configuration.

Note: This would even allow one to shard cached objects into sets, if for 
example different reducers need different data, which is often the case.

Downsides: 
- Perhaps an API change?
- Less info that the JT has for later optimizations

> Allow segregation of DistributedCache for maps and reduces
> ----------------------------------------------------------
>
>                 Key: MAPREDUCE-989
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-989
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: client
>            Reporter: Arun C Murthy
>
> Applications might have differing needs for files in the DistributedCache wrt 
> maps and reduces. We should allow them to specify them separately.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to