[
https://issues.apache.org/jira/browse/MAPREDUCE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Azuryy(Chijiong) updated MAPREDUCE-3323:
----------------------------------------
Attachment: (was: dc.patch)
> Distributed Cache for Map or Reduce or Both
> -------------------------------------------
>
> Key: MAPREDUCE-3323
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3323
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: distributed-cache, tasktracker
> Affects Versions: 0.20.203.0
> Reporter: Azuryy(Chijiong)
> Attachments: DistributedCache.patch, TaskTracker.patch
>
>
> We put some file into Distributed Cache, but sometimes, only Map or Reduce
> use thses cached files, not useful for both. but TaskTracker always download
> cached files from HDFS, if there are some little bit big files in cache, it's
> time expensive.
> so, this patch add some new API in the DistributedCache.java as follow:
> addArchiveToClassPathForMap
> addArchiveToClassPathForReduce
> addFileToClassPathForMap
> addFileToClassPathForReduce
> addCacheFileForMap
> addCacheFileForReduce
> addCacheArchiveForMap
> addCacheArchiveForReduce
> New API doesn't affect original interface. but they are specified for only
> map or reduce, not both of them.
> But if you do need cache file during both map and reduce, then use original
> interface.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira