Dhruv Kumar created HADOOP-8705:
-----------------------------------

             Summary: Add JSR 107 Caching support 
                 Key: HADOOP-8705
                 URL: https://issues.apache.org/jira/browse/HADOOP-8705
             Project: Hadoop Common
          Issue Type: Improvement
            Reporter: Dhruv Kumar


Having a cache on mappers and reducers could be very useful for some use cases, 
including but not limited to:

1. Iterative Map Reduce Programs: Some machine learning algorithms frequently 
need access to invariant data (see Mahout) over each iteration of MapReduce 
until convergence. A cache on such nodes could allow easy access to the hotset 
of data without going all the way to the distributed cache.

2. Storing of intermediate map and reduce outputs in memory to reduce shuffling 
time. This optimization has been discussed at length in Haloop 
(http://www.ics.uci.edu/~yingyib/papers/HaLoop_camera_ready.pdf).

There are some other scenarios as well where having a cache could come in 
handy. 

It will be nice to have some sort of pluggable support for JSR 107 compliant 
caches. 
 
. Now that JSR 107 is a caching standard, it will be nice

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to