[
https://issues.apache.org/jira/browse/MAPREDUCE-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Devaraj K resolved MAPREDUCE-2647.
----------------------------------
Resolution: Won't Fix
Closing it as Won't fix as there is no active feature development happening in
mrv1.
> Memory sharing across all the Tasks in the Task Tracker to improve the job
> performance
> --------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2647
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2647
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: tasktracker
> Reporter: Devaraj K
> Assignee: Devaraj K
>
> If all the tasks (maps/reduces) are using (working with) the same
> additional data to execute the map/reduce task, each task should load the
> data into memory individually and read the data. It is the additional effort
> for all the tasks to do the same job. Instead of loading the data by each
> task, data can be loaded into main memory and it can be used to execute all
> the tasks.
> h5.Proposed Solution:
> 1. Provide a mechanism to load the data into shared memory and to read that
> data from main memory.
> 2. We can provide a java API, which internally uses the native implementation
> to read the data from the memory. All the maps/reducers can this API for
> reading the data from the main memory.
> h5.Example:
> Suppose in a map task, ip address is a key and it needs to get location
> of the ip address from a local file. In this case each map task should load
> the file into main memory and read from it and close it. It takes some time
> to open, read from the file and process every time. Instead of this, we can
> load the file in the task tracker memory and each task can read from the
> memory directly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)