allow reducer to initialize lazily
----------------------------------
Key: MAPREDUCE-1956
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1956
Project: Hadoop Map/Reduce
Issue Type: Improvement
Components: tasktracker
Affects Versions: 0.20.2
Reporter: Ted Yu
>From http://www.scribd.com/doc/23046928/Hadoop-Performance-Tuning:
"In M/R job Reducers are initialized with Mappers at the job initialization,
but the reduce method is called in reduce phase when all the maps had been
finished. So in large jobs where Reducer loads data (>100 MB for business
logic) in-memory on initialization, the performance can be increased by lazily
initializing Reducers i.e. loading data in reduce method controlled by an
initialize flag variable which assures that it is loaded only once. By lazily
initializing Reducers which require memory (for business logic) on
initialization, number of maps can be increased."
Introducing a parameter for this purpose would allow more people to utilize the
above pattern.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.