Hi,

I read about this Reducer Lazy initialization in a document found in the below 
URL.

http://www.scribd.com/doc/23046928/Hadoop-Performance-Tuning



It says “:In M/R job Reducers are initialized with Mappers at the job 
initialization, but the reduce method is called in reduce phase when all the 
maps had been finished. So in large jobs where Reducer loads data (>100 MB for 
business logic) in-memory on initialization, the performance can be increased 
by lazily initializing Reducers i.e. loading data in reduce method controlled 
by an initialize flag variable which assures that it is loaded only once. By 
lazily initializing Reducers which require memory (for business logic) on 
initialization, number of maps can be increased.”



But I did not find any other resource which talks about Reducer Lazy 
initialization.

Does anyone have experience on this ?

If yes, how and where can I set this parameter to get it working.



Thanks for the support.


Regards
Syed Wasti


                                          

Reply via email to