Eric, On Mon, Aug 20, 2007 at 12:31:23PM -0700, Eric Zhang wrote: >Hi, >I have a hadoop application where each run of the map could potentially >generate large amount of key value pairs, so it caused the run of memory >error. I am wondering if there is a way to inform hadoop to write the key >value pairs to disk periodically? >
The standard OutputCollector already does sort and flush key/value pairs to disk periodically... clearly you could see memory-related issues during sort etc. What is the observed size of map outputs? Try increasing the child-jvm memory limit via mapred.child.java.opts (default is 200M). Arun >thanks, > >Eric Zhang >Vespa content @Yahoo! >Work: 408-349-2466 > >
