Eric,

On Mon, Aug 20, 2007 at 12:31:23PM -0700, Eric Zhang wrote:
>Hi, 
>I have a hadoop application where each run of the map could potentially
>generate large amount of key value pairs,  so it caused the run of memory
>error.  I am wondering if there is a way to inform hadoop to write the key
>value pairs to disk  periodically?
> 

The standard OutputCollector already does sort and flush key/value pairs to 
disk periodically... clearly you could see memory-related issues during sort 
etc.

What is the observed size of map outputs? Try increasing the child-jvm memory 
limit via mapred.child.java.opts (default is 200M).

Arun

>thanks,  
> 
>Eric Zhang
>Vespa content @Yahoo!
>Work: 408-349-2466
> 
> 

Reply via email to