Can you share your job details (or a sample reducer code) and also share your exact error?
If you are holding reducer provided values/keys in memory in your implementation, it can easily cause an OOME if not handled properly. The reducer by itself does read the values off a sorted file on the disk and doesn't cache the whole group in memory. On Thu, May 10, 2012 at 12:20 AM, Yang <teddyyyy...@gmail.com> wrote: > it seems that if I put too many records into the same mapper output > key, all these records are grouped into one key one one reducer, > > then the reducer became out of memory. > > > but the reducer interface is: > > public void reduce(K key, Iterator<V> values, > OutputCollector<K, V> output, > Reporter reporter) > > > so all the values belonging to the key can be iterated, so > theoretically they can be iterated from disk, and does not have to be > in memory at the same time, > so why am I getting out of heap error? is there some param I could > tune (apart from -Xmx since my box is ultimately bounded in memory > capacity) > > thanks > Yang -- Harsh J