You can set the amount of memory used by the reducer using the 
mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it 
in your job. You can set it to something like: -Xm512M to increase the amount 
of memory used by the JVM spawned for the reducer task.

-----Original Message-----
From: Kelly Burkhart [mailto:[email protected]] 
Sent: Wednesday, February 16, 2011 9:12 AM
To: [email protected]
Subject: Re: Reduce java.lang.OutOfMemoryError

I have had it fail with a single reducer and with 100 reducers.
Ultimately it needs to be funneled to a single reducer though.

-K

On Wed, Feb 16, 2011 at 9:02 AM, real great..
<[email protected]> wrote:
> Hi,
> How many reducers are you using currently?
> Try increasing the number or reducers.
> Let me know if it helps.
>
> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart 
> <[email protected]>wrote:
>
>> Hello, I'm seeing frequent fails in reduce jobs with errors similar 
>> to
>> this:
>>
>>
>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask:
>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, 
>> decompressed len: 172488
>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner:
>> attempt_201102081823_0175_r_000034_0 : Map output copy failure :
>> java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf
>> fleInMemory(ReduceTask.java:1508)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM
>> apOutput(ReduceTask.java:1408)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy
>> Output(ReduceTask.java:1261)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(
>> ReduceTask.java:1195)
>>
>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask:
>> Shuffling 172488 bytes (172492 raw bytes) into RAM from
>> attempt_201102081823_0175_m_002153_0
>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask:
>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, 
>> decompressed len: 161940
>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask:
>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, 
>> decompressed len: 228361
>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: 
>> Task
>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from
>> attempt_201102081823_0175_m_002153_0
>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner:
>> attempt_201102081823_0175_r_000034_0 : Map output copy failure :
>> java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf
>> fleInMemory(ReduceTask.java:1508)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM
>> apOutput(ReduceTask.java:1408)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy
>> Output(ReduceTask.java:1261)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(
>> ReduceTask.java:1195)
>>
>> Some also show this:
>>
>> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded
>>        at 
>> sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63)
>>        at 
>> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811)
>>        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
>>        at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon
>> nection.java:1072)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getI
>> nputStream(ReduceTask.java:1447)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM
>> apOutput(ReduceTask.java:1349)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy
>> Output(ReduceTask.java:1261)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(
>> ReduceTask.java:1195)
>>
>> The particular job I'm running is an attempt to merge multiple time 
>> series files into a single file.  The job tracker shows the following:
>>
>>
>> Kind    Num Tasks    Complete   Killed    Failed/Killed Task Attempts 
>> map     15795        15795      0         0 / 29 reduce  100          
>> 30         70        17 / 29
>>
>> All of the files I'm reading have records with a timestamp key similar to:
>>
>> 2011-01-03 08:30:00.457000<tab><record>
>>
>> My map job is a simple python program that ignores rows with times <
>> 08:30:00 and > 15:00:00, determines the type of input row and writes 
>> it to stdout with very minor modification.  It maintains no state and 
>> should not use any significant memory.  My reducer is the 
>> IdentityReducer.  The input files are individually gzipped then put 
>> into hdfs.  The total uncompressed size of the output should be 
>> around 150G.  Our cluster is 32 nodes each of which has 16G RAM and 
>> most of which have two 2T drives.  We're running hadoop 0.20.2.
>>
>>
>> Can anyone provide some insight on how we can eliminate this issue?
>> I'm certain this email does not provide enough info, please let me 
>> know what further information is needed to troubleshoot.
>>
>> Thanks in advance,
>>
>> -Kelly
>>
>
>
>
> --
> Regards,
> R.V.
>


Reply via email to