anyone ?

Début du message réexpédié :

> De : Olivier Varene - echo <var...@echo.fr>
> Objet : ReduceTask > ShuffleRamManager : Java Heap memory error
> Date : 4 décembre 2012 09:34:06 HNEC
> À : mapreduce-user@hadoop.apache.org
> Répondre à : mapreduce-user@hadoop.apache.org
> 
> 
> Hi to all,
> first many thanks for the quality of the work you are doing : thanks a lot
> 
> I am facing a bug with the memory management at shuffle time, I regularly get
> 
> Map output copy failure : java.lang.OutOfMemoryError: Java heap space
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1612)
> 
> 
> reading the code in org.apache.hadoop.mapred.ReduceTask.java file
> 
> the "ShuffleRamManager" is limiting the maximum of RAM allocation to 
> Integer.MAX_VALUE * maxInMemCopyUse ?
> 
> maxSize = (int)(conf.getInt("mapred.job.reduce.total.mem.bytes",
>            (int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE))
>          * maxInMemCopyUse);
> 
> Why is is so ?
> And why is it concatened to an Integer as its raw type is long ?
> 
> Does it mean that you can not have a Reduce Task taking advantage of more 
> than 2Gb of memory ?
> 
> To explain a little bit my use case, 
> I am processing some 2700 maps (each working on 128 MB block of data), and 
> when the reduce phase starts, it sometimes stumbles with java heap memory 
> issues.
> 
> configuration is : java 1.6.0-27
> hadoop 0.20.2
> -Xmx1400m
> io.sort.mb 400
> io.sort.factor 25
> io.sort.spill.percent 0.80
> mapred.job.shuffle.input.buffer.percent 0.70
> ShuffleRamManager: MemoryLimit=913466944, MaxSingleShuffleLimit=228366736
> 
> I will decrease 
> mapred.job.shuffle.input.buffer.percent to limit the errors, but I am not 
> fully confident for the scalability of the process.
> 
> Any help would be welcomed
> 
> once again, many thanks
> Olivier
> 
> 
> P.S: sorry if I misunderstood the code, any explanation would be really 
> welcomed
> 
> -- 
>  
>  
>  
> 
> 

Reply via email to