anyone ? Début du message réexpédié :
> De : Olivier Varene - echo <var...@echo.fr> > Objet : ReduceTask > ShuffleRamManager : Java Heap memory error > Date : 4 décembre 2012 09:34:06 HNEC > À : mapreduce-user@hadoop.apache.org > Répondre à : mapreduce-user@hadoop.apache.org > > > Hi to all, > first many thanks for the quality of the work you are doing : thanks a lot > > I am facing a bug with the memory management at shuffle time, I regularly get > > Map output copy failure : java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1612) > > > reading the code in org.apache.hadoop.mapred.ReduceTask.java file > > the "ShuffleRamManager" is limiting the maximum of RAM allocation to > Integer.MAX_VALUE * maxInMemCopyUse ? > > maxSize = (int)(conf.getInt("mapred.job.reduce.total.mem.bytes", > (int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE)) > * maxInMemCopyUse); > > Why is is so ? > And why is it concatened to an Integer as its raw type is long ? > > Does it mean that you can not have a Reduce Task taking advantage of more > than 2Gb of memory ? > > To explain a little bit my use case, > I am processing some 2700 maps (each working on 128 MB block of data), and > when the reduce phase starts, it sometimes stumbles with java heap memory > issues. > > configuration is : java 1.6.0-27 > hadoop 0.20.2 > -Xmx1400m > io.sort.mb 400 > io.sort.factor 25 > io.sort.spill.percent 0.80 > mapred.job.shuffle.input.buffer.percent 0.70 > ShuffleRamManager: MemoryLimit=913466944, MaxSingleShuffleLimit=228366736 > > I will decrease > mapred.job.shuffle.input.buffer.percent to limit the errors, but I am not > fully confident for the scalability of the process. > > Any help would be welcomed > > once again, many thanks > Olivier > > > P.S: sorry if I misunderstood the code, any explanation would be really > welcomed > > -- > > > > >