Re: how to solve reducer memory problem?

Gordon Wang Thu, 03 Apr 2014 18:20:12 -0700

I've no idea what your program is doing.
But you'd better estimate the memory consumption of your reducer. And then,
pick a proper Xmx size for it.
Looks like your reducer needs much memory to cache the TrainingWeights.



On Thu, Apr 3, 2014 at 5:53 PM, Li Li <[email protected]> wrote:

>  you can think of each TrainingWeights as a very large double[] whose
> length is about 10,000,000
> TrainingWeights result=null;
>  int total=0;
> for(TrainingWeights weights:values){
> if(result==null){
>  result=weights;
> }else{
> addWeights(result, weights);
>  }
> total++;
> }
> if(total>1){
>  divideWeights(result, total);
> }
> context.write(NullWritable.get(), result);
>
>
> On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <[email protected]> wrote:
>
>> What is the work in reducer ?
>> Do you have any memory intensive work in reducer(eg. cache a lot of data
>> in memory) ? I guess the OOM error comes from your code in reducer.
>>
>>
>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <[email protected]> wrote:
>>
>>> *mapred.child.java.opts=-Xmx2g*
>>>
>>>
>>> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <[email protected]> wrote:
>>>
>>>> 2g
>>>>
>>>>
>>>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <[email protected]> wrote:
>>>>
>>>>> This doesn't seem like related with the data size.
>>>>>
>>>>> How much memory do you use for the reducer?
>>>>>
>>>>> Regards,
>>>>> *Stanley Shi,*
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <[email protected]> wrote:
>>>>>
>>>>>> I have a map reduce program that do some matrix operations. in the
>>>>>> reducer, it will average many large matrix(each matrix takes up
>>>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer,
>>>>>> then the total memory usage is 20GB. so the reduce task got exception:
>>>>>>
>>>>>> FATAL org.apache.hadoop.mapred.Child: Error running child :
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)
>>>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)
>>>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)
>>>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)
>>>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)
>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>
>>>>>> one method I can come up with is use Combiner to save sums of some
>>>>>> matrixs and their count
>>>>>> but it still can solve the problem because the combiner is not fully
>>>>>> controled by me.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Regards
>> Gordon Wang
>>
>
>


-- 
Regards
Gordon Wang

Re: how to solve reducer memory problem?

Reply via email to