you can think of each TrainingWeights as a very large double[] whose
length is about 10,000,000
TrainingWeights result=null;
int total=0;
for(TrainingWeights weights:values){
if(result==null){
result=weights;
}else{
addWeights(result, weights);
}
total++;
}
if(total>1){
divideWeights(result, total);
}
context.write(NullWritable.get(), result);On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <[email protected]> wrote: > What is the work in reducer ? > Do you have any memory intensive work in reducer(eg. cache a lot of data > in memory) ? I guess the OOM error comes from your code in reducer. > > > On Thu, Apr 3, 2014 at 5:10 PM, Li Li <[email protected]> wrote: > >> *mapred.child.java.opts=-Xmx2g* >> >> >> On Thu, Apr 3, 2014 at 5:10 PM, Li Li <[email protected]> wrote: >> >>> 2g >>> >>> >>> On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <[email protected]> wrote: >>> >>>> This doesn't seem like related with the data size. >>>> >>>> How much memory do you use for the reducer? >>>> >>>> Regards, >>>> *Stanley Shi,* >>>> >>>> >>>> >>>> On Thu, Apr 3, 2014 at 8:04 AM, Li Li <[email protected]> wrote: >>>> >>>>> I have a map reduce program that do some matrix operations. in the >>>>> reducer, it will average many large matrix(each matrix takes up >>>>> 400+MB(said by Map output bytes). so if there 50 matrix to a reducer, >>>>> then the total memory usage is 20GB. so the reduce task got exception: >>>>> >>>>> FATAL org.apache.hadoop.mapred.Child: Error running child : >>>>> java.lang.OutOfMemoryError: Java heap space >>>>> at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344) >>>>> at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406) >>>>> at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238) >>>>> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438) >>>>> at org.apache.hadoop.mapred.Merger.merge(Merger.java:142) >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539) >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661) >>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399) >>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>> at >>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>>>> >>>>> one method I can come up with is use Combiner to save sums of some >>>>> matrixs and their count >>>>> but it still can solve the problem because the combiner is not fully >>>>> controled by me. >>>>> >>>> >>>> >>> >> > > > -- > Regards > Gordon Wang >
