Extremely amount of memory and DB connections by MR Job

Blanca Hernandez Mon, 29 Sep 2014 05:59:25 -0700

Hi,

I am using a hadoop map reduce job + mongoDb.
It goes against a data base 252Gb big. During the job the amount of conexions 
is over 8000 and we gave already 9Gb RAM. The job is still crashing because of 
a OutOfMemory with only a 8% of the mapping done.
Are this numbers normal? Or did we miss something regarding configuration?
I attach my code, just in case the problem is with it.


Mapper:

public class AveragePriceMapper extends Mapper<Object, BasicDBObject, Text, 
BSONWritable> {
    @Override
    public void map(final Object key, final BasicDBObject val, final Context 
context) throws IOException, InterruptedException {
        String id = "";
        for(String propertyId : currentId.split(AveragePriceGlobal.SEPARATOR)){
            id += val.get(propertyId) + AveragePriceGlobal.SEPARATOR;
        }
        BSONWritable bsonWritable = new BSONWritable(val);
        context.write(new Text(id), bsonWritable);
    }
}


Reducer:
public class AveragePriceReducer extends Reducer<Text, BSONWritable, Text, 
Text>  {
    public void reduce(final Text pKey, final Iterable<BSONWritable> pValues, 
final Context pContext) throws IOException, InterruptedException {
        while(pValues.iterator().hasNext() && continueLoop){
            BSONWritable next = pValues.iterator().next();
            //Make some calculations
        }        pContext.write(new Text(currentId), new Text(new 
MyClass(currentId, AveragePriceGlobal.COMMENT, 0, 0).toString()));

    }
}

The configuration includes a query which filters the number of objects to 
analyze (not the 252Gb will be analyzed).

Many thanks. Best regards,
Blanca

Extremely amount of memory and DB connections by MR Job

Reply via email to