Hi,
I am using a hadoop map reduce job + mongoDb.
It goes against a data base 252Gb big. During the job the amount of conexions
is over 8000 and we gave already 9Gb RAM. The job is still crashing because of
a OutOfMemory with only a 8% of the mapping done.
Are this numbers normal? Or did we miss something regarding configuration?
I attach my code, just in case the problem is with it.
Mapper:
public class AveragePriceMapper extends Mapper<Object, BasicDBObject, Text,
BSONWritable> {
@Override
public void map(final Object key, final BasicDBObject val, final Context
context) throws IOException, InterruptedException {
String id = "";
for(String propertyId : currentId.split(AveragePriceGlobal.SEPARATOR)){
id += val.get(propertyId) + AveragePriceGlobal.SEPARATOR;
}
BSONWritable bsonWritable = new BSONWritable(val);
context.write(new Text(id), bsonWritable);
}
}
Reducer:
public class AveragePriceReducer extends Reducer<Text, BSONWritable, Text,
Text> {
public void reduce(final Text pKey, final Iterable<BSONWritable> pValues,
final Context pContext) throws IOException, InterruptedException {
while(pValues.iterator().hasNext() && continueLoop){
BSONWritable next = pValues.iterator().next();
//Make some calculations
} pContext.write(new Text(currentId), new Text(new
MyClass(currentId, AveragePriceGlobal.COMMENT, 0, 0).toString()));
}
}
The configuration includes a query which filters the number of objects to
analyze (not the 252Gb will be analyzed).
Many thanks. Best regards,
Blanca