While I think work should be done to make the numbers nearby, we should ideally raise the JVM heap value than lower the memory.mb resource request of MR tasks. Otherwise, with YARN, users will start seeing more containers per node than before.
Also good to raise heap plus the sort buffer memory of mappers now, since the HDFS default block size has also doubled to 128m. I think we have a JIRA already open for this. Hi While looking into MAPREDUCE-5207 (adding defaults for mapreduce.{map|reduce}.memory.mb), I was wondering how much headroom should be left on top of mapred.child.java.opts (or other similar JVM opts) for the container memory itself? Currently, mapred.child.java.opts (per mapred-default.xml) is set to 200 MB by default. The default for mapreduce.{map|reduce}.memory.mb is 1024 in the code, which is significantly higher than the 200MB value. Do we need more than 100 MB for non-JVM memory per container? If so, does it make sense make that a config property in itself and the code to verify all 3 values are clear enough? Thanks Karthik