[ https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220641#comment-14220641 ]
Sandy Ryza commented on MAPREDUCE-5785: --------------------------------------- Hi Karthik. Took a look at the patch. Had some comments - mostly stylistic. {code} - return adminClasspath + " " + userClasspath; + return jobConf.getTaskJavaOpts(isMapTask ? TaskType.MAP : TaskType.REDUCE); {code} Wrong indentation? {code} + public String getTaskJavaOpts(TaskType taskType) { + + String javaOpts = getConfiguredTaskJavaOpts(taskType); {code} Unnecessary blank line. {code} + LOG.info("Task java.opts does not specify max heap size, setting using" + + " mapreduce.*.memory.mb * " + + MRJobConfig.HEAP_MEMORY_MB_RATIO); {code} Can we condense this and the log further down into a single message? {code} + if (LOG.isWarnEnabled()) { {code} Why use isWarnEnabled when we don't use isInfoEnabled? {code} + final int taskContainerMb = getMemoryRequired(taskType); {code} Any reason this should be final? Convention is usually not to declare local variables final unless they need to be (like referenced by an anonymous class). {code} + int taskHeapSize =(int)Math.ceil(taskContainerMb * heapRatio); {code} Should have a space after the "=". {code} + public static int parseMaximumHeapSizeMB(String javaOpts) { {code} Can this (and others) be marked Private? {code} + int memory = 1024; {code} It looks like this value will be overwritten in all cases. {code} + (heapSize = parseMaximumHeapSizeMB( + getConfiguredTaskJavaOpts(taskType))) > 0) { {code} This is a little weird. Can we assign heapSize outside of the condition? {code} + memory = + getInt(MRJobConfig.REDUCE_MEMORY_MB, + MRJobConfig.DEFAULT_REDUCE_MEMORY_MB); {code} This can be on 2 lines. {code} + If -Xmx is not set, it is inferred from mapreduc.{map|reduce}.memory.mb and {code} Missing an "e" at the end of mapreduc. {code} + <description>The ratio container between heap-size and container-size + If no -Xmx specified, it's calculated from the container memory + requirement: mapreduce.*.memory.mb * mapreduce.heap.memory-mb.ratio. + If -Xmx is specified but not mapreduce.*.memory.mb, it's calculated as + heapSize / mapreduce.heap.memory-mb.ratio. {code} Need a period after container size. "*" meaning both multiplication and either map or reduce is a little confusing here. It might be better to spell out {map|reduce} inside the config properties, which would also be consistent with how they're references above. Also, other descriptions tend to use "it is" instead of "it's". {code} <description>The amount of memory to request from the scheduler for each - reduce task. + reduce task. If this is not specified, it is inferred from {code} Indentation here is inconsistent with other places. Any reason to have getTaskJavaOpts in JobConf instead of MapReduceChildJVM? > Derive task attempt JVM max heap size and io.sort.mb automatically from > mapreduce.*.memory.mb > --------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5785 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mr-am, task > Reporter: Gera Shegalov > Assignee: Gera Shegalov > Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, > MAPREDUCE-5785.v03.patch, mr-5785-4.patch > > > Currently users have to set 2 memory-related configs per Job / per task type. > One first chooses some container size map reduce.\*.memory.mb and then a > corresponding maximum Java heap size Xmx < map reduce.\*.memory.mb. This > makes sure that the JVM's C-heap (native memory + Java heap) does not exceed > this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be > - allocating big containers whereas the JVM will only use the default > -Xmx200m. > - allocating small containers that will OOM because Xmx is too high. > With this JIRA, we propose to set Xmx automatically based on an empirical > ratio that can be adjusted. Xmx is not changed automatically if provided by > the user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)