[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220641#comment-14220641
 ] 

Sandy Ryza commented on MAPREDUCE-5785:
---------------------------------------

Hi Karthik.  Took a look at the patch.  Had some comments - mostly stylistic.

{code}
-    return adminClasspath + " " + userClasspath;
+      return jobConf.getTaskJavaOpts(isMapTask ? TaskType.MAP : 
TaskType.REDUCE);
{code}
Wrong indentation?

{code}
+  public String getTaskJavaOpts(TaskType taskType) {
+
+    String javaOpts = getConfiguredTaskJavaOpts(taskType);
{code}
Unnecessary blank line.

{code}
+      LOG.info("Task java.opts does not specify max heap size, setting using"
+          + " mapreduce.*.memory.mb * "
+          + MRJobConfig.HEAP_MEMORY_MB_RATIO);
{code}
Can we condense this and the log further down into a single message?

{code}
+        if (LOG.isWarnEnabled()) {
{code}
Why use isWarnEnabled when we don't use isInfoEnabled?

{code}
+      final int taskContainerMb = getMemoryRequired(taskType);
{code}
Any reason this should be final? Convention is usually not to declare local 
variables final unless they need to be (like referenced by an anonymous class).

{code}
+      int taskHeapSize =(int)Math.ceil(taskContainerMb * heapRatio);
{code}
Should have a space after the "=".

{code}
+  public static int parseMaximumHeapSizeMB(String javaOpts) {
{code}
Can this (and others) be marked Private?

{code}
+    int memory = 1024;
{code}
It looks like this value will be overwritten in all cases.

{code}
+          (heapSize = parseMaximumHeapSizeMB(
+              getConfiguredTaskJavaOpts(taskType))) > 0) {
{code}
This is a little weird.  Can we assign heapSize outside of the condition?

{code}
+        memory =
+            getInt(MRJobConfig.REDUCE_MEMORY_MB,
+                MRJobConfig.DEFAULT_REDUCE_MEMORY_MB);
{code}
This can be on 2 lines.

{code}
+  If -Xmx is not set, it is inferred from mapreduc.{map|reduce}.memory.mb and
{code}
Missing an "e" at the end of mapreduc.

{code}
+  <description>The ratio container between heap-size and container-size
+    If no -Xmx specified, it's calculated from the container memory
+    requirement: mapreduce.*.memory.mb * mapreduce.heap.memory-mb.ratio.
+    If -Xmx is specified but not mapreduce.*.memory.mb, it's calculated as
+    heapSize / mapreduce.heap.memory-mb.ratio.
{code}
Need a period after container size.  "*" meaning both multiplication and either 
map or reduce is a little confusing here. It might be better to spell out 
{map|reduce} inside the config properties, which would also be consistent with 
how they're references above.  Also, other descriptions tend to use "it is" 
instead of "it's".

{code}
   <description>The amount of memory to request from the scheduler for each
-  reduce task.
+    reduce task. If this is not specified, it is inferred from
{code}
Indentation here is inconsistent with other places.

Any reason to have getTaskJavaOpts in JobConf instead of MapReduceChildJVM?


> Derive task attempt JVM max heap size and io.sort.mb automatically from 
> mapreduce.*.memory.mb
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5785
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mr-am, task
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, 
> MAPREDUCE-5785.v03.patch, mr-5785-4.patch
>
>
> Currently users have to set 2 memory-related configs per Job / per task type. 
>  One first chooses some container size map reduce.\*.memory.mb and then a 
> corresponding maximum Java heap size Xmx < map reduce.\*.memory.mb. This 
> makes sure that the JVM's C-heap (native memory + Java heap) does not exceed 
> this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be 
> - allocating big containers whereas the JVM will only use the default 
> -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical 
> ratio that can be adjusted. Xmx is not changed automatically if provided by 
> the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to