[jira] [Commented] (MAPREDUCE-5785) Derive task attempt JVM max heap size and io.sort.mb automatically from mapreduce.*.memory.mb

Rohini Palaniswamy (JIRA) Mon, 14 Jul 2014 10:44:37 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060919#comment-14060919
 ]


Rohini Palaniswamy commented on MAPREDUCE-5785:
-----------------------------------------------

I was taking a look at https://issues.apache.org/jira/browse/MAPREDUCE-5785 and 
https://issues.apache.org/jira/browse/TEZ-699 . 
 
MR:
{code}
public static final float DEFAULT_MEMORY_MB_HEAP_RATIO = 1.33f;
float heapRatio = conf.getFloat(MRJobConfig.MEMORY_MB_HEAP_RATIO,
          MRJobConfig.DEFAULT_MEMORY_MB_HEAP_RATIO);
int taskHeapSize =(int)Math.ceil(taskContainerMb / heapRatio);
public static final float DEFAULT_IO_SORT_MB_HEAP_RATIO = 0.5f;
ioSortMbPer = JobContext.DEFAULT_IO_SORT_MB_HEAP_RATIO;
+        }
+        sortmb = (int)(maxHeapMb * ioSortMbPer);
{code}
Tez:
{code}
public static final String TEZ_CONTAINER_MAX_JAVA_HEAP_FRACTION =
+      TEZ_PREFIX + "container.max.java.heap.fraction";
+  public static final double TEZ_CONTAINER_MAX_JAVA_HEAP_FRACTION_DEFAULT = 
0.8;
int maxMemory = (int)(resource.getMemory() * maxHeapFactor);
 {code}

Few issues, inconsistencies that I see:
  - The MR one is really confusing. For heap it is divide by and io.sort.mb it 
is multiplication. I think it would be easier to keep both same to avoid 
confusion.  I had to apply more of my grey cells to do the division. Would 
prefer multiplication to determine the percentage as it is more easy to compute 
in mind than division.
   -  io.sort.mb as 50% of heap seems too high for default value. Most of the 
pig jobs which have huge bags would start failing.
- Another issue is taking the defaults now, for a 
4G container – Tez Xmx = 3.2G, MR Xmx=3.0G
8G container – Tez Xmx = 6.2G, MR xmx = 6G. 
Though the defaults work well for 1 or 2G of memory, for higher memories they 
seem to be actually wasting a lot of memory considering not more than 500M is 
usually required for native memory even if the Xmx keeps increasing. We should 
account that factor into calculation instead of doing Xmx as just a direct 
percentage of resource.mb.

Tez settings are usually equivalent of MR settings with an internal mapping to 
the MR setting taking them if they are specified, so that it is easier for 
users to switch between frameworks. This is one thing I am seeing inconsistent 
in terms of how the value is specified and would be good to reconcile both to 
have same behavior.

> Derive task attempt JVM max heap size and io.sort.mb automatically from 
> mapreduce.*.memory.mb
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5785
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5785
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mr-am, task
>            Reporter: Gera Shegalov
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5785.v01.patch, MAPREDUCE-5785.v02.patch, 
> MAPREDUCE-5785.v03.patch
>
>
> Currently users have to set 2 memory-related configs per Job / per task type. 
>  One first chooses some container size map reduce.\*.memory.mb and then a 
> corresponding maximum Java heap size Xmx < map reduce.\*.memory.mb. This 
> makes sure that the JVM's C-heap (native memory + Java heap) does not exceed 
> this mapreduce.*.memory.mb. If one forgets to tune Xmx, MR-AM might be 
> - allocating big containers whereas the JVM will only use the default 
> -Xmx200m.
> - allocating small containers that will OOM because Xmx is too high.
> With this JIRA, we propose to set Xmx automatically based on an empirical 
> ratio that can be adjusted. Xmx is not changed automatically if provided by 
> the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAPREDUCE-5785) Derive task attempt JVM max heap size and io.sort.mb automatically from mapreduce.*.memory.mb

Reply via email to