[
https://issues.apache.org/jira/browse/MAPREDUCE-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allen Wittenauer resolved MAPREDUCE-2872.
-----------------------------------------
Resolution: Won't Fix
Closing this as won't fix, given development on hadoop-1 has effectively
stopped.
> Optimize TaskTracker memory usage
> ---------------------------------
>
> Key: MAPREDUCE-2872
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2872
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tasktracker
> Affects Versions: 0.20.203.0
> Reporter: Binglin Chang
> Assignee: Binglin Chang
> Labels: memory, optimization
>
> We observe high memory usage of framework level components on slave node,
> mainly TaskTracker & Child, especially for large clusters. To be clear at
> first, large jobs with 10000-100000 map and >10000 reduce tasks are very
> common in our offline cluster, and will very likely continue to grow. This is
> reasonable because the number of map & reduce slots are in the same range,
> and it's impractical for users to reduce their job's task number without
> execution time penalty.
> High memory consumption will:
> * Limit the memory used by up level application;
> * Reduce page cache space, which plays a important role in spill, merge,
> shuffle and even HDFS performance;
> * Increase the probability of slave node OOM, which may affect storage
> layer(HDFS) too.
> A stable TT with predictable memory behavior is desired, this also applies to
> Child JVM.
> This issue focuses on TaskTracker memory optimization, on our cluster,
> TaskTracker use 600M+ memory & 300%+(3core+) CPU at peak, and 300M+ memory &
> much less CPU in average, so we need to set -Xmx to 1000M for TT to prevent
> OOM, then the TT memory is in 200M-1200M range, and 800M in average.
> Here are some ideas:
> Jetty http connection use a lot memory when these are many requests in queue,
> we need to limit the length of the queue, combine multiple requests into one
> request, or use netty just like MR2
> TaskCompletionEvents use a lot memory too if a job have large number of map
> task, this won't be a problem in MR2, but can be optimized, A typical
> TaskCompletionEvent object use 296 bytes memory, a job with 100000 map will
> use about 30M memory, problem will appear if there are some big RunningJob in
> a TaskTracker. There are more memory efficient implementations for
> TaskCompletionEvent.
> IndexCache: memory of indexcache varies directly as reduce number, on large
> cluster 10MB of indexcache is not enough,
> we set it to 100MB, again use primitive long[] instead of IndexRecord[] can
> save 50% of memory.
> Although some of the above won't be a problem in MR-v2, since MR-v1 is still
> widely used, I think optimizations are needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)