[ https://issues.apache.org/jira/browse/YARN-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yingqi Lu updated YARN-4282: ---------------------------- Attachment: flamegraph.png > JVM reuse in Yarn > ----------------- > > Key: YARN-4282 > URL: https://issues.apache.org/jira/browse/YARN-4282 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Yingqi Lu > Labels: performance > Attachments: flamegraph.png > > > Dear All, > Recently, we identified an issue inside Yarn with MapReduce. There is a > significant amount of time spent in libjvm.so and most of which is > compilation. > Attached is a flame graph (visual call graph) of a query running for about 8 > mins. Most of the yellow bars represent ‘libjvm.so’ functions while the java > functions are colored in red. Data show that more than 40% of overall > execution time is spent in compilation itself, but still a lot of code ran in > the interpreter mode by looking inside the JVM themselves. In the ideal case, > we want everything runs with compiled code over and over again. However in > reality, mappers and reducers are long died before the compilation benefits > kick in. In other word, we take the performance hit from both compilation and > interpreter. JVM reuse feature in MapReduce 1.0 addressed this issue, but it > was removed in Yarn. We are right now working on a bunch of JVM parameters to > minimize the impact of the performance, but still think it would be good to > open a discussion here to seek for more permanent solutions since it ties to > the nature of how Yarn works. > We are wondering if any of you have seen this issue before or if there is any > on-going project already happening to address this? > Data for this graph was collected across the entire system with multiple JVMs > running. The workload we use is BigBench workload > (https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench). > Thanks, > Yingqi Lu > 1. Software and workloads used in performance tests may have been optimized > for performance only on Intel microprocessors. Performance tests, such as > SYSmark and MobileMark, are measured using specific computer systems, > components, software, operations and functions. Any change to any of those > factors may cause the results to vary. You should consult other information > and performance tests to assist you in fully evaluating your contemplated > purchases, including the performance of that product when combined with other > products. -- This message was sent by Atlassian JIRA (v6.3.4#6332)