[ https://issues.apache.org/jira/browse/HIVE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933491#comment-15933491 ]
Hive QA commented on HIVE-16166: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12859609/HIVE-16166.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10480 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=141) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4250/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4250/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4250/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12859609 - PreCommit-HIVE-Build > HS2 may still waste up to 15% of memory on duplicate strings > ------------------------------------------------------------ > > Key: HIVE-16166 > URL: https://issues.apache.org/jira/browse/HIVE-16166 > Project: Hive > Issue Type: Improvement > Reporter: Misha Dmitriev > Assignee: Misha Dmitriev > Attachments: ch_2_excerpt.txt, HIVE-16166.01.patch, > HIVE-16166.02.patch > > > A heap dump obtained from one of our users shows that 15% of memory is wasted > on duplicate strings, despite the recent optimizations that I made. The > problematic strings just come from different sources this time. See the > excerpt from the jxray (www.jxray.com) analysis attached. > Adding String.intern() calls in the appropriate places reduces the overhead > of duplicate strings with this workload to ~6%. The remaining duplicates come > mostly from JDK internal and MapReduce data structures, and thus are more > difficult to fix. -- This message was sent by Atlassian JIRA (v6.3.15#6346)