[
https://issues.apache.org/jira/browse/HIVE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933583#comment-15933583
]
Misha Dmitriev commented on HIVE-16166:
---------------------------------------
I ran 'mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=vector_if_expr.q'
locally, and it passed.
I then checked the hive log at
http://104.198.109.242/logs/PreCommit-HIVE-Build-4250/failed/141-TestMiniLlapLocalCliDriver-skewjoinopt15.q-vector_coalesce.q-orc_ppd_decimal.q-and-27-more/logs/hive.log
It does have a bunch of exception stack traces, but it doesn't look like they
are related with my changes. At least I don't see 'StringInternUtils' (my class
where an NPE or some such is most likely to happen), and a bunch of NPEs all
across this log are all of the same type and have no traces of the code that
I've modified. I can't see where in this log the problematic test
(vector_if_expr) starts, or do all the tests run in parallel?
> HS2 may still waste up to 15% of memory on duplicate strings
> ------------------------------------------------------------
>
> Key: HIVE-16166
> URL: https://issues.apache.org/jira/browse/HIVE-16166
> Project: Hive
> Issue Type: Improvement
> Reporter: Misha Dmitriev
> Assignee: Misha Dmitriev
> Attachments: ch_2_excerpt.txt, HIVE-16166.01.patch,
> HIVE-16166.02.patch
>
>
> A heap dump obtained from one of our users shows that 15% of memory is wasted
> on duplicate strings, despite the recent optimizations that I made. The
> problematic strings just come from different sources this time. See the
> excerpt from the jxray (www.jxray.com) analysis attached.
> Adding String.intern() calls in the appropriate places reduces the overhead
> of duplicate strings with this workload to ~6%. The remaining duplicates come
> mostly from JDK internal and MapReduce data structures, and thus are more
> difficult to fix.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)