[ https://issues.apache.org/jira/browse/HIVE-14168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368644#comment-15368644 ]
Siddharth Seth commented on HIVE-14168: --------------------------------------- Any thoughts on this ? > Avoid serializing all parameters from HiveConf.java into in-memory HiveConf > instances > ------------------------------------------------------------------------------------- > > Key: HIVE-14168 > URL: https://issues.apache.org/jira/browse/HIVE-14168 > Project: Hive > Issue Type: Improvement > Reporter: Siddharth Seth > Priority: Critical > > All non-null parameters from HiveConf.java are explicitly set in each > HiveConf instance. > {code} > // Overlay the ConfVars. Note that this ignores ConfVars with null values > addResource(getConfVarInputStream()); > {code} > This unnecessarily bloats each Configuration object - 400+ conf variables > being set instead of probably <30 which would exist in hive-site.xml. > Looking at a HS2 heapdump - HiveConf is almost always the largest component > by a long way. Conf objects are also serialized very often - transmitting > lots of unneeded variables (serialized Hive conf is typically 1000+ variables > - due to Hadoop injecting it's configs into every config instance). > As long as HiveConf.get() is the approach used to read from a config - this > is avoidable. Hive code itself should be doing this. > This would be a potentially incompatible change for UDFs and other plugins > which have access to a Configuration object. > I'd suggest turning off the insert by default, and adding a flag to control > this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)