[ 
https://issues.apache.org/jira/browse/HIVE-14168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941904#comment-15941904
 ] 

Pengcheng Xiong commented on HIVE-14168:
----------------------------------------

I am deferring this to Hive 3.0 as we are going to cut the first RC and it is 
not marked as blocker.

> Avoid serializing all parameters from HiveConf.java into in-memory HiveConf 
> instances
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-14168
>                 URL: https://issues.apache.org/jira/browse/HIVE-14168
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Priority: Critical
>
> All non-null parameters from HiveConf.java are explicitly set in each 
> HiveConf instance.
> {code}
> // Overlay the ConfVars. Note that this ignores ConfVars with null values
>     addResource(getConfVarInputStream());
> {code}
> This unnecessarily bloats each Configuration object - 400+ conf variables 
> being set instead of probably <30 which would exist in hive-site.xml.
> Looking at a HS2 heapdump - HiveConf is almost always the largest component 
> by a long way. Conf objects are also serialized very often - transmitting 
> lots of unneeded variables (serialized Hive conf is typically 1000+ variables 
> - due to Hadoop injecting it's configs into every config instance).
> As long as HiveConf.get() is the approach used to read from a config - this 
> is avoidable. Hive code itself should be doing this.
> This would be a potentially incompatible change for UDFs and other plugins 
> which have access to a Configuration object.
> I'd suggest turning off the insert by default, and adding a flag to control 
> this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to