[ 
https://issues.apache.org/jira/browse/HIVE-23175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116304#comment-17116304
 ] 

Ashutosh Chauhan commented on HIVE-23175:
-----------------------------------------

[~mustafaiman] would you like to rebase your patch? Two new methods introduced 
in TezUtils in TEZ-4137 are static methods and we can duplicate those in Hive 
temporarily while waiting for a new Tez release.

> Skip serializing hadoop and tez config on HS side
> -------------------------------------------------
>
>                 Key: HIVE-23175
>                 URL: https://issues.apache.org/jira/browse/HIVE-23175
>             Project: Hive
>          Issue Type: Improvement
>          Components: Tez
>            Reporter: Mustafa Iman
>            Assignee: Mustafa Iman
>            Priority: Major
>         Attachments: HIVE-23175.1.patch, HIVE-23175.2.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> HiveServer spends a lot of time serializing configuration objects. We can 
> skip putting hadoop and tez config xml files in payload assuming that the 
> configs are the same on both HS and Task side. This depends on Tez to load 
> local xml configs when creating config objects 
> [https://issues.apache.org/jira/browse/TEZ-4137] 
> Ideally we should be able to skip hive-site.xml too. However, if we skip 
> hive-site.xml at that stage, then we make wrong choices at tez dag build 
> stage due to missing configs.
> In the ideal version of this, we should not be both looking up configs and 
> putting new configs from and to the same config object at DAG and Vertex 
> build phases. Instead we should be looking up from a HS2's HiveConf object 
> and writing to a brand new JobConf for each vertex. That way we would not 
> have any unnecessary item in the jobconf for any vertex. However Dag and 
> Vertex build stages (TezTask#build) and a lot of other components called from 
> there treat a single config object both the source of HS2 side config and the 
> target JobConf that they are putting vertex level options into. It is very 
> hard to separate these concerns now.
> With this patch, we are reducing the size of JobConf (per vertex) by ~65%. It 
> should improve the transmit latency. However, most significant gains are at 
> CPU time while compressing job configs as the config objects are much smaller 
> now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to