Mustafa Iman created TEZ-4137:
---------------------------------
Summary: Input/Output processors should merge payload to local conf
Key: TEZ-4137
URL: https://issues.apache.org/jira/browse/TEZ-4137
Project: Apache Tez
Issue Type: Improvement
Reporter: Mustafa Iman
Assignee: Mustafa Iman
This patch introduces config merging to various Input and Output processors. As
described in https://issues.apache.org/jira/browse/TEZ-4073 , we need to reduce
the size of the configuration objects transferred over the wire. There are two
improvements we are planning to do regarding to that:
# Skip sending default configs and configuration coming from xml files in
payload
# Send dag, vertex and session configurations in layers instead of sending dag
+ vertex + session configs all together three times.
In order to achieve these,
* We need to expose local config coming from configuration files to
TaskContext.
* Input/Output processors must merge the config from user payload to local
config in their TaskContext
This is the configuration merging part. After this is merged, corresponding
changes should be made on Hive side to prevent sending redundant configs. Until
Hive side is updated, changes here are only overhead because all the config
objects are the same and they have all the config options anyway.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)