anishek created HIVE-17814: ------------------------------ Summary: Reduce Memory footprint for large database bootstrap replication load Key: HIVE-17814 URL: https://issues.apache.org/jira/browse/HIVE-17814 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.0.0 Reporter: anishek Assignee: anishek Fix For: 3.0.0
As part of HIVE-16896 we are doing dynamic Query Task generation for bootstrap repl load. This was done since the number of tasks for large databases will generate a very large graph with hundreds of thousands of objects, this would put additional memory pressure on hive. The execution hook's however still keep reference to the query plan which gets dynamically modified and at the end of all task execution hive will have the whole DAG in memory which is what we have to prevent, Additionally for PostExecution Hive hooks we are additionally storing the TaskRunner objects for each task that is executed. We have to handle these issues to prevent excessive memory usage for replication specifically bootstrap replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)