[
https://issues.apache.org/jira/browse/HIVE-8208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xuefu Zhang updated HIVE-8208:
------------------------------
Fix Version/s: (was: spark-branch)
> Multi-table insertion optimization #1: don't always break operator tree.
> [Spark Branch]
> ---------------------------------------------------------------------------------------
>
> Key: HIVE-8208
> URL: https://issues.apache.org/jira/browse/HIVE-8208
> Project: Hive
> Issue Type: Improvement
> Reporter: Chao Sun
>
> Currently, with the current patch of multi-table insertion, it will break
> whenever there exists one TableScanOperator that can leads to multiple
> FileSinkOperators. Then, it identifies the lowest common ancestor (LCA), and
> breaks the tree there, creating same number of child SparkTasks as the number
> of FileSinkOperators.
> However, in the following situation it's better not to break the operator
> tree:
> Of all the paths from these FileSinkOperators to the LCA, if
> ReduceSinkOperator only exist in 0 or 1 path of them.
> In this case, we can do it in one spark job, and no need to break the
> operator tree.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)