Chao created HIVE-8208:
--------------------------

             Summary: Multi-table insertion optimization #1: don't always break 
operator tree. [Spark Branch]
                 Key: HIVE-8208
                 URL: https://issues.apache.org/jira/browse/HIVE-8208
             Project: Hive
          Issue Type: Improvement
            Reporter: Chao


Currently, with the current patch of multi-table insertion, it will break 
whenever there exists one TableScanOperator that can leads to multiple 
FileSinkOperators. Then, it identifies the lowest common ancestor (LCA), and 
breaks the tree there, creating same number of child SparkTasks as the number 
of FileSinkOperators.

However, in the following situation it's better not to break the operator tree:

Of all the paths from these FileSinkOperators to the LCA, if ReduceSinkOperator 
only exist in 0 or 1 path of them.

In this case, we can do it in one spark job, and no need to break the operator 
tree.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to