[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127504#comment-14127504 ] Chao commented on HIVE-8024: There's a problem again: Suppose the operator tree is the following: {code} TS_0TS_2 \/ UNION_3 | SEL_4 {code} After removing UNION operator, it will look like this: {code} TS_0TS_2 \/ SEL_4 {code} (Again, I ignored some operators, but you get the idea.) Then, we could have MapWork 1 starts from {{TS_0}}, and MapWork 2 starts from {{TS_2}}. Now, when MapWork 1 initialize itself, it will initialize the operator tree, starting from {{TS_0}, and go down the tree. When it gets to {{SEL_4}}, it will not be able to initialize it, because not all of {{SEL_4}}'s parent are initialize at that point. Hence, the execution will fail. Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch] - Key: HIVE-8024 URL: https://issues.apache.org/jira/browse/HIVE-8024 Project: Hive Issue Type: Task Components: Spark Reporter: Chao Assignee: Chao Currently, after operator tree is processed, the generated works with union operators will go through {{GenSparkUtils::removeUnionOperators}}, which will clone the original operator plan associated with the work, and remove union operators in it. This caused some issues as seen, for example, in HIVE-7870. This JIRA is created to find out whether it's possible to just remove the union operators in the original plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127509#comment-14127509 ] Chao commented on HIVE-8024: Closing this JIRA as this approach seems not working or at least is hard to implement. [~xuefuz] is writing a design doc and we'll continue the discussion there. Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch] - Key: HIVE-8024 URL: https://issues.apache.org/jira/browse/HIVE-8024 Project: Hive Issue Type: Task Components: Spark Reporter: Chao Assignee: Chao Currently, after operator tree is processed, the generated works with union operators will go through {{GenSparkUtils::removeUnionOperators}}, which will clone the original operator plan associated with the work, and remove union operators in it. This caused some issues as seen, for example, in HIVE-7870. This JIRA is created to find out whether it's possible to just remove the union operators in the original plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126398#comment-14126398 ] Chao commented on HIVE-8024: I think there's a problem removing union op in place. Suppose there's a plan for multi-insertion looks like this: {code} TS_0 TS_2 \/ UNION_3 / \ SEL_1SEL_4 {code} (I ignored some operators) Currently, {{TS_0}} and {{TS_2}} will be in two MapWorks, which have separate plans, like following: {code} TS_0 TS_2 /\/\ SEL_1 SEL_4 SEL_1 SEL_4 {code} If we remove the union operator from the original tree, the result may not be correct. Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch] - Key: HIVE-8024 URL: https://issues.apache.org/jira/browse/HIVE-8024 Project: Hive Issue Type: Task Components: Spark Reporter: Chao Assignee: Chao Currently, after operator tree is processed, the generated works with union operators will go through {{GenSparkUtils::removeUnionOperators}}, which will clone the original operator plan associated with the work, and remove union operators in it. This caused some issues as seen, for example, in HIVE-7870. This JIRA is created to find out whether it's possible to just remove the union operators in the original plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126400#comment-14126400 ] Chao commented on HIVE-8024: cc [~xuefuz], [~nyang] Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch] - Key: HIVE-8024 URL: https://issues.apache.org/jira/browse/HIVE-8024 Project: Hive Issue Type: Task Components: Spark Reporter: Chao Assignee: Chao Currently, after operator tree is processed, the generated works with union operators will go through {{GenSparkUtils::removeUnionOperators}}, which will clone the original operator plan associated with the work, and remove union operators in it. This caused some issues as seen, for example, in HIVE-7870. This JIRA is created to find out whether it's possible to just remove the union operators in the original plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126565#comment-14126565 ] Xuefu Zhang commented on HIVE-8024: --- [~csun], thanks for the quick update. This is interesting. However, if union3 has two children, then cloning the plan and then remove union doesn't help either as the cloned union also has two children. I don't know how this problem can be resolved by cloning the plan. I guess your case came from a multi-insert query, for which I think the logic of multi-insert should kick in first. In this case, we would use a FS - TS to separate the plan to multiple ones. Thus, this might be unrelated. Could you continue the research, excluding multi-insert cases? Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch] - Key: HIVE-8024 URL: https://issues.apache.org/jira/browse/HIVE-8024 Project: Hive Issue Type: Task Components: Spark Reporter: Chao Assignee: Chao Currently, after operator tree is processed, the generated works with union operators will go through {{GenSparkUtils::removeUnionOperators}}, which will clone the original operator plan associated with the work, and remove union operators in it. This caused some issues as seen, for example, in HIVE-7870. This JIRA is created to find out whether it's possible to just remove the union operators in the original plan. -- This message was sent by Atlassian JIRA (v6.3.4#6332)