[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]

2014-09-09 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127504#comment-14127504
 ] 

Chao commented on HIVE-8024:


There's a problem again:

Suppose the operator tree is the following:

{code}
 TS_0TS_2
   \/
   UNION_3
 |
SEL_4
{code}

After removing UNION operator, it will look like this:

{code}
TS_0TS_2
   \/
SEL_4
{code}

(Again, I ignored some operators, but you get the idea.)

Then, we could have MapWork 1 starts from {{TS_0}}, and MapWork 2 starts from 
{{TS_2}}.
Now, when MapWork 1 initialize itself, it will initialize the operator tree, 
starting from {{TS_0}, and go down the tree.
When it gets to {{SEL_4}}, it will not be able to initialize it, because not 
all of {{SEL_4}}'s parent are initialize at that point.
Hence, the execution will fail.

 Find out whether it's possible to remove UnionOperator from original operator 
 tree [Spark Branch]
 -

 Key: HIVE-8024
 URL: https://issues.apache.org/jira/browse/HIVE-8024
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chao
Assignee: Chao

 Currently, after operator tree is processed, the generated works with union 
 operators will go through {{GenSparkUtils::removeUnionOperators}}, which will 
 clone the original operator plan associated with the work, and remove union 
 operators in it. This caused some issues as seen, for example, in HIVE-7870. 
 This JIRA is created to find out whether it's possible to just remove the 
 union operators in the original plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]

2014-09-09 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127509#comment-14127509
 ] 

Chao commented on HIVE-8024:


Closing this JIRA as this approach seems not working or at least is hard to 
implement.
[~xuefuz] is writing a design doc and we'll continue the discussion there.

 Find out whether it's possible to remove UnionOperator from original operator 
 tree [Spark Branch]
 -

 Key: HIVE-8024
 URL: https://issues.apache.org/jira/browse/HIVE-8024
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chao
Assignee: Chao

 Currently, after operator tree is processed, the generated works with union 
 operators will go through {{GenSparkUtils::removeUnionOperators}}, which will 
 clone the original operator plan associated with the work, and remove union 
 operators in it. This caused some issues as seen, for example, in HIVE-7870. 
 This JIRA is created to find out whether it's possible to just remove the 
 union operators in the original plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]

2014-09-08 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126398#comment-14126398
 ] 

Chao commented on HIVE-8024:


I think there's a problem removing union op in place. Suppose there's a plan 
for multi-insertion looks like this:

{code}
  TS_0   TS_2
\/
  UNION_3
/   \
  SEL_1SEL_4
{code}

(I ignored some operators)
Currently, {{TS_0}} and {{TS_2}} will be in two MapWorks, which have separate 
plans, like following:

{code}
  TS_0  TS_2
 /\/\
  SEL_1   SEL_4  SEL_1  SEL_4
{code}

If we remove the union operator from the original tree, the result may not be 
correct.

 Find out whether it's possible to remove UnionOperator from original operator 
 tree [Spark Branch]
 -

 Key: HIVE-8024
 URL: https://issues.apache.org/jira/browse/HIVE-8024
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chao
Assignee: Chao

 Currently, after operator tree is processed, the generated works with union 
 operators will go through {{GenSparkUtils::removeUnionOperators}}, which will 
 clone the original operator plan associated with the work, and remove union 
 operators in it. This caused some issues as seen, for example, in HIVE-7870. 
 This JIRA is created to find out whether it's possible to just remove the 
 union operators in the original plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]

2014-09-08 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126400#comment-14126400
 ] 

Chao commented on HIVE-8024:


cc [~xuefuz], [~nyang]

 Find out whether it's possible to remove UnionOperator from original operator 
 tree [Spark Branch]
 -

 Key: HIVE-8024
 URL: https://issues.apache.org/jira/browse/HIVE-8024
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chao
Assignee: Chao

 Currently, after operator tree is processed, the generated works with union 
 operators will go through {{GenSparkUtils::removeUnionOperators}}, which will 
 clone the original operator plan associated with the work, and remove union 
 operators in it. This caused some issues as seen, for example, in HIVE-7870. 
 This JIRA is created to find out whether it's possible to just remove the 
 union operators in the original plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8024) Find out whether it's possible to remove UnionOperator from original operator tree [Spark Branch]

2014-09-08 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126565#comment-14126565
 ] 

Xuefu Zhang commented on HIVE-8024:
---

[~csun], thanks for the quick update. This is interesting. However, if union3 
has two children, then cloning the plan and then remove union doesn't help 
either as the cloned union also has two children. I don't know how this problem 
can be resolved by cloning the plan.

I guess your case came from a multi-insert query, for which I think the logic 
of multi-insert should kick in first. In this case, we would use a FS - TS to 
separate the plan to multiple ones.

Thus, this might be unrelated. Could you continue the research, excluding 
multi-insert cases?

 Find out whether it's possible to remove UnionOperator from original operator 
 tree [Spark Branch]
 -

 Key: HIVE-8024
 URL: https://issues.apache.org/jira/browse/HIVE-8024
 Project: Hive
  Issue Type: Task
  Components: Spark
Reporter: Chao
Assignee: Chao

 Currently, after operator tree is processed, the generated works with union 
 operators will go through {{GenSparkUtils::removeUnionOperators}}, which will 
 clone the original operator plan associated with the work, and remove union 
 operators in it. This caused some issues as seen, for example, in HIVE-7870. 
 This JIRA is created to find out whether it's possible to just remove the 
 union operators in the original plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)