[ 
https://issues.apache.org/jira/browse/HIVE-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243576#comment-14243576
 ] 

Chao commented on HIVE-9041:
----------------------------

Yeah, before we only considered the case involving splited mapworks for one 
table, but as this showed, there could be splited mapworks for multiple tables, 
and they are intermingled.

The immediate issue, from this example, is *how to make MW 3 know that it needs 
to get the saved IOContext from MW 1, but not MW 2*. Intuitively, we may need 
some kind of mapping to do this.


> Generate better plan for queries containing both union and multi-insert 
> [Spark Branch]
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-9041
>                 URL: https://issues.apache.org/jira/browse/HIVE-9041
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Chao
>            Assignee: Chao
>
> This is a follow-up for HIVE-8920. For queries like:
> {code}
> from (select * from table0 union all select * from table1) s
> insert overwrite table table3 select s.x, count(1) group by s.x
> insert overwrite table table4 select s.y, count(1) group by s.y;
> {code}
> Currently we generate the following plan:
> {noformat}
>     M1    M2
>       \  / \
>        U3   R5
>        |
>        R4
> {noformat}
> It's better, however, to have the following plan:
> {noformat}
>    M1  M2
>    |\  /|
>    | \/ |
>    | /\ |
>    R4  R5
> {noformat}
> Also, we can do some reseach in this JIRA to see if it's possible
> to remove UnionWork once and for all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to