[ https://issues.apache.org/jira/browse/HIVE-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447496#comment-13447496 ]
Namit Jain commented on HIVE-3426: ---------------------------------- Consider a query like: insert overwrite table tst_output partition (ds='1') select key, keyVal, agg from ( select key, '238' as keyVal, count(1) as agg from srcpart where ds = '2008-04-08' and hr = '11' and key = 238 group by key union all select key, '165' as keyVal, count(1) as agg from srcpart where ds = '2008-04-08' and hr = '11' and key = 165 group by key union all select key, '409' as keyVal, count(1) as agg from srcpart where ds = '2008-04-08' and hr = '12' and key = 409 group by key union all select key, '484' as keyVal, count(1) as agg from srcpart where ds = '2008-04-08' and hr = '12' and key = 484 group by key ) subq; It requires different map-reduce jobs for each sub-query. Since the same base table is being queried, ideally it should be a single map-reduce job. Atleast a single scan should be performed on ds=2008-04-08/hr=11 and hr=12 respectively. The query plan should be like a query plan with different branches for the table scan. > union with same source should be optimized > ------------------------------------------ > > Key: HIVE-3426 > URL: https://issues.apache.org/jira/browse/HIVE-3426 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Namit Jain > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira