Kevin Wilfong created HIVE-3496:
-----------------------------------

             Summary: Query plan for multi-join where the third table joined is 
a subquery containing a map-only union with hive.auto.convert.join=true is wrong
                 Key: HIVE-3496
                 URL: https://issues.apache.org/jira/browse/HIVE-3496
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.10.0
            Reporter: Kevin Wilfong
            Assignee: Kevin Wilfong


Take the following query as an example:

EXPLAIN SELECT * FROM 
src11 a JOIN
src12 b ON (a.key = b.key) JOIN
(SELECT * FROM (SELECT * FROM src13 UNION ALL SELECT * FROM src14)a )c ON 
c.value = b.value;

When hive.auto.convert.join=true, the two joins are implemented separately as 
conditional tasks with two mapjoins and a backup common join.  In the second 
join, the conditional task will be a backup task, contained in the 
ConditionalTask, and a root task.  This is clearly wrong, and leads to query 
failures.

I've traced this to the joinUnionPlan method of GenMapRedUtils.  If the union 
operator was performed in its own map reduce task and it could be a root task, 
when it is added to the mapper of the existing task which performs the join in 
the reducer, this task will get made a root task without first checking if the 
existing (non-union) task has any dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to