Harish Butani created HIVE-5710:
-----------------------------------

             Summary: Merging of Join Trees assumes all filters are on the 
merging table
                 Key: HIVE-5710
                 URL: https://issues.apache.org/jira/browse/HIVE-5710
             Project: Hive
          Issue Type: Bug
            Reporter: Harish Butani
            Assignee: Harish Butani


The following query fails with a SemanticException
{noformat}
select p1.name, p2.name, p3.name
from part p1 join p2 on p1.name = p2.name
join part p3 on p1.name = p3.name and p2.key > 10
{noformat}

The Merge Join logic associates the p2.key > 10 filter with the merging table 
i.e 'p1'. When constructing the Join Plan an attempt is made to resolve this 
predicate against p1's RowResolver which causes the SemanticException.

The underlying issue is that during runtime filters are applied on the input 
rows to the Join Operator. There is no way to apply a filter on intermediate 
data. In the above query we shouldn't apply p2.key >10 predicate directly on 
p2, but on the output of p1 join p2. 

The following is also a valid query, here the predicate refers to multiple left 
tables:
{noformat}
select p1.name, p2.name, p3.name
from part p1 join p2 on p1.name = p2.name
join part p3 on p1.name = p3.name and p2.key > p1.key
{noformat}

As a start, propose to prevent merging when there is a Filter that refers to a 
non-merging table.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to