Xia Zhao created HIVE-21545:
-------------------------------

             Summary: Left Semi Join: right table can be referenced in select?
                 Key: HIVE-21545
                 URL: https://issues.apache.org/jira/browse/HIVE-21545
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.1.0
            Reporter: Xia Zhao


In the below left semi join queries:

create table movies (id int, title string);
insert into movies values (1, "Toy Story"), (2, "Star Wars"), (3, "Harry 
Potter");

create table ratings (user_id int, movie_id int, rating decimal(2,1));
insert into ratings values (1, 1, 5.0), (2, 1, 4.5), (1, 2, 5.0), (3, 3, 5.0);

Q1: without reference to table alias, the right hand table column user_id is 
referenced in SELECT without prompting errors and results available. This is 
NOT expected left semi join behaviour. "The restrictions of using LEFT SEMI 
JOIN are that the right-hand-side table should only be referenced in the join 
condition (ON-clause), but not in WHERE- or SELECT-clauses etc."

SELECT user_id, count(id) from movies
left SEMI JOIN ratings
on id=movie_id and rating = 5 
group by user_id;

 

Q2: with reference to table alias, errors prompts when referencing right hand 
table column user_id in SELECT. Expected left semi join behaviour. 
SELECT r.user_id, count(m.id) from movies m
left SEMI JOIN ratings r
on m.id=r.movie_id and r.rating = 5 
group by r.user_id;

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to