[ 
https://issues.apache.org/jira/browse/TAJO-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395449#comment-14395449
 ] 

Hyunsik Choi edited comment on TAJO-1517 at 4/4/15 1:27 AM:
------------------------------------------------------------

I think that INTERSECT can be easily implemented with a query rewriting using 
SEMI JOIN + SELECT DISTINCT. Of course, we can design new physical operator for 
more efficient processing later.

BTW, {{INTERSECT ALL}} may be hard to use semi join due to the reason you 
mentioned. So, I recommend implementing new physical operator for it. Probably, 
some sort-based approach that compares two sorted rows of both tables would be 
work well.


was (Author: hyunsik):
I think that INTERSECT can be easily implemented with SEMI JOIN + SELECT 
DISTINCT. Of course, we can design new physical operator for more efficient 
processing later.

BTW, {{INTERSECT ALL}} may be hard to use semi join due to the reason you 
mentioned. So, I recommend implementing new physical operator for it. Probably, 
some sort-based approach that compares two sorted rows of both tables would be 
work well.

> Implement INTERSECT
> -------------------
>
>                 Key: TAJO-1517
>                 URL: https://issues.apache.org/jira/browse/TAJO-1517
>             Project: Tajo
>          Issue Type: Sub-task
>            Reporter: Keuntae Park
>
> {code}
> query1 INTERSECT [ALL] query2
> {code}
> INTERSECT returns all rows that are both in the result of query1 and in the 
> result of query2. 
> Duplicate rows are eliminated unless INTERSECT ALL is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to