[ 
https://issues.apache.org/jira/browse/TAJO-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394177#comment-14394177
 ] 

Keuntae Park commented on TAJO-1517:
------------------------------------

For INTERSECT case, I think it can be implemented as semi join + distinct.

However, for INTERSECT ALL, the problem becomes more difficult
because it should count the number of intersecting rows even with the same row 
value.

For example, 
when query1 results in {2, 2, 2} while query2 results in {2, 2},
query1 INTERSECT query2 is {2} while query1 INTERSECT ALL query2 is {2, 2}.

So, I'm wondering about implementing new INTERSECT ALL Exec,
which is very similar to semi-join 
except INTERSECT ALL Exec also accounts for the number of intersecting rows of 
the same value.

> Implement INTERSECT
> -------------------
>
>                 Key: TAJO-1517
>                 URL: https://issues.apache.org/jira/browse/TAJO-1517
>             Project: Tajo
>          Issue Type: Sub-task
>            Reporter: Keuntae Park
>
> {code}
> query1 INTERSECT [ALL] query2
> {code}
> INTERSECT returns all rows that are both in the result of query1 and in the 
> result of query2. 
> Duplicate rows are eliminated unless INTERSECT ALL is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to