[
https://issues.apache.org/jira/browse/TAJO-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070334#comment-14070334
]
ASF GitHub Bot commented on TAJO-972:
-------------------------------------
GitHub user babokim opened a pull request:
https://github.com/apache/tajo/pull/89
TAJO-972: Broadcast join with left outer join returns duplicated rows.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/babokim/tajo TAJO-972
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tajo/pull/89.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #89
----
commit e391b87215d139be0309e1b120dda412b70d9e9c
Author: 김형준 <[email protected]>
Date: 2014-07-22T11:23:55Z
TAJO-972: Broadcast join with left outer join returns duplicated rows.
----
> Broadcast join with left outer join returns duplicated rows.
> ------------------------------------------------------------
>
> Key: TAJO-972
> URL: https://issues.apache.org/jira/browse/TAJO-972
> Project: Tajo
> Issue Type: Bug
> Reporter: Hyoungjun Kim
> Assignee: Hyoungjun Kim
> Priority: Minor
>
> If LEFT OUTER JOIN has broadcast table and broadcast target table is left
> side, every tasks run join operation with all rows in broadcast table. So
> some tasks match and other tasks doesn't match.
> For example:
> {noformat}
> default>select * from small
> id
> -----------------
> 1
> 2
> 3
> default>select * from large
> 1
> 4 <-- Block1 in HDFS
> 5
> ...
> 2 <-- Block2 in HDFS
> 6
> default> select a.id, b.id from small a left outer join large b on a.id = b.id
> a.id b.id
> ---------------------------
> 1 1
> 2 null
> 3 null
> 1 null
> 2 2
> 3 null
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)