[ 
https://issues.apache.org/jira/browse/TAJO-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313661#comment-14313661
 ] 

ASF GitHub Bot commented on TAJO-1277:
--------------------------------------

GitHub user sirpkt opened a pull request:

    https://github.com/apache/tajo/pull/379

    TAJO-1277: GreedyHeuristicJoinOrderAlgorithm sometimes wrongly assumes 
associativity of joins

    Basically, it limits the range of join ordering until it meets outer join 
operations.
    For example, in the case of (((((a inner join b) inner join c) outer join 
d) inner join e) inner join f),
    join ordering is partitioned as three parts as
    1) (a inner join b) inner join c
    2) (result of 1) outer join d
    3) (((result of 2) inner join e) inner join f) 
    
    Following modifications are included:
    - findBestOrder() is changed to partition join ordering
    - getBestPair() and findJoin() are changed to return the corresponding 
JoinEdges of the selected join because those JoinEdges should be removed before 
next join ordering
    
    It passes 'mvn clean install' and several join cases I tested,
    however, I'm not sure this is good approach.
    Please, leave me comments about the patch.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sirpkt/tajo TAJO-1277

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/379.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #379
    
----
commit 74802e228dd3af8e885cc724c32d5746939c2c23
Author: Keuntae Park <[email protected]>
Date:   2015-02-09T09:21:04Z

    join optimizer is enhanced to distinguish non-associative join cases

----


> GreedyHeuristicJoinOrderAlgorithm sometimes wrongly assumes associativity of 
> joins
> ----------------------------------------------------------------------------------
>
>                 Key: TAJO-1277
>                 URL: https://issues.apache.org/jira/browse/TAJO-1277
>             Project: Tajo
>          Issue Type: Bug
>            Reporter: Keuntae Park
>
> It looks like GreedyHeuristicJoinOrderAlgorithm always assumes every joins 
> are associative.
> Following query returns in inaccurate result:
> {code}
> select * FROM
> customer c 
> right outer join nation n on c.c_custkey = n.n_nationkey
> join region r on c.c_custkey = r.r_regionkey;
> {code}
> because GreedyHeuristicJoinOrderAlgorithm changes join order as
> {code}
> select * FROM
> customer c 
> join region r on c.c_custkey = r.r_regionkey
> right outer join nation n on c.c_custkey = n.n_nationkey;
> {code}
> I think getBestPair() should be fixed to avoid wrong join ordering. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to