[ 
https://issues.apache.org/jira/browse/TAJO-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692846#comment-14692846
 ] 

ASF GitHub Bot commented on TAJO-1561:
--------------------------------------

GitHub user jihoonson opened a pull request:

    https://github.com/apache/tajo/pull/685

    TAJO-1561: Query which contains join condition in "OR" clause does not 
finish.

    I added a new query rewriting rule. I've also tested the TPC-DS query 
reported at Jira.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jihoonson/tajo-2 TAJO-1561

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/685.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #685
    
----
commit 64814c0cfd5f13a3ba6802ff6a715fb1e4ee8781
Author: Jihoon Son <[email protected]>
Date:   2015-08-11T09:21:49Z

    TAJO-1561

commit 3828616dcad249fee5c03badaffdbe1522452c34
Author: Jihoon Son <[email protected]>
Date:   2015-08-12T03:23:05Z

    TAJO-1561

----


> Query which contains join condition in "OR" clause does not finish.
> -------------------------------------------------------------------
>
>                 Key: TAJO-1561
>                 URL: https://issues.apache.org/jira/browse/TAJO-1561
>             Project: Tajo
>          Issue Type: Bug
>            Reporter: Hyoungjun Kim
>            Assignee: Jihoon Son
>
> {code:sql}
> select sum (ss_quantity)
>  from store_sales, store, customer_demographics, customer_address, date_dim
>  where s_store_sk = ss_store_sk
>  and  ss_sold_date_sk = d_date_sk and d_year = 1998
>  and  
>  (
>   (
>    cd_demo_sk = ss_cdemo_sk
>    and 
>    cd_marital_status = 'M'
>    and 
>    cd_education_status = '4 yr Degree'
>    and 
>    ss_sales_price between 100.00 and 150.00  
>    )
>  or
>   (
>   cd_demo_sk = ss_cdemo_sk
>    and 
>    cd_marital_status = 'M'
>    and 
>    cd_education_status = '4 yr Degree'
>    and 
>    ss_sales_price between 50.00 and 100.00   
>   )
>  or 
>  (
>   cd_demo_sk = ss_cdemo_sk
>   and 
>    cd_marital_status = 'M'
>    and 
>    cd_education_status = '4 yr Degree'
>    and 
>    ss_sales_price between 150.00 and 200.00  
>  )
>  )
>  and
>  (
>   (
>   ss_addr_sk = ca_address_sk
>   and
>   ca_country = 'United States'
>   and
>   ca_state in ('KY', 'GA', 'NM')
>   and ss_net_profit between 0 and 2000  
>   )
>  or
>   (ss_addr_sk = ca_address_sk
>   and
>   ca_country = 'United States'
>   and
>   ca_state in ('MT', 'OR', 'IN')
>   and ss_net_profit between 150 and 3000 
>   )
>  or
>   (ss_addr_sk = ca_address_sk
>   and
>   ca_country = 'United States'
>   and
>   ca_state in ('WI', 'MO', 'WV')
>   and ss_net_profit between 50 and 25000 
>   )
>  )
> {code}
> See the following query(TPC-DS Query48). The join condition of this query is 
> in the repeated OR clause as following:
> {noformat}
>    cd_demo_sk = ss_cdemo_sk
>    and 
>    cd_marital_status = 'M'
>    and 
>    cd_education_status = '4 yr Degree'
> {noformat}
> Tajo planner makes the logical for this query with CROSS JOIN because the 
> planner can't find JOIN condition. So this query does not finish. This query 
> can be changed to the following.
> {code:sql}
> select sum (ss_quantity)
>  from store_sales, store, customer_demographics, customer_address, date_dim
>  where s_store_sk = ss_store_sk
>  and  ss_sold_date_sk = d_date_sk and d_year = 1998
>  and
> (cd_demo_sk = ss_cdemo_sk
> and
> cd_marital_status = 'M'
> and
> cd_education_status = '4 yr Degree'
> and (
>   (ss_sales_price between 50.00 and 100.00) or
>   (ss_sales_price between 100.00 and 150.00) or
>   (ss_sales_price between 150.00 and 200.00)
> ))
> and
> (
> ss_addr_sk = ca_address_sk
>   and
>   ca_country = 'United States'
>   and (
>    (ca_state in ('KY', 'GA', 'NM') and ss_net_profit between 0 and 2000)
>    or
>    (ca_state in ('MT', 'OR', 'IN') and ss_net_profit between 150 and 3000)
>    or
>    (ca_state in ('WI', 'MO', 'WV') and ss_net_profit between 50 and 25000)
>   )
> )
> {code}
> Other solution also have same problem. See the following issues.
> - https://issues.cloudera.org/browse/IMPALA-1707 
> - https://issues.apache.org/jira/browse/HIVE-7914
> This issue is related with TPC-DS query 13, 48, 85.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to