Jaehwa Jung created TAJO-750:
--------------------------------

             Summary: Join orders affects abnormal to the result data.
                 Key: TAJO-750
                 URL: https://issues.apache.org/jira/browse/TAJO-750
             Project: Tajo
          Issue Type: Sub-task
            Reporter: Jaehwa Jung


I found that join orders  affects abnormal to the result data  as follows:

*Environment*
* DataSet: TPC-DS

*Case: 1*
{code:xml}
SELECT COUNT(*)
FROM (SELECT cs.cs_item_sk as cs_item_sk,
                             cs.cs_ext_discount_amt as cs_ext_discount_amt
             FROM catalog_sales cs
             JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
             WHERE d.d_date between '2000-01-27' and '2000-04-27') cs1
JOIN item i ON (i.i_item_sk = cs1.cs_item_sk)
JOIN (SELECT cs2.cs_item_sk as cs_item_sk,
                          1.3 * avg(cs_ext_discount_amt) as 
avg_cs_ext_discount_amt
           FROM (SELECT cs.cs_item_sk as cs_item_sk,
                                        cs.cs_ext_discount_amt as 
cs_ext_discount_amt
                        FROM catalog_sales cs
                        JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
                        WHERE d.d_date between '2000-01-27' and '2000-04-27') 
cs2
                        GROUP BY cs2.cs_item_sk) tmp1
ON (i.i_item_sk = tmp1.cs_item_sk)
{code}

- expected result: 71147
- activated result: 4163848

* Case 2
{code:xml}

SELECT COUNT(*)
FROM item i
JOIN (SELECT cs.cs_item_sk as cs_item_sk,
                             cs.cs_ext_discount_amt as cs_ext_discount_amt
             FROM catalog_sales cs
             JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
             WHERE d.d_date between '2000-01-27' and '2000-04-27') cs1 ON 
(i.i_item_sk = cs1.cs_item_sk)
JOIN (SELECT cs2.cs_item_sk as cs_item_sk,
                          1.3 * avg(cs_ext_discount_amt) as 
avg_cs_ext_discount_amt
           FROM (SELECT cs.cs_item_sk as cs_item_sk,
                                        cs.cs_ext_discount_amt as 
cs_ext_discount_amt
                        FROM catalog_sales cs
                        JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
                        WHERE d.d_date between '2000-01-27' and '2000-04-27') 
cs2
                        GROUP BY cs2.cs_item_sk) tmp1
ON (i.i_item_sk = tmp1.cs_item_sk);
{code}

- expected result: 23890
- activated result: 4163848

As you knew, two queries doesn't show the expected result. Furthermore, each 
result have differences. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to