Steve Carlin created IMPALA-14775:
-------------------------------------

             Summary: Calcite Planner: Wrong results with decorrelation query
                 Key: IMPALA-14775
                 URL: https://issues.apache.org/jira/browse/IMPALA-14775
             Project: IMPALA
          Issue Type: Bug
            Reporter: Steve Carlin


There seems to be an issue with some code in Calcite 1.42 causing different 
results from the original planner.

The following query found in 
[query_test/test_aggregation.py::TestAggregationQueries::test_grouping_sets|https://jenkins.impala.io/view/all/job/calcite-report-prototype/420/artifact/Impala/calcite_report/html/query_test_test_aggregation.py/TestAggregationQueries_test_grouping_sets/index.html]
 
works fine in 1.41:

 
{code:java}
select id
from functional.alltypesagg a
where exists
  (select id
   from functional.alltypestiny b
   where a.tinyint_col = b.tinyint_col and a.string_col = b.string_col
   group by rollup(id, int_col, bool_col))
  and tinyint_col < 10;
{code}
It returns 10 rows in the original planner and with Calcite 1.41, but with 
Calcite 1.42, it returns 9000 rows.

[~zwhtx] , I see you've been working on Impala and Calcite, and I think you 
made a bunch of changes in the RelDecorrelator.java file for 1.42?  I'm gonna 
start to look at this, but if you could help out, that would be fantastic, 
thanks!

Another query with problems in subquery.test is:
{code:java}
select count(*) FROM alltypesagg t1 WHERE day IS NOT NULL AND t1.int_col NOT IN 
(SELECT tt1.month AS tinyint_col_1 FROM alltypesagg tt1 LEFT JOIN alltypestiny 
tt2 ON tt2.year = tt1.id AND t1.bigint_col = tt2.smallint_col){code}
This query doesn't work at all with 1.41.  It compiles in 1.42 but produces an 
output of 9980 which does not match the original planner which outputs 10000.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to