Steve Carlin created IMPALA-14775:
-------------------------------------
Summary: Calcite Planner: Wrong results with decorrelation query
Key: IMPALA-14775
URL: https://issues.apache.org/jira/browse/IMPALA-14775
Project: IMPALA
Issue Type: Bug
Reporter: Steve Carlin
There seems to be an issue with some code in Calcite 1.42 causing different
results from the original planner.
The following query found in
[query_test/test_aggregation.py::TestAggregationQueries::test_grouping_sets|https://jenkins.impala.io/view/all/job/calcite-report-prototype/420/artifact/Impala/calcite_report/html/query_test_test_aggregation.py/TestAggregationQueries_test_grouping_sets/index.html]
works fine in 1.41:
{code:java}
select id
from functional.alltypesagg a
where exists
(select id
from functional.alltypestiny b
where a.tinyint_col = b.tinyint_col and a.string_col = b.string_col
group by rollup(id, int_col, bool_col))
and tinyint_col < 10;
{code}
It returns 10 rows in the original planner and with Calcite 1.41, but with
Calcite 1.42, it returns 9000 rows.
[~zwhtx] , I see you've been working on Impala and Calcite, and I think you
made a bunch of changes in the RelDecorrelator.java file for 1.42? I'm gonna
start to look at this, but if you could help out, that would be fantastic,
thanks!
Another query with problems in subquery.test is:
{code:java}
select count(*) FROM alltypesagg t1 WHERE day IS NOT NULL AND t1.int_col NOT IN
(SELECT tt1.month AS tinyint_col_1 FROM alltypesagg tt1 LEFT JOIN alltypestiny
tt2 ON tt2.year = tt1.id AND t1.bigint_col = tt2.smallint_col){code}
This query doesn't work at all with 1.41. It compiles in 1.42 but produces an
output of 9980 which does not match the original planner which outputs 10000.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)