[ 
https://issues.apache.org/jira/browse/CALCITE-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated CALCITE-5953:
--------------------------------------
    Description: 
consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
notice that this expression may only be null if there are no rows in the table

{{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(2) filter 
(where x=1)}} which broadens when it could be `null` to when there are no 
matches to the filter

* *A* is *0* correctly in this case; but I think it will be still *0* in case 
there are 0 input rows
* The result for *B* supposed to be *0* but since there are no matches by the 
filter it becomes *null*
* *C* is not touched

{code}
# Convert CASE to FILTER without matches
select  sum(case when x = 1 then 1 else 0 end) as a,
        sum(case when x = 1 then 2 else 0 end) as b,
        sum(case when x = 1 then 3 else -1 end) as c
from (values 0, null, 0, 2) as t(x);
+---+---+----+
| A | B | C  |
+---+---+----+
| 0 |   | -4 | # incorrect result for B
| 0 | 0 | -4 | # correct
+---+---+----+
(1 row)

!ok
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
B=[$t1], C=[$t2])
  EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER $3], 
C=[SUM($0)])
    EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
      EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
!plan

# Convert CASE to FILTER with no input rows
select  sum(case when x = 1 then 1 else 0 end) as a,
        sum(case when x = 1 then 2 else 0 end) as b,
        sum(case when x = 1 then 3 else -1 end) as c
from (values 0, null, 0, 2) as t(x) where x*x=1;
+---+---+---+
| A | B | C |
+---+---+---+
| 0 |   |   | # incorrect results - for 0 input rows sum supposed to be null
|   |   |   | # correct result
+---+---+---+
(1 row)

!ok

{code}

  was:
consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
notice that this expression may only be null if there are no rows in the table

{{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(2) filter 
(where x=1)}} which broadens when it could be `null` to when there are no 
matches to the filter

* *A* is *0* correctly in this case; but I think it will be still *0* in case 
there are 0 input rows
* The result for *B* supposed to be *0* but since there are no matches by the 
filter it becomes *null*
* *C* is not touched

{code}
# Convert CASE to FILTER without matches
select  sum(case when x = 1 then 1 else 0 end) as a,
        sum(case when x = 1 then 2 else 0 end) as b,
        sum(case when x = 1 then 3 else -1 end) as c
from (values 0, null, 0, 2) as t(x);
+---+---+----+
| A | B | C  |
+---+---+----+
| 0 | 0 | -4 |
+---+---+----+
(1 row)

!ok
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
B=[$t1], C=[$t2])
  EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER $3], 
C=[SUM($0)])
    EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
      EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
!plan
{code}


> AggregateCaseToFilterRule may make inaccurate SUM transformation
> ----------------------------------------------------------------
>
>                 Key: CALCITE-5953
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5953
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.36.0
>
>
> consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
> notice that this expression may only be null if there are no rows in the table
> {{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(2) 
> filter (where x=1)}} which broadens when it could be `null` to when there are 
> no matches to the filter
> * *A* is *0* correctly in this case; but I think it will be still *0* in case 
> there are 0 input rows
> * The result for *B* supposed to be *0* but since there are no matches by the 
> filter it becomes *null*
> * *C* is not touched
> {code}
> # Convert CASE to FILTER without matches
> select  sum(case when x = 1 then 1 else 0 end) as a,
>         sum(case when x = 1 then 2 else 0 end) as b,
>         sum(case when x = 1 then 3 else -1 end) as c
> from (values 0, null, 0, 2) as t(x);
> +---+---+----+
> | A | B | C  |
> +---+---+----+
> | 0 |   | -4 | # incorrect result for B
> | 0 | 0 | -4 | # correct
> +---+---+----+
> (1 row)
> !ok
> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
> B=[$t1], C=[$t2])
>   EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER 
> $3], C=[SUM($0)])
>     EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
> expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
> expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
>       EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
> !plan
> # Convert CASE to FILTER with no input rows
> select  sum(case when x = 1 then 1 else 0 end) as a,
>         sum(case when x = 1 then 2 else 0 end) as b,
>         sum(case when x = 1 then 3 else -1 end) as c
> from (values 0, null, 0, 2) as t(x) where x*x=1;
> +---+---+---+
> | A | B | C |
> +---+---+---+
> | 0 |   |   | # incorrect results - for 0 input rows sum supposed to be null
> |   |   |   | # correct result
> +---+---+---+
> (1 row)
> !ok
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to