Hello, I’m seeing some curious behavior of decorrelation which I don’t think is 
necessarily a bug, but seems a little over-conservative, and was wondering if 
maybe I was doing something wrong.

I have this query:
SELECT
    (SELECT count(*) AS P0
        FROM T1704 AS A7
        WHERE A7.C5633_2187 = CASE
            WHEN A0.C5633_157 THEN A0.C5633_155
            ELSE 123
        END)
FROM T872 AS A0

Which gets decorrelated to:
Project(EXPR$0=[CASE
│    WHEN IS NULL($215:P0) THEN 0:BIGINT
│    ELSE CAST($215:P0)
│END])
└── Join(condition=[=($213:LHS.$f213, $214:RHS.C5633_2187)], joinType=[left])
    ├── Project(*, $f213=[CASE
    │   │    WHEN $12:T872.C5633_157 THEN $10:T872.C5633_155
    │   │    ELSE 123:BIGINT
    │   │END])
    │   └── TableScan(table=[T872])
    └── Aggregate(group=[{$0:C5633_2187}], P0=[COUNT()])
        └── Project(C5633_2187=[$4:C5633_2187])
            └── Filter(condition=[=($4:T1704.C5633_2187, $4:T1704.C5633_2187)])
                └── TableScan(table=[T1704])

which only has one level of join between the case expression and the outer 
correlation field.

But if I change the else branch `123` into `123 + 456`, a second join is 
introduced.
Project(EXPR$0=[CASE
│    WHEN IS NULL($215:P0) THEN 0:BIGINT
│    ELSE CAST($215:P0)
│END])
└── Join(condition=[AND(=($10:T872.C5633_155, $213:RHS.C5633_155), 
=($12:T872.C5633_157, $214:RHS.C5633_157))], joinType=[left])
    ├── TableScan(table=[T872])
    └── Aggregate(group=[{$0:C5633_155, $1:C5633_157}], P0=[COUNT()])
        └── Project(C5633_155=[$13:C5633_155], C5633_157=[$14:C5633_157])
            └── Join(condition=[=($4:T1704.C5633_2187, CASE
                │    WHEN $14:RHS.C5633_157 THEN $13:RHS.C5633_155
                │    ELSE CAST(+(123, 456))
                │END)], joinType=[inner])
                ├── TableScan(table=[T1704])
                └── Aggregate(group=[{$0:C5633_155, $1:C5633_157}])
                    └── Project(C5633_155=[$10:T872.C5633_155], 
C5633_157=[$12:T872.C5633_157])
                        └── TableScan(table=[T872])

Now I also have two joins, an inner one on involving the case expression (which 
is not pushed down) and an outer which is self-joining on the correlated fields.

Is there a reason for this? It’s surprising to me that the introduction of the 
extra addition inside the case expression triggers such a radical change.

Thanks!
-Ian J. Bertolacci



Reply via email to