[
https://issues.apache.org/jira/browse/CALCITE-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033290#comment-18033290
]
Dmitry Sysolyatin edited comment on CALCITE-7250 at 10/27/25 5:09 PM:
----------------------------------------------------------------------
I think this bug has existed since 2016. For now, I recommend not applying
SubQueryRemoveRule or RelDecorrelate to plans generated for UPDATE or DELETE
operations, because these plans depend on the WHERE clause from the source
SELECT.
In your case, the resulting plan probably looks like this (if check file which
you sent)
{code:java}
JdbcToEnumerableConverter
JdbcTableModify(table=[[public, tab1]], operation=[DELETE], flattened=[false])
JdbcProject(c1=[$0], c2=[$1])
JdbcJoin(condition=[=($0, $2)], joinType=[inner])
JdbcTableScan(table=[[public, tab1]])
JdbcAggregate(group=[{0}])
JdbcFilter(condition=[AND(=($1, 2), IS NOT NULL($0))])
JdbcTableScan(table=[[public, tab2]]){code}
The source SELECT looks something like:
{code:java}
SELECT t1.col0 AS c1, t1.col1 AS c2
FROM tab1 t1
JOIN (
SELECT DISTINCT col0
FROM tab2
WHERE col1 = 2
AND col0 IS NOT NULL
) t2
ON t1.col0 = t2.col0; {code}
The delete will looks like:
{code:java}
DELETE FROM tab1 b <sourceSelect.getWhere()> <-- there is no where in top node
of source select
DELETE FROM tab1 b
{code}
We should find a more reliable way to generate the WHERE clause for DELETE
statements instead of using sourceSelect.getWhere() [1]
[1]
[https://github.com/apache/calcite/blob/4f79aa875c4ccf6e50bbd6475b716ce9cd529251/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L1164]
was (Author: dmsysolyatin):
I think this bug has existed since 2016. For now, I recommend not applying
SubQueryRemoveRule or RelDecorrelate to plans generated for UPDATE or DELETE
operations, because these plans depend on the WHERE clause from the source
SELECT.
In your case, the resulting plan probably looks like this (if check file which
you sent)
{code:java}
JdbcToEnumerableConverter: rowcount = 20.25, cumulative cost = {255.35 rows,
336.42499999999995 cpu, 0.0 io}, id = 282
JdbcTableModify(table=[[public, tab1]], operation=[DELETE],
flattened=[false]): rowcount = 20.25, cumulative cost = {253.325 rows, 334.4
cpu, 0.0 io}, id = 281
JdbcProject(c1=[$0], c2=[$1]): rowcount = 20.25, cumulative cost =
{251.29999999999998 rows, 334.4 cpu, 0.0 io}, id = 280
JdbcJoin(condition=[=($0, $2)], joinType=[inner]): rowcount = 20.25,
cumulative cost = {235.1 rows, 302.0 cpu, 0.0 io}, id = 279
JdbcTableScan(table=[[public, tab1]]): rowcount = 100.0, cumulative
cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 68
JdbcAggregate(group=[{0}]): rowcount = 1.35, cumulative cost = {114.85
rows, 201.0 cpu, 0.0 io}, id = 278
JdbcFilter(condition=[AND(=($1, 2), IS NOT NULL($0))]): rowcount =
13.5, cumulative cost = {113.5 rows, 201.0 cpu, 0.0 io}, id = 277
JdbcTableScan(table=[[public, tab2]]): rowcount = 100.0, cumulative
cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 69 {code}
The source SELECT looks something like:
{code:java}
SELECT t1.col0 AS c1, t1.col1 AS c2
FROM tab1 t1
JOIN (
SELECT DISTINCT col0
FROM tab2
WHERE col1 = 2
AND col0 IS NOT NULL
) t2
ON t1.col0 = t2.col0; {code}
The delete will looks like:
{code}
DELETE FROM tab1 b <sourceSelect.getWhere()> <-- there is no where in top node
of source select
DELETE FROM tab1 b
{code}
We should find a more reliable way to generate the WHERE clause for DELETE
statements instead of using sourceSelect.getWhere() [1]
[1]
https://github.com/apache/calcite/blob/4f79aa875c4ccf6e50bbd6475b716ce9cd529251/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L1164
> Unexpected deletion of all table records for DELETE ... WHERE EXISTS
> <subquery>
> -------------------------------------------------------------------------------
>
> Key: CALCITE-7250
> URL: https://issues.apache.org/jira/browse/CALCITE-7250
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.35.0, 1.40.0
> Reporter: Jordin Catanzaro
> Priority: Critical
> Attachments: planner-2025-10-22-1.log
>
>
> We discovered a delete query with subquery containing conditionals that
> deletes _all records from the target table_ (tested in Apache Calcite,
> version: 1.40.0). The same query with postgres driver removes the single
> matching record as expected. Here is the offending query:
> {code:java}
> delete from tab1 b where exists (select 1 from tab2 a where a.c2=2 and
> a.c1=b.c1);{code}
> Here is how to reproduce the issue:
> {code:java}
> -- Setup tables and data in postgres
> create table tab1 (c1 int, c2 int);
> create table tab2 (c1 int,c2 int);
> insert into tab1 values (1,1);
> insert into tab1 values (2,2);
> insert into tab2 values (2,2);
>
> -- Delete query
> delete from tab1 b where exists (
> select 1 from tab2 a where a.c2=2 and a.c1=b.c1
> );
>
> -- Check results
> select * from tab1;
>
> -- Expected (postgres): (1,1)
> -- Actual (calcite): (empty set)
> -- Calcite generated query: DELETE FROM "tab1"
> {code}
> I have attached the query plan for reference. Perhaps there is an issue when
> applying both `a.c2=2 and a.c1=b.c1`? Removing `a.c=2` allows `a.c1=b.c1` to
> resolve correctly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)