Incorrect explain output for updates/delete operations with returning-list on partitioned tables

SAIKIRAN AVULA Mon, 06 May 2024 14:06:55 -0700

Hi PostgreSQL Community,

I have been working on partitioned tables recently, and I have noticed
something that doesn't seem correct with the EXPLAIN output of an
update/delete query with a returning list.


For example, consider two partitioned tables, "t1" and "t2," with
partitions "t11," "t12," and "t21," "t22," respectively. The table
definitions are as follows:

```sql
postgres=# \d+ t1
                                     Partitioned table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |
|              |
 b      | integer |           |          |         | plain   |
|              |
 c      | integer |           |          |         | plain   |
|              |
Partition key: RANGE (a)
Partitions: t11 FOR VALUES FROM (0) TO (1000),
            t12 FOR VALUES FROM (1000) TO (10000)

postgres=# \d+ t2
                                     Partitioned table "public.t2"
 Column |  Type   | Collation | Nullable | Default | Storage | Compression
| Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 a      | integer |           |          |         | plain   |
|              |
 b      | integer |           |          |         | plain   |
|              |
 c      | integer |           |          |         | plain   |
|              |
Partition key: RANGE (a)
Partitions: t21 FOR VALUES FROM (0) TO (1000),
            t22 FOR VALUES FROM (1000) TO (10000)
```

The EXPLAIN output for an update query with a returning list doesn't seem
correct to me. Here are the examples (the part that doesn't seem right is
highlighted in bold):

*Query1:*
```
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a
 returning t1.c;
                                        QUERY PLAN

-------------------------------------------------------------------------------------------
 Update on public.t1  (cost=0.00..125187.88 rows=41616 width=14)
   *Output: t1_1.c     -----> something not right??*
   Update on public.t11 t1_1
   Update on public.t12 t1_2
   ->  Append  (cost=0.00..125187.88 rows=41616 width=14)
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=14)
               Output: 10, t1_1.tableoid, t1_1.ctid
               Join Filter: (t1_1.a = t2_1.a)
               ->  Seq Scan on public.t11 t1_1  (cost=0.00..30.40 rows=2040
width=14)
                     Output: t1_1.a, t1_1.tableoid, t1_1.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=4)
                     Output: t2_1.a
                     ->  Seq Scan on public.t21 t2_1  (cost=0.00..30.40
rows=2040 width=4)
                           Output: t2_1.a
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=14)
               Output: 10, t1_2.tableoid, t1_2.ctid
               Join Filter: (t1_2.a = t2_2.a)
               ->  Seq Scan on public.t12 t1_2  (cost=0.00..30.40 rows=2040
width=14)
                     Output: t1_2.a, t1_2.tableoid, t1_2.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=4)
                     Output: t2_2.a
                     ->  Seq Scan on public.t22 t2_2  (cost=0.00..30.40
rows=2040 width=4)
                           Output: t2_2.a
(23 rows)
```

*Query2:*

*```*postgres=# explain verbose update t1 set b = 10 from t2 where t1.a =
t2.a  returning t2.c;
                                        QUERY PLAN

-------------------------------------------------------------------------------------------
 Update on public.t1  (cost=0.00..125187.88 rows=41616 width=18)
   *Output: t2.c*
   Update on public.t11 t1_1
   Update on public.t12 t1_2
   ->  Append  (cost=0.00..125187.88 rows=41616 width=18)
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=18)
               Output: 10, t2_1.c, t1_1.tableoid, t1_1.ctid
               Join Filter: (t1_1.a = t2_1.a)
               ->  Seq Scan on public.t11 t1_1  (cost=0.00..30.40 rows=2040
width=14)
                     Output: t1_1.a, t1_1.tableoid, t1_1.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=8)
                     Output: t2_1.c, t2_1.a
                     ->  Seq Scan on public.t21 t2_1  (cost=0.00..30.40
rows=2040 width=8)
                           Output: t2_1.c, t2_1.a
         ->  Nested Loop  (cost=0.00..62489.90 rows=20808 width=18)
               Output: 10, t2_2.c, t1_2.tableoid, t1_2.ctid
               Join Filter: (t1_2.a = t2_2.a)
               ->  Seq Scan on public.t12 t1_2  (cost=0.00..30.40 rows=2040
width=14)
                     Output: t1_2.a, t1_2.tableoid, t1_2.ctid
               ->  Materialize  (cost=0.00..40.60 rows=2040 width=8)
                     Output: t2_2.c, t2_2.a
                     ->  Seq Scan on public.t22 t2_2  (cost=0.00..30.40
rows=2040 width=8)
                           Output: t2_2.c, t2_2.a
(23 rows)
```

After further investigation into the code, I noticed following:

1. In the 'grouping_planner()' function, while generating paths for the
final relation (
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L1857),
we only take care of adjusting the append_rel_attributes in returningList
for resultRelation. Shouldn't we do that for other relations as well in
query? Example for *Query2* above, *adjust_appendrel_attrs_multilevel* is a
no-op.
2. After plan creation (
https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/createplan.c#L351),
shouldn't we perform tlist labeling for the `returningList` as well? I
suspect this is resulting in incorrect output in *Query1*.

I suspect that similar issues might also be present for `withCheckOptions`,
`mergeActionList`, and `mergeJoinCondition`.

I would appreciate it if the community could provide insights or
clarifications regarding this observation.

Thank you for your time and consideration.


Regards
Saikiran Avula,
SDE, Amazon Web Services.

Incorrect explain output for updates/delete operations with returning-list on partitioned tables

Reply via email to