On 23.02.26 09:41, David Rowley wrote:
On Mon, 23 Feb 2026 at 21:21, Peter Eisentraut <[email protected]> wrote:
I read in the SQL standard that SELECT * in EXISTS is not supposed to be
expanded to all columns, but only to an arbitrary literal. This
corresponds to the recommendation in the PostgreSQL documentation to
write EXISTS (SELECT 1 ...) instead. But not even our own tests and
example code use that latter convention consistently, so I think many
users don't know it or observe it either. So implementing that little
optimization for SELECT * seems reasonable.
Thoughts?
Don't we already do this in simplify_EXISTS_query()?
It appears so, but then I think we should update the documentation along
the lines I showed, because I found it misleading about this. New
docs-only patch attached.
From bd4504ba048b5f0fbb222d6873eef63dc1621ce0 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <[email protected]>
Date: Tue, 24 Feb 2026 21:00:36 +0100
Subject: [PATCH v2] doc: Document SELECT in EXISTS optimizations
PostgreSQL optimizes away the target list of a SELECT immediately
inside an EXISTS. But the documentation was not mentioning this and
was suggesting a coding convention that suggested that picking an
efficiently computable target list was of relevance. Mention the
optimization explicitly and suggest other coding conventions.
Also add some more code comments about this.
---
doc/src/sgml/func/func-subquery.sgml | 9 +++++++--
src/backend/optimizer/plan/subselect.c | 8 +++++++-
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/func/func-subquery.sgml
b/doc/src/sgml/func/func-subquery.sgml
index a9f2b12e48c..73ff37f7ca9 100644
--- a/doc/src/sgml/func/func-subquery.sgml
+++ b/doc/src/sgml/func/func-subquery.sgml
@@ -70,8 +70,13 @@ <title><literal>EXISTS</literal></title>
and not on the contents of those rows, the output list of the
subquery is normally unimportant. A common coding convention is
to write all <literal>EXISTS</literal> tests in the form
- <literal>EXISTS(SELECT 1 WHERE ...)</literal>. There are exceptions to
- this rule however, such as subqueries that use <token>INTERSECT</token>.
+ <literal>EXISTS(SELECT * FROM ... WHERE ...)</literal>, another common
+ convention is to write <literal>EXISTS(SELECT 1 FROM ... WHERE
+ ...)</literal> or some other dummy constant. The output list is optimized
+ away in such queries, and no columns are actually fetched and no
+ expressions are actually computed. These conventions are not always
+ applicable, such as in subqueries that use <token>INTERSECT</token>, where
+ the output list of the subquery affects the result of the subquery itself.
</para>
<para>
diff --git a/src/backend/optimizer/plan/subselect.c
b/src/backend/optimizer/plan/subselect.c
index e9dc9d31f05..1f18934b2a4 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -1643,7 +1643,13 @@ convert_EXISTS_sublink_to_join(PlannerInfo *root,
SubLink *sublink,
* Note: by suppressing the targetlist we could cause an observable behavioral
* change, namely that any errors that might occur in evaluating the tlist
* won't occur, nor will other side-effects of volatile functions. This seems
- * unlikely to bother anyone in practice.
+ * unlikely to bother anyone in practice. Note that any column privileges are
+ * still checked even if the reference is removed here.
+ *
+ * The SQL standard specifies that a SELECT * immediately inside EXISTS
+ * expands to not all columns but an arbitrary literal. That is kind of the
+ * same idea, but our optimization goes further in that it throws away the
+ * entire targetlist, and not only if it was written as *.
*
* Returns true if was able to discard the targetlist, else false.
*/
--
2.53.0