Hi all,

I've never submitted a patch before, sorry for any missteps. This
patch replaces ctid with an id column in the UPDATE and DELETE
self-join examples. There's an important difference I recently learned
between using ctid and a primary key column for a self-join in an
UPDATE. With a primary key, two concurrent updates will both write.
With a ctid, only one of the updates will write. Ctid is the
equivalent of using a mutable column for the self-join. Tom explains
the details much better than I can [1].

In the example in the UPDATE doc this doesn't affect the final
outcome, because it's being used to break an update into batches and
all the updates will eventually write. But imagine a system that batch
processes incoming records and uses a self-join to an ordered CTE to
prevent UPDATE deadlocks. With a primary key self-join, this will work
correctly, but with a ctid self-join, concurrent updates will very
quietly not happen. I think the example in the UPDATE doc might lead
someone to mistakenly believe that ctid can be used in a self-join to
prevent UPDATE deadlocks in the same way a primary key can be used,
and that's why I'm proposing changing it.

For reference here's why the examples were added [2]. My changes keep
all the benefits except telling users about ctid. This was the initial
focus of [2] but there's plenty of other important benefits discussed.
And perhaps ctid should not be learned about too easily.

Thanks,
Bernice

[1] https://www.postgresql.org/message-id/1645231.1763582658%40sss.pgh.pa.us
[2] 
https://www.postgresql.org/message-id/flat/CADkLM%3DcaNEQsUwPWnfi2jR4ix99E0EJM_3jtcE-YjnEQC7Rssw%40mail.gmail.com
From ac18b371e005e85c247871d22aa416f70f378017 Mon Sep 17 00:00:00 2001
From: Bernice Southey <[email protected]>
Date: Thu, 20 Nov 2025 11:53:57 +0000
Subject: [PATCH] Remove ctid from update and delete examples

---
 doc/src/sgml/ref/delete.sgml | 10 +++++-----
 doc/src/sgml/ref/update.sgml |  9 +++------
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/delete.sgml b/doc/src/sgml/ref/delete.sgml
index 5b52f77e28f..48eb1b6755a 100644
--- a/doc/src/sgml/ref/delete.sgml
+++ b/doc/src/sgml/ref/delete.sgml
@@ -313,15 +313,15 @@ DELETE FROM tasks WHERE CURRENT OF c_tasks;
    documentation of <command>UPDATE</command></link>:
 <programlisting>
 WITH delete_batch AS (
-  SELECT l.ctid FROM user_logs AS l
-    WHERE l.status = 'archived'
-    ORDER BY l.creation_date
+  SELECT a.id FROM accounts AS a
+    WHERE a.status = 'archived'
+    ORDER BY a.id
     FOR UPDATE
     LIMIT 10000
 )
-DELETE FROM user_logs AS dl
+DELETE FROM accounts AS dl
   USING delete_batch AS del
-  WHERE dl.ctid = del.ctid;
+  WHERE dl.id = del.id;
 </programlisting>
   </para>
  </refsect1>
diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 40cca063946..7374b884080 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -486,13 +486,10 @@ UPDATE films SET kind = 'Dramatic' WHERE CURRENT OF c_films;
    no <literal>LIMIT</literal> clause for <command>UPDATE</command>, it is
    possible to get a similar effect through the use of
    a <link linkend="queries-with">Common Table Expression</link> and a
-   self-join.  With the standard <productname>PostgreSQL</productname>
-   table access method, a self-join on the system
-   column <link linkend="ddl-system-columns-ctid">ctid</link> is very
-   efficient:
+   self-join:
 <programlisting>
 WITH exceeded_max_retries AS (
-  SELECT w.ctid FROM work_item AS w
+  SELECT w.id FROM work_item AS w
     WHERE w.status = 'active' AND w.num_retries &gt; 10
     ORDER BY w.retry_timestamp
     FOR UPDATE
@@ -500,7 +497,7 @@ WITH exceeded_max_retries AS (
 )
 UPDATE work_item SET status = 'failed'
   FROM exceeded_max_retries AS emr
-  WHERE work_item.ctid = emr.ctid;
+  WHERE work_item.id = emr.id;
 </programlisting>
    This command will need to be repeated until no rows remain to be updated.
    Use of an <literal>ORDER BY</literal> clause allows the command to
-- 
2.43.0

Reply via email to