On Sat, Nov 29, 2025 at 10:45:45PM +0000, Bernice Southey wrote:
> Bruce Momjian <[email protected]> wrote:
> This is why I think the docs should say something different to what
> they currently do. The UPDATE example is fine only because it's run
> repeatedly until there are no more rows to find.
>
> I learned the primary key self-join pattern for order by/limit in
> delete/update soon after I began with postgres. I'm pretty sure ctid
> would have confused me. So I think the doc examples with an id column
> will be very helpful, and safer.
>
> If users discover ctid on internet forums or LLMs and then check the
> docs, I think they should get some notice of the concurrency risk.
>
> Here's a patch with an attempt to do the above.
I was traveling so I apologize for the delayed reply.
Sorry, I didn't like your proposed patch because it removes the ctid
example, and I didn't write that example so I don't feel enabled to
remove it. What I did do was to explain why ctid was safe in this
use-case. I also strengthened the documentation that ctid is not valid
for row references.
--
Bruce Momjian <[email protected]> https://momjian.us
EDB https://enterprisedb.com
Do not let urgent matters crowd out time for investment in the future.
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index cea28c00f8a..9070aaa5a7c 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1558,7 +1558,7 @@ CREATE TABLE circles (
locate the row version very quickly, a row's
<structfield>ctid</structfield> will change if it is
updated or moved by <command>VACUUM FULL</command>. Therefore
- <structfield>ctid</structfield> is useless as a long-term row
+ <structfield>ctid</structfield> should not be used as a row
identifier. A primary key should be used to identify logical rows.
</para>
</listitem>
diff --git a/doc/src/sgml/ref/delete.sgml b/doc/src/sgml/ref/delete.sgml
index 5b52f77e28f..c8dce945b06 100644
--- a/doc/src/sgml/ref/delete.sgml
+++ b/doc/src/sgml/ref/delete.sgml
@@ -323,6 +323,9 @@ DELETE FROM user_logs AS dl
USING delete_batch AS del
WHERE dl.ctid = del.ctid;
</programlisting>
+ This use of <structfield>ctid</structfield> is only safe because
+ the query is repeately run, avoiding the problem of changed
+ <structfield>ctid</structfield>s.
</para>
</refsect1>
diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 40cca063946..289f48b2549 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -503,6 +503,9 @@ UPDATE work_item SET status = 'failed'
WHERE work_item.ctid = emr.ctid;
</programlisting>
This command will need to be repeated until no rows remain to be updated.
+ (This use of <structfield>ctid</structfield> is only safe because
+ the query is repeately run, avoiding the problem of changed
+ <structfield>ctid</structfield>s.)
Use of an <literal>ORDER BY</literal> clause allows the command to
prioritize which rows will be updated; it can also prevent deadlock
with other update operations if they use the same ordering.