Re: More guidance on ctid

Bruce Momjian Mon, 22 Dec 2025 14:09:20 -0800

On Sat, Nov 29, 2025 at 10:45:45PM +0000, Bernice Southey wrote:
> Bruce Momjian <[email protected]> wrote:
> This is why I think the docs should say something different to what
> they currently do. The UPDATE example is fine only because it's run
> repeatedly until there are no more rows to find.
> 
> I learned the primary key self-join pattern for order by/limit in
> delete/update soon after I began with postgres. I'm pretty sure ctid
> would have confused me. So I think the doc examples with an id column
> will be very helpful, and safer.
> 
> If users discover ctid on internet forums or LLMs and then check the
> docs, I think they should get some notice of the concurrency risk.
> 
> Here's a patch with an attempt to do the above.


I was traveling so I apologize for the delayed reply.

Sorry, I didn't like your proposed patch because it removes the ctid
example, and I didn't write that example so I don't feel enabled to
remove it.  What I did do was to explain why ctid was safe in this
use-case.  I also strengthened the documentation that ctid is not valid
for row references.

-- 
  Bruce Momjian  <[email protected]>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index cea28c00f8a..9070aaa5a7c 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1558,7 +1558,7 @@ CREATE TABLE circles (
       locate the row version very quickly, a row's
       <structfield>ctid</structfield> will change if it is
       updated or moved by <command>VACUUM FULL</command>.  Therefore
-      <structfield>ctid</structfield> is useless as a long-term row
+      <structfield>ctid</structfield> should not be used as a row
       identifier.  A primary key should be used to identify logical rows.
      </para>
     </listitem>
diff --git a/doc/src/sgml/ref/delete.sgml b/doc/src/sgml/ref/delete.sgml
index 5b52f77e28f..c8dce945b06 100644
--- a/doc/src/sgml/ref/delete.sgml
+++ b/doc/src/sgml/ref/delete.sgml
@@ -323,6 +323,9 @@ DELETE FROM user_logs AS dl
   USING delete_batch AS del
   WHERE dl.ctid = del.ctid;
 </programlisting>
+   This use of <structfield>ctid</structfield> is only safe because
+   the query is repeately run, avoiding the problem of changed
+   <structfield>ctid</structfield>s.
   </para>
  </refsect1>
 
diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 40cca063946..289f48b2549 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -503,6 +503,9 @@ UPDATE work_item SET status = 'failed'
   WHERE work_item.ctid = emr.ctid;
 </programlisting>
    This command will need to be repeated until no rows remain to be updated.
+   (This use of <structfield>ctid</structfield> is only safe because
+   the query is repeately run, avoiding the problem of changed
+   <structfield>ctid</structfield>s.)
    Use of an <literal>ORDER BY</literal> clause allows the command to
    prioritize which rows will be updated; it can also prevent deadlock
    with other update operations if they use the same ordering.

Re: More guidance on ctid

Reply via email to