For the record, I was wrong, the patch in [1] doesn't affect lost
concurrent updates with ctid. It was applied in 17.7 and after looking
much harder at the thread and code, I can see it's unrelated.

I've worked out what's happening, and it has nothing to do with ctid.

Here's a much simplified example:
create table t(i int);
insert into t(i) values (1);
--s1
begin;
update t set i = 2 from (select i from t) x where t.i = x.i;
---------------
UPDATE 1
--s2
update t set i = 2 from (select i from t) x where t.i = x.i;
----------------
UPDATE 0  (after commit s1)

The same thing happens with
update t set  i = 2 from (select i from t for update) x where t.i = x.i;

x.i is not updated when s1 releases the lock and so s2 is still
looking for x.i = 1. Based on [2], I'm guessing that because the where
clause is indirect, it doesn't qualify for re-evalution after the lock
is released. So it continues to use the version of the record from the
start of the transaction. But I don't know nearly enough about the
internals to give a proper explanation.

Here's the ctid version that helped me figure this out.

with x as (select ctid from t for update),
y as (update t set i = 2 from x where t.ctid = x.ctid returning t.ctid)
select 'x', ctid from x union select 'y', ctid from y;
--s1
x (0, 1)
y (0, 2)
--s2
x (0, 2)

Even though x is updated in s2, the updated version isn't what's used
by y. I suspect the x version is only updated here because of the
final select.

So don't use this pattern to avoid deadlocks if this is a one shot update.
with x as (select ctid from t where ... order by id for update)
update t set ... where t.ctid =  x.ctid;
Use an immutable unique column, or retry deadlocks.

Thanks,
Bernice

[1] 
https://www.postgresql.org/message-id/flat/4a6268ff-3340-453a-9bf5-c98d51a6f729%40app.fastmail.com
[2] 
https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED


Reply via email to