We recently experienced an issue where a transaction that was finished and no 
longer existed kept slowly accumulating SIReadLocks over the period of a week.  
The only way we could remove the locks was by restarting postgresql.

The entries in pg_locks resembled:

mode: SIReadLock

locktype: page

relation::regclass::text: <table>_pkey

virtualtransaction: 36/296299968

granted:t

pid: 2263461

count(1): 5559 (when grouped)

Note that this pid did not exist in pg_stat_activity.  I understand that it is 
normal for SSI locks to persist after a transaction is finished.  There are 
however two aspects to this that I don't understand:

* It's my understanding that these locks should be cleared when there are no 
conflicting transactions.  These locks had existed for > 1 week and we have no 
transactions that last more than a few seconds (the oldest transaction in 
pg_stat_activity is always < 1minute old).

* Why would a transaction that is finished continue accumulating locks over 
time?

If it helps, here is some more information about the state of the system:

* There were a total of six pids in pg_locks that didn't exist in 
pg_stat_activity.  They held a variety of SIReadLocks, but they weren't 
increasing in number over time.  I'm not sure how long they were present; I 
only know that the problematic pid existed for a week due to its continual 
growth reflecting in our internal lock monitoring system.

* I tried finding overlapping SIReadLocks (see query below), but none were 
returned (I realize that the SSI conflict resolution algo is much more involved 
than this simple query)

* PG version: 9.6.17

I would appreciate any hints of what I could've done to investigate this 
further or how I could've resolved the issue without restarting the db (and 
thus experiencing downtime).

thank you in advance,
-Mike

SELECT

waiting.locktype AS w_locktype,

LEFT(waiting.relation::regclass::text,25) AS waiting_table,

COALESCE(waiting_stm.query,'?') AS w_query,

waiting.page ( http://waiting.page/ ) AS w_page,

waiting.tuple AS w_tuple,

waiting.pid ( http://waiting.pid/ ) AS w_pid,

other.locktype AS o_locktype,

LEFT(other.relation::regclass::text,15) AS other_table,

LEFT(COALESCE(other_stm.query, '?'), 50) AS other_query,

other.page ( http://other.page/ ) AS o_page,

other.tuple AS o_tuple,

other.pid ( http://other.pid/ ) AS other_pid,

other.GRANTED AS o_granted

FROM

pg_catalog.pg_locks AS waiting

LEFT JOIN

pg_catalog.pg_stat_activity AS waiting_stm

ON waiting_stm.pid ( http://waiting_stm.pid/ ) = waiting.pid ( 
http://waiting.pid/ )

JOIN

pg_catalog.pg_locks AS other

ON (

(

waiting."database" = other."database"

AND waiting.relation = other.relation

and waiting.locktype = other.locktype

AND ( CASE WHEN other.locktype = 'page' THEN waiting.page ( 
http://waiting.page/ ) IS NOT DISTINCT FROM other.page ( http://other.page/ )

WHEN other.locktype = 'tuple' THEN waiting.page ( http://waiting.page/ ) IS NOT 
DISTINCT FROM other.page ( http://other.page/ ) and waiting.tuple IS NOT 
DISTINCT FROM other.tuple

ELSE true END

)

)

OR waiting.transactionid = other.transactionid

) AND waiting.pid ( http://waiting.pid/ ) <> other.pid ( http://other.pid/ )

LEFT JOIN

pg_catalog.pg_stat_activity AS other_stm

ON other_stm.pid ( http://other_stm.pid/ ) = other.pid ( http://other.pid/ )

WHERE waiting.pid ( http://waiting.pid/ ) IN (2263461, 2263276, 2263283, 
2263284, 2263459, 2263527 )

Reply via email to