Hi, I am just trying to jump in, but ignore if not relevant.
when you said *Eventually this results in an "out of shared memory" error * Can you rule out the below two scenarios (wrt /dev/shm too low in docker or query requesting for too many locks either due to parallellism/partition involved) There have been multiple cases of out of shared memory i have read earlier for due to above. PostgreSQL: You might need to increase max_locks_per_transaction (cybertec-postgresql.com) <https://www.cybertec-postgresql.com/en/postgresql-you-might-need-to-increase-max_locks_per_transaction/> PostgreSQL at low level: stay curious! ยท Erthalion's blog <https://erthalion.info/2019/12/06/postgresql-stay-curious/#2-shared-memory> also, is this repeatable (given you mention it happens and eventually lead to "out of shared memory") I may be missing something, but i do not see a PID even though it has a lock granted on a page, was the process terminated explicitly or implicitly. ( and an orphan lingering ? ) ps auwwxx | grep postgres I took the below from "src/test/regress/sql/tidscan.sql" to simulate SIReadLock with an orphan process (by killing the process), but it gets reaped fine for me :( postgres=# \d tidscan Table "public.tidscan" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- id | integer | | | postgres=# INSERT INTO tidscan VALUES (1), (2), (3); postgres=# BEGIN ISOLATION LEVEL SERIALIZABLE; BEGIN postgres=*# SELECT * FROM tidscan WHERE ctid = '(0,1)'; id ---- 1 (1 row) postgres=*# -- locktype should be 'tuple' SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode = 'SIReadLock'; locktype | mode ----------+------------ tuple | SIReadLock (1 row) postgres=*# -- locktype should be 'tuple' SELECT pid, locktype, mode FROM pg_locks WHERE mode = 'SIReadLock'; pid | locktype | mode ------+----------+------------ 2831 | tuple | SIReadLock (1 row) i thought one could attach a gdb or strace to the pid to figure out what it did before crashing. As always, I have little knowledge on postgresql, feel free to ignore if nothing relevant. Thanks, Vijay On Tue, 27 Apr 2021 at 19:55, Mike Beachy <mbea...@gmail.com> wrote: > Hi Laurenz - > > On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe <laurenz.a...@cybertec.at> > wrote: > >> Not sure, but do you see prepared transactions in "pg_prepared_xacts"? >> > > No, the -1 in the virtualtransaction ( > https://www.postgresql.org/docs/11/view-pg-locks.html) for > pg_prepared_xacts was another clue I saw! But, it seems more or less a dead > end as I have nothing in pg_prepared_xacts. > > Thanks for the idea, though. > > I still need to put more effort into Tom's idea about SIReadLock hanging > out after the transaction, but some evidence pointing in this direction is > that I've reduced the number of db connections and found that the '-1/0' > locks will eventually go away! I interpret this as the db needing to find > time when no overlapping read/write transactions are present. This doesn't > seem completely correct, as I don't have any long lived transactions > running while these locks are hanging out. Confusion still remains, for > sure. > > Mike