Hi,

I am just trying to jump in, but ignore if not relevant.

when you said    *Eventually this results in an "out of shared memory"
error  *

Can you rule out the below two scenarios (wrt /dev/shm too low in docker or
query requesting for too many locks either due to parallellism/partition
involved)
There have been multiple cases of out of shared memory i have read earlier
for due to above.

PostgreSQL: You might need to increase max_locks_per_transaction
(cybertec-postgresql.com)
<https://www.cybertec-postgresql.com/en/postgresql-you-might-need-to-increase-max_locks_per_transaction/>
PostgreSQL at low level: stay curious! ยท Erthalion's blog
<https://erthalion.info/2019/12/06/postgresql-stay-curious/#2-shared-memory>

also, is this repeatable (given you mention it happens and eventually lead
to "out of shared memory")

I may be missing something, but i do not see a PID even though it has a
lock granted on a page, was the process terminated explicitly or
implicitly. ( and an orphan lingering ? )
ps auwwxx | grep postgres

I took the below from "src/test/regress/sql/tidscan.sql"  to simulate
SIReadLock with an orphan process (by killing the process), but it gets
reaped fine for me :(

postgres=# \d tidscan
              Table "public.tidscan"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 id     | integer |           |          |

postgres=# INSERT INTO tidscan VALUES (1), (2), (3);

postgres=# BEGIN ISOLATION LEVEL SERIALIZABLE;
BEGIN
postgres=*# SELECT * FROM tidscan WHERE ctid = '(0,1)';
 id
----
  1
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode =
'SIReadLock';
 locktype |    mode
----------+------------
 tuple    | SIReadLock
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT pid, locktype, mode FROM pg_locks WHERE mode = 'SIReadLock';
 pid  | locktype |    mode
------+----------+------------
 2831 | tuple    | SIReadLock
(1 row)

i thought one could attach a gdb or strace to the pid to figure out what it
did before crashing.

As always, I have little knowledge on postgresql, feel free to ignore if
nothing relevant.

Thanks,
Vijay



On Tue, 27 Apr 2021 at 19:55, Mike Beachy <mbea...@gmail.com> wrote:

> Hi Laurenz -
>
> On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe <laurenz.a...@cybertec.at>
> wrote:
>
>> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>>
>
> No, the -1 in the virtualtransaction (
> https://www.postgresql.org/docs/11/view-pg-locks.html) for
> pg_prepared_xacts was another clue I saw! But, it seems more or less a dead
> end as I have nothing in pg_prepared_xacts.
>
> Thanks for the idea, though.
>
> I still need to put more effort into Tom's idea about SIReadLock hanging
> out after the transaction, but some evidence pointing in this direction is
> that I've reduced the number of db connections and found that the '-1/0'
> locks will eventually go away! I interpret this as the db needing to find
> time when no overlapping read/write transactions are present. This doesn't
> seem completely correct, as I don't have any long lived transactions
> running while these locks are hanging out. Confusion still remains, for
> sure.
>
> Mike

Reply via email to