Re: -1/0 virtualtransaction

2021-05-01 Thread Mike Beachy
In case this helps anyone else, I found a simple way to get a rough
idea of what's going on, which is to run:

select (select count(distinct virtualtransaction) from pg_locks) as
tx_with_locks, (select count(*) from pg_stat_activity where state =
'active') as active_tx, (select count(*) from pg_locks where
virtualtransaction = '-1/0') as summarized_locks;

I disabled the part of my application that seemed to be causing
problems with too many writes (a background cleanup task) and then
triggered it from a separate process. I can see the number of
transactions with locks climbing when it hits a problematic item,
while the number of active transactions (of course) stays low.

Mike

On Fri, Apr 30, 2021 at 4:53 PM Mike Beachy  wrote:
>
> On Fri, Apr 30, 2021 at 7:12 AM Thomas Munro  wrote:
> > But do you have lots of short overlapping transactions so that there
> > is never a moment where there are zero transactions running?
>
> Yeah, that almost certainly explains it.
>
> Thanks very much for the explanation about the summarized locks.
>
> > The number of SERIALIZABLEXACT objects is (max_connections +
> > max_prepared_transactions) * 10.  So, you could try increasing
> > max_connections (without increasing the actual number of connections)
> > to see if you can get to a point where you don't see these invalid
> > virtual xids, and then maybe it'll be able to clean up locks more
> > aggressively.
>
> Aha! I hadn't considered that some parameter besides
> max_pred_locks_per_transaction would come into play.  I'll give this a
> shot.
>
> Thanks,
> Mike




Re: -1/0 virtualtransaction

2021-04-30 Thread Mike Beachy
On Fri, Apr 30, 2021 at 7:12 AM Thomas Munro  wrote:
> But do you have lots of short overlapping transactions so that there
> is never a moment where there are zero transactions running?

Yeah, that almost certainly explains it.

Thanks very much for the explanation about the summarized locks.

> The number of SERIALIZABLEXACT objects is (max_connections +
> max_prepared_transactions) * 10.  So, you could try increasing
> max_connections (without increasing the actual number of connections)
> to see if you can get to a point where you don't see these invalid
> virtual xids, and then maybe it'll be able to clean up locks more
> aggressively.

Aha! I hadn't considered that some parameter besides
max_pred_locks_per_transaction would come into play.  I'll give this a
shot.

Thanks,
Mike




Re: -1/0 virtualtransaction

2021-04-30 Thread Thomas Munro
On Wed, Apr 28, 2021 at 2:25 AM Mike Beachy  wrote:
> On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe  wrote:
>>
>> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>
> No, the -1 in the virtualtransaction 
> (https://www.postgresql.org/docs/11/view-pg-locks.html) for pg_prepared_xacts 
> was another clue I saw! But, it seems more or less a dead end as I have 
> nothing in pg_prepared_xacts.
>
> Thanks for the idea, though.

There is another way to get a pg_lock that shows -1/0 there: if we run
out of SERIALIZABLEXACT objects, we transfer the locks of the oldest
committed transaction to a single SERIALIZABLEXACT object called
OldCommittedSxact, and it has an invalid virtual xid.  You can see
this if you recompile with TEST_SUMMARIZE_SERIAL defined to force that
path, and then run three overlapping transactions like this:

tx1: BEGIN;
tx1: SELECT * FROM t WHERE id = 42;
tx2:   BEGIN;
tx2:   SELECT * FROM t WHERE id = 999;
tx1: COMMIT;

Even though it comitted, at this point we still see tx1's locks, along
with tx2's.

tx3: BEGIN;
tx3: SELECT * FROM t WHERE id = 1234;

At this point we see tx1's locks still but their vxid has changed to -1/0.

> I still need to put more effort into Tom's idea about SIReadLock hanging out 
> after the transaction, but some evidence pointing in this direction is that 
> I've reduced the number of db connections and found that the '-1/0' locks 
> will eventually go away! I interpret this as the db needing to find time when 
> no overlapping read/write transactions are present. This doesn't seem 
> completely correct, as I don't have any long lived transactions running while 
> these locks are hanging out. Confusion still remains, for sure.

But do you have lots of short overlapping transactions so that there
is never a moment where there are zero transactions running?

As mentioned, locks (and transactions, and conflicts) hang around
after you commit.  That's because things that your finished
transaction saw can cause transactions that are still running to
become uncommittable, by forming part of a cycle.   The rules for when
the locks can be freed change once you reach the degraded "summarised"
mode, though, due to lack of bookkeeping space.  Not sure of the exact
details without rereading the source code 42 times with much coffee,
but it's something like: summarised locks can only be freed at times
when there are zero active read/write serializable transactions,
because that's when "CanPartialClearThrough" advances, while in the
normal case they can be freed as soon as there are no SSI snapshots
that were taken before it committed, because that's when
"SxactGlobalXmin" advances.

The number of SERIALIZABLEXACT objects is (max_connections +
max_prepared_transactions) * 10.  So, you could try increasing
max_connections (without increasing the actual number of connections)
to see if you can get to a point where you don't see these invalid
virtual xids, and then maybe it'll be able to clean up locks more
aggressively.




Re: -1/0 virtualtransaction

2021-04-27 Thread Vijaykumar Jain
Hi,

I am just trying to jump in, but ignore if not relevant.

when you said*Eventually this results in an "out of shared memory"
error  *

Can you rule out the below two scenarios (wrt /dev/shm too low in docker or
query requesting for too many locks either due to parallellism/partition
involved)
There have been multiple cases of out of shared memory i have read earlier
for due to above.

PostgreSQL: You might need to increase max_locks_per_transaction
(cybertec-postgresql.com)

PostgreSQL at low level: stay curious! ยท Erthalion's blog


also, is this repeatable (given you mention it happens and eventually lead
to "out of shared memory")

I may be missing something, but i do not see a PID even though it has a
lock granted on a page, was the process terminated explicitly or
implicitly. ( and an orphan lingering ? )
ps auwwxx | grep postgres

I took the below from "src/test/regress/sql/tidscan.sql"  to simulate
SIReadLock with an orphan process (by killing the process), but it gets
reaped fine for me :(

postgres=# \d tidscan
  Table "public.tidscan"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 id | integer |   |  |

postgres=# INSERT INTO tidscan VALUES (1), (2), (3);

postgres=# BEGIN ISOLATION LEVEL SERIALIZABLE;
BEGIN
postgres=*# SELECT * FROM tidscan WHERE ctid = '(0,1)';
 id

  1
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT locktype, mode FROM pg_locks WHERE pid = pg_backend_pid() AND mode =
'SIReadLock';
 locktype |mode
--+
 tuple| SIReadLock
(1 row)

postgres=*# -- locktype should be 'tuple'
SELECT pid, locktype, mode FROM pg_locks WHERE mode = 'SIReadLock';
 pid  | locktype |mode
--+--+
 2831 | tuple| SIReadLock
(1 row)

i thought one could attach a gdb or strace to the pid to figure out what it
did before crashing.

As always, I have little knowledge on postgresql, feel free to ignore if
nothing relevant.

Thanks,
Vijay



On Tue, 27 Apr 2021 at 19:55, Mike Beachy  wrote:

> Hi Laurenz -
>
> On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe 
> wrote:
>
>> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>>
>
> No, the -1 in the virtualtransaction (
> https://www.postgresql.org/docs/11/view-pg-locks.html) for
> pg_prepared_xacts was another clue I saw! But, it seems more or less a dead
> end as I have nothing in pg_prepared_xacts.
>
> Thanks for the idea, though.
>
> I still need to put more effort into Tom's idea about SIReadLock hanging
> out after the transaction, but some evidence pointing in this direction is
> that I've reduced the number of db connections and found that the '-1/0'
> locks will eventually go away! I interpret this as the db needing to find
> time when no overlapping read/write transactions are present. This doesn't
> seem completely correct, as I don't have any long lived transactions
> running while these locks are hanging out. Confusion still remains, for
> sure.
>
> Mike


Re: -1/0 virtualtransaction

2021-04-27 Thread Mike Beachy
Hi Laurenz -

On Tue, Apr 27, 2021 at 2:56 AM Laurenz Albe 
wrote:

> Not sure, but do you see prepared transactions in "pg_prepared_xacts"?
>

No, the -1 in the virtualtransaction (
https://www.postgresql.org/docs/11/view-pg-locks.html) for
pg_prepared_xacts was another clue I saw! But, it seems more or less a dead
end as I have nothing in pg_prepared_xacts.

Thanks for the idea, though.

I still need to put more effort into Tom's idea about SIReadLock hanging
out after the transaction, but some evidence pointing in this direction is
that I've reduced the number of db connections and found that the '-1/0'
locks will eventually go away! I interpret this as the db needing to find
time when no overlapping read/write transactions are present. This doesn't
seem completely correct, as I don't have any long lived transactions
running while these locks are hanging out. Confusion still remains, for
sure.

Mike


Re: -1/0 virtualtransaction

2021-04-27 Thread Laurenz Albe
On Mon, 2021-04-26 at 17:45 -0400, Mike Beachy wrote:
> Does anyone have any pointers on what a virtualtransaction of '-1/0' means?
> 
> I'm using SSI and an example is 
> 
>  locktype | database | relation |  page   | tuple | virtualxid | 
> transactionid | classid | objid | objsubid | virtualtransaction | pid |
> mode| granted | fastpath
> --+--+--+-+---++---+-+---+--++-++-+--
>  page |16384 |   468238 |   19216 |   ||  
>  | |   |  | -1/0   | | SIReadLock | t 
>   | f
> 
>  This is incredibly hard to search for. 
> 
> I see these for page, tuple and (perhaps a clue?) one relation, and I'm 
> seeing the page and tuple locks accumulate over time.
>  Eventually this results in an "out of shared memory" error.

Not sure, but do you see prepared transactions in "pg_prepared_xacts"?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: -1/0 virtualtransaction

2021-04-26 Thread Mike Beachy
On Mon, Apr 26, 2021 at 6:16 PM Tom Lane  wrote:

> Hmm, that's an invalid VXID, which would ordinarily mean that nothing
> is holding the lock.  There is a passing mention in mvcc.sgml that
> SIRead locks sometimes need to be held past transaction end, so maybe
> what you're looking at is such a lock that is no longer associated
> with a specific transaction.  I have to disclaim knowing much of
> anything about the SSI code, though.
>

Hmm, yeah. I had seen discussion of this "held past transaction end" but
didn't know what to make of it.

The "Serializable Snapshot Isolation in PostgreSQL" paper (
https://drkp.net/papers/ssi-vldb12.pdf) makes me think this is a reasonable
line of inquiry.

Thanks,
Mike


Re: -1/0 virtualtransaction

2021-04-26 Thread Tom Lane
Mike Beachy  writes:
> Does anyone have any pointers on what a virtualtransaction of '-1/0' means?
> I'm using SSI and an example is

>  locktype | database | relation |  page   | tuple | virtualxid |
> transactionid | classid | objid | objsubid | virtualtransaction | pid |
>  mode| granted | fastpath
> --+--+--+-+---++---+-+---+--++-++-+--
>  page |16384 |   468238 |   19216 |   ||
> | |   |  | -1/0   | | SIReadLock |
> t   | f

Hmm, that's an invalid VXID, which would ordinarily mean that nothing
is holding the lock.  There is a passing mention in mvcc.sgml that
SIRead locks sometimes need to be held past transaction end, so maybe
what you're looking at is such a lock that is no longer associated
with a specific transaction.  I have to disclaim knowing much of
anything about the SSI code, though.

regards, tom lane




-1/0 virtualtransaction

2021-04-26 Thread Mike Beachy
Hi -

Does anyone have any pointers on what a virtualtransaction of '-1/0' means?

I'm using SSI and an example is

 locktype | database | relation |  page   | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction | pid |
 mode| granted | fastpath
--+--+--+-+---++---+-+---+--++-++-+--
 page |16384 |   468238 |   19216 |   ||
| |   |  | -1/0   | | SIReadLock |
t   | f

 This is incredibly hard to search for.

I see these for page, tuple and (perhaps a clue?) one relation, and I'm
seeing the page and tuple locks accumulate over time. Eventually this
results in an "out of shared memory" error.

Any help is appreciated.

Thanks,
Mike