Re: Help understanding SIReadLock growing without bound on completed transaction

2020-05-26 Thread Mike Klaas
On second look, it does seems the xid crossed the 2^32 mark recently, since 
most tables have a frozenxid close to 4b and the current xid is ~50m:

SELECT relname, age(relfrozenxid), relfrozenxid FROM pg_class WHERE relkind = 
'r' and relname not like 'pg%' order by relname;

relname  |    age    | relfrozenxid

---+---+--

  | 107232506 |   4237961815

  |  93692362 |   4251501959

  | 183484103 |   4161710218

  |  50760536 |   4294433785

  |  58821410 |   4286372911

  | 117427283 |   4227767038

  |  9454 |   4250653210

…

select max(backend_xid::text), min(backend_xmin::text) from pg_stat_activity 
where state='active';

max | min

--+--

50350294 | 50350065

-Mike

On Tue, May 26, 2020 at 8:42 AM, Mike Klaas < m...@superhuman.com > wrote:

> 
> On Fri, May 22, 2020 at 3:15 PM, Thomas Munro < thomas. munro@ gmail. com (
> thomas.mu...@gmail.com ) > wrote:
> 
>> 
>> 
>> Predicate locks are released by ClearOldPredicateLocks(), which releases
>> SERIALIZABLEXACTs once they are no longer interesting. It has a
>> conservative idea of what is no longer interesting: it waits until the
>> lowest xmin across active serializable snapshots is >= the transaction's
>> finishedBefore xid, which was the system's next xid (an xid that hasn't
>> been used yet*) at the time the SERIALIZABLEXACT committed. One
>> implication of this scheme is that SERIALIZABLEXACTs are cleaned up in
>> commit order. If you somehow got into a state where a few of them were
>> being kept around for a long time, but others committed later were being
>> cleaned up (which I suppose must be the case or your system would be
>> complaining about running out of SERIALIZABLEXACTs), that might imply that
>> there is a rare leak somewhere in this scheme. In the past I have wondered
>> if there might be a problem with wraparound in the xid tracking for
>> finished transactions, but I haven't worked out the details (transaction
>> ID wraparound is both figuratively and literally the Ground Hog Day of
>> PostgreSQL bug surfaces).
>> 
>> 
>> 
>> 
> 
> 
> 
> Thanks for the detailed reply, Thomas.  Is SERIALIZABLEXACT transaction ID
> wraparound the same as global xid wraparound?  The max transaction age in
> the db is ~197M [1] so I don't think we've gotten close to global
> wraparound lately.
> 
> 
> 
> Would it be helpful to cross-post this thread to pgsql-bugs or further
> investigate on my end
> 
> 
> 
> -Mike
> 
> 
> 
> [1] superhuman@ production => select datname, datfrozenxid,
> age(datfrozenxid) from pg_catalog.pg_database;
> 
> 
> datname | datfrozenxid | age
> 
> 
> 
> 
> ---+--+---
> 
> 
> 
> 
> cloudsqladmin | 4173950091 | 169089900
> 
> 
> 
> 
> template0 | 4266855294 | 76184697
> 
> 
> 
> 
> postgres | 4173951306 | 169088685
> 
> 
> 
> 
> template1 | 4266855860 | 76184131
> 
> 
> 
> 
> superhuman | 4145766807 | 197273184
> 
> 
>

Re: Help understanding SIReadLock growing without bound on completed transaction

2020-05-26 Thread Mike Klaas
On Fri, May 22, 2020 at 3:15 PM, Thomas Munro < thomas.mu...@gmail.com > wrote:

> 
> 
> 
> Predicate locks are released by ClearOldPredicateLocks(), which releases
> SERIALIZABLEXACTs once they are no longer interesting. It has a
> conservative idea of what is no longer interesting: it waits until the
> lowest xmin across active serializable snapshots is >= the transaction's
> finishedBefore xid, which was the system's next xid (an xid that hasn't
> been used yet*) at the time the SERIALIZABLEXACT committed. One
> implication of this scheme is that SERIALIZABLEXACTs are cleaned up in
> commit order. If you somehow got into a state where a few of them were
> being kept around for a long time, but others committed later were being
> cleaned up (which I suppose must be the case or your system would be
> complaining about running out of SERIALIZABLEXACTs), that might imply that
> there is a rare leak somewhere in this scheme. In the past I have wondered
> if there might be a problem with wraparound in the xid tracking for
> finished transactions, but I haven't worked out the details (transaction
> ID wraparound is both figuratively and literally the Ground Hog Day of
> PostgreSQL bug surfaces).
> 
> 
> 
> 

Thanks for the detailed reply, Thomas.  Is SERIALIZABLEXACT transaction ID 
wraparound the same as global xid wraparound?  The max transaction age in the 
db is ~197M [1] so I don't think we've gotten close to global wraparound lately.

Would it be helpful to cross-post this thread to pgsql-bugs or further 
investigate on my end

-Mike

[1] superhuman@ production => select datname, datfrozenxid, age(datfrozenxid) 
from pg_catalog.pg_database;

datname | datfrozenxid | age

---+--+---

cloudsqladmin | 4173950091 | 169089900

template0 | 4266855294 | 76184697

postgres | 4173951306 | 169088685

template1 | 4266855860 | 76184131

superhuman | 4145766807 | 197273184

Re: Help understanding SIReadLock growing without bound on completed transaction

2020-05-22 Thread Thomas Munro
On Fri, May 22, 2020 at 7:48 AM Mike Klaas  wrote:
> It's my understanding that these locks should be cleared when there are no 
> conflicting transactions.  These locks had existed for > 1 week and we have 
> no transactions that last more than a few seconds (the oldest transaction in 
> pg_stat_activity is always < 1minute old).
> Why would a transaction that is finished continue accumulating locks over 
> time?

Predicate locks are released by ClearOldPredicateLocks(), which
releases SERIALIZABLEXACTs once they are no longer interesting.  It
has a  conservative idea of what is no longer interesting: it waits
until the lowest xmin across active serializable snapshots is >= the
transaction's finishedBefore xid, which was the system's next xid (an
xid that hasn't been used yet*) at the time the SERIALIZABLEXACT
committed.  One implication of this scheme is that SERIALIZABLEXACTs
are cleaned up in commit order.  If you somehow got into a state where
a few of them were being kept around for a long time, but others
committed later were being cleaned up (which I suppose must be the
case or your system would be complaining about running out of
SERIALIZABLEXACTs), that might imply that there is a rare leak
somewhere in this scheme.  In the past I have wondered if there might
be a problem with wraparound in the xid tracking for finished
transactions, but I haven't worked out the details (transaction ID
wraparound is both figuratively and literally the Ground Hog Day of
PostgreSQL bug surfaces).

*Interestingly, it takes an unlocked view of that value, but that
doesn't seem relevant here; it could see a value that's too low, not
too high.




Re: Help understanding SIReadLock growing without bound on completed transaction

2020-05-21 Thread Mike Klaas
On Thu, May 21, 2020 at 5:19 PM, Thomas Munro < thomas.mu...@gmail.com > wrote:

> 
> 
> 
> On Fri, May 22, 2020 at 7:48 AM Mike Klaas < mike@ superhuman. com (
> m...@superhuman.com ) > wrote:
> 
> 
> 
>> 
>> 
>> pid:2263461
>> 
>> 
>> 
> 
> 
> 
> That's an unusually high looking pid. Is that expected, for example did
> you crank Linux's pid_max right up, or is this AIX, or something?
> 
> 
> 
> 

Unfortunately I'm not sure exactly what it's running on as it's a 
cloud-provided database instance running on google cloud:

=> select version();

PostgreSQL 9.6.16 on x86_64-pc-linux-gnu, compiled by clang version 
7.0.0-3~ubuntu0.18.04.1 (tags/RELEASE_700/final), 64-bit

-Mike

Re: Help understanding SIReadLock growing without bound on completed transaction

2020-05-21 Thread Thomas Munro
On Fri, May 22, 2020 at 7:48 AM Mike Klaas  wrote:
> locktype: page
> relation::regclass::text: _pkey
> virtualtransaction: 36/296299968
> granted:t
> pid:2263461

That's an unusually high looking pid.  Is that expected, for example
did you crank Linux's pid_max right up, or is this AIX, or something?