After a second manual run it properly reset the relfrozenid on the affected relation.
Only difference this one was a vacuum analyze but I suspect that was not the cause. On Monday, August 5, 2013, Armand du Plessis wrote: > Hi there, > > We're running into a scenario where despite doing a manual vacuum as a > superuser the relfrozenxid for one relation now dangerously close to > wraparound is not getting reset. > > It's a Postgres 9.2.3 cluster. We shutdown other access to the machine > while running the VACUUM to ensure it could complete quick enough with an > aggressive vacuum (vacuum_cost_limit 10000 and no delay). The previous > autovacuum was running for days without completing. > > There's also no old transactions in either pg_prepared_xacts or > pg_stat_activity. > > production=# SELECT relname, age(relfrozenxid) as xid_age FROM pg_class > WHERE relkind = 'r' ORDER BY age(relfrozenxid) DESC LIMIT 10; > relname | xid_age > -------------------+------------ > messages | 2050996318 > > (This is a similar scenario to one I've posted about a while back ( > http://www.postgresql.org/message-id/3238.1369055...@sss.pgh.pa.us) - In > that case I advanced the autovacuum_freeze_max_age with the intention to > replace the problematic table. That hasn't happened in time and now sitting > with a real problem) > > Below is the vacuum settings at the time of the manual vacuum: > > name | setting > ---------------------------------+------------ > autovacuum | on > autovacuum_analyze_scale_factor | 0.01 > autovacuum_analyze_threshold | 50 > autovacuum_freeze_max_age | 1750000000 > autovacuum_max_workers | 2 > autovacuum_naptime | 60 > autovacuum_vacuum_cost_delay | 50 > autovacuum_vacuum_cost_limit | 200 > autovacuum_vacuum_scale_factor | 2 > autovacuum_vacuum_threshold | 50 > log_autovacuum_min_duration | 0 > vacuum_cost_delay | 0 > vacuum_cost_limit | 10000 > vacuum_cost_page_dirty | 20 > vacuum_cost_page_hit | 1 > vacuum_cost_page_miss | 10 > vacuum_defer_cleanup_age | 0 > vacuum_freeze_min_age | 25000000 > vacuum_freeze_table_age | 150000000 > > The tail-end of the vacuum log: > > INFO: index "message_sender_client_ref_confirmation_index" now contains > 529204260 row versions in 10752961 pages > DETAIL: 261675128 index row versions were removed. > 722622 index pages have been deleted, 367793 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.24 sec. > INFO: index "messages_v2_recipient_sender" now contains 529194622 row > versions in 2632966 pages > DETAIL: 109563299 index row versions were removed. > 216413 index pages have been deleted, 132791 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.01 sec. > INFO: index "messages_v6_recipient_created_at_type" now contains 91840693 > row versions in 2312775 pages > DETAIL: 177088781 index row versions were removed. > 209178 index pages have been deleted, 3045 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.03 sec. > INFO: index "messages_v6_sender_created_at_type" now contains 91840716 > row versions in 2272370 pages > DETAIL: 175741424 index row versions were removed. > 201936 index pages have been deleted, 2958 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.04 sec. > INFO: index "messages_v6_unread_messages_count" now contains 4797806 row > versions in 29930 pages > DETAIL: 1808779 index row versions were removed. > 77 index pages have been deleted, 5 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: "messages": found 51093697 removable, 529671512 nonremovable row > versions in 22666431 out of 22666435 pages > DETAIL: 701995 dead row versions cannot be removed yet. > There were 716893565 unused item pointers. > 0 pages are entirely empty. > CPU 1229.92s/4618.87u sec elapsed 13951.33 sec. > INFO: vacuuming "pg_toast.pg_toast_18369" > INFO: index "pg_toast_18369_index" now contains 0 row versions in 1 pages > DETAIL: 0 index row versions were removed. > 0 index pages have been deleted, 0 are currently reusable. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: "pg_toast_18369": found 0 removable, 0 nonremovable row versions in > 0 out of 0 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 0 unused item pointers. > 0 pages are entirely empty. > CPU 0.00s/0.00u sec elapsed 0.00 sec. > VACUUM > > If I look at the pg_stat_user_tables log I can see it's cleaned up the > dead tuples successfully: > > production=# select relname, n_live_tup, n_dead_tup, last_vacuum, > last_autovacuum from pg_stat_user_tables; > relname | n_live_tup | n_dead_tup | > last_vacuum | last_autovacuum > > -------------------------------+------------+------------+-------------------------------+------------------------------- > messages | 529790266 | 2046126 | 2013-08-05 > 13:56:07.794664+00 | > > > I've got a ton of entries like the below in the log since we're > approaching the end of the world: > > 013-08-05 15:15:07.588 UTC,,,30352,,51ffc17b.7690,2,,2013-08-05 15:15:07 > UTC,9/358152441,3787657047,DEBUG,00000,"transaction ID wrap limit is > 3884757767, limited by database with OID 17671",,,,,,,,,"" > > Doing a VACUUM on any other relation resets the XID correctly. > > Any suggestions on how we can resolve this before we get shutdown? I > suspect even in shutdown mode we'll have this issue unless we can track > down the cause. > > Kind regards, > > Armand > -- Sent from Gmail Mobile