After a second manual run it properly reset the relfrozenid on the affected
relation.

Only difference this one was a vacuum analyze but I suspect that was not
the cause.

On Monday, August 5, 2013, Armand du Plessis wrote:

> Hi there,
>
> We're running into a scenario where despite doing a manual vacuum as a
> superuser the relfrozenxid for one relation now dangerously close to
> wraparound is not getting reset.
>
> It's a Postgres 9.2.3 cluster. We shutdown other access to the machine
> while running the VACUUM to ensure it could complete quick enough with an
> aggressive vacuum (vacuum_cost_limit 10000 and no delay). The previous
> autovacuum was running for days without completing.
>
> There's also no old transactions in either pg_prepared_xacts or
> pg_stat_activity.
>
> production=# SELECT relname, age(relfrozenxid) as xid_age FROM pg_class
> WHERE relkind = 'r' ORDER BY age(relfrozenxid) DESC LIMIT 10;
>       relname      |  xid_age
> -------------------+------------
>  messages          | 2050996318
>
> (This is a similar scenario to one I've posted about a while back (
> http://www.postgresql.org/message-id/3238.1369055...@sss.pgh.pa.us) - In
> that case I advanced the autovacuum_freeze_max_age with the intention to
> replace the problematic table. That hasn't happened in time and now sitting
> with a real problem)
>
> Below is the vacuum settings at the time of the manual vacuum:
>
>               name               |  setting
> ---------------------------------+------------
>  autovacuum                      | on
>  autovacuum_analyze_scale_factor | 0.01
>  autovacuum_analyze_threshold    | 50
>  autovacuum_freeze_max_age       | 1750000000
>  autovacuum_max_workers          | 2
>  autovacuum_naptime              | 60
>  autovacuum_vacuum_cost_delay    | 50
>  autovacuum_vacuum_cost_limit    | 200
>  autovacuum_vacuum_scale_factor  | 2
>  autovacuum_vacuum_threshold     | 50
>  log_autovacuum_min_duration     | 0
>  vacuum_cost_delay               | 0
>  vacuum_cost_limit               | 10000
>  vacuum_cost_page_dirty          | 20
>  vacuum_cost_page_hit            | 1
>  vacuum_cost_page_miss           | 10
>  vacuum_defer_cleanup_age        | 0
>  vacuum_freeze_min_age           | 25000000
>  vacuum_freeze_table_age         | 150000000
>
> The tail-end of the vacuum log:
>
> INFO:  index "message_sender_client_ref_confirmation_index" now contains
> 529204260 row versions in 10752961 pages
> DETAIL:  261675128 index row versions were removed.
> 722622 index pages have been deleted, 367793 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.24 sec.
> INFO:  index "messages_v2_recipient_sender" now contains 529194622 row
> versions in 2632966 pages
> DETAIL:  109563299 index row versions were removed.
> 216413 index pages have been deleted, 132791 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.01 sec.
> INFO:  index "messages_v6_recipient_created_at_type" now contains 91840693
> row versions in 2312775 pages
> DETAIL:  177088781 index row versions were removed.
> 209178 index pages have been deleted, 3045 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.03 sec.
> INFO:  index "messages_v6_sender_created_at_type" now contains 91840716
> row versions in 2272370 pages
> DETAIL:  175741424 index row versions were removed.
> 201936 index pages have been deleted, 2958 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.04 sec.
> INFO:  index "messages_v6_unread_messages_count" now contains 4797806 row
> versions in 29930 pages
> DETAIL:  1808779 index row versions were removed.
> 77 index pages have been deleted, 5 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "messages": found 51093697 removable, 529671512 nonremovable row
> versions in 22666431 out of 22666435 pages
> DETAIL:  701995 dead row versions cannot be removed yet.
> There were 716893565 unused item pointers.
> 0 pages are entirely empty.
> CPU 1229.92s/4618.87u sec elapsed 13951.33 sec.
> INFO:  vacuuming "pg_toast.pg_toast_18369"
> INFO:  index "pg_toast_18369_index" now contains 0 row versions in 1 pages
> DETAIL:  0 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> INFO:  "pg_toast_18369": found 0 removable, 0 nonremovable row versions in
> 0 out of 0 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> There were 0 unused item pointers.
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.00 sec.
> VACUUM
>
> If I look at the pg_stat_user_tables log I can see it's cleaned up the
> dead tuples successfully:
>
> production=# select relname, n_live_tup, n_dead_tup, last_vacuum,
> last_autovacuum from pg_stat_user_tables;
>             relname            | n_live_tup | n_dead_tup |
>  last_vacuum          |        last_autovacuum
>
> -------------------------------+------------+------------+-------------------------------+-------------------------------
>  messages                      |  529790266 |    2046126 | 2013-08-05
> 13:56:07.794664+00 |
>
>
> I've got a ton of entries like the below in the log since we're
> approaching the end of the world:
>
> 013-08-05 15:15:07.588 UTC,,,30352,,51ffc17b.7690,2,,2013-08-05 15:15:07
> UTC,9/358152441,3787657047,DEBUG,00000,"transaction ID wrap limit is
> 3884757767, limited by database with OID 17671",,,,,,,,,""
>
> Doing a VACUUM on any other relation resets the XID correctly.
>
> Any suggestions on how we can resolve this before we get shutdown? I
> suspect even in shutdown mode we'll have this issue unless we can track
> down the cause.
>
> Kind regards,
>
> Armand
>


-- 
Sent from Gmail Mobile

Reply via email to