Hi,

Thank you, yes those are the 'urgent' tables, I'd talk to the developers
regarding the locks.I too think, there's something 'fishy' going on.

Anyway, could it be that autovacuum blocks manual vacuum? Because I ran
vacuum (full, verbose) and some tables finished quite fast, with huge
amount of io recorded in the monitoring, but some of them are kind of
stuck?
Which brings me to the second question, how can I cancel autovacuum?

And the last thing I don't understand:

One particular table before vacuum full:

           relname            | relminmxid | table_size
------------------------------+------------+------------
 delayed_jobs                 | 1554151198 | 21 GB

And after vacuum full:

   relname    | relminmxid | table_size
--------------+------------+------------
 delayed_jobs | 1554155465 | 6899 MB

Shouldn't be the relminmxid changed after vacuum full, or am I not
understanding something?

Thanks a lot


On Wed, Aug 9, 2017 at 7:57 PM Andres Freund <and...@anarazel.de> wrote:

> Hi,
>
> On 2017-08-09 10:06:48 +0000, Peter Hunčár wrote:
> > We started feeding it several weeks ago and everything went smoothly
> until
> > we hit this issue:
> >
> > 2017-08-09 05:21:50.946 WIB >DETAIL:  This command would create a
> multixact
> > with 2 members, but the remaining space is only enough for 0 members.
> > 2017-08-09 05:21:50.946 WIB >HINT:  Execute a database-wide VACUUM in
> > database with OID 20101 with reduced vacuum_multixact_freeze_min_age and
> > vacuum_multixact_freeze_table_age settings.
>
> Ugh, that's not good.
>
>
> > I did what the hint proposes, but on a such large database a vacuum takes
> > several days.
> > We are currently in data loading phase and we are sending only INSERT
> > statements, there should be very little UPDATEs or DELETEs.
> > Yesterday, the message disappeared shortly, but today it's back (vacuum
> is
> > still running)
>
> I suggest manually vacuuming the tables with the oldest xids first, that
> ought to move the limits ahead. A query like
> SELECT oid::regclass, relminmxid, mxid_age(relminmxid) FROM pg_class WHERE
> relminmxid <> '0' ORDER BY mxid_age(relminmxid) DESC;
> might help you make that determination.
>
>
> > *Is there a way how to prevent/fix this so we can finish the loading (97%
> > done), because the performance went down from 100 ops/sec to 15ops/min.*
> >
> > Most tables have around 150 M rows with toast data.
> > There are several huge tables with toast data, currently autovacuumed, I
> > guess this is the reason for the performance drop:
> >
> >  | usename  | application_name | state  | backend_xmin |
> >                                     query
> >
> -+----------+------------------+--------+--------------+----------------------------------------------------------------------
> >  | postgres |                  | active |   1683428686 | autovacuum:
> VACUUM
> > pg_toast.pg_toast_14548803 (to prevent wraparound)
> >  | postgres |                  | active |   1683428693 | autovacuum:
> VACUUM
> > pg_toast.pg_toast_14548821 (to prevent wraparound)
> >  | postgres |                  | active |   1683428705 | autovacuum:
> VACUUM
> > pg_toast.pg_toast_14548828 (to prevent wraparound)
> >  | postgres |                  | active |   1683428719 | autovacuum:
> VACUUM
> > pg_toast.pg_toast_14548835 (to prevent wraparound)
> >  | postgres |                  | active |   1683428732 | autovacuum:
> VACUUM
> > pg_toast.pg_toast_14549150 (to prevent wraparound)
>
> If these are the most "urgent" tables from the query above, I'd let
> autovac finish, otherwise I'd cancel them and manually choose the most
> urgent ones.
>
>
> > After changing the vacuum_cost_limit to 10000 for one night, I saw
> 200MB/s
> > of writes the whole night, but I had to change it back to 2000, because
> the
> > insert perf went to 0.
>
> You have a fair amount of autovacuum workers, perhaps that's also a bit
> too much?  I'd consider making autovacuum's cleanup duties *less*
> aggressive, but trying to keep the rate of cleanup high.  E.g. by
> increasing vacuum_freeze_min_age and vacuum_multixact_freeze_min_age, so
> only urgently old stuff gets cleaned up.
>
>
> Do you know how you come to have a lot of large multixacts?  That's
> often indicative of way too many sessions trying to lock the same rows
> and such, might be worthwhile to work on that too, to reduce the
> frequency at which new ones are needed.
>
>
> Greetings,
>
> Andres Freund
>

Reply via email to