On Mon, Jun 19, 2017 at 10:33 AM, Dmitry O Litvintsev <litvi...@fnal.gov>
wrote:

> Hi
>
> Since I have posted this nothing really changed. I am starting to panic
> (mildly).
>
> The source (production) runs :
>
>           relname           |           mode           | granted |
>                         substr                                |
> query_start          |          age
> ----------------------------+--------------------------+----
> -----+------------------------------------------------------
> ----------------+-------------------------------+------------------------
>  t_inodes_iio_idx           | RowExclusiveLock         | t       |
> autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound)   |
> 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559
>


This is close to unreadable.  You can use use \x to get output from psql
which survives email more readably.

Your first report was 6 days ago.  Why is the job only 4 days old?  Are you
frequently restarting your production server, so that the vacuum job never
gets a chance to finish?  If so, that would explain your predicament.

And how big is this table, that it takes at least 4 days to VACUUM?

vacuum_cost_delay = 50ms
>

That is a lot.  The default value for this is 0.  The default value
for autovacuum_vacuum_cost_delay is 20, which is usually too high for giant
databases.

I think you are changing this in the wrong direction.  Rather than increase
vacuum_cost_delay, you need to decrease autovacuum_vacuum_cost_delay, so
that you won't keep having problems in the future.


On your test server, change vacuum_cost_delay to zero and then initiate a
manual vacuum of the table.  It will block on the autovacuum's lock, so
then kill the autovacuum (best to have the manual vacuum queued up first,
otherwise it will be race between when you start the manual vacuum, and
when the autovacuum automatically restarts, to see who gets the lock). See
how long it takes this unthrottled vacuum to run, and how much effect the
IO it causes has on the performance of other tasks.  If acceptable, repeat
this on production (although really, I don't that you have much of a choice
on whether the effect it is acceptable or not--it needs to be done.)

Cheers,

Jeff

Reply via email to