Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-19 Thread Scott Marlowe
On Mon, Jun 19, 2017 at 1:53 PM, Dmitry O Litvintsev  wrote:
> yes, we had to restart database 4 days ago (and vacuum has resumed on start).
> I checked the log files and discovered that autovacuum on this table takes
>
> pages: 0 removed, 14072307 remain
> tuples: 43524292 removed, 395006545 remain
> buffer usage: -1493114028 hits, 107664973 misses, 30263658 dirtied
> avg read rate: 1.604 MB/s, avg write rate: 0.451 MB/s
> system usage: CPU 2055.81s/17710.94u sec elapsed 524356.57 sec
>
> 6 days. So it is perpetually being autovacuumed (which I assumed to be a good 
> thing)
>
> Table has 400M entries, 115 GB.
>
> I will try your suggestions in the test environment.
>
> Thank you,
> Dmitry

Once you get this sorted, look into using the checkpostgresql.pl
script and a monitoring solution like zabbix or nagios to monitor
things like transactions until wraparound etc so you don't wind up
back here again. Best of luck in. Note that if you drop the vacuum
delay to 0ms the vacuum will probably complete in a few hours tops.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-19 Thread Dmitry O Litvintsev
yes, we had to restart database 4 days ago (and vacuum has resumed on start). 
I checked the log files and discovered that autovacuum on this table takes 

pages: 0 removed, 14072307 remain
tuples: 43524292 removed, 395006545 remain
buffer usage: -1493114028 hits, 107664973 misses, 30263658 dirtied
avg read rate: 1.604 MB/s, avg write rate: 0.451 MB/s
system usage: CPU 2055.81s/17710.94u sec elapsed 524356.57 sec

6 days. So it is perpetually being autovacuumed (which I assumed to be a good 
thing) 

Table has 400M entries, 115 GB. 

I will try your suggestions in the test environment. 

Thank you,
Dmitry 

From: Jeff Janes 
Sent: Monday, June 19, 2017 1:16 PM
To: Dmitry O Litvintsev
Cc: Andreas Kretschmer; pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum holds exclusive lock on table preventing it 
from to be updated

On Mon, Jun 19, 2017 at 10:33 AM, Dmitry O Litvintsev 
> wrote:
Hi

Since I have posted this nothing really changed. I am starting to panic 
(mildly).

The source (production) runs :

  relname   |   mode   | granted |  
  substr|  query_start  
|  age
+--+-+--+---+
 t_inodes_iio_idx   | RowExclusiveLock | t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559


This is close to unreadable.  You can use use \x to get output from psql which 
survives email more readably.

Your first report was 6 days ago.  Why is the job only 4 days old?  Are you 
frequently restarting your production server, so that the vacuum job never gets 
a chance to finish?  If so, that would explain your predicament.

And how big is this table, that it takes at least 4 days to VACUUM?

vacuum_cost_delay = 50ms

That is a lot.  The default value for this is 0.  The default value for 
autovacuum_vacuum_cost_delay is 20, which is usually too high for giant 
databases.

I think you are changing this in the wrong direction.  Rather than increase 
vacuum_cost_delay, you need to decrease autovacuum_vacuum_cost_delay, so that 
you won't keep having problems in the future.


On your test server, change vacuum_cost_delay to zero and then initiate a 
manual vacuum of the table.  It will block on the autovacuum's lock, so then 
kill the autovacuum (best to have the manual vacuum queued up first, otherwise 
it will be race between when you start the manual vacuum, and when the 
autovacuum automatically restarts, to see who gets the lock). See how long it 
takes this unthrottled vacuum to run, and how much effect the IO it causes has 
on the performance of other tasks.  If acceptable, repeat this on production 
(although really, I don't that you have much of a choice on whether the effect 
it is acceptable or not--it needs to be done.)

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-19 Thread Bill Moran
On Mon, 19 Jun 2017 17:33:23 +
Dmitry O Litvintsev  wrote:
> 
> The test stand where I was to test schema upgrade is stuck cuz vacuum is 
> blocking. 

If you're in "panic mode" I would recommend cancelling the existing vacuum,
running your upgrades, then immeditely running VACUUM FREEZE ANALYZE on that 
table to
make up for cancelling the autovacuum. Note that the manual VACUUM may take 
quite a while,
so run it in a screen session or something. Additionally, autovacuum is going 
to try to
relaunch that vacuum pretty aggressively, so you might have to cancel it a few 
times (using
pg_terminate_backend()) before your other processes are able to sneak in ahead 
of it.

Once you're out of panic mode you can take some time to breathe and consider 
your options
for reducing the issue in the future.

I have to 2nd Alvaro's comments about the cost delay. Personally, I'd recommend
setting vacuum_cost_delay to zero, unless your hardware is crap. In my recent
experience, allowing vacuum to run full-bore is less intrustive on a busy 
database
with good hardware than forcing it to take it's time. Unfortunately, changing it
now isn't going to speed the current vacuum up any.

Another comment: schema changes almost always need exclusive locks on tables
that they're modifying. As a result, you really need to plan them out a bit. 
Anything
could block a schema update, even a simple SELECT statement; so it's important 
to
check the health of things before starting. While it's not _generally_ a good 
practice
to interrupt autovacuum, it _can_ be done if the schema upgrade is necessary. 
Keep in
mind that it's just going to start back up again, but hopefully your schema 
update will
be done by then and it can do it's work without interfering with things. 
Another thing
you can do is to monitor the transaction ID values (the Nagios check_postgres 
has a nice
mode for monitoring this) and manually launch a VACUUM FREEZE ahead of 
autovacuum, so that
_you_ can pick the time for it to run and not have it happen to crop up at the 
worst possible
time ;)

You might also find that things are easier to deal with if you tweak the 
autovacuum
settings on this table to cause it to be vacuumed more frequently. In my 
experience, more
frequent vacuums that do less work each time often lead to happier databases. 
See ALTER
TABLE and the available settings to tweak autovacuum behavior.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-19 Thread Jeff Janes
On Mon, Jun 19, 2017 at 10:33 AM, Dmitry O Litvintsev 
wrote:

> Hi
>
> Since I have posted this nothing really changed. I am starting to panic
> (mildly).
>
> The source (production) runs :
>
>   relname   |   mode   | granted |
> substr|
> query_start  |  age
> +--+
> -+--
> +---+
>  t_inodes_iio_idx   | RowExclusiveLock | t   |
> autovacuum: VACUUM ANALYZE public.t_inodes (to prevent wraparound)   |
> 2017-06-15 10:26:18.643209-05 | 4 days 01:58:56.697559
>


This is close to unreadable.  You can use use \x to get output from psql
which survives email more readably.

Your first report was 6 days ago.  Why is the job only 4 days old?  Are you
frequently restarting your production server, so that the vacuum job never
gets a chance to finish?  If so, that would explain your predicament.

And how big is this table, that it takes at least 4 days to VACUUM?

vacuum_cost_delay = 50ms
>

That is a lot.  The default value for this is 0.  The default value
for autovacuum_vacuum_cost_delay is 20, which is usually too high for giant
databases.

I think you are changing this in the wrong direction.  Rather than increase
vacuum_cost_delay, you need to decrease autovacuum_vacuum_cost_delay, so
that you won't keep having problems in the future.


On your test server, change vacuum_cost_delay to zero and then initiate a
manual vacuum of the table.  It will block on the autovacuum's lock, so
then kill the autovacuum (best to have the manual vacuum queued up first,
otherwise it will be race between when you start the manual vacuum, and
when the autovacuum automatically restarts, to see who gets the lock). See
how long it takes this unthrottled vacuum to run, and how much effect the
IO it causes has on the performance of other tasks.  If acceptable, repeat
this on production (although really, I don't that you have much of a choice
on whether the effect it is acceptable or not--it needs to be done.)

Cheers,

Jeff


Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-19 Thread Alvaro Herrera
Dmitry O Litvintsev wrote:
> Hi
> 
> Since I have posted this nothing really changed. I am starting to panic 
> (mildly).  

...

> vacuum_cost_delay = 50ms

Most likely, this value is far too high.  You're causing autovacuum to
sleep for a very long time with this setting.  Hard to say for certain
without seeing the cost_limit value and the other related parameters,
but it's most certainly what's causing you pain.  The default of 20ms is
already too high for most users.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-19 Thread Dmitry O Litvintsev
Hi

Since I have posted this nothing really changed. I am starting to panic 
(mildly).  

The source (production) runs :

  relname   |   mode   | granted |  
  substr|  query_start  
|  age   
+--+-+--+---+
 t_inodes_iio_idx   | RowExclusiveLock | t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559
 t_inodes_pkey  | RowExclusiveLock | t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559
| ExclusiveLock| t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559
 t_inodes   | ShareUpdateExclusiveLock | t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559
 t_inodes_itype_idx | RowExclusiveLock | t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559
 t_inodes_imtime_idx| RowExclusiveLock | t   | autovacuum: 
VACUUM ANALYZE public.t_inodes (to prevent wraparound)   | 2017-06-15 
10:26:18.643209-05 | 4 days 01:58:56.697559

Above does not impact production activity a lot. 

On the test stand (where I pg_basebackupped from production and also upgraded 
to 9.6) I see:

   relname  |   mode   | granted |  
 substr   |  query_start
  |  age   
---+--+-++---+
 t_inodes  | ShareUpdateExclusiveLock | t   | 
autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 
15:27:54.872154-05 | 5 days 20:59:22.769404
 t_inodes_itype_idx| RowExclusiveLock | t   | 
autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 
15:27:54.872154-05 | 5 days 20:59:22.769404
 t_inodes_imtime_idx   | RowExclusiveLock | t   | 
autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 
15:27:54.872154-05 | 5 days 20:59:22.769404
 t_inodes_iio_idx  | RowExclusiveLock | t   | 
autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 
15:27:54.872154-05 | 5 days 20:59:22.769404
 t_inodes_pkey | RowExclusiveLock | t   | 
autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 
15:27:54.872154-05 | 5 days 20:59:22.769404
   | ExclusiveLock| t   | 
autovacuum: VACUUM public.t_inodes (to prevent wraparound) | 2017-06-13 
15:27:54.872154-05 | 5 days 20:59:22.769404
 t_inodes  | ShareUpdateExclusiveLock | f   | 
ANALYZE;   | 2017-06-13 
15:27:59.781285-05 | 5 days 20:59:17.860273
   | ExclusiveLock| t   | 
ANALYZE;   | 2017-06-13 
15:27:59.781285-05 | 5 days 20:59:17.860273


The test stand where I was to test schema upgrade is stuck cuz vacuum is 
blocking. 

Production settings follow:

version 9.3.9

max_connections = 512
shared_buffers = 8192MB
temp_buffers = 1024MB
work_mem = 512MB
#maintenance_work_mem = 2048MB 
maintenance_work_mem = 4096MB #increased after 3 days of vacuum analyze running 
max_stack_depth = 2MB
vacuum_cost_delay = 50ms
synchronous_commit = off
wal_buffers = 245MB
wal_writer_delay = 10s
checkpoint_segments = 64
checkpoint_completion_target = 0.9
random_page_cost = 2.0
effective_cache_size = 94GB
wal_level = hot_standby
hot_standby = on
archive_mode = on
archive_command = '/usr/loca/bin/wal_backup.sh %p %f'
max_wal_senders = 4
wal_keep_segments = 1024
max_standby_streaming_delay = 7200s

So, the problem : I cannot do schema change until vacuum has finished, and 
there 
seems to be no end in sight for vacuum to finish throwing off our software 
upgrade plans. 

Anything can be done here?

Thanks,
Dmitry


From: Andreas Kretschmer 
Sent: Tuesday, June 13, 2017 1:54 PM
To: pgsql-general@postgresql.org; Dmitry O Litvintsev; 
pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum holds exclusive lock on table 

Re: [GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-13 Thread Andreas Kretschmer


Am 13. Juni 2017 20:04:04 MESZ schrieb Dmitry O Litvintsev :
>
>I 
>wraparound)| 2017-
>| t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent
>wraparound)| 2017-06-13 12:31:04.870064-05 |
>00:28:50.276437 | 40672
>chimera | t_inodes   |   |
>ShareUpdateExclusiveLock | t   | enstore | autovacuum: VACUUM
>public.t_inodes (to prevent wraparound)| 2017-06-13
>12:31:04.870064-05 | 00:28:50.276437 | 40672
>

It is a autocacuum to prevent wraparound, you can't stop or avoid that.

Regards, Andreas
-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] autovacuum holds exclusive lock on table preventing it from to be updated

2017-06-13 Thread Dmitry O Litvintsev
Hi, 

I run postgresql 9.3.17. I am preparing for a major database schema upgrade. 

I copied production database to test system using pg_basebackup. 

Having started the database and waited for all WALs to be applied I proceeded 
to run 
schema modifications. 

Immediately I run into issue - updates on a table get stuck because I see that 
autovacuum is running
on that table and it holds exclusive lock:

datname |  relname   | transactionid |   mode   
| granted | usename |  substr   
|  query_start  |   age   |  pid  
-++---+--+-+-+---+---+-+---
 chimera | t_inodes_itype_idx |   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:25:22.285415 | 
40672
 chimera | t_inodes_imtime_idx|   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:25:22.285415 | 
40672
 chimera | t_inodes_iio_idx   |   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:25:22.285415 | 
40672
 chimera | t_inodes_pkey  |   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:25:22.285415 | 
40672
 chimera ||   | ExclusiveLock   
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:25:22.285415 | 
40672
 chimera | t_inodes   |   | 
ShareUpdateExclusiveLock | t   | enstore | autovacuum: VACUUM 
public.t_inodes (to prevent wraparound)
 
If I killed autovacuum (by running SELECT pg_cancel_backend(PID) , I get at an 
update going, but then another update would get stuck by autovacuum launching 
again). 

I tried to set autovacuum to off (together w/ track_counts) and conf file. 
After restart , autovacuum still runs !
chimera=# show autovacuum;
 autovacuum 

 off
(1 row)

checking activity : 
chimera=# select 
pg_stat_activity.datname,pg_class.relname,pg_locks.transactionid,
   pg_locks.mode, pg_locks.granted,pg_stat_activity.usename,
   substr(pg_stat_activity.query,1,256),
   pg_stat_activity.query_start, age(now(),pg_stat_activity.query_start) as 
"age",
   pg_stat_activity.pid from pg_stat_activity,pg_locks
   left outer join pg_class on (pg_locks.relation = pg_class.oid)
   where pg_locks.pid=pg_stat_activity.pid order by query_start;

shows autovacuum. Seems like setting it to off does not take any effect. 

datname |  relname   | transactionid |   mode   
| granted | usename |  substr   
|  query_start  |   age   |  pid  
-++---+--+-+-+---+---+-+---
 chimera | t_inodes_itype_idx |   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:28:50.276437 | 
40672
 chimera | t_inodes_imtime_idx|   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:28:50.276437 | 
40672
 chimera | t_inodes_iio_idx   |   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:28:50.276437 | 
40672
 chimera | t_inodes_pkey  |   | RowExclusiveLock
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:28:50.276437 | 
40672
 chimera ||   | ExclusiveLock   
 | t   | enstore | autovacuum: VACUUM public.t_inodes (to prevent 
wraparound)| 2017-06-13 12:31:04.870064-05 | 00:28:50.276437 | 
40672
 chimera | t_inodes   |   | 
ShareUpdateExclusiveLock | t   | enstore | autovacuum: VACUUM 
public.t_inodes (to prevent