Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-16 Thread Josh Berkus

Tom,

Neither database has and per-table autovacuum settings.

However, since this is a production database, I had to try something, 
and set vacuum_cost_limit up to 1000.  The issue with vacuuming one page 
at a time went away, or at least I have not seen it repeat in the last 
16 hours.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-16 Thread Alvaro Herrera
Josh Berkus wrote:
 Tom,
 
 Neither database has and per-table autovacuum settings.
 
 However, since this is a production database, I had to try
 something, and set vacuum_cost_limit up to 1000.  The issue with
 vacuuming one page at a time went away, or at least I have not seen
 it repeat in the last 16 hours.

How many autovac workers are there?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-16 Thread Josh Berkus



How many autovac workers are there?


Max_workers is set to 3.  However, I've never seen more than one active 
at a time.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-15 Thread Alvaro Herrera
Josh Berkus wrote:

 Basically, vacuuming of a table which normally takes about 20 minutes
 interactively with vacuum_cost_delay set to 20 had not completed after
 14 hours.  When I trussed it, I saw activity which indicated to me that
 autovacuum was doing a pollsys, presumably for cost_limit, every data page.
 
 Autovacuum was running with vacuum_cost_limit = 200 and
 autovacuum_vacuum_cost_delay = 20, which I believe is the default for 8.3.
 
 Truss output:
 
 pollsys(0xFD7FFFDF83E0, 0, 0xFD7FFFDF8470, 0x) = 0

So what is it polling?  Please try truss -v pollsys; is there a way in
Solaris to report what each file descriptor is pointing to?  (In linux
I'd look at /proc/pid/fd)

We don't call pollsys anywhere.  Something in Solaris must be doing it
under the hood.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-15 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 We don't call pollsys anywhere.  Something in Solaris must be doing it
 under the hood.

pg_usleep calls select(), and some googling indicates that select() is
implemented as pollsys() on recent Solaris versions.  So Josh's
assumption that those are delay calls seems plausible.  But it shouldn't
be sleeping after each page with normal cost_delay parameters, should it?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-15 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  We don't call pollsys anywhere.  Something in Solaris must be doing it
  under the hood.
 
 pg_usleep calls select(), and some googling indicates that select() is
 implemented as pollsys() on recent Solaris versions.  So Josh's
 assumption that those are delay calls seems plausible.  But it shouldn't
 be sleeping after each page with normal cost_delay parameters, should it?

Certainly not ... The only explanation would be that the cost balance
gets over the limit very frequently.  So one of the params would have to
be abnormally high (vacuum_cost_page_hit, vacuum_cost_page_miss,
vacuum_cost_page_dirty).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-15 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 But it shouldn't
 be sleeping after each page with normal cost_delay parameters, should it?

 Right, that's why I find this puzzling.  If the problem was easier to
 reproduce it would be easier to analyze.

The behavior would be explained if VacuumCostLimit were getting set to
zero (or some unreasonably small value) in the autovac worker process.
I looked at the autovac code that manages that, and it seems complicated
enough that a bug wouldn't surprise me in the least.

I especially note that wi_cost_limit is explicitly initialized to zero,
rather than something sane; and that table_recheck_autovac falls back to
setting vac_cost_limit from the previous value of VacuumCostLimit
... which is NOT constant but in general is left over from the
previously processed table.  One should also keep in mind that SIGHUP
processing might reload VacuumCostLimit from GUC values.  So I think
that area needs a closer look.

Josh, are you sure that both servers are identical in terms of both
GUC-related and per-table autovacuum settings?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Autovaccum with cost_delay does not complete on one solaris 5.10 machine

2010-04-15 Thread Josh Berkus

 Josh, are you sure that both servers are identical in terms of both
 GUC-related and per-table autovacuum settings?

I should check per-table.  GUC, yes, because the company has source
management for config files.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance