Re: [HACKERS] Why do we let autovacuum give up?

2014-01-24 Thread Robert Haas
On Thu, Jan 23, 2014 at 7:45 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2014-01-23 19:29:23 -0500, Tom Lane wrote:
>>> I concur with the other reports that the main problem in this test case is
>>> just that the default cost delay settings throttle autovacuum so hard that
>>> it has no chance of keeping up.  If I reduce autovacuum_vacuum_cost_delay
>>> from the default 20ms to 2ms, it seems to keep up quite nicely, on my
>>> machine anyway.  Probably other combinations of changes would do it too.
>
>>> Perhaps we need to back off the default cost delay settings a bit?
>>> We've certainly heard more than enough reports of table bloat in
>>> heavily-updated tables.  A system that wasn't hitting the updates as hard
>>> as it could might not need this, but on the other hand it probably
>>> wouldn't miss the I/O cycles from a more aggressive autovacuum, either.
>
>> Yes, I think adjusting the default makes sense, most setups that have
>> enough activity that costing plays a role have to greatly increase the
>> values. I'd rather increase the cost limit than reduce cost delay so
>> drastically though, but that's admittedly just gut feeling.
>
> Well, I didn't experiment with intermediate values, I was just trying
> to test the theory that autovac could keep up given less-extreme
> throttling.  I'm not taking any position on just where we need to set
> the values, only that what we've got is probably too extreme.

So, Greg Smith proposed what I think is a very useful methodology for
assessing settings in this area: figure out what it works out to in
MB/s.  If we assume we're going to read and dirty every page we
vacuum, and that this will take negligible time of itself so that the
work is dominated by the sleeps, the default settings work out to
200/(10 + 20) pages every 20ms, or 2.67MB/s.  Obviously, the rate will
be 3x higher if the pages don't need to be dirtied, and higher still
if they're all in cache, but considering the way the visibility map
works, it seems like a good bet that we WILL need to dirty most of the
pages that we look at - either they've got dead tuples and need
clean-up, or they don't and need to be marked all-visible.

A corollary of this is that if you're dirtying heap pages faster than
a few megabytes per second, autovacuum, at least with default
settings, is not going to keep up.  And if you assume that each write
transaction dirties at least one heap page, any volume of write
transactions in excess of a few hundred per second will meat that
criteria.  Which is really not that much; a single core can do over
1000 tps with synchronous_commit=off, or if there's a BBWC that can
absorb it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-24 Thread Claudio Freire
On Fri, Jan 24, 2014 at 2:44 PM, Josh Berkus  wrote:
> On 01/23/2014 07:22 PM, Alvaro Herrera wrote:
>>> If you ask me, I'd like autovac to know when not to run (or rather
>>> > wait a bit, not forever), perhaps by checking load factors or some
>>> > other tell-tale of an already-saturated I/O system.
>> We had a proposed design to tell autovac when not to run (or rather,
>> when to switch settings very high so that in practice it'd never run).
>> At some point somebody said "but we can just change autovacuum=off in
>> postgresql.conf via crontab when the high load period starts, and turn
>> it back on afterwards" --- and that was the end of it.
>
> Anything which depends on a timing-based feedback loop is going to be
> hopeless.  Saying "autovac shouldn't run if load is high" sounds like a
> simple statement, until you actually try to implement it.

Exactly.

But people tuning autovac down are doing exactly that: trying to tune
autovac to background-only work. They *must* then launch foreground
vacuums, at times they deem sensible, when doing that.

So, problem is not of people tuning down autovacuum, but of them
forgetting to vacuum explicitly after doing so.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-24 Thread Josh Berkus
On 01/23/2014 07:22 PM, Alvaro Herrera wrote:
>> If you ask me, I'd like autovac to know when not to run (or rather
>> > wait a bit, not forever), perhaps by checking load factors or some
>> > other tell-tale of an already-saturated I/O system.
> We had a proposed design to tell autovac when not to run (or rather,
> when to switch settings very high so that in practice it'd never run).
> At some point somebody said "but we can just change autovacuum=off in
> postgresql.conf via crontab when the high load period starts, and turn
> it back on afterwards" --- and that was the end of it.

Anything which depends on a timing-based feedback loop is going to be
hopeless.  Saying "autovac shouldn't run if load is high" sounds like a
simple statement, until you actually try to implement it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Claudio Freire
On Fri, Jan 24, 2014 at 12:33 AM, Craig Ringer  wrote:
> On 01/24/2014 11:32 AM, Tom Lane wrote:
>> Alvaro Herrera  writes:
>>> Claudio Freire escribió:
 If you ask me, I'd like autovac to know when not to run (or rather
 wait a bit, not forever), perhaps by checking load factors or some
 other tell-tale of an already-saturated I/O system.
>>
>>> We had a proposed design to tell autovac when not to run (or rather,
>>> when to switch settings very high so that in practice it'd never run).
>>> At some point somebody said "but we can just change autovacuum=off in
>>> postgresql.conf via crontab when the high load period starts, and turn
>>> it back on afterwards" --- and that was the end of it.
>>
>> The hard part of this is that shutting down autovacuum during heavy
>> load may be exactly the wrong thing to do.
>
> Yep. In fact, it may be appropriate to limit or stop autovacuum's work
> on some big tables, while pushing its activity even higher for small,
> high churn tables.
>
> If you stop autovacuum on a message-queue system when load gets high,
> you'll get a giant messy bloat explosion.

A message queue has a steady state and needs way more than autovacuum.
A table used as a message queue would need a wholly dedicated
autovacuum worker to be constantly vacuuming. It's certainly an
extreme example.

But normal tables are much bigger than their active set, so vacuuming,
which walks all those cold gigabytes, tends to wreak havoc with I/O
performance. Doing it in peak hours, which is autovacuum's preferred
time, is terrible. Delaying autovacuum for a while doesn't sound like
such a disastruous thing.

In essence, I'm talking about two thresholds. A "vacuum in the
background" threshold, and a "omfg this table is a mess vacuum now now
now" threshold. The background part is quite not straightforward
though. As in, what is background?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Alvaro Herrera
Craig Ringer escribió:
> On 01/24/2014 11:32 AM, Tom Lane wrote:

> > The hard part of this is that shutting down autovacuum during heavy
> > load may be exactly the wrong thing to do.
> 
> Yep. In fact, it may be appropriate to limit or stop autovacuum's work
> on some big tables, while pushing its activity even higher for small,
> high churn tables.
> 
> If you stop autovacuum on a message-queue system when load gets high,
> you'll get a giant messy bloat explosion.

The design we had was to have table groups, each with their own set of
custom parameters, and they would change depending on schedule.  You
could keep the queue tables in one group which would not change
parameters, and only change the rest.

But as I said, it was never fully implemented.  (We had a partial patch
from a GSoC project, IIRC.)  I don't have the cycles to implement it
now, anyway.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Craig Ringer
On 01/24/2014 11:32 AM, Tom Lane wrote:
> Alvaro Herrera  writes:
>> Claudio Freire escribió:
>>> If you ask me, I'd like autovac to know when not to run (or rather
>>> wait a bit, not forever), perhaps by checking load factors or some
>>> other tell-tale of an already-saturated I/O system.
> 
>> We had a proposed design to tell autovac when not to run (or rather,
>> when to switch settings very high so that in practice it'd never run).
>> At some point somebody said "but we can just change autovacuum=off in
>> postgresql.conf via crontab when the high load period starts, and turn
>> it back on afterwards" --- and that was the end of it.
> 
> The hard part of this is that shutting down autovacuum during heavy
> load may be exactly the wrong thing to do.

Yep. In fact, it may be appropriate to limit or stop autovacuum's work
on some big tables, while pushing its activity even higher for small,
high churn tables.

If you stop autovacuum on a message-queue system when load gets high,
you'll get a giant messy bloat explosion.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Tom Lane
Alvaro Herrera  writes:
> Claudio Freire escribió:
>> If you ask me, I'd like autovac to know when not to run (or rather
>> wait a bit, not forever), perhaps by checking load factors or some
>> other tell-tale of an already-saturated I/O system.

> We had a proposed design to tell autovac when not to run (or rather,
> when to switch settings very high so that in practice it'd never run).
> At some point somebody said "but we can just change autovacuum=off in
> postgresql.conf via crontab when the high load period starts, and turn
> it back on afterwards" --- and that was the end of it.

The hard part of this is that shutting down autovacuum during heavy
load may be exactly the wrong thing to do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Alvaro Herrera
Claudio Freire escribió:

> If you ask me, I'd like autovac to know when not to run (or rather
> wait a bit, not forever), perhaps by checking load factors or some
> other tell-tale of an already-saturated I/O system.

We had a proposed design to tell autovac when not to run (or rather,
when to switch settings very high so that in practice it'd never run).
At some point somebody said "but we can just change autovacuum=off in
postgresql.conf via crontab when the high load period starts, and turn
it back on afterwards" --- and that was the end of it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Claudio Freire
On Thu, Jan 23, 2014 at 10:38 PM, Craig Ringer  wrote:
>>>
>>> Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute
>>> in this case. Back to drawing board for a test case.
>>
>> Well, I think quite many people don't realize it might be necessary to
>> tune autovac on busy workloads. As it very well might be the case in
>> Josh's case.
>
> Oh, lots of people realise it's a good idea to tune autovac on busy
> workloads.
>
> They just do it in the wrong direction, making it run less often and
> less aggressively, causing more bloat, and making their problem worse.
>
> I've seen this enough times that I'm starting to think the autovauum
> tuning knobs need a child safety lock ;-)
>
> More seriously, people don't understand autovacuum, how it works, or why
> they need it. They notice it when things are already bad, see that it's
> doing lots of work and doing lots of I/O that competes with queries, and
> turn it off to "solve" the problem.


AFAIK, tuning down autovacuum is common advice **when compounded with
manually scheduled vacuuming**.

The problem of autovacuum is that it always picks the wrong time to
work. That is, when the DB is the most active. Because statistically
that's when the thresholds are passed.

If you ask me, I'd like autovac to know when not to run (or rather
wait a bit, not forever), perhaps by checking load factors or some
other tell-tale of an already-saturated I/O system.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Craig Ringer
On 01/24/2014 07:52 AM, Andres Freund wrote:
> On 2014-01-24 12:49:57 +1300, Mark Kirkwood wrote:
>> autovacuum_max_workers = 4
>> autovacuum_naptime = 10s
>> autovacuum_vacuum_scale_factor = 0.1
>> autovacuum_analyze_scale_factor = 0.1
>> autovacuum_vacuum_cost_delay = 0ms
>>
>> Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute
>> in this case. Back to drawing board for a test case.
> 
> Well, I think quite many people don't realize it might be necessary to
> tune autovac on busy workloads. As it very well might be the case in
> Josh's case.

Oh, lots of people realise it's a good idea to tune autovac on busy
workloads.

They just do it in the wrong direction, making it run less often and
less aggressively, causing more bloat, and making their problem worse.

I've seen this enough times that I'm starting to think the autovauum
tuning knobs need a child safety lock ;-)

More seriously, people don't understand autovacuum, how it works, or why
they need it. They notice it when things are already bad, see that it's
doing lots of work and doing lots of I/O that competes with queries, and
turn it off to "solve" the problem.

I'm not sure how to tackle that.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Tom Lane
Andres Freund  writes:
> On 2014-01-23 19:29:23 -0500, Tom Lane wrote:
>> I concur with the other reports that the main problem in this test case is
>> just that the default cost delay settings throttle autovacuum so hard that
>> it has no chance of keeping up.  If I reduce autovacuum_vacuum_cost_delay
>> from the default 20ms to 2ms, it seems to keep up quite nicely, on my
>> machine anyway.  Probably other combinations of changes would do it too.

>> Perhaps we need to back off the default cost delay settings a bit?
>> We've certainly heard more than enough reports of table bloat in
>> heavily-updated tables.  A system that wasn't hitting the updates as hard
>> as it could might not need this, but on the other hand it probably
>> wouldn't miss the I/O cycles from a more aggressive autovacuum, either.

> Yes, I think adjusting the default makes sense, most setups that have
> enough activity that costing plays a role have to greatly increase the
> values. I'd rather increase the cost limit than reduce cost delay so
> drastically though, but that's admittedly just gut feeling.

Well, I didn't experiment with intermediate values, I was just trying
to test the theory that autovac could keep up given less-extreme
throttling.  I'm not taking any position on just where we need to set
the values, only that what we've got is probably too extreme.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Andres Freund
On 2014-01-23 19:29:23 -0500, Tom Lane wrote:
> I saw at most two pages skipped per vacuum, and
> usually none; so there's no way that a whole lot of tuples are going
> unvacuumed because of this.  (I wonder though if we ought to add such
> counting as a permanent feature ...)

I generally think we need to work a bit on the data reported back by
vacuum. Adding data and also making the data output when using
autovacuum more consistent with what VACUUM VERBOSE reports. The latter
curiously often has less detail than autovacuum.
I had hoped to get to that for 9.4, but it doesn't look like it.


> I concur with the other reports that the main problem in this test case is
> just that the default cost delay settings throttle autovacuum so hard that
> it has no chance of keeping up.  If I reduce autovacuum_vacuum_cost_delay
> from the default 20ms to 2ms, it seems to keep up quite nicely, on my
> machine anyway.  Probably other combinations of changes would do it too.

> Perhaps we need to back off the default cost delay settings a bit?
> We've certainly heard more than enough reports of table bloat in
> heavily-updated tables.  A system that wasn't hitting the updates as hard
> as it could might not need this, but on the other hand it probably
> wouldn't miss the I/O cycles from a more aggressive autovacuum, either.

Yes, I think adjusting the default makes sense, most setups that have
enough activity that costing plays a role have to greatly increase the
values. I'd rather increase the cost limit than reduce cost delay so
drastically though, but that's admittedly just gut feeling.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Tom Lane
Andres Freund  writes:
> On 2014-01-23 16:15:50 -0500, Tom Lane wrote:
>> [ thinks... ]  It's possible that what you saw is not the
>> kick-out-autovacuum-entirely behavior, but the behavior added in commit
>> bbb6e559c, whereby vacuum (auto or regular) will skip over pages that it
>> can't immediately get an exclusive buffer lock on.  On a heavily used
>> table, we might skip the same page repeatedly, so that dead tuples don't
>> get cleaned for a long time.

> I don't think it's too likely as an explanation here. Such workloads are
> likely to fill a page with only dead tuples, right? Once all tuples are
> safely dead they will be killed from the btree which should cause the
> page not to be visited anymore and thus safely vacuumable.

I added some instrumentation to vacuumlazy.c to count the number of pages
skipped in this way.  You're right, it seems to be negligible, at least
with Mark's test case.  I saw at most two pages skipped per vacuum, and
usually none; so there's no way that a whole lot of tuples are going
unvacuumed because of this.  (I wonder though if we ought to add such
counting as a permanent feature ...)

I concur with the other reports that the main problem in this test case is
just that the default cost delay settings throttle autovacuum so hard that
it has no chance of keeping up.  If I reduce autovacuum_vacuum_cost_delay
from the default 20ms to 2ms, it seems to keep up quite nicely, on my
machine anyway.  Probably other combinations of changes would do it too.

Perhaps we need to back off the default cost delay settings a bit?
We've certainly heard more than enough reports of table bloat in
heavily-updated tables.  A system that wasn't hitting the updates as hard
as it could might not need this, but on the other hand it probably
wouldn't miss the I/O cycles from a more aggressive autovacuum, either.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Andres Freund
On 2014-01-23 16:15:50 -0500, Tom Lane wrote:
> [ thinks... ]  It's possible that what you saw is not the
> kick-out-autovacuum-entirely behavior, but the behavior added in commit
> bbb6e559c, whereby vacuum (auto or regular) will skip over pages that it
> can't immediately get an exclusive buffer lock on.  On a heavily used
> table, we might skip the same page repeatedly, so that dead tuples don't
> get cleaned for a long time.

I don't think it's too likely as an explanation here. Such workloads are
likely to fill a page with only dead tuples, right? Once all tuples are
safely dead they will be killed from the btree which should cause the
page not to be visited anymore and thus safely vacuumable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Andres Freund
On 2014-01-24 12:49:57 +1300, Mark Kirkwood wrote:
> autovacuum_max_workers = 4
> autovacuum_naptime = 10s
> autovacuum_vacuum_scale_factor = 0.1
> autovacuum_analyze_scale_factor = 0.1
> autovacuum_vacuum_cost_delay = 0ms
> 
> Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute
> in this case. Back to drawing board for a test case.

Well, I think quite many people don't realize it might be necessary to
tune autovac on busy workloads. As it very well might be the case in
Josh's case.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Mark Kirkwood

On 24/01/14 12:28, Mark Kirkwood wrote:

On 24/01/14 12:13, Jeff Janes wrote:

On Thu, Jan 23, 2014 at 1:41 PM, Mark Kirkwood <
mark.kirkw...@catalyst.net.nz> wrote:


On 24/01/14 10:16, Mark Kirkwood wrote:


On 24/01/14 10:09, Robert Haas wrote:


On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood
 wrote:


On 24/01/14 09:49, Tom Lane wrote:

2. What have you got that is requesting exclusive lock on 
pg_attribute?
That seems like a pretty unfriendly behavior in itself. regards, 
tom

lane


I've seen this sort of problem where every db session was busily
creating
temporary tables. I never got to the find *why* they needed to 
make so

many,
but it seemed like a bad idea.


But... how does that result on a vacuum-incompatible lock request
against pg_attribute?

I see that it'll insert lots of rows into pg_attribute, and maybe
later delete them, but none of that blocks vacuum.


That was my thought too - if I see it happening again here (was a 
year or

so ago that I saw some serious pg_attribute bloat) I'll dig deeper.




Actually not much digging required. Running the attached script via
pgbench (8 sessions) against a default configured postgres 8.4 sees
pg_attribute get to 1G after about 15 minutes.


At that rate, with default throttling, it will be a close race whether
autovac can vacuum pages as fast as they are being added.  Even if it 
never

gets cancelled, it might not ever finish.



Yes - I should have set the cost delay to 0 first (checking that now).





Doing that (and a few other autovac tweaks):

autovacuum_max_workers = 4
autovacuum_naptime = 10s
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = 0ms

Stops excessive bloat - clearly autovacuum *is* able to vacuum 
pg_attribute in this case. Back to drawing board for a test case.


Regards

Mark




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Mark Kirkwood

On 24/01/14 12:13, Jeff Janes wrote:

On Thu, Jan 23, 2014 at 1:41 PM, Mark Kirkwood <
mark.kirkw...@catalyst.net.nz> wrote:


On 24/01/14 10:16, Mark Kirkwood wrote:


On 24/01/14 10:09, Robert Haas wrote:


On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood
 wrote:


On 24/01/14 09:49, Tom Lane wrote:


2. What have you got that is requesting exclusive lock on pg_attribute?
That seems like a pretty unfriendly behavior in itself. regards, tom
lane


I've seen this sort of problem where every db session was busily
creating
temporary tables. I never got to the find *why* they needed to make so
many,
but it seemed like a bad idea.


But... how does that result on a vacuum-incompatible lock request
against pg_attribute?

I see that it'll insert lots of rows into pg_attribute, and maybe
later delete them, but none of that blocks vacuum.



That was my thought too - if I see it happening again here (was a year or
so ago that I saw some serious pg_attribute bloat) I'll dig deeper.




Actually not much digging required. Running the attached script via
pgbench (8 sessions) against a default configured postgres 8.4 sees
pg_attribute get to 1G after about 15 minutes.


At that rate, with default throttling, it will be a close race whether
autovac can vacuum pages as fast as they are being added.  Even if it never
gets cancelled, it might not ever finish.



Yes - I should have set the cost delay to 0 first (checking that now).



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Jeff Janes
On Thu, Jan 23, 2014 at 1:41 PM, Mark Kirkwood <
mark.kirkw...@catalyst.net.nz> wrote:

> On 24/01/14 10:16, Mark Kirkwood wrote:
>
>> On 24/01/14 10:09, Robert Haas wrote:
>>
>>> On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood
>>>  wrote:
>>>
 On 24/01/14 09:49, Tom Lane wrote:

> 2. What have you got that is requesting exclusive lock on pg_attribute?
> That seems like a pretty unfriendly behavior in itself. regards, tom
> lane
>
 I've seen this sort of problem where every db session was busily
 creating
 temporary tables. I never got to the find *why* they needed to make so
 many,
 but it seemed like a bad idea.

>>> But... how does that result on a vacuum-incompatible lock request
>>> against pg_attribute?
>>>
>>> I see that it'll insert lots of rows into pg_attribute, and maybe
>>> later delete them, but none of that blocks vacuum.
>>>
>>>
>> That was my thought too - if I see it happening again here (was a year or
>> so ago that I saw some serious pg_attribute bloat) I'll dig deeper.
>>
>>
>>
> Actually not much digging required. Running the attached script via
> pgbench (8 sessions) against a default configured postgres 8.4 sees
> pg_attribute get to 1G after about 15 minutes.
>

At that rate, with default throttling, it will be a close race whether
autovac can vacuum pages as fast as they are being added.  Even if it never
gets cancelled, it might not ever finish.

Cheers,

Jeff


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Josh Berkus
On 01/23/2014 02:55 PM, Josh Berkus wrote:
> On 01/23/2014 02:17 PM, Magnus Hagander wrote:
>> FWIW, I have a patch around somewhere that I never cleaned up properly for
>> submissions that simply added a counter to pg_stat_user_tables indicating
>> how many times vacuum had aborted on that specific table. If that's enough
>> info  (it was for my case) to cover this case, I can try to dig it out
>> again and clean it up...
> 
> It would be 100% more information than we currently have.  How much more
> difficult would it be to count completed autovacuums as well?  It's
> really the ratio of the two which matters ...

Actually, now that I think about it, the ratio of the two doesn't matter
as much as whether the most recent autovacuum aborted or not.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Josh Berkus
On 01/23/2014 02:17 PM, Magnus Hagander wrote:
> FWIW, I have a patch around somewhere that I never cleaned up properly for
> submissions that simply added a counter to pg_stat_user_tables indicating
> how many times vacuum had aborted on that specific table. If that's enough
> info  (it was for my case) to cover this case, I can try to dig it out
> again and clean it up...

It would be 100% more information than we currently have.  How much more
difficult would it be to count completed autovacuums as well?  It's
really the ratio of the two which matters ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Magnus Hagander
On Thu, Jan 23, 2014 at 10:00 PM, Harold Giménez  wrote:

> On Thu, Jan 23, 2014 at 12:53 PM, Josh Berkus  wrote:
> > On 01/23/2014 12:34 PM, Joshua D. Drake wrote:
> >>
> >> Hello,
> >>
> >> I have run into yet again another situation where there was an
> >> assumption that autovacuum was keeping up and it wasn't. It was caused
> >> by autovacuum quitting because another process requested a lock.
> >>
> >> In turn we received a ton of bloat on pg_attribute which caused all
> >> kinds of other issues (as can be expected).
> >>
> >> The more I run into it, the more it seems like autovacuum should behave
> >> like vacuum, in that it gets precedence when it is running. First come,
> >> first serve as they say.
> >>
> >> Thoughts?
> >
> > If we let autovacuum block user activity, a lot more people would turn
> > it off.
> >
> > Now, if you were to argue that we should have some way to monitor the
> > tables which autovac can never touch because of conflicts, I would agree
> > with you.
>
> Agree completely. Easy ways to monitor this would be great. Once you
> know there's a problem, tweaking autovacuum settings is very hard and
> misunderstood, and explaining how to be effective at it is a dark art
> too.
>

FWIW, I have a patch around somewhere that I never cleaned up properly for
submissions that simply added a counter to pg_stat_user_tables indicating
how many times vacuum had aborted on that specific table. If that's enough
info  (it was for my case) to cover this case, I can try to dig it out
again and clean it up...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Mark Kirkwood

On 24/01/14 10:16, Mark Kirkwood wrote:

On 24/01/14 10:09, Robert Haas wrote:

On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood
 wrote:

On 24/01/14 09:49, Tom Lane wrote:
2. What have you got that is requesting exclusive lock on 
pg_attribute?
That seems like a pretty unfriendly behavior in itself. regards, 
tom lane
I've seen this sort of problem where every db session was busily 
creating
temporary tables. I never got to the find *why* they needed to make 
so many,

but it seemed like a bad idea.

But... how does that result on a vacuum-incompatible lock request
against pg_attribute?

I see that it'll insert lots of rows into pg_attribute, and maybe
later delete them, but none of that blocks vacuum.



That was my thought too - if I see it happening again here (was a year 
or so ago that I saw some serious pg_attribute bloat) I'll dig deeper.





Actually not much digging required. Running the attached script via 
pgbench (8 sessions) against a default configured postgres 8.4 sees 
pg_attribute get to 1G after about 15 minutes.


BEGIN;
DROP TABLE IF EXISTS tab0;
CREATE TEMP TABLE tab0 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab0  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab1;
CREATE TEMP TABLE tab1 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab1  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab2;
CREATE TEMP TABLE tab2 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab2  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab3;
CREATE TEMP TABLE tab3 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab3  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab4;
CREATE TEMP TABLE tab4 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab4  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab5;
CREATE TEMP TABLE tab5 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab5  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab6;
CREATE TEMP TABLE tab6 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab6  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab7;
CREATE TEMP TABLE tab7 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab7  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab8;
CREATE TEMP TABLE tab8 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab8  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab9;
CREATE TEMP TABLE tab9 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab9  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab10;
CREATE TEMP TABLE tab10 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab10  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab11;
CREATE TEMP TABLE tab11 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab11  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab12;
CREATE TEMP TABLE tab12 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab12  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab13;
CREATE TEMP TABLE tab13 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab13  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab14;
CREATE TEMP TABLE tab14 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab14  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab15;
CREATE TEMP TABLE tab15 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab15  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab16;
CREATE TEMP TABLE tab16 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab16  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab17;
CREATE TEMP TABLE tab17 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab17  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab18;
CREATE TEMP TABLE tab18 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab18  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab19;
CREATE TEMP TABLE tab19 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab19  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab20;
CREATE TEMP TABLE tab20 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab20  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab21;
CREATE TEMP TABLE tab21 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab21  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab22;
CREATE TEMP TABLE tab22 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab22  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab23;
CREATE TEMP TABLE tab23 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab23  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab24;
CREATE TEMP TABLE tab24 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab24  SELECT generate_series(1,1000),'xx';
DROP TABLE IF EXISTS tab25;
CREATE TEMP TABLE tab25 ( id INTEGER PRIMARY KEY, val TEXT);
INSERT INTO tab25  SELECT generate_series(1,1000),'xxx

Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Mark Kirkwood

On 24/01/14 10:09, Robert Haas wrote:

On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood
 wrote:

On 24/01/14 09:49, Tom Lane wrote:

2. What have you got that is requesting exclusive lock on pg_attribute?
That seems like a pretty unfriendly behavior in itself. regards, tom lane

I've seen this sort of problem where every db session was busily creating
temporary tables. I never got to the find *why* they needed to make so many,
but it seemed like a bad idea.

But... how does that result on a vacuum-incompatible lock request
against pg_attribute?

I see that it'll insert lots of rows into pg_attribute, and maybe
later delete them, but none of that blocks vacuum.



That was my thought too - if I see it happening again here (was a year 
or so ago that I saw some serious pg_attribute bloat) I'll dig deeper.


regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Tom Lane
Mark Kirkwood  writes:
> On 24/01/14 09:49, Tom Lane wrote:
>> 2. What have you got that is requesting exclusive lock on 
>> pg_attribute? That seems like a pretty unfriendly behavior in itself. 

> I've seen this sort of problem where every db session was busily 
> creating temporary tables. I never got to the find *why* they needed to 
> make so many, but it seemed like a bad idea.

That shouldn't result in any table-level exclusive locks on system
catalogs, though.

[ thinks... ]  It's possible that what you saw is not the
kick-out-autovacuum-entirely behavior, but the behavior added in commit
bbb6e559c, whereby vacuum (auto or regular) will skip over pages that it
can't immediately get an exclusive buffer lock on.  On a heavily used
table, we might skip the same page repeatedly, so that dead tuples don't
get cleaned for a long time.

To add insult to injury, despite having done that, vacuum would reset the
pgstats dead-tuple count to zero, thus postponing the next autovacuum.
I think commit 115f41412 may have improved the situation, but I'd want
to see some testing of this theory before I'd propose back-patching it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Pavel Stehule
Dne 23.1.2014 22:04 "Mark Kirkwood" 
napsal(a):
>
> On 24/01/14 09:49, Tom Lane wrote:
>>
>> 2. What have you got that is requesting exclusive lock on pg_attribute?
That seems like a pretty unfriendly behavior in itself. regards, tom lane
>
>
> I've seen this sort of problem where every db session was busily creating
temporary tables. I never got to the find *why* they needed to make so
many, but it seemed like a bad idea.
>

Our customer had same problem with  temp tables by intensively plpgsql
functions. For higher load a temp tables are performance and stability
killer. Vacuum of pg attrib has very ugly impacts :(

Regars

Pavel

After redesign - without tmp tables - his applications works well.

We needs a global temp tables

> Regards
>
> Mark
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Joshua D. Drake


On 01/23/2014 01:03 PM, Mark Kirkwood wrote:


On 24/01/14 09:49, Tom Lane wrote:

2. What have you got that is requesting exclusive lock on
pg_attribute? That seems like a pretty unfriendly behavior in itself.
regards, tom lane


I've seen this sort of problem where every db session was busily
creating temporary tables. I never got to the find *why* they needed to
make so many, but it seemed like a bad idea.


Yep... that's the one. They are creating lots and lots of temp tables.

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Robert Haas
On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood
 wrote:
> On 24/01/14 09:49, Tom Lane wrote:
>> 2. What have you got that is requesting exclusive lock on pg_attribute?
>> That seems like a pretty unfriendly behavior in itself. regards, tom lane
>
> I've seen this sort of problem where every db session was busily creating
> temporary tables. I never got to the find *why* they needed to make so many,
> but it seemed like a bad idea.

But... how does that result on a vacuum-incompatible lock request
against pg_attribute?

I see that it'll insert lots of rows into pg_attribute, and maybe
later delete them, but none of that blocks vacuum.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Mark Kirkwood

On 24/01/14 09:49, Tom Lane wrote:
2. What have you got that is requesting exclusive lock on 
pg_attribute? That seems like a pretty unfriendly behavior in itself. 
regards, tom lane 


I've seen this sort of problem where every db session was busily 
creating temporary tables. I never got to the find *why* they needed to 
make so many, but it seemed like a bad idea.


Regards

Mark



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Harold Giménez
On Thu, Jan 23, 2014 at 12:53 PM, Josh Berkus  wrote:
> On 01/23/2014 12:34 PM, Joshua D. Drake wrote:
>>
>> Hello,
>>
>> I have run into yet again another situation where there was an
>> assumption that autovacuum was keeping up and it wasn't. It was caused
>> by autovacuum quitting because another process requested a lock.
>>
>> In turn we received a ton of bloat on pg_attribute which caused all
>> kinds of other issues (as can be expected).
>>
>> The more I run into it, the more it seems like autovacuum should behave
>> like vacuum, in that it gets precedence when it is running. First come,
>> first serve as they say.
>>
>> Thoughts?
>
> If we let autovacuum block user activity, a lot more people would turn
> it off.
>
> Now, if you were to argue that we should have some way to monitor the
> tables which autovac can never touch because of conflicts, I would agree
> with you.

Agree completely. Easy ways to monitor this would be great. Once you
know there's a problem, tweaking autovacuum settings is very hard and
misunderstood, and explaining how to be effective at it is a dark art
too.

-Harold


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Josh Berkus
On 01/23/2014 12:34 PM, Joshua D. Drake wrote:
> 
> Hello,
> 
> I have run into yet again another situation where there was an
> assumption that autovacuum was keeping up and it wasn't. It was caused
> by autovacuum quitting because another process requested a lock.
> 
> In turn we received a ton of bloat on pg_attribute which caused all
> kinds of other issues (as can be expected).
> 
> The more I run into it, the more it seems like autovacuum should behave
> like vacuum, in that it gets precedence when it is running. First come,
> first serve as they say.
> 
> Thoughts?

If we let autovacuum block user activity, a lot more people would turn
it off.

Now, if you were to argue that we should have some way to monitor the
tables which autovac can never touch because of conflicts, I would agree
with you.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why do we let autovacuum give up?

2014-01-23 Thread Tom Lane
"Joshua D. Drake"  writes:
> I have run into yet again another situation where there was an 
> assumption that autovacuum was keeping up and it wasn't. It was caused 
> by autovacuum quitting because another process requested a lock.

> In turn we received a ton of bloat on pg_attribute which caused all 
> kinds of other issues (as can be expected).

> The more I run into it, the more it seems like autovacuum should behave 
> like vacuum, in that it gets precedence when it is running. First come, 
> first serve as they say.

1. Back when it worked like that, things were worse.

2. What have you got that is requesting exclusive lock on pg_attribute?
That seems like a pretty unfriendly behavior in itself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers