Re: [HACKERS] Why do we let autovacuum give up?
On Thu, Jan 23, 2014 at 7:45 PM, Tom Lane wrote: > Andres Freund writes: >> On 2014-01-23 19:29:23 -0500, Tom Lane wrote: >>> I concur with the other reports that the main problem in this test case is >>> just that the default cost delay settings throttle autovacuum so hard that >>> it has no chance of keeping up. If I reduce autovacuum_vacuum_cost_delay >>> from the default 20ms to 2ms, it seems to keep up quite nicely, on my >>> machine anyway. Probably other combinations of changes would do it too. > >>> Perhaps we need to back off the default cost delay settings a bit? >>> We've certainly heard more than enough reports of table bloat in >>> heavily-updated tables. A system that wasn't hitting the updates as hard >>> as it could might not need this, but on the other hand it probably >>> wouldn't miss the I/O cycles from a more aggressive autovacuum, either. > >> Yes, I think adjusting the default makes sense, most setups that have >> enough activity that costing plays a role have to greatly increase the >> values. I'd rather increase the cost limit than reduce cost delay so >> drastically though, but that's admittedly just gut feeling. > > Well, I didn't experiment with intermediate values, I was just trying > to test the theory that autovac could keep up given less-extreme > throttling. I'm not taking any position on just where we need to set > the values, only that what we've got is probably too extreme. So, Greg Smith proposed what I think is a very useful methodology for assessing settings in this area: figure out what it works out to in MB/s. If we assume we're going to read and dirty every page we vacuum, and that this will take negligible time of itself so that the work is dominated by the sleeps, the default settings work out to 200/(10 + 20) pages every 20ms, or 2.67MB/s. Obviously, the rate will be 3x higher if the pages don't need to be dirtied, and higher still if they're all in cache, but considering the way the visibility map works, it seems like a good bet that we WILL need to dirty most of the pages that we look at - either they've got dead tuples and need clean-up, or they don't and need to be marked all-visible. A corollary of this is that if you're dirtying heap pages faster than a few megabytes per second, autovacuum, at least with default settings, is not going to keep up. And if you assume that each write transaction dirties at least one heap page, any volume of write transactions in excess of a few hundred per second will meat that criteria. Which is really not that much; a single core can do over 1000 tps with synchronous_commit=off, or if there's a BBWC that can absorb it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Fri, Jan 24, 2014 at 2:44 PM, Josh Berkus wrote: > On 01/23/2014 07:22 PM, Alvaro Herrera wrote: >>> If you ask me, I'd like autovac to know when not to run (or rather >>> > wait a bit, not forever), perhaps by checking load factors or some >>> > other tell-tale of an already-saturated I/O system. >> We had a proposed design to tell autovac when not to run (or rather, >> when to switch settings very high so that in practice it'd never run). >> At some point somebody said "but we can just change autovacuum=off in >> postgresql.conf via crontab when the high load period starts, and turn >> it back on afterwards" --- and that was the end of it. > > Anything which depends on a timing-based feedback loop is going to be > hopeless. Saying "autovac shouldn't run if load is high" sounds like a > simple statement, until you actually try to implement it. Exactly. But people tuning autovac down are doing exactly that: trying to tune autovac to background-only work. They *must* then launch foreground vacuums, at times they deem sensible, when doing that. So, problem is not of people tuning down autovacuum, but of them forgetting to vacuum explicitly after doing so. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 01/23/2014 07:22 PM, Alvaro Herrera wrote: >> If you ask me, I'd like autovac to know when not to run (or rather >> > wait a bit, not forever), perhaps by checking load factors or some >> > other tell-tale of an already-saturated I/O system. > We had a proposed design to tell autovac when not to run (or rather, > when to switch settings very high so that in practice it'd never run). > At some point somebody said "but we can just change autovacuum=off in > postgresql.conf via crontab when the high load period starts, and turn > it back on afterwards" --- and that was the end of it. Anything which depends on a timing-based feedback loop is going to be hopeless. Saying "autovac shouldn't run if load is high" sounds like a simple statement, until you actually try to implement it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Fri, Jan 24, 2014 at 12:33 AM, Craig Ringer wrote: > On 01/24/2014 11:32 AM, Tom Lane wrote: >> Alvaro Herrera writes: >>> Claudio Freire escribió: If you ask me, I'd like autovac to know when not to run (or rather wait a bit, not forever), perhaps by checking load factors or some other tell-tale of an already-saturated I/O system. >> >>> We had a proposed design to tell autovac when not to run (or rather, >>> when to switch settings very high so that in practice it'd never run). >>> At some point somebody said "but we can just change autovacuum=off in >>> postgresql.conf via crontab when the high load period starts, and turn >>> it back on afterwards" --- and that was the end of it. >> >> The hard part of this is that shutting down autovacuum during heavy >> load may be exactly the wrong thing to do. > > Yep. In fact, it may be appropriate to limit or stop autovacuum's work > on some big tables, while pushing its activity even higher for small, > high churn tables. > > If you stop autovacuum on a message-queue system when load gets high, > you'll get a giant messy bloat explosion. A message queue has a steady state and needs way more than autovacuum. A table used as a message queue would need a wholly dedicated autovacuum worker to be constantly vacuuming. It's certainly an extreme example. But normal tables are much bigger than their active set, so vacuuming, which walks all those cold gigabytes, tends to wreak havoc with I/O performance. Doing it in peak hours, which is autovacuum's preferred time, is terrible. Delaying autovacuum for a while doesn't sound like such a disastruous thing. In essence, I'm talking about two thresholds. A "vacuum in the background" threshold, and a "omfg this table is a mess vacuum now now now" threshold. The background part is quite not straightforward though. As in, what is background? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Craig Ringer escribió: > On 01/24/2014 11:32 AM, Tom Lane wrote: > > The hard part of this is that shutting down autovacuum during heavy > > load may be exactly the wrong thing to do. > > Yep. In fact, it may be appropriate to limit or stop autovacuum's work > on some big tables, while pushing its activity even higher for small, > high churn tables. > > If you stop autovacuum on a message-queue system when load gets high, > you'll get a giant messy bloat explosion. The design we had was to have table groups, each with their own set of custom parameters, and they would change depending on schedule. You could keep the queue tables in one group which would not change parameters, and only change the rest. But as I said, it was never fully implemented. (We had a partial patch from a GSoC project, IIRC.) I don't have the cycles to implement it now, anyway. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 01/24/2014 11:32 AM, Tom Lane wrote: > Alvaro Herrera writes: >> Claudio Freire escribió: >>> If you ask me, I'd like autovac to know when not to run (or rather >>> wait a bit, not forever), perhaps by checking load factors or some >>> other tell-tale of an already-saturated I/O system. > >> We had a proposed design to tell autovac when not to run (or rather, >> when to switch settings very high so that in practice it'd never run). >> At some point somebody said "but we can just change autovacuum=off in >> postgresql.conf via crontab when the high load period starts, and turn >> it back on afterwards" --- and that was the end of it. > > The hard part of this is that shutting down autovacuum during heavy > load may be exactly the wrong thing to do. Yep. In fact, it may be appropriate to limit or stop autovacuum's work on some big tables, while pushing its activity even higher for small, high churn tables. If you stop autovacuum on a message-queue system when load gets high, you'll get a giant messy bloat explosion. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Alvaro Herrera writes: > Claudio Freire escribió: >> If you ask me, I'd like autovac to know when not to run (or rather >> wait a bit, not forever), perhaps by checking load factors or some >> other tell-tale of an already-saturated I/O system. > We had a proposed design to tell autovac when not to run (or rather, > when to switch settings very high so that in practice it'd never run). > At some point somebody said "but we can just change autovacuum=off in > postgresql.conf via crontab when the high load period starts, and turn > it back on afterwards" --- and that was the end of it. The hard part of this is that shutting down autovacuum during heavy load may be exactly the wrong thing to do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Claudio Freire escribió: > If you ask me, I'd like autovac to know when not to run (or rather > wait a bit, not forever), perhaps by checking load factors or some > other tell-tale of an already-saturated I/O system. We had a proposed design to tell autovac when not to run (or rather, when to switch settings very high so that in practice it'd never run). At some point somebody said "but we can just change autovacuum=off in postgresql.conf via crontab when the high load period starts, and turn it back on afterwards" --- and that was the end of it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Thu, Jan 23, 2014 at 10:38 PM, Craig Ringer wrote: >>> >>> Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute >>> in this case. Back to drawing board for a test case. >> >> Well, I think quite many people don't realize it might be necessary to >> tune autovac on busy workloads. As it very well might be the case in >> Josh's case. > > Oh, lots of people realise it's a good idea to tune autovac on busy > workloads. > > They just do it in the wrong direction, making it run less often and > less aggressively, causing more bloat, and making their problem worse. > > I've seen this enough times that I'm starting to think the autovauum > tuning knobs need a child safety lock ;-) > > More seriously, people don't understand autovacuum, how it works, or why > they need it. They notice it when things are already bad, see that it's > doing lots of work and doing lots of I/O that competes with queries, and > turn it off to "solve" the problem. AFAIK, tuning down autovacuum is common advice **when compounded with manually scheduled vacuuming**. The problem of autovacuum is that it always picks the wrong time to work. That is, when the DB is the most active. Because statistically that's when the thresholds are passed. If you ask me, I'd like autovac to know when not to run (or rather wait a bit, not forever), perhaps by checking load factors or some other tell-tale of an already-saturated I/O system. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 01/24/2014 07:52 AM, Andres Freund wrote: > On 2014-01-24 12:49:57 +1300, Mark Kirkwood wrote: >> autovacuum_max_workers = 4 >> autovacuum_naptime = 10s >> autovacuum_vacuum_scale_factor = 0.1 >> autovacuum_analyze_scale_factor = 0.1 >> autovacuum_vacuum_cost_delay = 0ms >> >> Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute >> in this case. Back to drawing board for a test case. > > Well, I think quite many people don't realize it might be necessary to > tune autovac on busy workloads. As it very well might be the case in > Josh's case. Oh, lots of people realise it's a good idea to tune autovac on busy workloads. They just do it in the wrong direction, making it run less often and less aggressively, causing more bloat, and making their problem worse. I've seen this enough times that I'm starting to think the autovauum tuning knobs need a child safety lock ;-) More seriously, people don't understand autovacuum, how it works, or why they need it. They notice it when things are already bad, see that it's doing lots of work and doing lots of I/O that competes with queries, and turn it off to "solve" the problem. I'm not sure how to tackle that. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Andres Freund writes: > On 2014-01-23 19:29:23 -0500, Tom Lane wrote: >> I concur with the other reports that the main problem in this test case is >> just that the default cost delay settings throttle autovacuum so hard that >> it has no chance of keeping up. If I reduce autovacuum_vacuum_cost_delay >> from the default 20ms to 2ms, it seems to keep up quite nicely, on my >> machine anyway. Probably other combinations of changes would do it too. >> Perhaps we need to back off the default cost delay settings a bit? >> We've certainly heard more than enough reports of table bloat in >> heavily-updated tables. A system that wasn't hitting the updates as hard >> as it could might not need this, but on the other hand it probably >> wouldn't miss the I/O cycles from a more aggressive autovacuum, either. > Yes, I think adjusting the default makes sense, most setups that have > enough activity that costing plays a role have to greatly increase the > values. I'd rather increase the cost limit than reduce cost delay so > drastically though, but that's admittedly just gut feeling. Well, I didn't experiment with intermediate values, I was just trying to test the theory that autovac could keep up given less-extreme throttling. I'm not taking any position on just where we need to set the values, only that what we've got is probably too extreme. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 2014-01-23 19:29:23 -0500, Tom Lane wrote: > I saw at most two pages skipped per vacuum, and > usually none; so there's no way that a whole lot of tuples are going > unvacuumed because of this. (I wonder though if we ought to add such > counting as a permanent feature ...) I generally think we need to work a bit on the data reported back by vacuum. Adding data and also making the data output when using autovacuum more consistent with what VACUUM VERBOSE reports. The latter curiously often has less detail than autovacuum. I had hoped to get to that for 9.4, but it doesn't look like it. > I concur with the other reports that the main problem in this test case is > just that the default cost delay settings throttle autovacuum so hard that > it has no chance of keeping up. If I reduce autovacuum_vacuum_cost_delay > from the default 20ms to 2ms, it seems to keep up quite nicely, on my > machine anyway. Probably other combinations of changes would do it too. > Perhaps we need to back off the default cost delay settings a bit? > We've certainly heard more than enough reports of table bloat in > heavily-updated tables. A system that wasn't hitting the updates as hard > as it could might not need this, but on the other hand it probably > wouldn't miss the I/O cycles from a more aggressive autovacuum, either. Yes, I think adjusting the default makes sense, most setups that have enough activity that costing plays a role have to greatly increase the values. I'd rather increase the cost limit than reduce cost delay so drastically though, but that's admittedly just gut feeling. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Andres Freund writes: > On 2014-01-23 16:15:50 -0500, Tom Lane wrote: >> [ thinks... ] It's possible that what you saw is not the >> kick-out-autovacuum-entirely behavior, but the behavior added in commit >> bbb6e559c, whereby vacuum (auto or regular) will skip over pages that it >> can't immediately get an exclusive buffer lock on. On a heavily used >> table, we might skip the same page repeatedly, so that dead tuples don't >> get cleaned for a long time. > I don't think it's too likely as an explanation here. Such workloads are > likely to fill a page with only dead tuples, right? Once all tuples are > safely dead they will be killed from the btree which should cause the > page not to be visited anymore and thus safely vacuumable. I added some instrumentation to vacuumlazy.c to count the number of pages skipped in this way. You're right, it seems to be negligible, at least with Mark's test case. I saw at most two pages skipped per vacuum, and usually none; so there's no way that a whole lot of tuples are going unvacuumed because of this. (I wonder though if we ought to add such counting as a permanent feature ...) I concur with the other reports that the main problem in this test case is just that the default cost delay settings throttle autovacuum so hard that it has no chance of keeping up. If I reduce autovacuum_vacuum_cost_delay from the default 20ms to 2ms, it seems to keep up quite nicely, on my machine anyway. Probably other combinations of changes would do it too. Perhaps we need to back off the default cost delay settings a bit? We've certainly heard more than enough reports of table bloat in heavily-updated tables. A system that wasn't hitting the updates as hard as it could might not need this, but on the other hand it probably wouldn't miss the I/O cycles from a more aggressive autovacuum, either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 2014-01-23 16:15:50 -0500, Tom Lane wrote: > [ thinks... ] It's possible that what you saw is not the > kick-out-autovacuum-entirely behavior, but the behavior added in commit > bbb6e559c, whereby vacuum (auto or regular) will skip over pages that it > can't immediately get an exclusive buffer lock on. On a heavily used > table, we might skip the same page repeatedly, so that dead tuples don't > get cleaned for a long time. I don't think it's too likely as an explanation here. Such workloads are likely to fill a page with only dead tuples, right? Once all tuples are safely dead they will be killed from the btree which should cause the page not to be visited anymore and thus safely vacuumable. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 2014-01-24 12:49:57 +1300, Mark Kirkwood wrote: > autovacuum_max_workers = 4 > autovacuum_naptime = 10s > autovacuum_vacuum_scale_factor = 0.1 > autovacuum_analyze_scale_factor = 0.1 > autovacuum_vacuum_cost_delay = 0ms > > Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute > in this case. Back to drawing board for a test case. Well, I think quite many people don't realize it might be necessary to tune autovac on busy workloads. As it very well might be the case in Josh's case. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 24/01/14 12:28, Mark Kirkwood wrote: On 24/01/14 12:13, Jeff Janes wrote: On Thu, Jan 23, 2014 at 1:41 PM, Mark Kirkwood < mark.kirkw...@catalyst.net.nz> wrote: On 24/01/14 10:16, Mark Kirkwood wrote: On 24/01/14 10:09, Robert Haas wrote: On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood wrote: On 24/01/14 09:49, Tom Lane wrote: 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. But... how does that result on a vacuum-incompatible lock request against pg_attribute? I see that it'll insert lots of rows into pg_attribute, and maybe later delete them, but none of that blocks vacuum. That was my thought too - if I see it happening again here (was a year or so ago that I saw some serious pg_attribute bloat) I'll dig deeper. Actually not much digging required. Running the attached script via pgbench (8 sessions) against a default configured postgres 8.4 sees pg_attribute get to 1G after about 15 minutes. At that rate, with default throttling, it will be a close race whether autovac can vacuum pages as fast as they are being added. Even if it never gets cancelled, it might not ever finish. Yes - I should have set the cost delay to 0 first (checking that now). Doing that (and a few other autovac tweaks): autovacuum_max_workers = 4 autovacuum_naptime = 10s autovacuum_vacuum_scale_factor = 0.1 autovacuum_analyze_scale_factor = 0.1 autovacuum_vacuum_cost_delay = 0ms Stops excessive bloat - clearly autovacuum *is* able to vacuum pg_attribute in this case. Back to drawing board for a test case. Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 24/01/14 12:13, Jeff Janes wrote: On Thu, Jan 23, 2014 at 1:41 PM, Mark Kirkwood < mark.kirkw...@catalyst.net.nz> wrote: On 24/01/14 10:16, Mark Kirkwood wrote: On 24/01/14 10:09, Robert Haas wrote: On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood wrote: On 24/01/14 09:49, Tom Lane wrote: 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. But... how does that result on a vacuum-incompatible lock request against pg_attribute? I see that it'll insert lots of rows into pg_attribute, and maybe later delete them, but none of that blocks vacuum. That was my thought too - if I see it happening again here (was a year or so ago that I saw some serious pg_attribute bloat) I'll dig deeper. Actually not much digging required. Running the attached script via pgbench (8 sessions) against a default configured postgres 8.4 sees pg_attribute get to 1G after about 15 minutes. At that rate, with default throttling, it will be a close race whether autovac can vacuum pages as fast as they are being added. Even if it never gets cancelled, it might not ever finish. Yes - I should have set the cost delay to 0 first (checking that now). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Thu, Jan 23, 2014 at 1:41 PM, Mark Kirkwood < mark.kirkw...@catalyst.net.nz> wrote: > On 24/01/14 10:16, Mark Kirkwood wrote: > >> On 24/01/14 10:09, Robert Haas wrote: >> >>> On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood >>> wrote: >>> On 24/01/14 09:49, Tom Lane wrote: > 2. What have you got that is requesting exclusive lock on pg_attribute? > That seems like a pretty unfriendly behavior in itself. regards, tom > lane > I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. >>> But... how does that result on a vacuum-incompatible lock request >>> against pg_attribute? >>> >>> I see that it'll insert lots of rows into pg_attribute, and maybe >>> later delete them, but none of that blocks vacuum. >>> >>> >> That was my thought too - if I see it happening again here (was a year or >> so ago that I saw some serious pg_attribute bloat) I'll dig deeper. >> >> >> > Actually not much digging required. Running the attached script via > pgbench (8 sessions) against a default configured postgres 8.4 sees > pg_attribute get to 1G after about 15 minutes. > At that rate, with default throttling, it will be a close race whether autovac can vacuum pages as fast as they are being added. Even if it never gets cancelled, it might not ever finish. Cheers, Jeff
Re: [HACKERS] Why do we let autovacuum give up?
On 01/23/2014 02:55 PM, Josh Berkus wrote: > On 01/23/2014 02:17 PM, Magnus Hagander wrote: >> FWIW, I have a patch around somewhere that I never cleaned up properly for >> submissions that simply added a counter to pg_stat_user_tables indicating >> how many times vacuum had aborted on that specific table. If that's enough >> info (it was for my case) to cover this case, I can try to dig it out >> again and clean it up... > > It would be 100% more information than we currently have. How much more > difficult would it be to count completed autovacuums as well? It's > really the ratio of the two which matters ... Actually, now that I think about it, the ratio of the two doesn't matter as much as whether the most recent autovacuum aborted or not. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 01/23/2014 02:17 PM, Magnus Hagander wrote: > FWIW, I have a patch around somewhere that I never cleaned up properly for > submissions that simply added a counter to pg_stat_user_tables indicating > how many times vacuum had aborted on that specific table. If that's enough > info (it was for my case) to cover this case, I can try to dig it out > again and clean it up... It would be 100% more information than we currently have. How much more difficult would it be to count completed autovacuums as well? It's really the ratio of the two which matters ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Thu, Jan 23, 2014 at 10:00 PM, Harold Giménez wrote: > On Thu, Jan 23, 2014 at 12:53 PM, Josh Berkus wrote: > > On 01/23/2014 12:34 PM, Joshua D. Drake wrote: > >> > >> Hello, > >> > >> I have run into yet again another situation where there was an > >> assumption that autovacuum was keeping up and it wasn't. It was caused > >> by autovacuum quitting because another process requested a lock. > >> > >> In turn we received a ton of bloat on pg_attribute which caused all > >> kinds of other issues (as can be expected). > >> > >> The more I run into it, the more it seems like autovacuum should behave > >> like vacuum, in that it gets precedence when it is running. First come, > >> first serve as they say. > >> > >> Thoughts? > > > > If we let autovacuum block user activity, a lot more people would turn > > it off. > > > > Now, if you were to argue that we should have some way to monitor the > > tables which autovac can never touch because of conflicts, I would agree > > with you. > > Agree completely. Easy ways to monitor this would be great. Once you > know there's a problem, tweaking autovacuum settings is very hard and > misunderstood, and explaining how to be effective at it is a dark art > too. > FWIW, I have a patch around somewhere that I never cleaned up properly for submissions that simply added a counter to pg_stat_user_tables indicating how many times vacuum had aborted on that specific table. If that's enough info (it was for my case) to cover this case, I can try to dig it out again and clean it up... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Why do we let autovacuum give up?
On 24/01/14 10:16, Mark Kirkwood wrote: On 24/01/14 10:09, Robert Haas wrote: On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood wrote: On 24/01/14 09:49, Tom Lane wrote: 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. But... how does that result on a vacuum-incompatible lock request against pg_attribute? I see that it'll insert lots of rows into pg_attribute, and maybe later delete them, but none of that blocks vacuum. That was my thought too - if I see it happening again here (was a year or so ago that I saw some serious pg_attribute bloat) I'll dig deeper. Actually not much digging required. Running the attached script via pgbench (8 sessions) against a default configured postgres 8.4 sees pg_attribute get to 1G after about 15 minutes. BEGIN; DROP TABLE IF EXISTS tab0; CREATE TEMP TABLE tab0 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab0 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab1; CREATE TEMP TABLE tab1 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab1 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab2; CREATE TEMP TABLE tab2 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab2 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab3; CREATE TEMP TABLE tab3 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab3 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab4; CREATE TEMP TABLE tab4 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab4 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab5; CREATE TEMP TABLE tab5 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab5 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab6; CREATE TEMP TABLE tab6 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab6 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab7; CREATE TEMP TABLE tab7 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab7 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab8; CREATE TEMP TABLE tab8 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab8 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab9; CREATE TEMP TABLE tab9 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab9 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab10; CREATE TEMP TABLE tab10 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab10 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab11; CREATE TEMP TABLE tab11 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab11 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab12; CREATE TEMP TABLE tab12 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab12 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab13; CREATE TEMP TABLE tab13 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab13 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab14; CREATE TEMP TABLE tab14 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab14 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab15; CREATE TEMP TABLE tab15 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab15 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab16; CREATE TEMP TABLE tab16 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab16 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab17; CREATE TEMP TABLE tab17 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab17 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab18; CREATE TEMP TABLE tab18 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab18 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab19; CREATE TEMP TABLE tab19 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab19 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab20; CREATE TEMP TABLE tab20 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab20 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab21; CREATE TEMP TABLE tab21 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab21 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab22; CREATE TEMP TABLE tab22 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab22 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab23; CREATE TEMP TABLE tab23 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab23 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab24; CREATE TEMP TABLE tab24 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab24 SELECT generate_series(1,1000),'xx'; DROP TABLE IF EXISTS tab25; CREATE TEMP TABLE tab25 ( id INTEGER PRIMARY KEY, val TEXT); INSERT INTO tab25 SELECT generate_series(1,1000),'xxx
Re: [HACKERS] Why do we let autovacuum give up?
On 24/01/14 10:09, Robert Haas wrote: On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood wrote: On 24/01/14 09:49, Tom Lane wrote: 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. But... how does that result on a vacuum-incompatible lock request against pg_attribute? I see that it'll insert lots of rows into pg_attribute, and maybe later delete them, but none of that blocks vacuum. That was my thought too - if I see it happening again here (was a year or so ago that I saw some serious pg_attribute bloat) I'll dig deeper. regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Mark Kirkwood writes: > On 24/01/14 09:49, Tom Lane wrote: >> 2. What have you got that is requesting exclusive lock on >> pg_attribute? That seems like a pretty unfriendly behavior in itself. > I've seen this sort of problem where every db session was busily > creating temporary tables. I never got to the find *why* they needed to > make so many, but it seemed like a bad idea. That shouldn't result in any table-level exclusive locks on system catalogs, though. [ thinks... ] It's possible that what you saw is not the kick-out-autovacuum-entirely behavior, but the behavior added in commit bbb6e559c, whereby vacuum (auto or regular) will skip over pages that it can't immediately get an exclusive buffer lock on. On a heavily used table, we might skip the same page repeatedly, so that dead tuples don't get cleaned for a long time. To add insult to injury, despite having done that, vacuum would reset the pgstats dead-tuple count to zero, thus postponing the next autovacuum. I think commit 115f41412 may have improved the situation, but I'd want to see some testing of this theory before I'd propose back-patching it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
Dne 23.1.2014 22:04 "Mark Kirkwood" napsal(a): > > On 24/01/14 09:49, Tom Lane wrote: >> >> 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane > > > I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. > Our customer had same problem with temp tables by intensively plpgsql functions. For higher load a temp tables are performance and stability killer. Vacuum of pg attrib has very ugly impacts :( Regars Pavel After redesign - without tmp tables - his applications works well. We needs a global temp tables > Regards > > Mark > > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 01/23/2014 01:03 PM, Mark Kirkwood wrote: On 24/01/14 09:49, Tom Lane wrote: 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. Yep... that's the one. They are creating lots and lots of temp tables. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Thu, Jan 23, 2014 at 4:03 PM, Mark Kirkwood wrote: > On 24/01/14 09:49, Tom Lane wrote: >> 2. What have you got that is requesting exclusive lock on pg_attribute? >> That seems like a pretty unfriendly behavior in itself. regards, tom lane > > I've seen this sort of problem where every db session was busily creating > temporary tables. I never got to the find *why* they needed to make so many, > but it seemed like a bad idea. But... how does that result on a vacuum-incompatible lock request against pg_attribute? I see that it'll insert lots of rows into pg_attribute, and maybe later delete them, but none of that blocks vacuum. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 24/01/14 09:49, Tom Lane wrote: 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane I've seen this sort of problem where every db session was busily creating temporary tables. I never got to the find *why* they needed to make so many, but it seemed like a bad idea. Regards Mark -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On Thu, Jan 23, 2014 at 12:53 PM, Josh Berkus wrote: > On 01/23/2014 12:34 PM, Joshua D. Drake wrote: >> >> Hello, >> >> I have run into yet again another situation where there was an >> assumption that autovacuum was keeping up and it wasn't. It was caused >> by autovacuum quitting because another process requested a lock. >> >> In turn we received a ton of bloat on pg_attribute which caused all >> kinds of other issues (as can be expected). >> >> The more I run into it, the more it seems like autovacuum should behave >> like vacuum, in that it gets precedence when it is running. First come, >> first serve as they say. >> >> Thoughts? > > If we let autovacuum block user activity, a lot more people would turn > it off. > > Now, if you were to argue that we should have some way to monitor the > tables which autovac can never touch because of conflicts, I would agree > with you. Agree completely. Easy ways to monitor this would be great. Once you know there's a problem, tweaking autovacuum settings is very hard and misunderstood, and explaining how to be effective at it is a dark art too. -Harold -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
On 01/23/2014 12:34 PM, Joshua D. Drake wrote: > > Hello, > > I have run into yet again another situation where there was an > assumption that autovacuum was keeping up and it wasn't. It was caused > by autovacuum quitting because another process requested a lock. > > In turn we received a ton of bloat on pg_attribute which caused all > kinds of other issues (as can be expected). > > The more I run into it, the more it seems like autovacuum should behave > like vacuum, in that it gets precedence when it is running. First come, > first serve as they say. > > Thoughts? If we let autovacuum block user activity, a lot more people would turn it off. Now, if you were to argue that we should have some way to monitor the tables which autovac can never touch because of conflicts, I would agree with you. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we let autovacuum give up?
"Joshua D. Drake" writes: > I have run into yet again another situation where there was an > assumption that autovacuum was keeping up and it wasn't. It was caused > by autovacuum quitting because another process requested a lock. > In turn we received a ton of bloat on pg_attribute which caused all > kinds of other issues (as can be expected). > The more I run into it, the more it seems like autovacuum should behave > like vacuum, in that it gets precedence when it is running. First come, > first serve as they say. 1. Back when it worked like that, things were worse. 2. What have you got that is requesting exclusive lock on pg_attribute? That seems like a pretty unfriendly behavior in itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers