Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 12/2/14, 2:22 PM, Jeff Janes wrote: Or maybe I overestimate how hard it would be to make vacuum restartable. You would have to save a massive amount of state (upto maintenance_work_mem tid list, the block you left off on both the table and all of the indexes in that table), and you would somehow have to validate that saved state against any changes that might have occurred to the table or the indexes while it was saved and you were not holding the lock, which seems like it would almost as full of corner cases as weakening the lock in the first place. Aren't they logically the same thing? If we could drop the lock and take it up again later, maybe the answer is not to save the state, but just to pause the vacuum until the lock becomes free again, in effect saving the state in situ. That would allow autovac worker to be held hostage to anyone taking a lock, though. Yeah, rather than messing with any of that, I think it would make a lot more sense to split vacuum into smaller operations that don't require such a huge chunk of time. The only easy way to do it that I see is to have it only stop at the end of a index-cleaning cycle, which probably takes too long to block for. Or record a restart point at the end of each index-cleaning cycle, and then when it yields the lock it abandons all work since the last cycle end, rather than since the beginning. That would be better than what we have, but seems like a far cry from actual restarting from any point. Now that's not a bad idea. This would basically mean just saving a block number in pg_class after every intermediate index clean and then setting that back to zero when we're done with that relation, right? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 2014-12-02 12:22:42 -0800, Jeff Janes wrote: > Or maybe I overestimate how hard it would be to make vacuum > restartable. That's a massive project. Which is why I'm explicitly *not* suggesting that. What I instead suggest is a separate threshhold after which vacuum isn't going to abort automaticlaly after a lock conflict. So after that threshold just behave like anti wraparound vacuum already does. Maybe autovacuum_vacuum/analyze_force_threshold or similar. If set to zero, the default, that behaviour is disabled. If set to a positive value it's an absolute one, if negative it's a factor of the normal autovacuum_vacuum/analyze_threshold. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On Tue, Dec 2, 2014 at 11:41 AM, Andres Freund wrote: > On 2014-12-02 11:23:31 -0800, Jeff Janes wrote: > > I think it would be more promising to work on downgrading lock strengths > so > > that fewer things conflict, and it would be not much more work than what > > you propose. > > I think you *massively* underestimate the effort required to to lower > lock levels. There's some very ugly corners you have to think about to > do so. Just look at how long it took to implement the lock level > reductions for ALTER TABLE - and those were the simpler cases. > Or maybe I overestimate how hard it would be to make vacuum restartable. You would have to save a massive amount of state (upto maintenance_work_mem tid list, the block you left off on both the table and all of the indexes in that table), and you would somehow have to validate that saved state against any changes that might have occurred to the table or the indexes while it was saved and you were not holding the lock, which seems like it would almost as full of corner cases as weakening the lock in the first place. Aren't they logically the same thing? If we could drop the lock and take it up again later, maybe the answer is not to save the state, but just to pause the vacuum until the lock becomes free again, in effect saving the state in situ. That would allow autovac worker to be held hostage to anyone taking a lock, though. The only easy way to do it that I see is to have it only stop at the end of a index-cleaning cycle, which probably takes too long to block for. Or record a restart point at the end of each index-cleaning cycle, and then when it yields the lock it abandons all work since the last cycle end, rather than since the beginning. That would be better than what we have, but seems like a far cry from actual restarting from any point. > > > What operations are people doing on a regular basis that take locks > > which cancel vacuum? create index? > > Locking tables against modifications in this case. > So in "share mode", then? I don't think there is any reason that there can't be a lock mode that conflicts with "ROW EXCLUSIVE" but not "SHARE UPDATE EXCLUSIVE". Basically something that conflicts with logical changes, but not with physical changes. Cheers, Jeff
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 2014-12-02 11:23:31 -0800, Jeff Janes wrote: > I think it would be more promising to work on downgrading lock strengths so > that fewer things conflict, and it would be not much more work than what > you propose. I think you *massively* underestimate the effort required to to lower lock levels. There's some very ugly corners you have to think about to do so. Just look at how long it took to implement the lock level reductions for ALTER TABLE - and those were the simpler cases. > What operations are people doing on a regular basis that take locks > which cancel vacuum? create index? Locking tables against modifications in this case. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On Tue, Dec 2, 2014 at 11:12 AM, Josh Berkus wrote: > On 12/02/2014 11:08 AM, Andres Freund wrote: > > On 2014-12-02 11:02:07 -0800, Josh Berkus wrote: > >> On 12/02/2014 10:35 AM, Alvaro Herrera wrote: > >>> If the table is large, the time window for this to happen is large > also; > >>> there might never be a time window large enough between two lock > >>> acquisitions for one autovacuum run to complete in a table. This > >>> starves the table from vacuuming completely, until things are bad > enough > >>> that an emergency vacuum is forced. By then, the bloat is disastrous. > >>> > >>> I think it's that suicide that Andres wants to disable. > > > > Correct. > > > >> A much better solution for this ... and one which would solve a *lot* of > >> other issues with vacuum and autovacuum ... would be to give vacuum a > >> way to track which blocks an incomplete vacuum had already visited. > >> This would be even more valuable for freeze. > > > > That's pretty much a different problem. Yes, some more persistent would > > be helpful - although it'd need to be *much* more than which pages it > > has visited - but you'd still be vulnerable to the same issue. > > If we're trying to solve the problem that vacuums of large, high-update > tables never complete, it's solving the same problem. And in a much > better way. > > And yeah, doing a vacuum placeholder wouldn't be simple, but it's the > only solution I can think of that's worthwhile. Just disabling the > vacuum releases sharelock behavior puts the user in the situation of > deciding between maintenance and uptime. > I think it would be more promising to work on downgrading lock strengths so that fewer things conflict, and it would be not much more work than what you propose. What operations are people doing on a regular basis that take locks which cancel vacuum? create index? Cheers, Jeff
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 2014-12-02 11:12:40 -0800, Josh Berkus wrote: > On 12/02/2014 11:08 AM, Andres Freund wrote: > > On 2014-12-02 11:02:07 -0800, Josh Berkus wrote: > >> On 12/02/2014 10:35 AM, Alvaro Herrera wrote: > >>> If the table is large, the time window for this to happen is large also; > >>> there might never be a time window large enough between two lock > >>> acquisitions for one autovacuum run to complete in a table. This > >>> starves the table from vacuuming completely, until things are bad enough > >>> that an emergency vacuum is forced. By then, the bloat is disastrous. > >>> > >>> I think it's that suicide that Andres wants to disable. > > > > Correct. > > > >> A much better solution for this ... and one which would solve a *lot* of > >> other issues with vacuum and autovacuum ... would be to give vacuum a > >> way to track which blocks an incomplete vacuum had already visited. > >> This would be even more valuable for freeze. > > > > That's pretty much a different problem. Yes, some more persistent would > > be helpful - although it'd need to be *much* more than which pages it > > has visited - but you'd still be vulnerable to the same issue. > > If we're trying to solve the problem that vacuums of large, high-update > tables never complete, it's solving the same problem. Which isn't what I'm talking about. The problem is that vacuum is cancelled if a conflicting lock request is acquired. Plain updates don't do that. But there's workloads where you need more heavyweight updates, and then it can easily happen. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 12/02/2014 11:08 AM, Andres Freund wrote: > On 2014-12-02 11:02:07 -0800, Josh Berkus wrote: >> On 12/02/2014 10:35 AM, Alvaro Herrera wrote: >>> If the table is large, the time window for this to happen is large also; >>> there might never be a time window large enough between two lock >>> acquisitions for one autovacuum run to complete in a table. This >>> starves the table from vacuuming completely, until things are bad enough >>> that an emergency vacuum is forced. By then, the bloat is disastrous. >>> >>> I think it's that suicide that Andres wants to disable. > > Correct. > >> A much better solution for this ... and one which would solve a *lot* of >> other issues with vacuum and autovacuum ... would be to give vacuum a >> way to track which blocks an incomplete vacuum had already visited. >> This would be even more valuable for freeze. > > That's pretty much a different problem. Yes, some more persistent would > be helpful - although it'd need to be *much* more than which pages it > has visited - but you'd still be vulnerable to the same issue. If we're trying to solve the problem that vacuums of large, high-update tables never complete, it's solving the same problem. And in a much better way. And yeah, doing a vacuum placeholder wouldn't be simple, but it's the only solution I can think of that's worthwhile. Just disabling the vacuum releases sharelock behavior puts the user in the situation of deciding between maintenance and uptime. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 2014-12-02 11:02:07 -0800, Josh Berkus wrote: > On 12/02/2014 10:35 AM, Alvaro Herrera wrote: > > If the table is large, the time window for this to happen is large also; > > there might never be a time window large enough between two lock > > acquisitions for one autovacuum run to complete in a table. This > > starves the table from vacuuming completely, until things are bad enough > > that an emergency vacuum is forced. By then, the bloat is disastrous. > > > > I think it's that suicide that Andres wants to disable. Correct. > A much better solution for this ... and one which would solve a *lot* of > other issues with vacuum and autovacuum ... would be to give vacuum a > way to track which blocks an incomplete vacuum had already visited. > This would be even more valuable for freeze. That's pretty much a different problem. Yes, some more persistent would be helpful - although it'd need to be *much* more than which pages it has visited - but you'd still be vulnerable to the same issue. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 12/02/2014 10:35 AM, Alvaro Herrera wrote: > If the table is large, the time window for this to happen is large also; > there might never be a time window large enough between two lock > acquisitions for one autovacuum run to complete in a table. This > starves the table from vacuuming completely, until things are bad enough > that an emergency vacuum is forced. By then, the bloat is disastrous. > > I think it's that suicide that Andres wants to disable. A much better solution for this ... and one which would solve a *lot* of other issues with vacuum and autovacuum ... would be to give vacuum a way to track which blocks an incomplete vacuum had already visited. This would be even more valuable for freeze. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
Robert Haas wrote: > On Sat, Nov 29, 2014 at 11:46 PM, Jim Nasby wrote: > > What do you mean by "never succeed"? Is it skipping a large number of pages? > > Might re-trying the locks within the same vacuum help, or are the user locks > > too persistent? > > You are confused. He's talking about the relation-level lock that > vacuum attempts to take before doing any work at all on a given table, > not the per-page cleanup locks that it takes while processing each > page. If the relation-level lock can't be acquired, the whole table > is skipped. Almost there. Autovacuum takes the relation-level lock, starts processing. Some time later, another process wants a lock that conflicts with the one autovacuum has. This is flagged by the deadlock detector, and a signal is sent to autovacuum, which commits suicide. If the table is large, the time window for this to happen is large also; there might never be a time window large enough between two lock acquisitions for one autovacuum run to complete in a table. This starves the table from vacuuming completely, until things are bad enough that an emergency vacuum is forced. By then, the bloat is disastrous. I think it's that suicide that Andres wants to disable. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On Sat, Nov 29, 2014 at 11:46 PM, Jim Nasby wrote: > What do you mean by "never succeed"? Is it skipping a large number of pages? > Might re-trying the locks within the same vacuum help, or are the user locks > too persistent? You are confused. He's talking about the relation-level lock that vacuum attempts to take before doing any work at all on a given table, not the per-page cleanup locks that it takes while processing each page. If the relation-level lock can't be acquired, the whole table is skipped. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On 11/29/14, 2:22 AM, Andres Freund wrote: Hi, I've more than once seen that autovacuums on certain tables never succeed because regular exclusive (or similar) lockers cause it to give way/up before finishing. Usually if some part of the application uses explicit exclusive locks. In general I think it's quite imortant that autovacuum bheaves that way. But I think it might be worhtwile to offer an option to disable that behaviour. If some piece of application logic requires exclusive locks and that leads to complete starvation of autovacuum, there's really nothing that can be done but to manually schedule vacuums right now. I can see two possible solutions: 1) Add a reloption that allows to configure whether autovacuum gives way to locks acquired by user backends. 2) Add a second set of autovacuum_*_scale_factor variables that governs a threshhold after which autovacuum doesn't get cancelled anymore. Opinions? What do you mean by "never succeed"? Is it skipping a large number of pages? Might re-trying the locks within the same vacuum help, or are the user locks too persistent? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
On Nov 29, 2014 9:23 AM, "Andres Freund" wrote: > > Hi, > > I've more than once seen that autovacuums on certain tables never > succeed because regular exclusive (or similar) lockers cause it to give > way/up before finishing. Usually if some part of the application uses > explicit exclusive locks. > > In general I think it's quite imortant that autovacuum bheaves that > way. But I think it might be worhtwile to offer an option to disable > that behaviour. If some piece of application logic requires exclusive > locks and that leads to complete starvation of autovacuum, there's > really nothing that can be done but to manually schedule vacuums right > now. > > I can see two possible solutions: > > 1) Add a reloption that allows to configure whether autovacuum gives way >to locks acquired by user backends. > 2) Add a second set of autovacuum_*_scale_factor variables that governs >a threshhold after which autovacuum doesn't get cancelled anymore. > > Opinions? I definitely think having such a tunable would be very useful, in edge cases (so as you say the default should definitely be what it is today). Another "trigger option" could be to say "you may terminate autovaccum this many times before forcing one through", rather than triggers on tuple count. But tuples is probably a better choice, as it gives more dynamics - unless we want to do both. /Magnus
[HACKERS] How about a option to disable autovacuum cancellation on lock conflict?
Hi, I've more than once seen that autovacuums on certain tables never succeed because regular exclusive (or similar) lockers cause it to give way/up before finishing. Usually if some part of the application uses explicit exclusive locks. In general I think it's quite imortant that autovacuum bheaves that way. But I think it might be worhtwile to offer an option to disable that behaviour. If some piece of application logic requires exclusive locks and that leads to complete starvation of autovacuum, there's really nothing that can be done but to manually schedule vacuums right now. I can see two possible solutions: 1) Add a reloption that allows to configure whether autovacuum gives way to locks acquired by user backends. 2) Add a second set of autovacuum_*_scale_factor variables that governs a threshhold after which autovacuum doesn't get cancelled anymore. Opinions? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers