Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On Mon, Oct 27, 2014 at 5:51 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Jeff Janes wrote: It is only a page read if you have to read the page. It would seem optimal to have bgwriter adventitiously set hint bits and vm bits, because that is the last point at which the page can be changed without risking that it be written out twice. At that point, it has been given the maximum amount of time it can be given for the interested transactions to have committed and to have aged past the xmin horizon. I seem to recall that the main problem with that, though, is that you must be attached to a database in order to determine visibility, and bgwriter is not attached to a database. Regarding tuple hint bits, I couldn't find any such limitation in SetHintBits, other than in MarkBufferDirtyHint there being some code that would cause trouble: it accesses MyPgXact, which bgwriter would obviously not have. Maybe worth some experimentation ... I'm not sure about vm bits, though. That's a whole different topic. From a theoretical point of view, hint bits and VM bits present mostly-similar issues; they are basically dependent on the state of the transaction machinery, which for the most part is cluster-wide, so it should be fine for a process to do that sort of work without being bound to a specific database. I think the biggest problem for either is that the bgwriter can't look at the system catalogs to find out whether a particular buffer contains a heap page or an index page (or something else) and it needs to know that, which probably requires another buffer flag. From a practical point of view, there's a major performance difference between the two operations: setting a VM bit requires emitting a WAL record, whereas setting hint bits does not. That might make it too expensive to be worth doing, at least in some situations. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Jeff Janes wrote: It is only a page read if you have to read the page. It would seem optimal to have bgwriter adventitiously set hint bits and vm bits, because that is the last point at which the page can be changed without risking that it be written out twice. At that point, it has been given the maximum amount of time it can be given for the interested transactions to have committed and to have aged past the xmin horizon. I seem to recall that the main problem with that, though, is that you must be attached to a database in order to determine visibility, and bgwriter is not attached to a database. Regarding tuple hint bits, I couldn't find any such limitation in SetHintBits, other than in MarkBufferDirtyHint there being some code that would cause trouble: it accesses MyPgXact, which bgwriter would obviously not have. Maybe worth some experimentation ... I'm not sure about vm bits, though. That's a whole different topic. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On Mon, Oct 20, 2014 at 5:46 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-10-20 17:43:26 -0700, Josh Berkus wrote: On 10/20/2014 05:39 PM, Jim Nasby wrote: Or maybe vacuum isn't the right way to handle some of these scenarios. It's become the catch-all for all of this stuff, but maybe that doesn't make sense anymore. Certainly when it comes to dealing with inserts there's no reason we *have* to do anything other than set hint bits and possibly freeze xmin. +1 A page read is a page read. What's the point of heaving another process do it? It is only a page read if you have to read the page. It would seem optimal to have bgwriter adventitiously set hint bits and vm bits, because that is the last point at which the page can be changed without risking that it be written out twice. At that point, it has been given the maximum amount of time it can be given for the interested transactions to have committed and to have aged past the xmin horizon. I seem to recall that the main problem with that, though, is that you must be attached to a database in order to determine visibility, and bgwriter is not attached to a database. Cheers, Jeff
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/21/14, 4:36 PM, Jeff Janes wrote: On Mon, Oct 20, 2014 at 5:46 PM, Andres Freund and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote: On 2014-10-20 17:43:26 -0700, Josh Berkus wrote: On 10/20/2014 05:39 PM, Jim Nasby wrote: Or maybe vacuum isn't the right way to handle some of these scenarios. It's become the catch-all for all of this stuff, but maybe that doesn't make sense anymore. Certainly when it comes to dealing with inserts there's no reason we *have* to do anything other than set hint bits and possibly freeze xmin. +1 A page read is a page read. What's the point of heaving another process do it? It is only a page read if you have to read the page. It would seem optimal to have bgwriter adventitiously set hint bits and vm bits, because that is the last point at which the page can be changed without risking that it be written out twice. At that point, it has been given the maximum amount of time it can be given for the interested transactions to have committed and to have aged past the xmin horizon. I seem to recall that the main problem with that, though, is that you must be attached to a database in order to determine visibility, and bgwriter is not attached to a database. It's also a bit more complex than a simple question of is the page still in shared buffers. Our *real* last chance is when the page is about to be evicted from the filesystem cache; after that reading it back it will be extremely expensive (relatively speaking). I think it's worth considering this, because if you have any moderate length transactions on a busy database bgwriter won't be able to help much; you'll be burning through shared buffers too quickly. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 2014-10-19 20:43:29 -0500, Jim Nasby wrote: On 10/19/14, 11:41 AM, Andres Freund wrote: On 2014-10-18 21:36:48 -0500, Jim Nasby wrote: The weird part is that if it's not doing a freeze it will just punt on a page if it can't get the cleanup lock. I don't think that's particularly wierd. Otherwise vacuum can get stuck behind a single very hot page - leading to much, much more bloat. I have to believe that could seriously screw up autovacuum scheduling. Why? I'm worried there could be some pathological cases where we'd skip a large number of pages, perhaps if a vacuum scan and a seqscan ended up running alongside each other. I've seen little evidence of that. The reverse, a stuck autovacuum, is imo much more likely. For this to be an actual problem you'd need to encounter many pages that are not locked, but are pinned. That state doesn't exist for very long. Perhaps this is just paranoia, but we have no idea how bad things might be, because we don't have any logging for how many pages we skipped because we couldn't lock them. But so what? If we skip individual pages it won't be too bad - and very likely waiting very long is going to be more painful. The page won't be marked 'all visible' so the next vacuum will come around to it again. And it'll also get cleaned up by opportunistic hot pruning. Also, if this really is that big a deal for heap pages, how come we don't get screwed by it on Btree index pages, where we mandate that we acquire a cleanup lock? Because we never hold pins for btree pages for very long. Whereas we do that for heap pages. If you e.g. run a cursor forward you can hold a pin for essentially unbounded time. Now that we have forks, I'm wondering if it would be best to come up with a per-page system that could be used to determine when a table needs background work to be done. The visibility map could serve a lot of this purpose, but I'm not sure if it would work for getting hint bits set in the background. It would. Per definition, all tuples that are 'all visible' need to be fully hint bitted. I think it would also be a win if we had a way to advance relfrozenxid and relminmxid. Perhaps something that simply remembered the last XID that touched each page... Not sure what you're getting at here? That ultimately, our current method for determining when and what to vacuum is rather crude, and likely results in wasted effort during scans as well as not firing autovac often enough. Keep in mind that autovac started as a user-space utility and the best it could possibly do was to keep a table of stats counters. I agree that we should trigger autovacuum more often. It's *intentionally* not triggered *at all* for insert only workloads (if you discount anti wraparound vacuums). I think it's time to change that. For that we'd need to make vacuums that don't delete any tuples cheaper. We already rescan only the changed parts of the heaps - but we always scan indexes fully... The visibility map obviously helps cut down on extra work during a scan, but it only goes so far in that regard. Aha. Instead of relying on the crude methods, if we reliably tracked certain txids on a per-block basis in a fork, we could cheaply scan the fork and make an extremely informed decision on how much a vacuum would gain us, and exactly what blocks it should hit. Let me use freezing as an example. If we had a reliable list of the lowest txid for each block of a relation that would allow us to do a freeze scan by hitting only blocks with minimum txid within our freeze range. The same could be done for multixacts. It'd also become a prime contention point because you'd need to constantly update it. In contrast to a simple 'is frozen' bit (akin to is_visible) which only changes infrequently, and only in one direction. If we stored 3 txids for each block in a fork, we could fit information for ~680 heap blocks in each fork block. So in a database with 680G of heap data, we could fully determine every *block* (not table) we needed to vacuum by scanning just 1GB of data. That would allow for far better autovacuum scheduling than what we do today. It's not that simple. Wraparounds and locking complicate it significantly. I think the big missing piece lest something like Heikki's xid lsn ranges thing gets finished is a freeze map. The problem with a simple freeze map is when do you actually set the bit? There's precisely one place where you can set it for normal operation. During vacuum's scan. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/20/14, 3:11 PM, Andres Freund wrote: On 2014-10-19 20:43:29 -0500, Jim Nasby wrote: On 10/19/14, 11:41 AM, Andres Freund wrote: On 2014-10-18 21:36:48 -0500, Jim Nasby wrote: The weird part is that if it's not doing a freeze it will just punt on a page if it can't get the cleanup lock. I don't think that's particularly wierd. Otherwise vacuum can get stuck behind a single very hot page - leading to much, much more bloat. I have to believe that could seriously screw up autovacuum scheduling. Why? I'm worried there could be some pathological cases where we'd skip a large number of pages, perhaps if a vacuum scan and a seqscan ended up running alongside each other. I've seen little evidence of that. The reverse, a stuck autovacuum, is imo much more likely. For this to be an actual problem you'd need to encounter many pages that are not locked, but are pinned. That state doesn't exist for very long. How would you actually get evidence of this... we don't log it. :) (See my proposal at http://www.postgresql.org/message-id/54446c10.2080...@bluetreble.com) Perhaps this is just paranoia, but we have no idea how bad things might be, because we don't have any logging for how many pages we skipped because we couldn't lock them. But so what? If we skip individual pages it won't be too bad - and very likely waiting very long is going to be more painful. The page won't be marked 'all visible' so the next vacuum will come around to it again. And it'll also get cleaned up by opportunistic hot pruning. Probably true. Hopefully we can start logging it and then we'll know for sure. That ultimately, our current method for determining when and what to vacuum is rather crude, and likely results in wasted effort during scans as well as not firing autovac often enough. Keep in mind that autovac started as a user-space utility and the best it could possibly do was to keep a table of stats counters. I agree that we should trigger autovacuum more often. It's *intentionally* not triggered *at all* for insert only workloads (if you discount anti wraparound vacuums). I think it's time to change that. For that we'd need to make vacuums that don't delete any tuples cheaper. We already rescan only the changed parts of the heaps - but we always scan indexes fully... Or maybe vacuum isn't the right way to handle some of these scenarios. It's become the catch-all for all of this stuff, but maybe that doesn't make sense anymore. Certainly when it comes to dealing with inserts there's no reason we *have* to do anything other than set hint bits and possibly freeze xmin. Instead of relying on the crude methods, if we reliably tracked certain txids on a per-block basis in a fork, we could cheaply scan the fork and make an extremely informed decision on how much a vacuum would gain us, and exactly what blocks it should hit. Let me use freezing as an example. If we had a reliable list of the lowest txid for each block of a relation that would allow us to do a freeze scan by hitting only blocks with minimum txid within our freeze range. The same could be done for multixacts. It'd also become a prime contention point because you'd need to constantly update it. In contrast to a simple 'is frozen' bit (akin to is_visible) which only changes infrequently, and only in one direction. Actually, the contention on freeze would very possibly be minimal, because it probably doesn't change very often. Even if it did, it's OK if the value isn't 100% accurate, so long as the recorded XID is guaranteed older than what's actually on the page. If we stored 3 txids for each block in a fork, we could fit information for ~680 heap blocks in each fork block. So in a database with 680G of heap data, we could fully determine every *block* (not table) we needed to vacuum by scanning just 1GB of data. That would allow for far better autovacuum scheduling than what we do today. It's not that simple. Wraparounds and locking complicate it significantly. I realize what I'm talking about isn't trivial (though, I'm confused by your comment about wraparound since presumably TransactionIdPrecedes() and it's ilk solve that problem...) My ultimate point here is that we're using what are (today) very crude methods to control what gets vacuumed when, and I think that now that we have resource forks would could do *much* better without a tremendous amount of work. But to make a big advancement here we'll need to take a step back and rethink some things (like vacuum is the only way to handle these problems). Let me put some thought into this. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/20/2014 05:39 PM, Jim Nasby wrote: Or maybe vacuum isn't the right way to handle some of these scenarios. It's become the catch-all for all of this stuff, but maybe that doesn't make sense anymore. Certainly when it comes to dealing with inserts there's no reason we *have* to do anything other than set hint bits and possibly freeze xmin. +1 -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 2014-10-20 17:43:26 -0700, Josh Berkus wrote: On 10/20/2014 05:39 PM, Jim Nasby wrote: Or maybe vacuum isn't the right way to handle some of these scenarios. It's become the catch-all for all of this stuff, but maybe that doesn't make sense anymore. Certainly when it comes to dealing with inserts there's no reason we *have* to do anything other than set hint bits and possibly freeze xmin. +1 A page read is a page read. What's the point of heaving another process do it? Vacuum doesn't dirty pages if they don't have to be dirtied. Especially stuff like freezing cannot really be dealt with outside of vacuum unless you make already complex stuff more complex for a marginal benefit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 2014-10-18 21:36:48 -0500, Jim Nasby wrote: On 10/9/14, 4:19 PM, Andres Freund wrote: On 2014-10-09 18:16:46 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that*precisely* what a plain vacuum run does? Well, it also scans for dead tuples, removes them, and needs to go through indexes to remove their references. IIRC it doesn't do most of that if that there's no need. And if it's a insert only table without rollbacks. I*do* think there's some optimizations we could make in general. No, it always attempts dead tuple removal. I said some steps, not all steps. Check it out: /* If any tuples need to be deleted, perform final vacuum cycle */ /* XXX put a threshold on min number of tuples here? */ if (vacrelstats-num_dead_tuples 0) { /* Log cleanup info before we touch indexes */ vacuum_log_cleanup_info(onerel, vacrelstats); /* Remove index entries */ for (i = 0; i nindexes; i++) lazy_vacuum_index(Irel[i], indstats[i], vacrelstats); /* Remove tuples from heap */ lazy_vacuum_heap(onerel, vacrelstats); vacrelstats-num_index_scans++; } There's rub here though. We unconditionally do: /* Do post-vacuum cleanup and statistics update for each index */ for (i = 0; i nindexes; i++) lazy_cleanup_index(Irel[i], indstats[i], vacrelstats); and that's not particularly cheap. Maybe we should make that conditional when there's been no lazy_vacuum_index/heap calls at all? The weird part is that if it's not doing a freeze it will just punt on a page if it can't get the cleanup lock. I don't think that's particularly wierd. Otherwise vacuum can get stuck behind a single very hot page - leading to much, much more bloat. I have to believe that could seriously screw up autovacuum scheduling. Why? Now that we have forks, I'm wondering if it would be best to come up with a per-page system that could be used to determine when a table needs background work to be done. The visibility map could serve a lot of this purpose, but I'm not sure if it would work for getting hint bits set in the background. It would. Per definition, all tuples that are 'all visible' need to be fully hint bitted. I think it would also be a win if we had a way to advance relfrozenxid and relminmxid. Perhaps something that simply remembered the last XID that touched each page... Not sure what you're getting at here? I think the big missing piece lest something like Heikki's xid lsn ranges thing gets finished is a freeze map. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Andres Freund and...@2ndquadrant.com writes: There's rub here though. We unconditionally do: /* Do post-vacuum cleanup and statistics update for each index */ for (i = 0; i nindexes; i++) lazy_cleanup_index(Irel[i], indstats[i], vacrelstats); and that's not particularly cheap. Maybe we should make that conditional when there's been no lazy_vacuum_index/heap calls at all? Absolutely not. If the cleanup step is skippable, it would be the province of the index AM to make that decision. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 2014-10-19 12:50:30 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: There's rub here though. We unconditionally do: /* Do post-vacuum cleanup and statistics update for each index */ for (i = 0; i nindexes; i++) lazy_cleanup_index(Irel[i], indstats[i], vacrelstats); and that's not particularly cheap. Maybe we should make that conditional when there's been no lazy_vacuum_index/heap calls at all? Absolutely not. If the cleanup step is skippable, it would be the province of the index AM to make that decision. Fair point. At the moment we're doing a full of nbtree indexes everytime we do a vacuum. Even when the heap vacuum only scanned a couple hundred pages of a huge table. That makes partial vacuum noticeably less useful. So I do think we need to do something to improve upon the situation. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/19/14, 11:41 AM, Andres Freund wrote: On 2014-10-18 21:36:48 -0500, Jim Nasby wrote: On 10/9/14, 4:19 PM, Andres Freund wrote: On 2014-10-09 18:16:46 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that*precisely* what a plain vacuum run does? Well, it also scans for dead tuples, removes them, and needs to go through indexes to remove their references. IIRC it doesn't do most of that if that there's no need. And if it's a insert only table without rollbacks. I*do* think there's some optimizations we could make in general. No, it always attempts dead tuple removal. I said some steps, not all steps. Check it out: /* If any tuples need to be deleted, perform final vacuum cycle */ /* XXX put a threshold on min number of tuples here? */ if (vacrelstats-num_dead_tuples 0) { /* Log cleanup info before we touch indexes */ vacuum_log_cleanup_info(onerel, vacrelstats); /* Remove index entries */ for (i = 0; i nindexes; i++) lazy_vacuum_index(Irel[i], indstats[i], vacrelstats); /* Remove tuples from heap */ lazy_vacuum_heap(onerel, vacrelstats); vacrelstats-num_index_scans++; } There's rub here though. We unconditionally do: /* Do post-vacuum cleanup and statistics update for each index */ for (i = 0; i nindexes; i++) lazy_cleanup_index(Irel[i], indstats[i], vacrelstats); and that's not particularly cheap. Maybe we should make that conditional when there's been no lazy_vacuum_index/heap calls at all? We could possibly pass in to lazy_cleanup_index whether we actually removed any tuples. The weird part is that if it's not doing a freeze it will just punt on a page if it can't get the cleanup lock. I don't think that's particularly wierd. Otherwise vacuum can get stuck behind a single very hot page - leading to much, much more bloat. I have to believe that could seriously screw up autovacuum scheduling. Why? I'm worried there could be some pathological cases where we'd skip a large number of pages, perhaps if a vacuum scan and a seqscan ended up running alongside each other. Perhaps this is just paranoia, but we have no idea how bad things might be, because we don't have any logging for how many pages we skipped because we couldn't lock them. Also, if this really is that big a deal for heap pages, how come we don't get screwed by it on Btree index pages, where we mandate that we acquire a cleanup lock? Now that we have forks, I'm wondering if it would be best to come up with a per-page system that could be used to determine when a table needs background work to be done. The visibility map could serve a lot of this purpose, but I'm not sure if it would work for getting hint bits set in the background. It would. Per definition, all tuples that are 'all visible' need to be fully hint bitted. I think it would also be a win if we had a way to advance relfrozenxid and relminmxid. Perhaps something that simply remembered the last XID that touched each page... Not sure what you're getting at here? That ultimately, our current method for determining when and what to vacuum is rather crude, and likely results in wasted effort during scans as well as not firing autovac often enough. Keep in mind that autovac started as a user-space utility and the best it could possibly do was to keep a table of stats counters. The visibility map obviously helps cut down on extra work during a scan, but it only goes so far in that regard. Instead of relying on the crude methods, if we reliably tracked certain txids on a per-block basis in a fork, we could cheaply scan the fork and make an extremely informed decision on how much a vacuum would gain us, and exactly what blocks it should hit. Let me use freezing as an example. If we had a reliable list of the lowest txid for each block of a relation that would allow us to do a freeze scan by hitting only blocks with minimum txid within our freeze range. The same could be done for multixacts. That's just one example. We could do something similar for background hinting (oldest xmin/xmax of all non-hinted tuples), and whether it's worth trying
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/9/14, 4:19 PM, Andres Freund wrote: On 2014-10-09 18:16:46 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that*precisely* what a plain vacuum run does? Well, it also scans for dead tuples, removes them, and needs to go through indexes to remove their references. IIRC it doesn't do most of that if that there's no need. And if it's a insert only table without rollbacks. I*do* think there's some optimizations we could make in general. No, it always attempts dead tuple removal. The weird part is that if it's not doing a freeze it will just punt on a page if it can't get the cleanup lock. I have to believe that could seriously screw up autovacuum scheduling. Now that we have forks, I'm wondering if it would be best to come up with a per-page system that could be used to determine when a table needs background work to be done. The visibility map could serve a lot of this purpose, but I'm not sure if it would work for getting hint bits set in the background. I think it would also be a win if we had a way to advance relfrozenxid and relminmxid. Perhaps something that simply remembered the last XID that touched each page... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Hello! Autovacuum daemon performs vacuum when the number of rows updated/deleted (n_dead_tuples) reaches some threshold. Similarly it performs analyze when the number of rows changed in any way (incl. inserted). When a table is mostly insert-only, its visibility map is not updated as vacuum threshold is almost never reached, but analyze does not update visibility map. Why could it be a bad idea to run vacuum after some number of any changes including inserts, like analyze? Or at least make it tunable by user (add a second bunch of paramters to control second vacuum threshold, disabled by default)? Best regards, Alexey Bashtanov -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On Thu, Oct 9, 2014 at 02:34:17PM +0400, Alexey Bashtanov wrote: Hello! Autovacuum daemon performs vacuum when the number of rows updated/deleted (n_dead_tuples) reaches some threshold. Similarly it performs analyze when the number of rows changed in any way (incl. inserted). When a table is mostly insert-only, its visibility map is not updated as vacuum threshold is almost never reached, but analyze does not update visibility map. Why could it be a bad idea to run vacuum after some number of any changes including inserts, like analyze? Or at least make it tunable by user (add a second bunch of paramters to control second vacuum threshold, disabled by default)? I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. (Of course, this idea needs refinement to avoid running over and over when the bit cannot be set on some pages for whatever reason.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that *precisely* what a plain vacuum run does? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Alvaro Herrera alvhe...@2ndquadrant.com wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. (Of course, this idea needs refinement to avoid running over and over when the bit cannot be set on some pages for whatever reason.) Wouldn't we get substantially the same thing just by counting tuple inserts toward the autovacuum vacuum threshold? I mean, it unless the table is due for wraparound prevention autovacuum, it will only visit pages that don't have the all-visible bit set, right? And how much work would that do beyond what you're describing if none of the tuples are dead? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Andres Freund wrote: On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that *precisely* what a plain vacuum run does? Well, it also scans for dead tuples, removes them, and needs to go through indexes to remove their references. I'm thinking in something very lightweight. Otherwise, why don't we just reduce the vacuum_scale_factor default to something very small, so that vacuum is triggered more often? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Kevin Grittner wrote: Wouldn't we get substantially the same thing just by counting tuple inserts toward the autovacuum vacuum threshold? I mean, it unless the table is due for wraparound prevention autovacuum, it will only visit pages that don't have the all-visible bit set, right? And how much work would that do beyond what you're describing if none of the tuples are dead? The problem is precisely what happens if there are some dead tuples, but not enough to reach the 20% threshold: this vacuum now has to scan the table twice and has to clean up indexes also. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 2014-10-09 18:16:46 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. Isn't that *precisely* what a plain vacuum run does? Well, it also scans for dead tuples, removes them, and needs to go through indexes to remove their references. IIRC it doesn't do most of that if that there's no need. And if it's a insert only table without rollbacks. I *do* think there's some optimizations we could make in general. I'm thinking in something very lightweight. Otherwise, why don't we just reduce the vacuum_scale_factor default to something very small, so that vacuum is triggered more often? The problem here is that that doesn't trigger for inserts. Just for updates/deletes or rollbacks. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
On 10/9/14, 4:03 PM, Alvaro Herrera wrote: Bruce Momjian wrote: I agree this is a serious problem. We have discussed various options, but have not decided on anything. The TODO list has: https://wiki.postgresql.org/wiki/Todo Improve setting of visibility map bits for read-only and insert-only workloads http://www.postgresql.org/message-id/20130906001437.ga29...@momjian.us I hate to repeat myself, but I think autovacuum could be modified to run actions other than vacuum and analyze. In this specific case we could be running a table scan that checks only pages that don't have the all-visible bit set, and see if it can be set. (Of course, this idea needs refinement to avoid running over and over when the bit cannot be set on some pages for whatever reason.) If we go down that road we should also think about having it proactively set hint bits... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers