Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Cédric Villemain
2011/5/29 Tom Lane t...@sss.pgh.pa.us: Greg Stark gsst...@mit.edu writes: On Sat, May 28, 2011 at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: I also found that Greg was right in thinking that it would help if we tweaked lazy_scan_heap to not always scan the first SKIP_PAGES_THRESHOLD-1 pages

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Tom Lane
=?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com writes: 2011/5/29 Tom Lane t...@sss.pgh.pa.us: OK, do you like the attached version of that logic?  (Other fragments of the patch as before.) The idea was that remove only one page from the VACUUM will prevent relfrozenxid

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Pavan Deolasee
On Sun, May 29, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: =?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com writes: 2011/5/29 Tom Lane t...@sss.pgh.pa.us: OK, do you like the attached version of that logic?  (Other fragments of the patch as before.) The idea

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes: On Sun, May 29, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: That would require proof, not just suggestion.  Skipping pages will defeat the OS read-ahead algorithm, and so could easily cost more than reading them. My worry is what we

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Pavan Deolasee
On Sun, May 29, 2011 at 9:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: Pavan Deolasee pavan.deola...@gmail.com writes: On Sun, May 29, 2011 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote: That would require proof, not just suggestion.  Skipping pages will defeat the OS read-ahead algorithm, and so

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes: I am sorry if I sounded terse above. But my gripe is that sometimes we are too reluctant to listen to ideas and insist on producing some hard numbers first which might take significant efforts. But we are not equally strict when such changes are

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Pavan Deolasee
On Sun, May 29, 2011 at 10:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Pavan Deolasee pavan.deola...@gmail.com writes: I am sorry if I sounded terse above. But my gripe is that sometimes we are too reluctant to listen to ideas and insist on producing some hard numbers first which might take

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-29 Thread Cédric Villemain
2011/5/29 Tom Lane t...@sss.pgh.pa.us: =?ISO-8859-1?Q?C=E9dric_Villemain?= cedric.villemain.deb...@gmail.com writes: 2011/5/29 Tom Lane t...@sss.pgh.pa.us: OK, do you like the attached version of that logic?  (Other fragments of the patch as before.) The idea was that remove only one page

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Still, maybe we don't have a better option. If it were me, I'd add an additional safety valve: use your formula if the percentage of the relation scanned is above some threshold where there's unlikely to be too much skew. But if the percentage

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-28 Thread Greg Stark
On Sat, May 28, 2011 at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: I also found that Greg was right in thinking that it would help if we tweaked lazy_scan_heap to not always scan the first SKIP_PAGES_THRESHOLD-1 pages even if they were all_visible_according_to_vm.  That seemed to skew the

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-28 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Sat, May 28, 2011 at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: I also found that Greg was right in thinking that it would help if we tweaked lazy_scan_heap to not always scan the first SKIP_PAGES_THRESHOLD-1 pages even if they were

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-27 Thread Robert Haas
On Thu, May 26, 2011 at 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: When we prune or vacuum a page, I don't suppose we have enough information about that page's previous state to calculate a tuple count delta, do we?  That would allow a far

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-27 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Still, maybe we don't have a better option. If it were me, I'd add an additional safety valve: use your formula if the percentage of the relation scanned is above some threshold where there's unlikely to be too much skew. But if the percentage

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Wed, May 25, 2011 at 9:41 AM, Tom Lane t...@sss.pgh.pa.us wrote: ... What I'm currently imagining is to do a smoothed moving average, where we factor in the new density estimate with a weight dependent on the percentage of the table we did scan. That is,

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Robert Haas
On Thu, May 26, 2011 at 11:25 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm still of the opinion that an incremental estimation process like the above is a lot saner than what we're doing now, snarky Dilbert references notwithstanding.  The only thing that seems worthy of debate from here is

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: I would feel a lot better about something that is deterministic, like, I dunno, if VACUUM visits more than 25% of the table, we use its estimate. And we always use ANALYZE's estimate. Or something. This argument seems to rather miss the point. The

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Robert Haas
On Thu, May 26, 2011 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Another thought: Couldn't relation_needs_vacanalyze() just scale up reltuples by the ratio of the current number of pages in the relation to relpages, just as the query planner does? Hmm ... that would fix Florian's

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: I think we should really consider replacing reltuples with reltupledensity at some point. I continue to be afraid that using a decaying average in this case is going to end up overweighting the values from some portion of the table that's getting

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Robert Haas
On Thu, May 26, 2011 at 1:28 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: I think we should really consider replacing reltuples with reltupledensity at some point.  I continue to be afraid that using a decaying average in this case is going to

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: Given how trivial it would be to adjust reltuples to keep its ratio to relpages about the same when we don't have a new hard number, but some evidence that we should fudge our previous value, I don't

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Robert Haas
On Thu, May 26, 2011 at 2:05 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: I'm a bit confused by this - what the current design obfuscates is the fact that reltuples and relpages are not really independent columns; you can't update one without updating the other, unless you want screwy

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Except that's not how it works. At least in the case of ANALYZE, we *aren't* counting all the tuples in the table. We're selecting a random sample of pages and inferring a tuple density, which we then extrapolate to the whole table and store. Then

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Thu, May 26, 2011 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Another thought: Couldn't relation_needs_vacanalyze() just scale up reltuples by the ratio of the current number of pages in the relation to relpages, just as the query planner does?

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: By storing the ratio and one count you make changes to the other count implied and less visible. It seems more understandable and less prone to error (to me, anyway) to keep the two raw numbers and

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-26 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: When we prune or vacuum a page, I don't suppose we have enough information about that page's previous state to calculate a tuple count delta, do we? That would allow a far more accurate number to be maintained than anything suggested so far,

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Tom Lane
Florian Helmberger f...@25th-floor.com writes: On 25.05.11 04:47, Tom Lane wrote: Florian Helmbergerf...@25th-floor.com writes: I'm running a production database with PostgreSQL 9.0.3 (64-bit) on Debian 5.0.4 and have an issue with a TOAST table and far to frequent autovacuum runs. I

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mié may 25 11:47:52 -0400 2011: I think I see what must be going on here: that toast table must contain a long run of all-visible-according-to-the-VM pages (which is hardly a surprising situation). This results in VACUUM choosing not to update the pg_class

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Robert Haas
On Wed, May 25, 2011 at 11:47 AM, Tom Lane t...@sss.pgh.pa.us wrote: 2. Revise the vacuum code so that it doesn't skip updating the pg_class entries.  We could have it count the number of pages it skipped, rather than just keeping a bool, and then scale up the rel_tuples count to be

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of mié may 25 11:47:52 -0400 2011: I can see two basic approaches we might take here: 1. Modify autovacuum to use something from the stats collector, rather than reltuples, to make its decisions. I'm not too

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: I don't know what client-side code might be looking at relpages/reltuples. I know that I find reltuples useful for getting an accurate enough sense of rows in a table (or set of tables) without resorting to count(*). I'd be OK with any two of pages,

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Wed, May 25, 2011 at 11:47 AM, Tom Lane t...@sss.pgh.pa.us wrote: 2. Revise the vacuum code so that it doesn't skip updating the pg_class entries.  We could have it count the number of pages it skipped, rather than just keeping a bool, and then

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Robert Haas
On Wed, May 25, 2011 at 12:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, I had been thinking about the latter point.  We could be conservative and just keep the reported tuple density the same (ie, update relpages to the new correct value, while setting reltuples so that the density ratio

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Alvaro Herrera
Excerpts from Kevin Grittner's message of mié may 25 12:37:24 -0400 2011: Tom Lane t...@sss.pgh.pa.us wrote: I don't know what client-side code might be looking at relpages/reltuples. I know that I find reltuples useful for getting an accurate enough sense of rows in a table (or set

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Wed, May 25, 2011 at 12:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, I had been thinking about the latter point.  We could be conservative and just keep the reported tuple density the same (ie, update relpages to the new correct value, while

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mié may 25 12:54:28 -0400 2011: I don't know. That's maybe better, but I'd be willing to wager that in some cases it will just slow down the rate at which we converge to a completely incorrect value, while in other cases it'll fail to update the data

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Robert Haas
On Wed, May 25, 2011 at 1:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: [ shrug... ]  When you don't have complete information, it's *always* the case that you will sometimes make a mistake.  That's not justification for paralysis, especially not when the existing code is demonstrably broken.

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Robert Haas
On Wed, May 25, 2011 at 1:15 PM, Robert Haas robertmh...@gmail.com wrote: I agree that if VACUUM scanned 99% of the table, it's probably fine to use its numbers.  It's also fine to use the numbers from ANALYZE, because those pages are chosen randomly.  What bothers me is the idea of using a

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Cédric Villemain
2011/5/25 Alvaro Herrera alvhe...@commandprompt.com: Excerpts from Kevin Grittner's message of mié may 25 12:37:24 -0400 2011: Tom Lane t...@sss.pgh.pa.us wrote: I don't know what client-side code might be looking at relpages/reltuples. I know that I find reltuples useful for getting an

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Alvaro Herrera
Excerpts from Cédric Villemain's message of mié may 25 13:24:01 -0400 2011: Well, we only actually need to store one number, because you can figure out a much more precise number-of-pages figure with pg_relation_size() divided by configured page size. I may miss something but we need

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Cédric Villemain
2011/5/25 Alvaro Herrera alvhe...@commandprompt.com: Excerpts from Cédric Villemain's message of mié may 25 13:24:01 -0400 2011: Well, we only actually need to store one number, because you can figure out a much more precise number-of-pages figure with pg_relation_size() divided by

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Wed, May 25, 2011 at 1:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Because the problem is not specific to TOAST tables.  As things currently stand, we will accept the word of an ANALYZE as gospel even if it scanned 1% of the table, and completely

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Robert Haas
On Wed, May 25, 2011 at 5:54 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, May 25, 2011 at 1:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Because the problem is not specific to TOAST tables.  As things currently stand, we will accept the word of an

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Greg Stark
On Wed, May 25, 2011 at 10:05 PM, Greg Stark gsst...@mit.edu wrote: updated_density = old_density + (new_density - old_density) * reliability new_reltuples = updated_density * new_relpages This amounts to assuming that the pages observed in the vacuum have the density observed and the pages

Re: [HACKERS] [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

2011-05-25 Thread Greg Stark
On Wed, May 25, 2011 at 9:41 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah, I had been thinking about the latter point.  We could be conservative and just keep the reported tuple density the same (ie, update relpages to the new correct value, while setting reltuples so that the density ratio