> No, it's not that easy. The question you have to ask is "when has that > initial write burst stopped?". As an example, if autovacuum happened to > see that table in the instant after CREATE, it might autovacuum it while > it's still empty, and then this rule fails to trigger any further effort.
Well, frankly, it would be useful to know it's empty too. If you really wanna see that 1000-row default estimate bite you on the tuchas, try a JOIN against an empty new table. > Personally I've always thought that autovacuum's rules should be based > on a percentage of rows changed, not an absolute threshold (or maybe in > addition to an absolute threshold). This way, if you create a table and > insert 10 rows, that would make it subject to analyze on-sight, even if > autovac had managed to pass by while it was still empty, because the > percentage-changed is infinite. Then, if you insert the other 35 rows > you meant to insert, it's *again* subject to autoanalyze on the next > pass, because the percentage-changed is still 350% > I remember having got voted down on the percentage approach back when > we first put AV into core, but I remain convinced that decision was a > bad one. Yeah, I was one of the ones voting against you. The reason not to have percentage-only is for small tables. Imagine that you have a table with 18 rows, and analyze_threshold is 0 and analyze_scale_factor is 0.1. Add two rows --> ANALYZE UPDATE two rows --> ANALYZE UPDATE three more rows --> ANALYZE DELETE three rows --> ANALYZE Without a threshold, any table under 100 rows which gets UPDATEs would be continually in the autoanalyze queue, which would not be beneficial; the churn in pg_statistic alone would be detrimental. I guess the real issue is that we tried to solve the issue of ANALYZEing tables of radically different sizes with a simple heuristic of threshold+scale_factor, and that's proving too simple for actual production sites. The other end where autoanalyze often falls down is the high end (tables with a million rows). Can anyone think of a new heuristic which doesn't involve adding 2-6 new GUCS knobs? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers