Hi, So here is an updated version of the patch/fix, addressing the remaining issues in v3 posted by Tom in November.
The 0001 part is actually a bugfix in bloom and spgist index AM, which did something like this: reltuples = IndexBuildHeapScan(...) result->heap_tuples = result->index_tuples = reltuples; That is, these two AMs simply used the number of heap rows for the index. That does not work for partial indexes, of course, where the correct index reltuples value is likely much lower. 0001 fixes this by tracking the number of actually indexed rows in the build states, just like in the other index AMs. A VACUUM or ANALYZE will fix the estimate, of course, but for tables that are not changing very much it may take quite a while. So I think this is something we definitely need to back-patch. The 0002 part is the main part, unifying the definition of reltuples on three main places: a) acquire_sample_rows (ANALYZE) b) lazy_scan_heap (VACUUM) c) IndexBuildHeapRangeScan (CREATE INDEX) As the ANALYZE case seems the most constrained, the other two places were updated to use the same criteria for which rows to include in the reltuples estimate: * HEAPTUPLE_LIVE * HEAPTUPLE_INSERT_IN_PROGRESS (same transaction) * HEAPTUPLE_DELETE_IN_PROGRESS (not the same trasaction) This resolves the issue with oscillating reltuples estimates, produced by VACUUM and ANALYZE (with many non-removable dead tuples). I've checked all IndexBuildHeapRangeScan callers, and none of them is using the reltuples estimate for anything except for passing it to index_update_stats. Aside from the bug fixed in 0001, of course. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
0001-Compute-index_tuples-correctly-in-bloom-and-spgist.patch.gz
Description: application/gzip
0002-Unify-the-definition-of-reltuples-in-VACUUM-ANALYZE-.patch.gz
Description: application/gzip