On Mon, Nov 2, 2020 at 10:03 AM Peter Geoghegan <p...@bowt.ie> wrote: > Attached is my proposed fix, which takes this approach. I will commit > this on Wednesday or Thursday, barring any objections.
Just to be clear: I am not proposing that we set 'IndexBulkDeleteResult.estimated_count = false' here, even though there is a certain sense in which we now accept an unreliable figure in Postgres 13. This is not what GIN does. That approach doesn't seem appropriate for nbtree + deduplication, which is much closer to nbtree in Postgres 12 than to GIN. I believe that the final num_index_tuples value (generated during cleanup-only nbtree VACUUM) is in general sufficiently reliable to not be treated as an estimate by vacuumlazy.c -- the pg_class entry for the index should still be updated in update_index_statistics(). In other words, I think that the remaining posting-list related inaccuracies are comparable to the existing inaccuracies caused by concurrent page splits during nbtree vacuuming (I describe the problem right next to an old comment about that issue, in fact). What we have in both cases is an artifact of how the data is physically represented and the difficulty it causes us during vacuuming, in certain cases. There are known error bars. That's why we shouldn't treat num_index_tuples as merely an estimate. -- Peter Geoghegan