On Mon, Feb 15, 2021 at 7:26 PM Peter Geoghegan <p...@bowt.ie> wrote: > Actually, there is one more reason why I bring up idea 1 now: I want > to hear your thoughts on the index AM API questions now, which idea 1 > touches on. Ideally all of the details around the index AM VACUUM APIs > (i.e. when and where the extra work happens -- btvacuumcleanup() vs > btbulkdelete()) won't need to change much in the future. I worry about > getting this index AM API stuff right, at least a little.
Speaking of problems like this, I think I spotted an old one: we call _bt_update_meta_cleanup_info() in either btbulkdelete() or btvacuumcleanup(). I think that we should always call it in btvacuumcleanup(), though -- even in cases where there is no call to btvacuumscan() inside btvacuumcleanup() (because btvacuumscan() happened earlier instead, during the btbulkdelete() call). This makes the value of IndexVacuumInfo.num_heap_tuples (which is what we store in the metapage) much more accurate -- right now it's always pg_class.reltuples from *before* the VACUUM started. And so the btm_last_cleanup_num_heap_tuples value in a nbtree metapage is often kind of inaccurate. This "estimate during ambulkdelete" issue is documented here (kind of): /* * Struct for input arguments passed to ambulkdelete and amvacuumcleanup * * num_heap_tuples is accurate only when estimated_count is false; * otherwise it's just an estimate (currently, the estimate is the * prior value of the relation's pg_class.reltuples field, so it could * even be -1). It will always just be an estimate during ambulkdelete. */ typedef struct IndexVacuumInfo { ... } The name of the metapage field is already btm_last_cleanup_num_heap_tuples, which already suggests the approach that I propose now. So why don't we do it like that already? (Thinks some more...) I wonder: did this detail change at the last minute during the development of the feature (just before commit 857f9c36) back in early 2018? That change have made it easier to deal with oldestBtpoXact/btm_oldest_btpo_xact, which IIRC was a late addition to the patch -- so maybe it's truly an accident that the code doesn't work the way that I suggest it should already. (It's annoying to make state from btbulkdelete() appear in btvacuumcleanup(), unless it's from IndexVacuumInfo or something -- I can imagine this changing at the last minute, just for that reason.) Do you think that this needs to be treated as a bug in the backbranches, Masahiko? I'm not sure... In any case we should probably make this change as part of Postgres 14. Don't you think? It's certainly easy to do it this way now, since there will be no need to keep around a oldestBtpoXact value until btvacuumcleanup() (in the common case where btbulkdelete() is where we call btvacuumscan()). The new btm_last_cleanup_num_delpages field (which replaces btm_oldest_btpo_xact) has a value that just comes from the bulk stats, which is easy anyway. -- Peter Geoghegan