On Sat, Apr 5, 2025 at 2:02 AM Konstantin Knizhnik <knizh...@garret.ru> wrote: > > A more targeted solution to your specific problem would be to update > the visibility map on access. Then, the first time you have to fetch > that heap page, you could mark it all-visible (assuming the long > running transaction has ended) and then the next index-only scan > wouldn't have to do the same heap fetch. It doesn't add any overhead > in the case that the long running transaction has not ended, unlike > trying to trigger another autovacuum. > > I really considered this alternative when thinking about the solution of the > problem. It is more consistent with hint bit approach. > I declined it in favor of this solution because of the following reasons: > > 1. Index-only scan holds read-only lock on heap page. In order to update it, > we need to upgrade this lock to exclusive. > 2. We need to check visibility for all elements on the page (actually do > something like `heap_page_is_all_visible`) but if there is large number > elements at the page it can be quite expensive. And I afraid that it can > slowdown speed of index-only scan. Yes, only in "slow case" - when it has to > access heap to perform visibility check. But still it may be not acceptable. > Also it is not clear how to mark page as already checked. Otherwise we will > have to repeat this check for all tids referring this page. > 3. `heap_page_is_all_visible` is local to lazyvaccum.c. So to use it in > index-only scan we either have to make it global, either cut&paste it's code. > Just removing "static" is not possible, because it is using local > `LVRelState`, so some refactoring is needed in any case. > 4. We need to wal-log VM page and heap pages in case of setting all-visible > bit. It is quite expensive operation. Doing it inside index-only scan can > significantly increase time of select. Certainly Postgres is not a real-time > DBMS. But still it is better to provide some predictable query execution > time. This is why I think that it is better to do such workt in background > (in vaccum).
I wasn't thinking about adding a new VM setting functionality to index only scan in particular. heapam_index_fetch_tuple() already calls heap_page_prune_opt() which will do pruning under certain conditions. I was thinking that we start updating the VM after pruning in the on-access case too (not just when pruning is invoked by vacuum). If you look at the callers of heap_page_prune_opt(), it includes bitmap heap scan and also heap_prepare_pagescan() which is invoked as part of sequential scans and other operations. - Melanie