> BTW, I just realized another bug in the patch: btbulkdelete fails to
> guarantee that it visits every page in the index. It was OK for
> btvacuumcleanup to ignore pages added to the index after it starts,
> but btbulkdelete has to deal with such pages.
Actually, as written this patch does not work. At all. btbulkdelete
has to guarantee that it removes every index entry it's told to, and
it cannot guarantee that in the presence of concurrent page splits.
A split could move index items from a page that btbulkdelete hasn't
visited to one it's already passed over. This is not possible with an
index-order traversal (because splits only move items to the right)
but it's definitely possible with a physical-order traversal.
I was toying with the idea of remembering deletable pages (which
btvacuumcleanup does anyway), which are the only ones that page splits
could move items to, and then rescanning those after the completion
of the primary pass. This has a couple of pretty unpleasant
* We have to remember *every* deletable page for correctness, compared
to the current situation where btvacuumcleanup can bound the number of
pages it tracks. This creates a situation where VACUUM may fail
outright if it doesn't have gobs of memory. Since one of the main
reasons for developing lazy VACUUM was to ensure we could vacuum
arbitrarily large tables in bounded memory, I'm not happy with this.
* The rescan could be far from cheap if there are many such pages.
Also, I'm not sure when you can stop: it certainly seems possible that
items could be moved down during the primary scan and then moved down
again while btbulkdelete is reconsidering the deletable pages. Without
locking out splits entirely, it's far from obvious that we can make it
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?