On 13/07/18 21:28, Andrey Borodin wrote:
13 июля 2018 г., в 18:25, Heikki Linnakangas <hlinn...@iki.fi>
написал(а):
Looking at the second patch, to scan the GiST index in physical
order, that seems totally unsafe, if there are any concurrent page
splits. In the logical scan, pushStackIfSplited() deals with that,
by comparing the page's NSN with the parent's LSN. But I don't see
anything like that in the physical scan code.
Leaf page can be pointed by internal page and rightlink
simultaneously. The purpose of NSN is to visit this page exactly once
by following only on of two links in a scan. This is achieved
naturally if we read everything from the beginning to the end. (That
is how I understand, I can be wrong)
The scenario where this fails goes like this:
1. Vacuum scans physical pages 1-10
2. A concurrent insertion splits page 15. The new left half stays on
page 15, but the new right half goes to page 5
3. Vacuum scans pages 11-20
Now, if there were any dead tuples on the right half of the split, moved
to page 5, the vacuum would miss them.
The way this is handled in B-tree is that when a page is split, the page
is stamped with current "vacuum cycle id". When the vacuum scan
encounters a page with the current cycle id, whose right-link points to
a lower-numbered page, it immediately follows the right link, and
re-scans it. I.e. in the above example, if it was a B-tree, in step 3
when vacuum scans page 15, it would see that it was concurrently split.
It would immediately vacuum page 5 again, before continuing the scan in
physical order.
We'll need to do something similar in GiST.
- Heikki