On Thu, 26 Feb 2026 at 22:03, Madhav Madhusoodanan <[email protected]> wrote: > > On Tue, Aug 26, 2025 at 2:11 PM Kirk Wolak <[email protected]> wrote: > > I do have a question, one of the IDEAS we discussed was to ADD a new page > > that combined the 2 pages. > > Would the flow then be as follows? Please correct me if I'm wrong: > Start: Parent page P, with adjacent child pages A -> B -> C -> D. > Pages B and C are sparse enough and are about to be merged. > 1: Acquire lock on pages B and C > 2: Create a new page N, which copies the tuples in pages B and C > 3: Acquire lock on parent page P, update the separator keys in P, > release lock on P > 4: Update pointers such that pages link like so: A -> N -> D > 5: Release lock on pages B and C
That is one part of the picture (the merging part), but it's missing a lot of details: - How do concurrent workloads (index scans, backwards index scans, index insertions) detect and recover from concurrent merges when they step through pages? - Do you have a theory or proof of correctness for the above? - Can this scheme be implemented without adding significant overhead to current workloads that don't benefit significantly from the new feature? One example for a problem with the given flow: where (and how) do you update the sibling pointers in pages A (to N from B) and D (to N from C)? You haven't explicitly locked pages A and D by the time you get to step 4, even though locking pages is critical to guarantee correct and safe operations. Simply locking them in step 4 isn't sufficient, as another process may have locked page A in preparation for a split, which would then be waiting to get a lock on page B. If you've already locked page B before locking page A, you'll have a deadlock. Additionally, how does this work when you find out in step 3 that your pages B and C each are linked to from a different parent page? You'd have to update both parent pages, but that would also require locking more than just the one parent page per level that currently gets locked during page deletion. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com)
