https://www.postgresql.org/message-id/CAJEAwVE4UAmm8fr%2BNW8XTnKV6M--ACoNhL3ES8yoKL2sKhbaiw%40mail.gmail.com
Let me re-summarize what's been done here to make sure I understand: Each key in GIN has its own posting tree, which is itself a btree, holding all of the tuples that contain that key. This posting tree is really just a set of tuples, and searches always scan an entire posting tree (because all the tuples contain the key, so all are a potential match). Currently, the cleanup process requires blocking all reads and writes in the posting list. I assume this is only a problem when a key is common enough that its posting list is quite large (how large?). This patch makes a couple changes to improve concurrency: 1. Vacuum leaves first, without removing any pages. Instead, just keep track of pages which are completely empty (more generally, subtrees that contain empty pages). 2. Free the empty pages in those subtrees. That requires blocking reads and writes to that subtree, but not the entire posting list. Blocking writes is easy: acquiring a cleanup lock on the root of any subtree always blocks any writes to that subtree, because the write keeps a pin on all pages in that path. Blocking reads is accomplished by locking the page to be deleted, then locking/unlocking the page one left of that (which may be outside the subtree?). Maybe a basic question, but why is the posting list a btree at all, and not, say a doubly-linked list? Review of the code itself: * Nice and simple, with a net line count of only +45. * Remove commented-out code. * In the README, "leaf-to-right" should be left-to-right; and "takinf" should be "taking". * When you say the leftmost page is never deleted; do you mean the leftmost of the entire posting tree, or leftmost of the subtree that you are removing pages from? * In order to keep out concurrent reads, you need to lock/unlock the left page while holding exclusive lock on the page being deleted, but I didn't see how that happens exactly in the code. Where does that happen? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers