On Fri, Jun 22, 2018 at 4:24 AM, Andrey V. Lepikhov <a.lepik...@postgrespro.ru> wrote: > According to your feedback, i develop second version of the patch. > In this version: > 1. High-level functions index_beginscan(), index_rescan() not used. Tree > descent made by _bt_search(). _bt_binsrch() used for positioning on the > page. > 2. TID list introduced in amtargetdelete() interface. Now only one tree > descent needed for deletion all tid's from the list with equal scan key > value - logical duplicates deletion problem. > 3. Only one WAL record for index tuple deletion per leaf page per > amtargetdelete() call.
Cool. What is this "race" code about? > + buffer = ReadBufferExtended(rel, MAIN_FORKNUM, > ItemPointerGetBlockNumber(tid), RBM_NORMAL, NULL); > + LockBuffer(buffer, BUFFER_LOCK_SHARE); > + > + page = (Page) BufferGetPage(buffer); > + offnum = ItemPointerGetOffsetNumber(tid); > + lp = PageGetItemId(page, offnum); > + > + /* > + * VACUUM Races: someone already remove the tuple from HEAP. Ignore it. > + */ > + if (!ItemIdIsUsed(lp)) > + return NULL; It looks wrong -- why should another process have set the item as unused? And if we assume that that's possible at all, what's to stop a third process from actually reusing the item pointer before we arrive (at get_tuple_by_tid()), leaving us to find a tuple that is totally unrelated to the original tuple to be deleted? (Also, you're not releasing the buffer lock before you return.) > 4. VACUUM can sort TID list preliminary for more quick search of duplicates. This version of the patch prevents my own "force unique keys" patch from working, since you're not using my proposed new _bt_search()/_bt_binsrch()/_bt_compare() interface (you're not passing them a heap TID). It is essential that your patch be able to quickly reach any tuple that it needs to kill. Otherwise, the worst case performance is totally unacceptable; it will never be okay to go through 10%+ of the index to kill a single tuple hundreds or even thousands of times per VACUUM. It seems to me that doing this tid_list_search() binary search is pointless -- you should instead be relying on logical duplicates using their heap TID as a tie-breaker. Rather than doing a binary search within tid_list_search(), you should instead match the presorted heap TIDs at the leaf level against the sorted in-memory TID list. You know, a bit like a merge join. I suggest that you go even further than this: I think that you should just start distributing my patch as part of your patch series. You can change my code if you need to. I also suggest using "git format patch" with simple, short commit messages to produce patches. This makes it a lot easier to track the version of the patch, changes over time, etc. I understand why you'd hesitate to take ownership of my code (it has big problems!), but the reality is that all the problems that my patch has are also problems for your patch. One patch cannot get committed without the other, so they are already the same project. As a bonus, my patch will probably improve the best case performance for your patch, since multi-deletions will now have much better locality of access. -- Peter Geoghegan