On Tue, Oct 1, 2019 at 5:55 AM Alexander Korotkov <a.korot...@postgrespro.ru> wrote: > On Mon, Sep 30, 2019 at 10:54 PM Peter Geoghegan <p...@bowt.ie> wrote: > > > > On Sun, Sep 29, 2019 at 8:12 AM Alexander Korotkov > > <a.korot...@postgrespro.ru> wrote: > > > I just managed to reproduce this using two sessions on master branch. > > > > > > session 1 > > > session 2 > > > > Was the involvement of the pending list stuff in Chen's example just a > > coincidence? Can you recreate the problem while eliminating that > > factor (i.e. while setting fastupdate to off)? > > > > Chen's example involved an INSERT that deadlocked against VACUUM -- > > not a SELECT. Is this just a coincidence? > > Chen wrote. > > > Unfortunately the insert process(run by gcore) held no lwlock, it should be > > another process(we did not fetch core file) that hold the lwlock needed for > > autovacuum process. > > So, he catched backtrace for INSERT and post it for information. But > since INSERT has no lwlocks held, it couldn't participate deadlock. > It was just side waiter. > > I've rerun my reproduction case and it still deadlocks. Just the same > steps but GIN index with (fastupdate = off).
BTW, while trying to revise README I found another bug. It appears to be possible to reach deleted page from downlink. The reproduction case is following. session 1 session 2 # create table tmp (ar int[]) with (autovacuum_enabled = false); # insert into tmp (select '{1}' from generate_series(1,10000000) i); # insert into tmp values ('{1,2}'); # insert into tmp (select '{1}' from generate_series(1,10000000) i); # create index tmp_idx on tmp using gin(ar); # delete from tmp; # set max_parallel_workers_per_gather = 0; /* Breakpoint where entyLoadMoreItems() calls ginFindLeafPage() to search GIN posting tree */ gdb> b ginget.c:682 gdb> select * from tmp where ar @> '{1,2}'; gdb> /* step till ReleaseAndReadBuffer() releases a buffer */ # vacuum tmp; # continue It also appears that previous version of deadlock fix didn't supply left sibling to leftmost child of any page. As result, internal pages were never deleted. The first attached patch is revised fix is attached. The second patch fix traversing to deleted page using downlink. Similarly to nbtree, we just always move right if landed on deleted page. Also, it appears that we clear all other flags while marking page as deleted. That cause assert to fire. With patch, we just add deleted flag without erasing others. Also, I have to remove assert that ginStepRight() never steps to deleted page. If we landed to deleted page from downlink, then we can find other deleted page by rightlink. I'm planning to continue work on README, comments and commit messages. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
0001-gin_ginDeletePage_ginStepRight_deadlock_fix-2.patch
Description: Binary data
0002-gin-fix-traversing-to-deleted-page-by-downlink-2.patch
Description: Binary data