During Vacuum, cleanup in btree need to acquire cleanup lock on the
block and the non-MVCC scans or scans on unlogged index retain the pin
on a buffer during scan of a block.  Now, if the scan is suspended
(use cursor and then fetch one row) after holding a pin on a buffer in
Session-1 and in Session-2 we initiate the vacuum which will wait for
the suspended scan in Session-1 to finish (by this time vacuum has a
ShareUpdateExclusiveLock on the table), now if in Session-1, user
tries to acquire a conflicting lock on the same table, it will create
a deadlock.

Simple test to reproduce deadlock:

create unlogged table ul_t1(c1 int);
create index idx_ul_t1 on ul_t1 (c1);
insert into ul_t1 values(1);
insert into ul_t1 values(2);
delete from ul_t1 where c1=1;

set enable_seqscan = off;
begin transaction;
declare c1 cursor for select * from ul_t1 where c1=2;
fetch next from c1; --every thing is good till here

vacuum ul_t1;  -- This will wait for session-1 to complete the scan

lock table ul_t1 in Access Exclusive Mode;  -- This will wait for
session-2 to complete the vacuum

If we agree that above is a problematic case, then some of the options
to solve it could be (a) Vacuum should not wait for a cleanup lock and
instead just give up and start again which I think is a bad idea (b)
don't allow to take lock of higher granularity after the scan is
suspended, not sure if that is feasible (c) document the above danger,
this sounds okay on the ground that nobody has reported the problem
till now

I have found this problem while analysing similar case for hash index
interaction with vacuum as is being discussed on that thread [1].  We
could think of solving it in a way as proposed on that thread, but I
think as proposed that is not safe for non-MVCC scans.


[1] - 

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to