Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

Masahiko Sawada Wed, 08 Nov 2017 06:42:12 -0800

On Wed, Nov 8, 2017 at 5:41 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Nov 6, 2017 at 4:42 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
>>>> I suggest that a good thing to do more or less immediately, regardless
>>>> of when this patch ends up being ready, would be to insert an
>>>> insertion that LockAcquire() is never called while holding a lock of
>>>> one of these types.  If that assertion ever fails, then the whole
>>>> theory that these lock types don't need deadlock detection is wrong,
>>>> and we'd like to find out about that sooner or later.
>>>
>>> I understood. I'll check that first.
>>
>> I've checked whether LockAcquire is called while holding a lock of one
>> of four types: LOCKTAG_RELATION_EXTENSION, LOCKTAG_PAGE,
>> LOCKTAG_TUPLE, and LOCKTAG_SPECULATIVE_TOKEN. To summary, I think that
>> we cannot move these four lock types together out of heavy-weight
>> lock, but can move only the relation extension lock with tricks.
>>
>> Here is detail of the survey.
>
> Thanks for these details, but I'm not sure I fully understand.
>
>> * LOCKTAG_RELATION_EXTENSION
>> There is a path that LockRelationForExtension() could be called while
>> holding another relation extension lock. In brin_getinsertbuffer(), we
>> acquire a relation extension lock for a index relation and could
>> initialize a new buffer (brin_initailize_empty_new_buffer()). During
>> initializing a new buffer, we call RecordPageWithFreeSpace() which
>> eventually can call fsm_readbuf(rel, addr, true) where the third
>> argument is "extend". We can process this problem by having the list
>> (or local hash) of acquired locks and skip acquiring the lock if
>> already had. For other call paths calling LockRelationForExtension, I
>> don't see any problem.
>
> Does calling fsm_readbuf(rel,addr,true) take some heavyweight lock?


No, I meant fsm_readbuf(rel,addr,true) can acquire a relation
extension lock. So it's not problem.

> Basically, what matters here in the end is whether we can articulate a
> deadlock-proof rule around the order in which these locks are
> acquired.

You're right, my survey was not enough to make a decision.

As far as the acquiring these four lock types goes, there are two call
paths that acquire any type of locks while holding another type of
lock. The one is that acquiring a relation extension lock and then
acquiring a relation extension lock for the same relation again. As
explained before, this can be resolved by remembering the holding lock
(perhaps holding only last one is enough). Another is that acquiring
either a tuple lock, a page lock or a speculative insertion lock and
then acquiring a relation extension lock. In the second case, we try
to acquire these two locks in the same order; acquiring 3 types lock
and then extension lock. So it's not problem if we apply the rule that
is that we disallow to try acquiring these three lock types while
holding any relation extension lock. Also, as far as I surveyed there
is no path to acquire a relation lock while holding other 3 type
locks.

>  The simplest such rule would be "you can only acquire one
> lock of any of these types at a time, and you can't subsequently
> acquire a heavyweight lock".  But a more complicated rule would be OK
> too, e.g. "you can acquire as many heavyweight locks as you want, and
> after that you can optionally acquire one page, tuple, or speculative
> token lock, and after that you can acquire a relation extension lock".
> The latter rule, although more complex, is still deadlock-proof,
> because the heavyweight locks still use the deadlock detector, and the
> rest has a consistent order of lock acquisition that precludes one
> backend taking A then B while another backend takes B then A.  I'm not
> entirely clear whether your survey leads us to a place where we can
> articulate such a deadlock-proof rule.

Speaking of the acquiring these four lock types and heavy weight lock,
there obviously is a call path to acquire any of four lock types while
holding a heavy weight lock. In reverse, there also is a call path
that we acquire a heavy weight lock while holding any of four lock
types. The call path I found is that in heap_delete we acquire a tuple
lock and call XactLockTableWait or MultiXactIdWait which eventually
could acquire LOCKTAG_TRANSACTION in order to wait for the concurrent
transactions finish. But IIUC since these functions acquire the lock
for the concurrent transaction's transaction id, deadlocks doesn't
happen.
However, there might be other similar call paths if I'm missing
something. For example, we do some operations that might acquire any
heavy weight locks other than LOCKTAG_TRANSACTION, while holding a
page lock (in ginInsertCleanup) or holding a specualtive insertion
lock (in nodeModifyTable).

To summary, I think we can put the following rules in order to move
four lock types out of heavy weight lock.

1. Do not acquire either a tuple lock, a page lock or a speculative
insertion lock while holding a extension lock.
2. Do not acquire any heavy weight lock except for LOCKTAG_TRANSACTION
while holding any of these four lock types.

Also I'm concerned that it imposes the rules for developers which is
difficult to check statically. We can put several assertions to source
code but it's hard to test the all possible paths by regression tests.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

Reply via email to