On Wed, Nov 8, 2017 at 5:41 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Nov 6, 2017 at 4:42 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote: >>>> I suggest that a good thing to do more or less immediately, regardless >>>> of when this patch ends up being ready, would be to insert an >>>> insertion that LockAcquire() is never called while holding a lock of >>>> one of these types. If that assertion ever fails, then the whole >>>> theory that these lock types don't need deadlock detection is wrong, >>>> and we'd like to find out about that sooner or later. >>> >>> I understood. I'll check that first. >> >> I've checked whether LockAcquire is called while holding a lock of one >> of four types: LOCKTAG_RELATION_EXTENSION, LOCKTAG_PAGE, >> LOCKTAG_TUPLE, and LOCKTAG_SPECULATIVE_TOKEN. To summary, I think that >> we cannot move these four lock types together out of heavy-weight >> lock, but can move only the relation extension lock with tricks. >> >> Here is detail of the survey. > > Thanks for these details, but I'm not sure I fully understand. > >> * LOCKTAG_RELATION_EXTENSION >> There is a path that LockRelationForExtension() could be called while >> holding another relation extension lock. In brin_getinsertbuffer(), we >> acquire a relation extension lock for a index relation and could >> initialize a new buffer (brin_initailize_empty_new_buffer()). During >> initializing a new buffer, we call RecordPageWithFreeSpace() which >> eventually can call fsm_readbuf(rel, addr, true) where the third >> argument is "extend". We can process this problem by having the list >> (or local hash) of acquired locks and skip acquiring the lock if >> already had. For other call paths calling LockRelationForExtension, I >> don't see any problem. > > Does calling fsm_readbuf(rel,addr,true) take some heavyweight lock?
No, I meant fsm_readbuf(rel,addr,true) can acquire a relation extension lock. So it's not problem. > Basically, what matters here in the end is whether we can articulate a > deadlock-proof rule around the order in which these locks are > acquired. You're right, my survey was not enough to make a decision. As far as the acquiring these four lock types goes, there are two call paths that acquire any type of locks while holding another type of lock. The one is that acquiring a relation extension lock and then acquiring a relation extension lock for the same relation again. As explained before, this can be resolved by remembering the holding lock (perhaps holding only last one is enough). Another is that acquiring either a tuple lock, a page lock or a speculative insertion lock and then acquiring a relation extension lock. In the second case, we try to acquire these two locks in the same order; acquiring 3 types lock and then extension lock. So it's not problem if we apply the rule that is that we disallow to try acquiring these three lock types while holding any relation extension lock. Also, as far as I surveyed there is no path to acquire a relation lock while holding other 3 type locks. > The simplest such rule would be "you can only acquire one > lock of any of these types at a time, and you can't subsequently > acquire a heavyweight lock". But a more complicated rule would be OK > too, e.g. "you can acquire as many heavyweight locks as you want, and > after that you can optionally acquire one page, tuple, or speculative > token lock, and after that you can acquire a relation extension lock". > The latter rule, although more complex, is still deadlock-proof, > because the heavyweight locks still use the deadlock detector, and the > rest has a consistent order of lock acquisition that precludes one > backend taking A then B while another backend takes B then A. I'm not > entirely clear whether your survey leads us to a place where we can > articulate such a deadlock-proof rule. Speaking of the acquiring these four lock types and heavy weight lock, there obviously is a call path to acquire any of four lock types while holding a heavy weight lock. In reverse, there also is a call path that we acquire a heavy weight lock while holding any of four lock types. The call path I found is that in heap_delete we acquire a tuple lock and call XactLockTableWait or MultiXactIdWait which eventually could acquire LOCKTAG_TRANSACTION in order to wait for the concurrent transactions finish. But IIUC since these functions acquire the lock for the concurrent transaction's transaction id, deadlocks doesn't happen. However, there might be other similar call paths if I'm missing something. For example, we do some operations that might acquire any heavy weight locks other than LOCKTAG_TRANSACTION, while holding a page lock (in ginInsertCleanup) or holding a specualtive insertion lock (in nodeModifyTable). To summary, I think we can put the following rules in order to move four lock types out of heavy weight lock. 1. Do not acquire either a tuple lock, a page lock or a speculative insertion lock while holding a extension lock. 2. Do not acquire any heavy weight lock except for LOCKTAG_TRANSACTION while holding any of these four lock types. Also I'm concerned that it imposes the rules for developers which is difficult to check statically. We can put several assertions to source code but it's hard to test the all possible paths by regression tests. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers