On Tue, Mar 10, 2020 at 4:11 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, Feb 24, 2020 at 3:38 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Thu, Feb 20, 2020 at 8:06 AM Andres Freund <and...@anarazel.de> wrote: > > > What I'm advocating is that extension locks should continue to go > > > through lock.c. And yes, that requires some changes to group locking, > > > but I still don't see why they'd be complicated. > > > > > > > Fair position, as per initial analysis, I think if we do below three > > things, it should work out without changing to a new way of locking > > for relation extension or page type locks. > > a. As per the discussion above, ensure in code we will never try to > > acquire another heavy-weight lock after acquiring relation extension > > or page type locks (probably by having Asserts in code or maybe some > > other way). > > I have done an analysis of the relation extension lock (which can be > acquired via LockRelationForExtension or > ConditionalLockRelationForExtension) and found that we don't acquire > any other heavyweight lock after acquiring it. However, we do > sometimes try to acquire it again in the places where we update FSM > after extension, see points (e) and (f) described below. The usage of > this lock can be broadly divided into six categories and each one is > explained as follows: > > a. Where after taking the relation extension lock we call ReadBuffer > (or its variant) and then LockBuffer. The LockBuffer internally calls > either LWLock to acquire or release neither of which acquire another > heavy-weight lock. It is quite obvious as well that while taking some > lightweight lock, there is no reason to acquire another heavyweight > lock on any object. The specs/comments of ReadBufferExtended (which > gets called from variants of ReadBuffer) API says that if the blknum > requested is P_NEW, only one backend can call it at-a-time which > indicates that we don't need to acquire any heavy-weight lock inside > this API. Otherwise, also, this API won't need a heavy-weight lock to > read the existing block into shared buffer as two different backends > are allowed to read the same block. I have also gone through all the > functions called/used in this path to ensure that we don't use > heavy-weight locks inside it. > > The usage by APIs BloomNewBuffer, GinNewBuffer, gistNewBuffer, > _bt_getbuf, and SpGistNewBuffer falls in this category. Another API > that falls under this category is revmap_physical_extend which uses > ReadBuffer, LocakBuffer and ReleaseBuffer. The ReleaseBuffer API > unpins aka decrement the reference count for buffer and disassociates > a buffer from the resource owner. None of that requires heavy-weight > lock. T > > b. After taking relation extension lock, we call > RelationGetNumberOfBlocks which primarily calls file-level functions > to determine the size of the file. This doesn't acquire any other > heavy-weight lock after relation extension lock. > > The usage by APIs ginvacuumcleanup, gistvacuumscan, btvacuumscan, and > spgvacuumscan falls in this category. > > c. There is a usage in API brin_page_cleanup() where we just acquire > and release the relation extension lock to avoid reinitializing the > page. As there is no call in-between acquire and release, so there is > no chance of another heavy-weight lock acquire after having relation > extension lock. > > d. In fsm_extend() and vm_extend(), after acquiring relation extension > lock, we perform various file-level operations like RelationOpenSmgr, > smgrexists, smgrcreate, smgrnblocks, smgrextend. First, from theory, > we don't have any heavy-weight lock other than relation extension lock > which can cover such operations and then I have verified it by going > through these APIs that these don't acquire any other heavy-weight > lock. Then these APIs also call PageSetChecksumInplace computes a > checksum of the page and sets the same in page header which is quite > straight-forward and doesn't acquire any heavy-weight lock. > > In vm_extend, we additionally call CacheInvalidateSmgr to send a > shared-inval message to force other backends to close any smgr > references they may have for the relation for which we extending > visibility map which has no reason to acquire any heavy-weight lock. > I have checked the code path as well and I didn't find any > heavy-weight lock call in that. > > e. In brin_getinsertbuffer, we call ReadBuffer() and LockBuffer(), the > usage of which is the same as what is mentioned in (a). In addition > to that it calls brin_initialize_empty_new_buffer() which further > calls RecordPageWithFreeSpace which can again acquire relation > extension lock for same relation. This usage is safe because we have > a mechanism in heavy-weight lock manager that if we already hold a > lock and a request came for the same lock and in same mode, the lock > will be granted. > > f. In RelationGetBufferForTuple(), there are multiple APIs that get > called and like (e), it can try to reacquire the relation extension > lock in one of those APIs. The main APIs it calls after acquiring > relation extension lock are described as follows: > - GetPageWithFreeSpace: This tries to find a page in the given > relation with at least the specified amount of free space. This > mainly checks the FSM pages and in one of the paths might call > fsm_extend which can again try to acquire the relation extension lock > on the same relation. > - RelationAddExtraBlocks: This adds multiple pages in a relation if > there is contention around relation extension lock. This calls > RelationExtensionLockWaiterCount which is mainly to check how many > lockers are waiting for the same lock, then call ReadBufferBI which as > explained above won't require heavy-weight locks and FSM APIs which > can acquire Relation extension lock on the same relation, but that is > safe as discussed previously. > > The Page locks can be acquired via LockPage and ConditionalLockPage. > This is acquired from one place in the code during Gin index cleanup > (ginInsertCleanup). The basic idea is that it will scan the pending > list and move entries into the main index. While moving entries to > the main page, it might need to add a new page that will require us to > take a relation extension lock. Now, unlike relation extension lock, > after acquiring page lock, we do acquire another heavy-weight lock > (relation extension lock), but as we never acquire it in reverse > order, this is safe. > > So, as per this analysis, we can add Asserts for relation extension > and page locks which will indicate that they won't participate in > deadlocks. It would be good if someone else can also do independent > analysis and verify my findings.
I have also analyzed the usage for the RelationExtensioLock and the PageLock. And, my findings are on the same lines. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com