On 2014-06-17 12:41:26 +0530, Amit Kapila wrote:
> On Fri, May 23, 2014 at 10:01 PM, Amit Kapila <[email protected]>
> wrote:
> > On Fri, Jan 31, 2014 at 3:24 PM, Andres Freund <[email protected]>
> wrote:
> > > I've pushed a rebased version of the patchset to
> > > http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git
> > > branch rwlock contention.
> > > 220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small problem,
> > > ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA.
> >
> > As per discussion in developer meeting, I wanted to test shared
> > buffer scaling patch with this branch. I am getting merge
> > conflicts as per HEAD. Could you please get it resolved, so that
> > I can get the data.
>
> I have started looking into this patch and have few questions/
> findings which are shared below:
>
> 1. I think stats for lwstats->ex_acquire_count will be counted twice,
> first it is incremented in LWLockAcquireCommon() and then in
> LWLockAttemptLock()
Hrmpf. Will fix.
> 2.
> Handling of potentialy_spurious case seems to be pending
> in LWLock functions like LWLockAcquireCommon().
>
> LWLockAcquireCommon()
> {
> ..
> /* add to the queue */
> LWLockQueueSelf(l, mode);
>
> /* we're now guaranteed to be woken up if necessary */
> mustwait = LWLockAttemptLock(l, mode, false, &potentially_spurious);
>
> }
>
> I think it can lead to some problems, example:
>
> Session -1
> 1. Acquire Exclusive LWlock
>
> Session -2
> 1. Acquire Shared LWlock
>
> 1a. Unconditionally incrementing shared count by session-2
>
> Session -1
> 2. Release Exclusive lock
>
> Session -3
> 1. Acquire Exclusive LWlock
> It will put itself to wait queue by seeing the lock count incremented
> by Session-2
>
> Session-2
> 1b. Decrement the shared count and add it to wait queue.
>
> Session-4
> 1. Acquire Exclusive lock
> This session will get the exclusive lock, because even
> though other lockers are waiting, lockcount is zero.
>
> Session-2
> 2. Try second time to take shared lock, it won't get
> as session-4 already has an exclusive lock, so it will
> start waiting
>
> Session-4
> 2. Release Exclusive lock
> it will not wake the waiters because waiters have been added
> before acquiring this lock.
I don't understand this step here? When releasing the lock it'll notice
that the waiters is <> 0 and acquire the spinlock which should protect
against badness here?
> 3.
> LWLockAcquireCommon()
> {
> for (;;)
> {
> PGSemaphoreLock(&proc->sem, false);
> if (!proc->lwWaiting)
> ..
> }
> proc->lwWaiting is checked, updated without spinklock where
> as previously it was done under spinlock, won't it be unsafe?
It was previously checked/unset without a spinlock as well:
/*
* Awaken any waiters I removed from the queue.
*/
while (head != NULL)
{
LOG_LWDEBUG("LWLockRelease", T_NAME(l), T_ID(l), "release
waiter");
proc = head;
head = proc->lwWaitLink;
proc->lwWaitLink = NULL;
proc->lwWaiting = false;
PGSemaphoreUnlock(&proc->sem);
}
I don't think there's dangers here, lwWaiting will only ever be
manipulated by the the PGPROC's owner. As discussed elsewhere there
needs to be a write barrier before the proc->lwWaiting = false, even in
upstream code.
> 4.
> LWLockAcquireCommon()
> {
> ..
> for (;;)
> {
> /* "false" means cannot accept cancel/die interrupt here. */
> PGSemaphoreLock(&proc->sem, false);
> if (!proc->lwWaiting)
> break;
> extraWaits++;
> }
> lock->releaseOK = true;
> }
>
> lock->releaseOK is updated/checked without spinklock where
> as previously it was done under spinlock, won't it be unsafe?
Hm. That's probably buggy. Good catch. Especially if you have a compiler
that does byte manipulation by reading e.g. 4 bytes from a struct and
then write the wider variable back... So the releaseOk bit needs to move
into LWLockDequeueSelf().
Thanks for looking!
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers