Re: [HACKERS] Wait free LW_SHARED acquisition - v0.9

Andres Freund Wed, 08 Oct 2014 13:00:09 -0700

On 2014-10-08 15:23:22 -0400, Robert Haas wrote:
> On Wed, Oct 8, 2014 at 9:35 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> > 1) Convert PGPROC->lwWaitLink into a dlist. The old code was frail and
> >    verbose. This also does:
> >     * changes the logic in LWLockRelease() to release all shared lockers
> >       when waking up any. This can yield some significant performance
> >       improvements - and the fairness isn't really much worse than
> >       before,
> >       as we always allowed new shared lockers to jump the queue.
> >
> >     * adds a memory pg_write_barrier() in the wakeup paths between
> >       dequeuing and unsetting ->lwWaiting. That was always required on
> >       weakly ordered machines, but f4077cda2 made it more urgent. I can
> >       reproduce crashes without it.
>
> I think it's a really bad idea to mix a refactoring change (like
> converting PGPROC->lwWaitLink into a dlist) with an attempted
> performance enhancement (like changing the rules for jumping the lock
> queue) and a bug fix (like adding pg_write_barrier where needed).  I'd
> suggest that the last of those be done first, and perhaps
> back-patched.


I think it makes sense to separate out the write barrier one. I don't
really see the point of separating the other two changes.

I've indeed previously started a thread
(http://archives.postgresql.org/message-id/20140210134625.GA15246%40awork2.anarazel.de)
about the barrier issue. IIRC you argued that that might be to
expensive.

> The current coding, using a hand-rolled list, touches shared memory
> fewer times.  When many waiters are awoken at once, we clip them all
> out of the list at one go.  Your revision moves them to a
> backend-private list one at a time, and then pops them off one at a
> time.  The backend-private memory accesses don't seem like they matter
> much, but the shared memory accesses would be nice to avoid.

I can't imagine this to matter.  We're entering the kernel for each PROC
for the PGSemaphoreUnlock() and we're dirtying the cacheline for
proc->lwWaiting = false anyway. This really is the slow path.

> Does LWLockUpdateVar's wake-up loop need a write barrier per
> iteration, or just one before the loop starts?  How about commenting
> the pg_write_barrier() with the read-fence to which it pairs?

Hm. Are you picking out LWLockUpdateVar for a reason or just as an
example? Because I don't see a difference between the different wakeup
loops?
It needs to be a barrier per iteration.

Currently the loop looks like
        while (head != NULL)
        {
                proc = head;
                head = proc->lwWaitLink;
                proc->lwWaitLink = NULL;
                proc->lwWaiting = false;
                PGSemaphoreUnlock(&proc->sem);
        }

Consider what happens if either the compiler or the cpu reorders this
to:
                proc->lwWaiting = false;
                head = proc->lwWaitLink;
                proc->lwWaitLink = NULL;
                PGSemaphoreUnlock(&proc->sem);

as soon as lwWaiting = false, 'proc' can wake up and acquire a new
lock. Backends can wake up prematurely because proc->sem is used for
other purposes than this (that's why the loops around PGSemaphoreLock
exist). Then it could reset lwWaitLink while acquiring a new lock. And
some processes wouldn't be woken up anymore.

The barrier it pairs with is the spinlock acquiration before
requeuing. To be more obviously correct we could add a read barrier
before
                                        if (!proc->lwWaiting)
                                                break;
but I don't think it's needed.

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.9

Reply via email to