> Good day, everyone.
> This patch improves performance of contended LWLock.
> It was tested on 4 socket 72 core x86 server (144 HT) Centos 7.1
> gcc 4.8.5
> Patch makes lock acquiring in single CAS loop:
> 1. LWLock->state is read, and ability for lock acquiring is detected.
>   If there is possibility to take a lock, CAS tried.
>   If CAS were successful, lock is aquired (same to original version).
> 2. but if lock is currently held by other backend, we check ability for
>   taking WaitList lock. If wait list lock is not help by anyone, CAS
>   perfomed for taking WaitList lock and set LW_FLAG_HAS_WAITERS at once.
>   If CAS were successful, then LWLock were still held at the moment wait
>   list lock were held - this proves correctness of new algorithm. And
>   Proc is queued to wait list then.
> 3. Otherwise spin_delay is performed, and loop returns to step 1.

Interesting work.  Thanks for posting.

