Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

Jesper Pedersen Mon, 11 Sep 2017 08:02:53 -0700

Hi,

On 09/08/2017 03:35 PM, Sokolov Yura wrote:

I'm seeing


-M prepared: Up to 11% improvement
-M prepared -S: No improvement, no regression ("noise")
-M prepared -N: Up to 12% improvement

for all runs the improvement shows up the closer you get to the number
of CPU threads, or above. Although I'm not seeing the same
improvements as you on very large client counts there are definitely
improvements :)


It is expected:
- patch "fixes NUMA": for example, it doesn't give improvement on 1 socket
   at all (I've tested it using numactl to bind to 1 socket)
- and certainly it gives less improvement on 2 sockets than on 4 sockets
   (and 28 cores vs 72 cores also gives difference),
- one of hot points were CLogControlLock, and it were fixed with
   "Group mode for CLOG updates" [1]


I'm planning to re-test that patch.


+static inline bool
+LWLockAttemptLockOrQueueSelf(LWLock *lock, LWLockMode mode,
LWLockMode waitmode)

I'll leave it to the Committer to decide if this method is too big to
be "inline".


GCC 4.9 doesn't want to inline it without directive, and function call
is then remarkable in profile.

Attach contains version with all suggestions applied except remove of
"inline".


Yes, ideally the method will be kept at "inline".

Open questions:
---------------
* spins_per_delay as extern
* Calculation of skip_wait_list


Currently calculation of skip_wait_list is mostly empirical (ie best
i measured).


Ok, good to know.

I strongly think that instead of spins_per_delay something dependent
on concrete lock should be used. I tried to store it in a LWLock
itself, but it were worse.

Yes, LWLock should be kept as small as possible, and cache line aligneddue to the cache storms, as shown by perf c2c.

Recently I understand it should be stored in array indexed by tranche,
but I didn't implement it yet, and therefore didn't measure.

Different constants for the LWLock could have an impact, but theconstants would also be dependent on machine setup, and work load.


Thanks for working on this !

Best regards,
 Jesper


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

Reply via email to