Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

Robert Haas Fri, 12 Feb 2016 06:31:23 -0800

On Thu, Feb 11, 2016 at 3:26 PM, Robert Haas <[email protected]> wrote:
> On Thu, Feb 11, 2016 at 9:53 AM, Robert Haas <[email protected]> wrote:
>> The fact that InitLocks() doesn't do this has been discussed before
>> and there's no consensus on changing it.  It is, at any rate, a
>> separate issue.  I'll go through the rest of this patch again now.
>
> I did a little bit of cleanup on this patch and the results are
> attached.  I also did some benchmarking of this version.  I tested
> this on two systems, in each case using five-minute, read-only pgbench
> runs at scale factor 3000, varying the client count and taking the
> median of three runs.  First, I tested it on hydra, a 2-socket,
> 16-processor, 64-thread POWER box.  This is a community resource
> hosted at OSUOSL.  Second, I tested it on cthulhu, an 8-socket,
> 64-processor, 128-thread x86_64 box.  This is an EnterpriseDB
> resource.  Here are the results:
>
> hydra, master vs. patched
> 1 client: 8042.872109 vs. 7839.587491 (-2.5%)
> 64 clients: 213311.852043 vs. 214002.314071 (+0.3%)
> 96 clients: 219551.356907 vs. 221908.397489 (+1.1%)
> 128 clients: 210279.022760 vs. 217974.079171 (+3.7%)
>
> cthulhu, master vs. patched
> 1 client: 3407.705820 vs. 3645.129360 (+7.0%)
> 64 clients: 88667.681890 vs. 82636.914343 (-6.8%)
> 96 clients: 79303.750250 vs. 105442.993869 (+33.0%)
> 128 clients: 74684.510668 vs. 120984.938371 (+62.0%)
>
> Obviously, the high-client count results on cthulhu are stellar.  I'm
> *assuming* that the regressions are just random variation.  I am
> however wondering if it to set the freelist affinity based on
> something other than the hash value, like say the PID, so that the
> same process doesn't keep switching to a different freelist for every
> buffer eviction.  It also strikes me that it's probably quite likely
> that slock_t mutex[NUM_FREELISTS] is a poor way to lay out this data
> in memory.  For example, on a system where slock_t is just one byte,
> most likely all of those mutexes are going to be in the same cache
> line, which means you're going to get a LOT of false sharing.  It
> seems like it would be sensible to define a new struct that contains
> an slock_t, a long, and a HASHELEMENT *, and then make an array of
> those structs.  That wouldn't completely eliminate false sharing, but
> it would reduce it quite a bit.  My guess is that if you did that, you
> could reduce the number of freelists to 8 or less and get pretty much
> the same performance benefit that you're getting right now with 32.
> And that, too, seems likely to be good for single-client performance.


One other thing: I don't see what prevents the nentries count for a
particular freelist from going negative, which would subsequently
cause an Assertion failure.  I wonder if we should restructure the
code so that we keep track of the total number of elements allocated
and the number on each freelist, so that the count of elements =
elements allocation - sum(elements on each freelist).  This would
require a little more code churn but it seems that the result would be
more clearly correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

Reply via email to