On 2016-04-11 14:40:29 -0700, Andres Freund wrote:
> On 2016-04-11 12:17:20 -0700, Andres Freund wrote:
> I did get access to the machine (thanks!). My testing shows that
> performance is sensitive to various parameters influencing memory
> allocation. E.g. twiddling with max_connections changes
> performance. With max_connections=400 and the previous patches applied I
> get ~1220000 tps, with 402 ~1620000 tps. This sorta confirms that we're
> dealing with an alignment/sharing related issue.
> Padding PGXACT to a full cache-line seems to take care of the largest
> part of the performance irregularity. I looked at perf profiles and saw
> that most cache misses stem from there, and that the percentage (not
> absolute amount!) changes between fast/slow settings.
> To me it makes intuitive sense why you'd want PGXACTs to be on separate
> cachelines - they're constantly dirtied via SnapshotResetXmin(). Indeed
> making it immediately return propels performance up to 1720000, without
> other changes. Additionally cacheline-padding PGXACT speeds things up to
> 1750000 tps.
> But I'm unclear why the magnitude of the effect depends on other
> allocations. With the previously posted patches allPgXact is always
I've spent considerable amount experimenting around this. The alignment
of allPgXact does *not* apear to play a significant role; rather it
apears to be the the "distance" between the allPgXact and pgprocno
Alexander, could you post dmidecode output, and install numactl &
numastat on the machine? I wonder if the box has cluster-on-die
activated or not. Do I see correctly that this is a system that could
potentially have 8 sockets, but actually has only four? Because I see
physical id : 3 in /proc/cpuinfo only going up to three (from zero),
not 7? And there's only 144 processorcs, while each E7-8890 v3 should
have 36 threads.
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: