On 2016-04-11 12:17:20 -0700, Andres Freund wrote: > On 2016-04-11 22:08:15 +0300, Alexander Korotkov wrote: > > On Mon, Apr 11, 2016 at 5:04 PM, Alexander Korotkov < > > a.korot...@postgrespro.ru> wrote: > > > > > On Mon, Apr 11, 2016 at 8:10 AM, Andres Freund <and...@anarazel.de> wrote: > > > > > >> Could you retry after applying the attached series of patches? > > >> > > > > > > Yes, I will try with these patches and snapshot too old reverted. > > > > > > > I've run the same benchmark with 279d86af and 848ef42b reverted. I've > > tested of all 3 patches from you applied and, for comparison, 3 patches + > > clog buffers reverted back to 32. > > > > clients patches patches + clog_32 > > 1 12594 12556 > > 2 26705 26258 > > 4 50985 53254 > > 8 103234 104416 > > 10 135321 130893 > > 20 268675 267648 > > 30 370437 409710 > > 40 486512 482382 > > 50 539910 525667 > > 60 616401 672230 > > 70 667864 660853 > > 80 924606 737768 > > 90 1217435 799581 > > 100 1326054 863066 > > 110 1446380 980206 > > 120 1484920 1000963 > > 130 1512440 1058852 > > 140 1536181 1088958 > > 150 1504750 1134354 > > 160 1461513 1132173 > > 170 1453943 1158656 > > 180 1426288 1120511
> Any chance that I could run some tests on that machine myself? It's very > hard to investigate that kind of issue without access; the only thing I > otherwise can do is lob patches at you, till we find the relevant > memory. I did get access to the machine (thanks!). My testing shows that performance is sensitive to various parameters influencing memory allocation. E.g. twiddling with max_connections changes performance. With max_connections=400 and the previous patches applied I get ~1220000 tps, with 402 ~1620000 tps. This sorta confirms that we're dealing with an alignment/sharing related issue. Padding PGXACT to a full cache-line seems to take care of the largest part of the performance irregularity. I looked at perf profiles and saw that most cache misses stem from there, and that the percentage (not absolute amount!) changes between fast/slow settings. To me it makes intuitive sense why you'd want PGXACTs to be on separate cachelines - they're constantly dirtied via SnapshotResetXmin(). Indeed making it immediately return propels performance up to 1720000, without other changes. Additionally cacheline-padding PGXACT speeds things up to 1750000 tps. But I'm unclear why the magnitude of the effect depends on other allocations. With the previously posted patches allPgXact is always cacheline-aligned. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers