Re: [HACKERS] strict aliasing
Robert Haas wrote: > Kevin Grittner wrote: >> This suggests that in the long term, it might be worth [...] > The other possibility is that the OS is smart enough about moving > things around to get good locality that sticking locality hints on > top doesn't really make any difference. Certainly, I would expect > any OS to be smart enough to allocate backend-local memory on the > same processor where the task is running, and to avoid moving > processes between cells more than necessary. Right. I'm not sure that it will make any more sense to do this than to do raw access to a disk partition. I don't think it's a given that we can do a better job of this than the OS does. > Regarding results instability [...] My working theory is that this > is the result of spinlock contention. > So my theory is that now the performance goes down more or less > "permanently", unless or until there's some momentary break in the > action that lets the queue of people waiting for that spinlock > drain out. This is just a wild-ass guess, and I might be totally > wrong... Well, I suspect that you're basing that guess on enough evidence that it's more likely to be right than the wild-assed guesses I've been throwing out there. :-) I can't say it's inconsistent with anything I've seen. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing
On Wed, Nov 16, 2011 at 9:47 AM, Kevin Grittner wrote: > This suggests that in the long term, it might be worth investigating > whether we can arrange for a connection's process to have some > degree of core affinity and encourage each process to allocate local > memory from RAM controlled by that core. To some extent I would > expect the process-based architecture of PostgreSQL to help with > that, as you would expect a NUMA-aware OS to try to arrange that to > some degree. I've done some testing on HP/UX-Itanium and have not been able to demonstrate any significant performance benefit from overriding the operating system's default policies regarding processor affinity. For example, I hacked the code to request that the shared memory be allocated as cell-local memory, then used mpsched with the FILL_TREE policy to bind everything to a single cell, and sure enough it all ran in that cell, but it wasn't any better than 4 clients running on different cells with the shared memory segment allocated interleaved. This result didn't really make much sense to me, because it seemed like it SHOULD have helped. So it's possible I did something wrong. But if so, I couldn't find it. The other possibility is that the OS is smart enough about moving things around to get good locality that sticking locality hints on top doesn't really make any difference. Certainly, I would expect any OS to be smart enough to allocate backend-local memory on the same processor where the task is running, and to avoid moving processes between cells more than necessary. Regarding results instability, on some patch sets I've tried, I've seen very unstable performance. I've also noticed that a very short run sometimes gives much higher performance than a longer run. My working theory is that this is the result of spinlock contention. Suppose you have a spinlock that is getting passed around very quickly between, say, 32 processes. Since the data protected by the spinlock is on the same cache line as the lock, what ideally happens is that the process gets the lock and then finishes its work and releases the lock before anyone else tries to pull the cache line away. And at the beginning of the run, that's what does actually happen. But then for some reason a process gets context-switched out while it holds the lock, or maybe it's just that somebody gets unlucky enough to have the cache line stolen before they can dump the spinlock and can't quite get it back fast enough. Now people start to pile up trying to get that spinlock, and that means trouble, because now it's much harder for any given process to get the cache line and finish its work before the cache line gets stolen away. So my theory is that now the performance goes down more or less "permanently", unless or until there's some momentary break in the action that lets the queue of people waiting for that spinlock drain out. This is just a wild-ass guess, and I might be totally wrong... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing
Ants Aasma wrote: > Concurrency 8 results should probably be ignored - variance was > huge, definitely more than the differences. I'm not so sure it should be ignored -- one thing I noticed in looking at the raw numbers from my benchmarks was that the -O2 code was much more consistent from run to run than the -O3 code. I doubt that the more aggressive optimizations were developed under NUMA architecture, and I suspect that the aggressively optimized code may be more sensitive to whether memory is directly accessed by the core running the process or routed though the memory controller on another core. (I hit on this idea this morning when I remembered seeing similar variations in run times of STREAM against our new servers with NUMA.) This suggests that in the long term, it might be worth investigating whether we can arrange for a connection's process to have some degree of core affinity and encourage each process to allocate local memory from RAM controlled by that core. To some extent I would expect the process-based architecture of PostgreSQL to help with that, as you would expect a NUMA-aware OS to try to arrange that to some degree. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing
On Tue, Nov 15, 2011 at 9:02 PM, Kevin Grittner wrote: > From my reading, it appears that if we get safe code in terms of > strict aliasing, we might be able to use the "restrict" keyword to > get further optimizations which bring it to a net win, but I think > there is currently lower-hanging fruit than monkeying with these > compiler options. I'm letting this go, although I still favor the > const-ifying which started this discussion, on the grounds of API > clarity. Speaking of lower-hanging fruit... I ran a series of tests to see how different optimization flags affect performance. I was particularly interested in what effect link time optimization has. The results are somewhat interesting. Benchmark machine is my laptop, Intel Core i5 M 540 @ 2.53GHz. 2 cores + hyperthreading for a total of 4 threads. Ubuntu 11.10. Compiled with GCC 4.6.1-9ubuntu3. I ran pgbench read only test with scale factor 10, default options except for shared_buffers = 256MB. The dataset fits fully in shared buffers. I tried following configurations: default: plain old ./configure; make; make install -O3: what it says on the label lto: CFLAGS="-O3 -flto" This should do some global optimizations at link time. PGO: compiled with CFLAGS="-O3 -fprofile-generate", then ran pgbench -T 30 on a scalefactor 100 database (IO bound rw load to mix the profile up a bit). Then did # sed -i s/-fprofile-generate/-fprofile-use/ src/Makefile.global and recompiled and installed. lto + PGO: same as previous, but with added -flto. Median tps of 3 5 minute runs at different concurrency levels: -c default -O3 lto PGO lto + PGO == 1 6753.40 6689.76 6498.37 6614.73 5918.65 2 11600.87 11659.33 12074.63 12957.81 13353.54 4 18852.86 18918.32 19008.89 20006.49 20652.93 8 15232.30 15762.70 14568.06 15880.19 16091.24 16 15693.93 15625.87 16563.91 17088.28 18223.02 Percentage increase from default flags: -c default -O3 lto PGO lto + PGO == 1 6753.40 -0.94% -3.78% -2.05% -12.36% 2 11600.87 0.50%4.08% 11.70% 15.11% 4 18852.86 0.35%0.83%6.12%9.55% 8 15232.30 3.48% -4.36%4.25%5.64% 16 15693.93 -0.43%5.54%8.88% 16.12% Concurrency 8 results should probably be ignored - variance was huge, definitely more than the differences. For other results, variance was ~1%. I don't know what to make of the single client results, why they seem to be going in the opposite direction of all other results. Other than that both profile guided optimization and link time optimization give a pretty respectable boost. If anyone can suggest some more diverse workloads to test, I could try to see if the PGO results persist when profiling and benchmark loads differ more. These results suggest that giving the compiler information about hot and cold paths results in a significant improvement in generated code quality. -- Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing
Florian Weimer wrote: > * Andres Freund: > >> I don't gcc will ever be able to call all possible misusages. >> E.g. The List api is a case where its basically impossible to >> catch everything (as gcc won't be able to figure out what the >> ListCell.data.ptr_value pointed to originally in the general >> case). > > Correct, if code is not strict-aliasing-safe and you compile with > -f-strict-aliasing, GCC may silently produce wrong code. (Same > with -fwrapv, by the way.) I've spent a little time trying to get my head around what to look for in terms of unsafe code, but am not really there yet. Meanwhile, I've run a few more benchmarks of -fstrict-aliasing (without also changing to the -O3 switch) compared to a normal build. In no benchmark so far has strict aliasing by itself performed better on my i7, and in most cases it is slightly worse. (This means that some of the optimizations in -O3 probably *did* have a small net positive, since the benchmarks combining both showed a gain as long as there weren't more clients than cores, and the net loss on just strict aliasing would account for the decrease at higher client counts.) >From my reading, it appears that if we get safe code in terms of strict aliasing, we might be able to use the "restrict" keyword to get further optimizations which bring it to a net win, but I think there is currently lower-hanging fruit than monkeying with these compiler options. I'm letting this go, although I still favor the const-ifying which started this discussion, on the grounds of API clarity. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing
* Andres Freund: > I don't gcc will ever be able to call all possible misusages. E.g. The > List api is a case where its basically impossible to catch everything > (as gcc won't be able to figure out what the ListCell.data.ptr_value > pointed to originally in the general case). Correct, if code is not strict-aliasing-safe and you compile with -f-strict-aliasing, GCC may silently produce wrong code. (Same with -fwrapv, by the way.) -- Florian Weimer BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
On Monday, November 14, 2011 10:22:52 PM Tom Lane wrote: > "Kevin Grittner" writes: > >> Tom Lane wrote: > >>> Dunno ... where were the warnings exactly? > > > > From HEAD checkout of a few minutes ago I now see only 9: > Hmm ... well, none of those look likely to be in performance-sensitive > areas. But I wonder just how good the trouble-detection code is these > days. No idea about how good it is but you can make the detection code more aggressive by -Wstrict-aliasing=1 (which will produce more false positives). I don't gcc will ever be able to call all possible misusages. E.g. The List api is a case where its basically impossible to catch everything (as gcc won't be able to figure out what the ListCell.data.ptr_value pointed to originally in the general case). Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
On Monday, November 14, 2011 10:25:19 PM Alvaro Herrera wrote: > Excerpts from Kevin Grittner's message of lun nov 14 17:30:50 -0300 2011: > > Tom Lane wrote: > > > "Kevin Grittner" writes: > > >> Also, is there something I should do to deal with the warnings > > >> before this would be considered a meaningful test? > > > > > > Dunno ... where were the warnings exactly? > > > > All 10 were like this: > > warning: dereferencing type-punned pointer will break > > > > strict-aliasing rules > > Uhm, shouldn't we expect there to be one warning for each use of a Node > using some specific node pointer type as well as something generic such > as inside a ListCell etc? The case with Node's being accessed by SomethingNode is legal to my knowledge as the individual memory locations are accessed by variables of the same type. That follows from the rules "an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union)" and "a type compatible with the effective type of the object". And the ListCell case is ok as well unless there is a wrong cast in code using the ListCell somewhere. E.g. its afaics safe to do something like: void do_something_int(int); int bla; void* foo = &bla; ... do_something_int(*(int*)foo); but do_something_short(*(short*)foo); is illegal. The compiler obviously cant be able to prove all misusage of the void* pointers in e.g. ListCell's though... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
On Mon, Nov 14, 2011 at 06:25:19PM -0300, Alvaro Herrera wrote: > > All 10 were like this: > > > > warning: dereferencing type-punned pointer will break > > strict-aliasing rules > > Uhm, shouldn't we expect there to be one warning for each use of a Node > using some specific node pointer type as well as something generic such > as inside a ListCell etc? Maybe they're safe? But in any case given the use of Node, a may be an idea to mark it with attribute((__may_alias__)), that should clear up most of the problems in that area. http://ohse.de/uwe/articles/gcc-attributes.html#type-may_alias Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] strict aliasing (was: const correctness)
Excerpts from Kevin Grittner's message of lun nov 14 17:30:50 -0300 2011: > Tom Lane wrote: > > "Kevin Grittner" writes: > >> Also, is there something I should do to deal with the warnings > >> before this would be considered a meaningful test? > > > > Dunno ... where were the warnings exactly? > > All 10 were like this: > > warning: dereferencing type-punned pointer will break > strict-aliasing rules Uhm, shouldn't we expect there to be one warning for each use of a Node using some specific node pointer type as well as something generic such as inside a ListCell etc? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
"Kevin Grittner" writes: >> Tom Lane wrote: >>> Dunno ... where were the warnings exactly? > From HEAD checkout of a few minutes ago I now see only 9: Hmm ... well, none of those look likely to be in performance-sensitive areas. But I wonder just how good the trouble-detection code is these days. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
"Kevin Grittner" wrote: > Tom Lane wrote: >> "Kevin Grittner" writes: >>> Also, is there something I should do to deal with the warnings >>> before this would be considered a meaningful test? >> >> Dunno ... where were the warnings exactly? > > All 10 were like this: > > warning: dereferencing type-punned pointer will break > strict-aliasing rules From HEAD checkout of a few minutes ago I now see only 9: parse_type.c: In function *typenameTypeMod*: parse_type.c:313:4 parse_type.c:318:4 parse_type.c:319:7 guc.c: In function *flatten_set_variable_args*: guc.c:6036:3 guc.c:6087:7 plpython.c: In function *PLy_plan_status*: plpython.c:3213:3 btree_utils_var.c: In function *gbt_var_node_truncate*: btree_utils_var.c:213:2 trgm_gist.c: In function *gtrgm_consistent*: trgm_gist.c:262:5 trgm_gist.c:262:5 -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
Tom Lane wrote: > Dunno ... where were the warnings exactly? Ah, you asked "where", not "what". I don't think I saved that, and I had to reboot for a new kernel, so I don't have the buffer sitting around. I'll do a new build and let you know shortly. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
Tom Lane wrote: > "Kevin Grittner" writes: >> The results were interesting. While the small overlap between >> samples from the two builds at most levels means that this was >> somewhat unlikely to be just sampling noise, there could have >> been alignment issues that account for some of the differences. >> In short, the strict aliasing build always beat the other with 4 >> clients or fewer (on this 4 core machine), but always lost with >> more than 4 clients. > > That is *weird*. Yeah, my only theories are that it was an unlucky set of samples (which seems a little thin looking at the numbers) or that some of the optimizations in -O3 are about improving pipelining at what would otherwise be an increase in cycles, but that context switching breaks up the pipelining enough that it's a net loss at high concurrency. That doesn't seem quite as thin as the other explanation, but it's not very satisfying without some sort of confirmation. >> Also, is there something I should do to deal with the warnings >> before this would be considered a meaningful test? > > Dunno ... where were the warnings exactly? All 10 were like this: warning: dereferencing type-punned pointer will break strict-aliasing rules The warning is about reading a union using a different type than was last stored there. It seems like that might sometimes be legitimate reasons to do that, and that if it was broken with strict aliasing it might be broken without. But strict aliasing is new territory for me. > Also, did you run the regression tests (particularly the parallel > version) against the build? Yes. The normal parallel `make check-world`, the `make installcheck-world` against an install with default_transaction_isolation = 'serializable' and max_prepared_transactions = 10, and `make -C src/test/isolation installcheck`. All ran without problem. I'm inclined to try -O3 and -strict-aliasing separately, with a more iterations; but I want to fix anything that's wrong with the aliasing first. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
"Kevin Grittner" writes: > The results were interesting. While the small overlap between > samples from the two builds at most levels means that this was > somewhat unlikely to be just sampling noise, there could have been > alignment issues that account for some of the differences. In > short, the strict aliasing build always beat the other with 4 > clients or fewer (on this 4 core machine), but always lost with more > than 4 clients. That is *weird*. > Also, is there something I should do to deal with the warnings > before this would be considered a meaningful test? Dunno ... where were the warnings exactly? Also, did you run the regression tests (particularly the parallel version) against the build? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strict aliasing (was: const correctness)
Florian Pflug wrote: > If we're concerned about helping the compiler produce better code, > I think we should try to make our code safe under strict aliasing > rules. AFAIK, that generally helps much more than > const-correctness. (Dunno how feasible that is, though) To get a preliminary feel for how much this might help, I set my workstation with an i7-2600 and 16GB RAM to run Robert Haas's pgbench concurrency tests against PostgreSQL built with (default) -O2 and no strict aliasing versus -O3 and strict aliasing. I ignored the ten warnings about punning under strict aliasing. Both builds were with asserts disabled. No other changes from Friday's HEAD. All runs were at the REPEATABLE READ isolation level. I scheduled it for a window of time where the box wasn't running any scheduled maintenance. The results were interesting. While the small overlap between samples from the two builds at most levels means that this was somewhat unlikely to be just sampling noise, there could have been alignment issues that account for some of the differences. In short, the strict aliasing build always beat the other with 4 clients or fewer (on this 4 core machine), but always lost with more than 4 clients. 1 client: +0.8% 2 clients: +2.0% 4 clients: +3.2% 8 clients: -0.9% 16 clients: -0.5% 32 clients: -0.9% I wouldn't want to make too much out of this without repeating the tests and trying different hardware, but I'm wondering whether the abrupt difference at the number of cores makes sense to anybody. Also, is there something I should do to deal with the warnings before this would be considered a meaningful test? Raw numbers: no-strict-aliasing.1 tps = 7140.253910 no-strict-aliasing.1 tps = 7291.465297 no-strict-aliasing.1 tps = 7219.054359 no-strict-aliasing.2 tps = 16592.613779 no-strict-aliasing.2 tps = 15418.602945 no-strict-aliasing.2 tps = 16826.200551 no-strict-aliasing.4 tps = 48145.69 no-strict-aliasing.4 tps = 47141.611960 no-strict-aliasing.4 tps = 47263.175254 no-strict-aliasing.8 tps = 93466.397174 no-strict-aliasing.8 tps = 93757.111493 no-strict-aliasing.8 tps = 93422.349453 no-strict-aliasing.16 tps = 88758.623319 no-strict-aliasing.16 tps = 88976.546555 no-strict-aliasing.16 tps = 88521.025343 no-strict-aliasing.32 tps = 87799.019143 no-strict-aliasing.32 tps = 88006.881881 no-strict-aliasing.32 tps = 88295.826711 strict-aliasing.1 tps = 7067.461710 strict-aliasing.1 tps = 7415.244823 strict-aliasing.1 tps = 7277.643321 strict-aliasing.2 tps = 14576.820162 strict-aliasing.2 tps = 16928.746994 strict-aliasing.2 tps = 19958.285834 strict-aliasing.4 tps = 48780.830247 strict-aliasing.4 tps = 49067.751657 strict-aliasing.4 tps = 48303.413578 strict-aliasing.8 tps = 93155.601896 strict-aliasing.8 tps = 92279.973490 strict-aliasing.8 tps = 92629.332125 strict-aliasing.16 tps = 88328.799197 strict-aliasing.16 tps = 88283.503270 strict-aliasing.16 tps = 88463.673815 strict-aliasing.32 tps = 87148.701204 strict-aliasing.32 tps = 87398.233624 strict-aliasing.32 tps = 87201.021722 -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers