On 09/10/2018 02:15 PM, Gustavo Romero wrote: > Hi Severin, > > On 09/10/2018 06:27 AM, Severin Gehwolf wrote: >> On Mon, 2018-09-10 at 10:05 +0100, Andrew Haley wrote: >>> On 09/05/2018 02:12 PM, Severin Gehwolf wrote: >>>> Is there a good >>>> reason to not use -O3 -ffp-contract=off everywhere? >>> >>> Is there a good reason to use -O3 rather than -O2? >> >> Not sure. I was following what JDK-8170153 did, which was using >> OPTIMIZATION := HIGH corresponding to -O3. cc'ing Gustavo. Gustavo, >> would you know why HIGH was chosen over, LOW? > > I don't remember exactly, but at least for ppc64 I discussed that a bit with > the toolchain folks (also regarding the precision issue, etc) and they never > said anything against using -O3. Unfortunately it was long time ago so I > don't remember exactly the numbers on ppc64 for -O2 to check if it was > worse and so I selected -O3 instead. > >>> -O3 can bloat the >>> code which can increase cache pressure, which is not always noticeable >>> in benchmarks but hurts real-world programs. Unless benchmarks are >>> significantly better at -O3, -O2 is a good default choice. >> >> OK, thanks! I'll re-test and change to LOW (-O2) if it gives similar >> results. > > That's interesting. Andrew, do you mean bloat in the sense of final code size > (for instance, due to unrolling), right?
Yes. With one of my other hats on: I'm also am occasional GCC maintainer, and we've always had the problem that people assume that O3 > O2, therefore O3 is better. It can be, but inlining can cause problems due to code size and high register pressure, so it's good to check. Let's see. > BTW (I just remembered that), on RISC the lack of optimization hurts way more > than the lack of optimization on CISC, Mmm, yes. Inlining is cool if you have a ton of registers, and can cause frantic spilling if you don't. > so I recall that it puzzled me the fact that turning on the > optimization on x86_64 did not change much the scenario, contrary to > the conspicuous gains on on ppc64 when turning on the optimization. > I took me some time so to understand that the optimization flag was > the culprit (a much simpler case lucky), because I tried first to > profile and optimize the fdlibm code (after extracting it from JVM > for detailed analysis) and only after getting to a dead end I turned > to look at simpler causes. > > Are you checking the difference between -O2 and -O3 only on x86_64? x86_64 has hand-carved code for a lot of this stuff, so it might not much be affected. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671