I understand what roundoff=2 means. Please go ahead. sun On Wed, Apr 27, 2011 at 8:47 PM, Ramanarayanan, Ramshankar <ramshankar.ramanaraya...@amd.com> wrote: > Sun, > > I think my choice of words contradict with the keywords used in the flags. > > The main change here is enabling -OPT:roundoff=2 at -O3 instead of > -OPT:roundoff=1. This change enables fast-math functions, aggressive loop > nest optimizations, reassociation on floating point expressions and more > aggressive round-off settings. These aggressive floating point optimizations > improve performance when using the -O3 flag but may affect floating point > accuracy. The use of -fp-accuracy=relaxed in addition to -O3 is recommended > for cases which need more floating point accuracy. -fp-accuracy=relaxed > automatically sets -OPT:roundoff=1. User may also use -OPT:roundoff=1. > > Please let me know if you have questions. > > Ram > > -----Original Message----- > From: Sun Chan [mailto:sun.c...@gmail.com] > Sent: Wednesday, April 27, 2011 5:02 PM > To: Ramanarayanan, Ramshankar > Cc: open64-devel@lists.sourceforge.net > Subject: Re: [Open64-devel] code review request for update to O3 flag > > you are really saying that to get back previous behavior, one needs to > -fp_accuracy=not_relaxed or something to that effect? I don't follow > your message. > Sun > > On Wed, Apr 27, 2011 at 6:37 PM, Ramanarayanan, Ramshankar > <ramshankar.ramanaraya...@amd.com> wrote: >> Could a gate keeper approve this patch? >> >> >> >> This update enhances performance of the compiled code on X8664 when using >> the O3 flag. Improvements come mainly from relaxing the floating point >> accuracy setting at O3. This enables a wide range of optimizations including >> loop nest optimizations and associative redundancy elimination >> optimizations. Given this change, users will need to use >> -fp-accuracy=relaxed flag in addition to -O3 if they require the earlier >> floating point precision. During subsequent tuning we found that the bad >> reference bias heuristic affects the computed cache costs and leads to >> incorrect choice of inner loops and is thus ignored. >> >> >> >> Following tests have been conducted with this change. >> >> >> >> 1: No compiler time failure for x86 build >> >> 2: SPEC CPU 2006 validated with AMD flags and with O3 flag >> >> 3: The gcc regression suite has no new failures on x86/Linux >> >> >> >> Best regards, >> >> Ram >> >> >> >> Ramshankar Ramanarayanan >> >> Member of Technical Staff >> >> Open Source Compiler Engineering >> >> Advanced Micro Devices, Bangalore >> >> >> >> >> >> ------------------------------------------------------------------------------ >> WhatsUp Gold - Download Free Network Management Software >> The most intuitive, comprehensive, and cost-effective network >> management toolset available today. Delivers lowest initial >> acquisition cost and overall TCO of any competing solution. >> http://p.sf.net/sfu/whatsupgold-sd >> _______________________________________________ >> Open64-devel mailing list >> Open64-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/open64-devel >> >> > > >
------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ Open64-devel mailing list Open64-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/open64-devel