Could a gate keeper approve this patch?
This update enhances performance of the compiled code on X8664 when using the
O3 flag. Improvements come mainly from relaxing the floating point accuracy
setting at O3. This enables a wide range of optimizations including loop nest
optimizations and associative redundancy elimination optimizations. Given this
change, users will need to use -fp-accuracy=relaxed flag in addition to -O3 if
they require the earlier floating point precision. During subsequent tuning we
found that the bad reference bias heuristic affects the computed cache costs
and leads to incorrect choice of inner loops and is thus ignored.
Following tests have been conducted with this change.
1: No compiler time failure for x86 build
2: SPEC CPU 2006 validated with AMD flags and with O3 flag
3: The gcc regression suite has no new failures on x86/Linux
Best regards,
Ram
Ramshankar Ramanarayanan
Member of Technical Staff
Open Source Compiler Engineering
Advanced Micro Devices, Bangalore
Index: osprey/be/lno/cache_model.cxx
===================================================================
--- osprey/be/lno/cache_model.cxx (revision 3559)
+++ osprey/be/lno/cache_model.cxx (working copy)
@@ -6739,6 +6739,9 @@
iloop[s] = i;
}
+#ifndef TARG_X8664
+ // Ignoring bad reference bias heuristic to allow the right choice of inner
loop
+ // Only do this for X8664
if (depth != required_inner && arl->Num_Bad()) {
INT nbodies = 1;
INT i;
@@ -6751,6 +6754,8 @@
}
*cycles_per_iter += bias / nbodies;
}
+#endif /* TARG_X8664 */
+
if (Debug_Cache_Model) {
fprintf(TFile, "*** END CACHE MODEL (REQUIRED INNER LOOP=%d, ",
required_inner);
Index: osprey/common/com/config.cxx
===================================================================
--- osprey/common/com/config.cxx (revision 3559)
+++ osprey/common/com/config.cxx (working copy)
@@ -1319,6 +1319,14 @@
Aggregate_Alignment = 16;
if ( !Vcast_Complex_Set && Opt_Level > 1 )
Vcast_Complex = TRUE;
+ if (Opt_Level > 2) {
+ //
+ // Enabling malloc_algorithm at O3
+ //
+ if (!OPT_Malloc_Alg_Set)
+ OPT_Malloc_Alg = 1;
+ }
+
#endif
}
@@ -1634,7 +1642,8 @@
#if defined(TARG_IA64) || defined(TARG_LOONGSON)
Roundoff_Level = ROUNDOFF_ASSOC;
#else
- Roundoff_Level = ROUNDOFF_SIMPLE;
+ // Enabling OPT:RO=2 at O3
+ Roundoff_Level = ROUNDOFF_ASSOC;
#endif
#endif
}
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today. Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel