Could a gate keeper approve this patch?

This update enhances performance of the compiled code on X8664 when using the 
O3 flag. Improvements come mainly from relaxing the floating point accuracy 
setting at O3. This enables a wide range of optimizations including loop nest 
optimizations and associative redundancy elimination optimizations. Given this 
change, users will need to use -fp-accuracy=relaxed flag in addition to -O3 if 
they require the earlier floating point precision. During subsequent tuning we 
found that the bad reference bias heuristic affects the computed cache costs 
and leads to incorrect choice of inner loops and is thus ignored.

Following tests have been conducted with this change.

1: No compiler time failure for x86 build
2: SPEC CPU 2006 validated with AMD flags and with O3 flag
3: The gcc regression suite has no new failures on x86/Linux



Best regards,

Ram



Ramshankar Ramanarayanan

Member of Technical Staff

Open Source Compiler Engineering

Advanced Micro Devices, Bangalore


Index: osprey/be/lno/cache_model.cxx
===================================================================
--- osprey/be/lno/cache_model.cxx       (revision 3559)
+++ osprey/be/lno/cache_model.cxx       (working copy)
@@ -6739,6 +6739,9 @@
     iloop[s] = i;
   }
 
+#ifndef TARG_X8664
+  // Ignoring bad reference bias heuristic to allow the right choice of inner 
loop
+  // Only do this for X8664
   if (depth != required_inner && arl->Num_Bad()) {
     INT nbodies = 1;
     INT i;
@@ -6751,6 +6754,8 @@
     } 
     *cycles_per_iter += bias / nbodies;
   }
+#endif /* TARG_X8664 */
+
   if (Debug_Cache_Model) {
     fprintf(TFile, "*** END CACHE MODEL (REQUIRED INNER LOOP=%d, ", 
       required_inner); 
Index: osprey/common/com/config.cxx
===================================================================
--- osprey/common/com/config.cxx        (revision 3559)
+++ osprey/common/com/config.cxx        (working copy)
@@ -1319,6 +1319,14 @@
     Aggregate_Alignment = 16;
   if ( !Vcast_Complex_Set && Opt_Level > 1 )
     Vcast_Complex = TRUE;
+  if (Opt_Level > 2) {
+    // 
+    // Enabling malloc_algorithm at O3  
+    // 
+    if (!OPT_Malloc_Alg_Set)
+        OPT_Malloc_Alg = 1;
+  }
+
 #endif
 }
 
@@ -1634,7 +1642,8 @@
 #if defined(TARG_IA64) || defined(TARG_LOONGSON)
     Roundoff_Level = ROUNDOFF_ASSOC;
 #else
-    Roundoff_Level = ROUNDOFF_SIMPLE;
+    // Enabling OPT:RO=2 at O3
+    Roundoff_Level = ROUNDOFF_ASSOC;
 #endif
 #endif
   }
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Open64-devel mailing list
Open64-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/open64-devel

Reply via email to