On 30/07/2025 18:13, Thomas Schwinge wrote:
Hi Andrew!

On 2025-07-25T14:44:06+0000, Andrew Stubbs <a...@baylibre.com> wrote:
The optimization options are deliberately passed through to the LTO compiler,
but when the same mechanism is reused for offloading it ends up forcing the
host compiler settings onto the device compiler.

Yeah, that's a problem: for performance-tuning-related 'Param's, as
discussed here, but also stuff like <https://gcc.gnu.org/PR114717>
"'-fcf-protection' vs. offloading compilation", for example.

Maybe this should be removed
completely

What is "this" that you suggest to remove, is it just 'Param's?  (That
is, classifying them the same as 'Target's ('-m[...]') -- which cannot
even be passed through, as they usually don't apply to the heterogeneous
offloading back ends.)  On the other hand, we want to continue passing
through more general flags, like 'Optimization's: '-O', and most (I
suppose, but not all...) '-f[...]' ones?

but this patch just fixes a few of them.

"This" was generally hand-waving in the direction of allowing the LTO option passing to affect offloading at all. Certainly we don't want to be in a position where the user must always specify -foffload-options=-O2 explicitly, but almost everything else is questionable.

In particular,
param_vect_partial_vector_usage is disabled by x86 and this really hurts amdgcn.

        * optc-save-gen.awk: Don't pass through options marked "NoOffload".

Please document 'NoOffload' in 'gcc/doc/options.texi'.

OK, I'll post something today.

Do we want '--help=params' to indicate which 'Param's have been tagged
'NoOffload'?

        * params.opt (-param=vect-epilogues-nomask): Add NoOffload.
        (-param=vect-partial-vector-usage): Likewise.
        (-param=vect-inner-loop-cost-factor): Likewise.

Do we want to document these three in 'gcc/doc/invoke.texi' as
"not passed through to offload compilation", or similar?

I feel like neither of these things will improve the readability of the documentation, for the vast majority of readers. Also, it's really not clear to me that end-users ever *expected* that these would be passed through.

Probably something in the documentation somewhere should explain what they should expect. Can we auto-generate the list of options that are exceptions to the rule?


Grüße
  Thomas


diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk
index a3d7e5a478e..31756ec380d 100644
--- a/gcc/optc-save-gen.awk
+++ b/gcc/optc-save-gen.awk
@@ -1313,6 +1313,12 @@ for (i = 0; i < n_opts; i++) {
                # offloading is enabled.
                if (flag_set_p("Target", flags[i]))
                        var_target_opt[n_opt_val] = 1;
+
+               # These options should not be passed from host to target, but
+               # are not actually target specific.
+               if (flag_set_p("NoOffload", flags[i]))
+                       var_target_opt[n_opt_val] = 2;
+
                n_opt_val++;
        }
  }
@@ -1393,7 +1399,7 @@ for (i = 0; i < n_opt_val; i++) {
                # Do not stream out target-specific opts if offloading is
                # enabled.
                if (var_target_opt[i])
-                       print "  if (!lto_stream_offload_p)"
+                       print "  if (!lto_stream_offload_p) {"
                # If applicable, encode the streamed value.
                if (var_opt_optimize_init[i]) {
                        print "  if (" var_opt_optimize_init[i] " > (" 
var_opt_val_type[i] ") 10)";
@@ -1403,6 +1409,8 @@ for (i = 0; i < n_opt_val; i++) {
                } else {
                        print "  bp_pack_var_len_" sgn " (bp, ptr->" name");";
                }
+               if (var_target_opt[i])
+                       print "}"
        }
  }
  print "  for (size_t i = 0; i < ARRAY_SIZE (ptr->explicit_mask); i++)";
@@ -1418,10 +1426,14 @@ print "                           struct cl_optimization 
*ptr ATTRIBUTE_UNUSED)"
  print "{";
  for (i = 0; i < n_opt_val; i++) {
        name = var_opt_val[i]
-        if (var_target_opt[i]) {
+        if (var_target_opt[i] == 1) {
                print "#ifdef ACCEL_COMPILER"
                print "#error accel compiler cannot define Optimization attribute 
for target-specific option " name;
                print "#else"
+       } else if (var_target_opt[i] == 2) {
+               print "#ifdef ACCEL_COMPILER"
+               print "  ptr->" name " = global_options." name ";"
+               print "#else"
        }
        otype = var_opt_val_type[i];
        if (otype ~ "^const char \\**$") {
@@ -1489,6 +1501,9 @@ for (i = 0; i < n_opts; i++) {
        if (flag_set_p("Warning", flags[i]))
                continue;
+ if (flag_set_p("NoOffload", flags[i]))
+               continue;
+
        if (name in checked_options)
                continue;
        checked_options[name]++
diff --git a/gcc/params.opt b/gcc/params.opt
index c7d5fd4d13b..ac1b2c7eb26 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1226,7 +1226,7 @@ Common Joined UInteger Var(param_use_canonical_types) 
Init(1) IntegerRange(0, 1)
  Whether to use canonical types.
-param=vect-epilogues-nomask=
-Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1) 
IntegerRange(0, 1) Param Optimization
+Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1) 
IntegerRange(0, 1) Param Optimization NoOffload
  Enable loop epilogue vectorization using smaller vector size.
-param=vect-max-layout-candidates=
@@ -1246,11 +1246,11 @@ Common Joined UInteger 
Var(param_vect_max_version_for_alignment_checks) Init(6)
  Bound on number of runtime checks inserted by the vectorizer's loop 
versioning for alignment check.
-param=vect-partial-vector-usage=
-Common Joined UInteger Var(param_vect_partial_vector_usage) Init(2) 
IntegerRange(0, 2) Param Optimization
+Common Joined UInteger Var(param_vect_partial_vector_usage) Init(2) 
IntegerRange(0, 2) Param Optimization NoOffload
  Controls how loop vectorizer uses partial vectors.  0 means never, 1 means 
only for loops whose need to iterate can be removed, 2 means for all loops.  
The default value is 2.
-param=vect-inner-loop-cost-factor=
-Common Joined UInteger Var(param_vect_inner_loop_cost_factor) Init(50) 
IntegerRange(1, 10000) Param Optimization
+Common Joined UInteger Var(param_vect_inner_loop_cost_factor) Init(50) 
IntegerRange(1, 10000) Param Optimization NoOffload
  The maximum factor which the loop vectorizer applies to the cost of 
statements in an inner loop relative to the loop being vectorized.
-param=vect-induction-float=

Reply via email to