On Thu, 4 Jan 2024, Jan Hubicka wrote:

> Hi,
> this patch adds new option -falign-all-functions which works like
> -falign-functions, but applies to all functions including those in cold
> regions.  As discussed in the PR log, this is needed for atomically
> patching function entries in the kernel.
> 
> An option would be to make -falign-function mandatory, but I think it is not a
> good idea, since original purpose of -falign-funtions is optimization of
> instruction decode and cache size.  Having -falign-all-functions is
> backwards compatible.  Richi also suggested extending syntax of the
> -falign-functions parameters (which is already non-trivial) but it seems
> to me that having separate flag is more readable.
> 
> Bootstrapped/regtested x86_64-linux, OK for master and later
> backports to release branches?
> 
> gcc/ChangeLog:
>       
>       PR middle-end/88345
>       * common.opt: Add -falign-all-functions
>       * doc/invoke.texi: Add -falign-all-functions.
>       (-falign-functions, -falign-labels, -falign-loops): Document
>       that alignment is ignored in cold code.
>       * flags.h (align_loops): Reindent.
>       (align_jumps): Reindent.
>       (align_labels): Reindent.
>       (align_functions): Reindent.
>       (align_all_functions): New macro.
>       * opts.cc (common_handle_option): Handle -falign-all-functions.
>       * toplev.cc (parse_alignment_opts): Likewise.
>       * varasm.cc (assemble_start_function): Likewise.
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d263a959df3..fea2c855fcf 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1033,6 +1033,13 @@ faggressive-loop-optimizations
>  Common Var(flag_aggressive_loop_optimizations) Optimization Init(1)
>  Aggressively optimize loops using language constraints.
>  
> +falign-all-functions
> +Common Var(flag_align_all_functions) Optimization
> +Align the start of functions.

all functions

or maybe "of every function."?

> +
> +falign-all-functions=
> +Common RejectNegative Joined Var(str_align_all_functions) Optimization
> +
>  falign-functions
>  Common Var(flag_align_functions) Optimization
>  Align the start of functions.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index d272b9228dd..ad3d75d310c 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -543,6 +543,7 @@ Objective-C and Objective-C++ Dialects}.
>  @xref{Optimize Options,,Options that Control Optimization}.
>  @gccoptlist{-faggressive-loop-optimizations
>  -falign-functions[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
> +-falign-all-functions=[@var{n}]
>  -falign-jumps[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
>  -falign-labels[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
>  -falign-loops[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]]
> @@ -14177,6 +14178,9 @@ Align the start of functions to the next power-of-two 
> greater than or
>  equal to @var{n}, skipping up to @var{m}-1 bytes.  This ensures that at
>  least the first @var{m} bytes of the function can be fetched by the CPU
>  without crossing an @var{n}-byte alignment boundary.
> +This is an optimization of code performance and alignment is ignored for
> +functions considered cold.  If alignment is required for all functions,
> +use @option{-falign-all-functions}.
>  
>  If @var{m} is not specified, it defaults to @var{n}.
>  
> @@ -14210,6 +14214,12 @@ overaligning functions. It attempts to instruct the 
> assembler to align
>  by the amount specified by @option{-falign-functions}, but not to
>  skip more bytes than the size of the function.
>  
> +@opindex falign-all-functions=@var{n}
> +@item -falign-all-functions
> +Specify minimal alignment for function entry. Unlike 
> @option{-falign-functions}
> +this alignment is applied also to all functions (even those considered cold).
> +The alignment is also not affected by @option{-flimit-function-alignment}
> +

For functions with two entries (like on powerpc), which entry does this
apply to?  I suppose the external ABI entry, not the local one?  But
how does this then help to align the patchable entry (the common
local entry should be aligned?).  Should we align _both_ entries?

>  @opindex falign-labels
>  @item -falign-labels
>  @itemx -falign-labels=@var{n}
> @@ -14240,6 +14250,8 @@ Enabled at levels @option{-O2}, @option{-O3}.
>  Align loops to a power-of-two boundary.  If the loops are executed
>  many times, this makes up for any execution of the dummy padding
>  instructions.
> +This is an optimization of code performance and alignment is ignored for
> +loops considered cold.
>  
>  If @option{-falign-labels} is greater than this value, then its value
>  is used instead.
> @@ -14262,6 +14274,8 @@ Enabled at levels @option{-O2}, @option{-O3}.
>  Align branch targets to a power-of-two boundary, for branch targets
>  where the targets can only be reached by jumping.  In this case,
>  no dummy operations need be executed.
> +This is an optimization of code performance and alignment is ignored for
> +jumps considered cold.
>  
>  If @option{-falign-labels} is greater than this value, then its value
>  is used instead.
> @@ -14371,7 +14385,7 @@ To use the link-time optimizer, @option{-flto} and 
> optimization
>  options should be specified at compile time and during the final link.
>  It is recommended that you compile all the files participating in the
>  same link with the same options and also specify those options at
> -link time.  
> +link time.
>  For example:
>  
>  @smallexample
> diff --git a/gcc/flags.h b/gcc/flags.h
> index e4bafa310d6..ecf4fb9e846 100644
> --- a/gcc/flags.h
> +++ b/gcc/flags.h
> @@ -89,6 +89,7 @@ public:
>    align_flags x_align_jumps;
>    align_flags x_align_labels;
>    align_flags x_align_functions;
> +  align_flags x_align_all_functions;
>  };
>  
>  extern class target_flag_state default_target_flag_state;
> @@ -98,10 +99,11 @@ extern class target_flag_state *this_target_flag_state;
>  #define this_target_flag_state (&default_target_flag_state)
>  #endif
>  
> -#define align_loops   (this_target_flag_state->x_align_loops)
> -#define align_jumps   (this_target_flag_state->x_align_jumps)
> -#define align_labels  (this_target_flag_state->x_align_labels)
> -#define align_functions       (this_target_flag_state->x_align_functions)
> +#define align_loops      (this_target_flag_state->x_align_loops)
> +#define align_jumps      (this_target_flag_state->x_align_jumps)
> +#define align_labels     (this_target_flag_state->x_align_labels)
> +#define align_functions          (this_target_flag_state->x_align_functions)
> +#define align_all_functions (this_target_flag_state->x_align_all_functions)
>  
>  /* Returns TRUE if generated code should match ABI version N or
>     greater is in use.  */
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 7a3830caaa3..3fa521501ff 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -3342,6 +3342,12 @@ common_handle_option (struct gcc_options *opts,
>                               &opts->x_str_align_functions);
>        break;
>  
> +    case OPT_falign_all_functions_:
> +      check_alignment_argument (loc, arg, "all-functions",
> +                             &opts->x_flag_align_all_functions,
> +                             &opts->x_str_align_all_functions);
> +      break;
> +
>      case OPT_ftabstop_:
>        /* It is documented that we silently ignore silly values.  */
>        if (value >= 1 && value <= 100)
> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
> index 85450d97a1a..3dd6f4e1ef7 100644
> --- a/gcc/toplev.cc
> +++ b/gcc/toplev.cc
> @@ -1219,6 +1219,7 @@ parse_alignment_opts (void)
>    parse_N_M (str_align_jumps, align_jumps);
>    parse_N_M (str_align_labels, align_labels);
>    parse_N_M (str_align_functions, align_functions);
> +  parse_N_M (str_align_all_functions, align_all_functions);
>  }
>  
>  /* Process the options that have been parsed.  */
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 69f8f8ee018..ddb8536a337 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -1919,6 +1919,37 @@ assemble_start_function (tree decl, const char *fnname)
>        ASM_OUTPUT_ALIGN (asm_out_file, align);
>      }
>  
> +  /* Handle forced alignment.  This really ought to apply to all functions,
> +     since it is used by patchable entries.  */
> +  if (align_all_functions.levels[0].log > align)
> +    {
> +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
> +      int align_log = align_all_functions.levels[0].log;
> +#endif
> +      int max_skip = align_all_functions.levels[0].maxskip;
> +      if (flag_limit_function_alignment && crtl->max_insn_address > 0
> +       && max_skip >= crtl->max_insn_address)
> +     max_skip = crtl->max_insn_address - 1;
> +
> +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
> +      ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
> +      if (max_skip >= (1 << align_log) - 1)
> +     align = align_functions.levels[0].log;
> +      if (max_skip == align_all_functions.levels[0].maxskip)
> +     {
> +       ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file,
> +                                  align_all_functions.levels[1].log,
> +                                  align_all_functions.levels[1].maxskip);
> +       if (align_all_functions.levels[1].maxskip
> +           >= (1 << align_all_functions.levels[1].log) - 1)
> +         align = align_all_functions.levels[1].log;
> +     }
> +#else
> +      ASM_OUTPUT_ALIGN (asm_out_file, align_all_functions.levels[0].log);
> +      align = align_all_functions.levels[0].log;
> +#endif

This looks close to the handling of user-specified alignment
(factor sth out?), but I wonder if for -falign-all-functions
we should only allow a hard alignment (no max_skip) and also
not allow (but diagnose?) conflicts with limit-function-alignment?

The interaction with the other flags also doesn't seem to be
well documented?  The main docs suggest it should be
-fmin-function-alignment= which to me then suggests
-flimit-function-alignment should not have an effect on it
and even very small functions should be aligned.

Richard.

> +    }
> +
>    /* Handle a user-specified function alignment.
>       Note that we still need to align to DECL_ALIGN, as above,
>       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to