Re: [PATCH] predict: Adjust optimize_function_for_size_p [PR105818]

2022-09-27 Thread Kewen.Lin via Gcc-patches
on 2022/8/29 14:35, Kewen.Lin via Gcc-patches wrote:
> on 2022/8/15 16:33, Kewen.Lin via Gcc-patches wrote:
>> on 2022/7/11 11:42, Kewen.Lin wrote:
>>> on 2022/6/15 14:20, Kewen.Lin wrote:
 Hi Honza,

 Thanks for the comments!  Some replies are inlined below.

 on 2022/6/14 19:37, Jan Hubicka wrote:
>> Hi,
>>
>> Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO
>> if func->decl is not null but no cgraph node is available for it.
>> As PR105818 shows, this could give unexpected result.  For the
>> case in PR105818, when parsing bar decl in function foo, the cfun
>> is a function structure for foo, for which there is none cgraph
>> node, so it returns OPTIMIZE_SIZE_NO.  But it's incorrect since
>> the context is to optimize for size, the flag optimize_size is
>> true.
>>
>> The patch is to make optimize_function_for_size_p to check
>> optimize_size as what it does when func->decl is unavailable.
>>
>> One regression failure got exposed on aarch64-linux-gnu:
>>
>> PASS->FAIL: gcc.dg/guality/pr54693-2.c   -Os \
>>  -DPREVENT_OPTIMIZATION  line 21 x == 10 - i
>>
>> The difference comes from the macro LOGICAL_OP_NON_SHORT_CIRCUIT
>> used in function fold_range_test during c parsing, it uses
>> optimize_function_for_speed_p which is equal to the invertion
>> of optimize_function_for_size_p.  At that time cfun->decl is valid
>> but no cgraph node for it, w/o this patch function
>> optimize_function_for_speed_p returns true eventually, while it
>> returns false with this patch.  Since the command line option -Os
>> is specified, there is no reason to interpret it as "for speed".
>> I think this failure is expected and adjust the test case
>> accordingly.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -
>>
>>  PR target/105818
>>
>> gcc/ChangeLog:
>>
>>  * predict.cc (optimize_function_for_size_p): Check optimize_size when
>>  func->decl is valid but its cgraph node is unavailable.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/powerpc/pr105818.c: New test.
>>  * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
>> ---
>>  gcc/predict.cc  | 2 +-
>>  gcc/testsuite/gcc.dg/guality/pr54693-2.c| 2 +-
>>  gcc/testsuite/gcc.target/powerpc/pr105818.c | 9 +
>>  3 files changed, 11 insertions(+), 2 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr105818.c
>>
>> diff --git a/gcc/predict.cc b/gcc/predict.cc
>> index 5734e4c8516..6c60a973236 100644
>> --- a/gcc/predict.cc
>> +++ b/gcc/predict.cc
>> @@ -268,7 +268,7 @@ optimize_function_for_size_p (struct function *fun)
>>cgraph_node *n = cgraph_node::get (fun->decl);
>>if (n)
>>  return n->optimize_for_size_p ();
>> -  return OPTIMIZE_SIZE_NO;
>> +  return optimize_size ? OPTIMIZE_SIZE_MAX : OPTIMIZE_SIZE_NO;
>
> We could also do (opt_for_fn (cfun->decl, optimize_size) that is
> probably better since one can change optimize_size with optimization
> attribute.

 Good point, agree!

> However I think in most cases we check for optimize_size early I think
> we are doing something wrong, since at that time htere is no profile
> available.  Why exactly PR105818 hits the flag change issue?

 For PR105818, the reason why the flag changs is that:

 Firstly, the inconsistent flag is OPTION_MASK_SAVE_TOC_INDIRECT bit
 of rs6000_isa_flags_explicit, it's set as below:

 /* If we can shrink-wrap the TOC register save separately, then use
-msave-toc-indirect unless explicitly disabled.  */
 if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
 && flag_shrink_wrap_separate
 && optimize_function_for_speed_p (cfun))
   rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;

 Initially, rs6000 initialize target_option_default_node with
 OPTION_MASK_SAVE_TOC_INDIRECT unset, at that time cfun is NULL
 and optimize_size is true.

 Later, when c parser handling function foo, it builds target
 option node as target_option_default_node in function
 handle_optimize_attribute, it does global option saving and
 verifying there as well, at that time the cfun is NULL, no
 issue is found.  And function store_parm_decls allocates
 struct_function for foo then, cfun becomes function struct
 for foo, when c parser continues to handle the decl bar in
 foo, function handle_optimize_attribute works as before,
 tries to restore the target options at the end, it calls
 targetm.target_option.restore (rs6000_function_specific_restore)
 which calls function rs6000_option_override_internal again,
 at this time the cfun is not NULL while there is no cgraph
 node for its 

PING^1 [PATCH] Adjust the symbol for SECTION_LINK_ORDER linked_to section [PR99889]

2022-09-27 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600190.html

BR,
Kewen

on 2022/8/24 16:17, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As discussed in PR98125, -fpatchable-function-entry with
> SECTION_LINK_ORDER support doesn't work well on powerpc64
> ELFv1 because the filled "Symbol" in
> 
>   .section name,"flags"o,@type,Symbol
> 
> sits in .opd section instead of in the function_section
> like .text or named .text*.
> 
> Since we already generates one label LPFE* which sits in
> function_section of current_function_decl, this patch is
> to reuse it as the symbol for the linked_to section.  It
> avoids the above ABI specific issue when using the symbol
> concluded from current_function_decl.
> 
> Besides, with this support some previous workarounds for
> powerpc64 ELFv1 can be reverted.
> 
> btw, rs6000_print_patchable_function_entry can be dropped
> but there is another rs6000 patch which needs this rs6000
> specific hook rs6000_print_patchable_function_entry, not
> sure which one gets landed first, so just leave it here.
> 
> Bootstrapped and regtested on below:
> 
>   1) powerpc64-linux-gnu P8 with default binutils 2.27
>  and latest binutils 2.39.
>   2) powerpc64le-linux-gnu P9 (default binutils 2.30).
>   3) powerpc64le-linux-gnu P10 (default binutils 2.30).
>   4) x86_64-redhat-linux with default binutils 2.30
>  and latest binutils 2.39.
>   5) aarch64-linux-gnu  with default binutils 2.30
>  and latest binutils 2.39.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
> 
>   PR target/99889
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry):
>   Adjust to call function default_print_patchable_function_entry.
>   * targhooks.cc (default_print_patchable_function_entry_1): Remove and
>   move the flags preparation ...
>   (default_print_patchable_function_entry): ... here, adjust to use
>   current_function_funcdef_no for label no.
>   * targhooks.h (default_print_patchable_function_entry_1): Remove.
>   * varasm.cc (default_elf_asm_named_section): Adjust code for
>   __patchable_function_entries section support with LPFE label.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/pr93195a.C: Remove the skip on powerpc*-*-* 64-bit.
>   * gcc.target/aarch64/pr92424-2.c: Adjust LPFE1 with LPFE0.
>   * gcc.target/aarch64/pr92424-3.c: Likewise.
>   * gcc.target/i386/pr93492-2.c: Likewise.
>   * gcc.target/i386/pr93492-3.c: Likewise.
>   * gcc.target/i386/pr93492-4.c: Likewise.
>   * gcc.target/i386/pr93492-5.c: Likewise.
> ---
>  gcc/config/rs6000/rs6000.cc  | 13 +-
>  gcc/varasm.cc| 15 ---
>  gcc/targhooks.cc | 45 +++-
>  gcc/targhooks.h  |  3 --
>  gcc/testsuite/g++.dg/pr93195a.C  |  1 -
>  gcc/testsuite/gcc.target/aarch64/pr92424-2.c |  4 +-
>  gcc/testsuite/gcc.target/aarch64/pr92424-3.c |  4 +-
>  gcc/testsuite/gcc.target/i386/pr93492-2.c|  4 +-
>  gcc/testsuite/gcc.target/i386/pr93492-3.c|  4 +-
>  gcc/testsuite/gcc.target/i386/pr93492-4.c|  4 +-
>  gcc/testsuite/gcc.target/i386/pr93492-5.c|  4 +-
>  11 files changed, 40 insertions(+), 61 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index df491bee2ea..dba28b8e647 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -14771,18 +14771,9 @@ rs6000_print_patchable_function_entry (FILE *file,
>  unsigned HOST_WIDE_INT patch_area_size,
>  bool record_p)
>  {
> -  unsigned int flags = SECTION_WRITE | SECTION_RELRO;
> -  /* When .opd section is emitted, the function symbol
> - default_print_patchable_function_entry_1 is emitted into the .opd 
> section
> - while the patchable area is emitted into the function section.
> - Don't use SECTION_LINK_ORDER in that case.  */
> -  if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2)
> -  && HAVE_GAS_SECTION_LINK_ORDER)
> -flags |= SECTION_LINK_ORDER;
> -  default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
> - flags);
> +  default_print_patchable_function_entry (file, patch_area_size, record_p);
>  }
> -
> 
> +
>  enum rtx_code
>  rs6000_reverse_condition (machine_mode mode, enum rtx_code code)
>  {
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 4db8506b106..d4de6e164ee 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -6906,11 +6906,16 @@ default_elf_asm_named_section (const char *name, 
> unsigned int flags,
>   fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
>if (flags & SECTION_LINK_ORDER)
>   {
> -   tree id = DECL_ASSEMBLER_NAME (decl);
> -   ultimate_transparent_alias_target ();
> -   const char *name = IDENTIFIER_POINTER (id);
> -   name = 

PING^1 [PATCH v4] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-09-27 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600277.html

BR,
Kewen

on 2022/8/25 13:50, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As PR99888 and its related show, the current support for
> -fpatchable-function-entry on powerpc ELFv2 doesn't work
> well with global entry existence.  For example, with one
> command line option -fpatchable-function-entry=3,2, it got
> below w/o this patch:
> 
>   .LPFE1:
> nop
> nop
> .type   foo, @function
>   foo:
> nop
>   .LFB0:
> .cfi_startproc
>   .LCF0:
>   0:  addis 2,12,.TOC.-.LCF0@ha
> addi 2,2,.TOC.-.LCF0@l
> .localentry foo,.-foo
> 
> , the assembly is unexpected since the patched nops have
> no effects when being entered from local entry.
> 
> This patch is to update the nops patched before and after
> local entry, it looks like:
> 
> .type   foo, @function
>   foo:
>   .LFB0:
> .cfi_startproc
>   .LCF0:
>   0:  addis 2,12,.TOC.-.LCF0@ha
> addi 2,2,.TOC.-.LCF0@l
> nop
> nop
> .localentry foo,.-foo
> nop
> 
> Bootstrapped and regtested on powerpc64-linux-gnu P7 & P8,
> and powerpc64le-linux-gnu P9 & P10.
> 
> v4: Change the remaining NOP to nop and update documentation of option
> -fpatchable-function-entry for PowerPC ELFv2 ABI dual entry points
> as Segher suggested.
> 
> v3: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599925.html
> 
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599617.html
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599461.html
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> 
>   PR target/99888
>   PR target/105649
> 
> gcc/ChangeLog:
> 
>   * doc/invoke.texi (option -fpatchable-function-entry): Adjust the
>   documentation for PowerPC ELFv2 ABI dual entry points.
>   * config/rs6000/rs6000-internal.h
>   (rs6000_print_patchable_function_entry): New function declaration.
>   * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue):
>   Support patchable-function-entry by emitting nops before and after
>   local entry for the function that needs global entry.
>   * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Skip
>   the function that needs global entry till global entry has been
>   emitted.
>   * config/rs6000/rs6000.h (struct machine_function): New bool member
>   global_entry_emitted.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr99888-1.c: New test.
>   * gcc.target/powerpc/pr99888-2.c: New test.
>   * gcc.target/powerpc/pr99888-3.c: New test.
>   * gcc.target/powerpc/pr99888-4.c: New test.
>   * gcc.target/powerpc/pr99888-5.c: New test.
>   * gcc.target/powerpc/pr99888-6.c: New test.
>   * c-c++-common/patchable_function_entry-default.c: Adjust for
>   powerpc_elfv2 to avoid compilation error.
> ---
>  gcc/config/rs6000/rs6000-internal.h   |  5 +++
>  gcc/config/rs6000/rs6000-logue.cc | 32 ++
>  gcc/config/rs6000/rs6000.cc   | 10 -
>  gcc/config/rs6000/rs6000.h|  4 ++
>  gcc/doc/invoke.texi   |  8 +++-
>  .../patchable_function_entry-default.c|  3 ++
>  gcc/testsuite/gcc.target/powerpc/pr99888-1.c  | 43 +++
>  gcc/testsuite/gcc.target/powerpc/pr99888-2.c  | 43 +++
>  gcc/testsuite/gcc.target/powerpc/pr99888-3.c  | 11 +
>  gcc/testsuite/gcc.target/powerpc/pr99888-4.c  | 13 ++
>  gcc/testsuite/gcc.target/powerpc/pr99888-5.c  | 13 ++
>  gcc/testsuite/gcc.target/powerpc/pr99888-6.c  | 14 ++
>  12 files changed, 195 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-2.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-3.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-4.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-5.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-6.c
> 
> diff --git a/gcc/config/rs6000/rs6000-internal.h 
> b/gcc/config/rs6000/rs6000-internal.h
> index 8ee8c987b81..d80c04b5ae5 100644
> --- a/gcc/config/rs6000/rs6000-internal.h
> +++ b/gcc/config/rs6000/rs6000-internal.h
> @@ -183,10 +183,15 @@ extern tree rs6000_fold_builtin (tree fndecl 
> ATTRIBUTE_UNUSED,
>tree *args ATTRIBUTE_UNUSED,
>bool ignore ATTRIBUTE_UNUSED);
> 
> +extern void rs6000_print_patchable_function_entry (FILE *,
> +unsigned HOST_WIDE_INT,
> +bool);
> +
>  extern bool rs6000_passes_float;
>  extern bool rs6000_passes_long_double;
>  extern bool rs6000_passes_vector;
>  extern bool rs6000_returns_struct;
>  extern bool cpu_builtin_p;
> 
> +
>  

PING^1 [PATCH] rs6000/test: Adjust pr104992.c with vect_int_mod [PR106516]

2022-09-27 Thread Kewen.Lin via Gcc-patches
Hi,

I assumed the generic part introducing check_effective_target_vect_int_mod
needs the approval from global maintainers.

So gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600191.html

BR,
Kewen

on 2022/8/24 16:17, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As PR106516 shows, we can get unexpected gimple outputs for
> function thud on some target which supports modulus operation
> for vector int.  This patch introduces one effective target
> vect_int_mod for it, then adjusts the test case with it.
> 
> Tested on x86_64-redhat-linux and powerpc64{,le}-linux-gnu,
> especially powerpc64le Power10.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
>   PR testsuite/106516
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/pr104992.c: Adjust with vect_int_mod.
>   * lib/target-supports.exp (check_effective_target_vect_int_mod): New
>   proc for effective target vect_int_mod.
> ---
>  gcc/testsuite/gcc.dg/pr104992.c   | 3 ++-
>  gcc/testsuite/lib/target-supports.exp | 8 
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/pr104992.c b/gcc/testsuite/gcc.dg/pr104992.c
> index 217c89a458c..82f8c75559c 100644
> --- a/gcc/testsuite/gcc.dg/pr104992.c
> +++ b/gcc/testsuite/gcc.dg/pr104992.c
> @@ -54,4 +54,5 @@ __attribute__((noipa)) unsigned waldo (unsigned x, unsigned 
> y, unsigned z) {
>  return x / y * z == x;
>  }
> 
> -/* { dg-final {scan-tree-dump-times " % " 9 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " % " 9 "optimized" { target { ! 
> vect_int_mod } } } } */
> +/* { dg-final { scan-tree-dump-times " % " 6 "optimized" { target 
> vect_int_mod } } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 04a2a8e8659..a4bdd23bed0 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -8239,6 +8239,14 @@ proc check_effective_target_vect_long_mult { } {
>  return $answer
>  }
> 
> +# Return 1 if the target supports vector int modulus, 0 otherwise.
> +
> +proc check_effective_target_vect_int_mod { } {
> +return [check_cached_effective_target_indexed vect_int_mod {
> +  expr { [istarget powerpc*-*-*]
> +  && [check_effective_target_power10_ok] }}]
> +}
> +
>  # Return 1 if the target supports vector even/odd elements extraction, 0 
> otherwise.
> 
>  proc check_effective_target_vect_extract_even_odd { } {
> --
> 2.27.0


[PATCH] rs6000: Rework option -mpowerpc64 handling [PR106680]

2022-09-27 Thread Kewen.Lin via Gcc-patches
Hi,

PR106680 shows that -m32 -mpowerpc64 is different from
-mpowerpc64 -m32, this is determined by the way how we
handle option powerpc64 in rs6000_handle_option.

Segher pointed out this difference should be taken as
a bug and we should ensure that option powerpc64 is
independent of -m32/-m64.  So this patch removes the
handlings in rs6000_handle_option and add some necessary
supports in rs6000_option_override_internal instead.

With this patch, if users specify -m{no-,}powerpc64, the
specified value is honoured, otherwise, for 64bit it
always enables OPTION_MASK_POWERPC64 while for 32bit
it disables OPTION_MASK_POWERPC64 if OS_MISSING_POWERPC64.

Bootstrapped and regress-tested on:
  - powerpc64-linux-gnu P7 and P8 {-m64,-m32}
  - powerpc64le-linux-gnu P9 and P10
  - powerpc-ibm-aix7.2.0.0 {-maix64,-maix32}

Hi Iain, could you help to test this on darwin to ensure
it won't break darwin's build and new tests are fine?
Thanks in advance!

Is it ok for trunk if darwin testing goes well?

BR,
Kewen
-
PR target/106680

gcc/ChangeLog:

* common/config/rs6000/rs6000-common.cc (rs6000_handle_option): Remove
the adjustment for option powerpc64 in -m64 handling, and remove the
whole -m32 handling.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): When no
explicit powerpc64 option is provided, enable it at -m64 and disable it
for OS_MISSING_POWERPC64.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106680-1.c: New test.
* gcc.target/powerpc/pr106680-2.c: New test.
* gcc.target/powerpc/pr106680-3.c: New test.
* gcc.target/powerpc/pr106680-4.c: New test.
---
 gcc/common/config/rs6000/rs6000-common.cc | 11 ---
 gcc/config/rs6000/rs6000.cc   | 33 ++-
 gcc/testsuite/gcc.target/powerpc/pr106680-1.c | 12 +++
 gcc/testsuite/gcc.target/powerpc/pr106680-2.c | 13 
 gcc/testsuite/gcc.target/powerpc/pr106680-3.c | 12 +++
 gcc/testsuite/gcc.target/powerpc/pr106680-4.c | 16 +
 6 files changed, 77 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106680-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106680-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106680-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106680-4.c

diff --git a/gcc/common/config/rs6000/rs6000-common.cc 
b/gcc/common/config/rs6000/rs6000-common.cc
index 8e393d08a23..c76b5c27bb6 100644
--- a/gcc/common/config/rs6000/rs6000-common.cc
+++ b/gcc/common/config/rs6000/rs6000-common.cc
@@ -119,19 +119,8 @@ rs6000_handle_option (struct gcc_options *opts, struct 
gcc_options *opts_set,
 #else
 case OPT_m64:
 #endif
-  opts->x_rs6000_isa_flags |= OPTION_MASK_POWERPC64;
   opts->x_rs6000_isa_flags |= (~opts_set->x_rs6000_isa_flags
   & OPTION_MASK_PPC_GFXOPT);
-  opts_set->x_rs6000_isa_flags |= OPTION_MASK_POWERPC64;
-  break;
-
-#ifdef TARGET_USES_AIX64_OPT
-case OPT_maix32:
-#else
-case OPT_m32:
-#endif
-  opts->x_rs6000_isa_flags &= ~OPTION_MASK_POWERPC64;
-  opts_set->x_rs6000_isa_flags |= OPTION_MASK_POWERPC64;
   break;

 case OPT_mminimal_toc:
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index e6fa3ad0eb7..605d35893f9 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3648,17 +3648,12 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_pointer_size = 32;
 }

-  /* Some OSs don't support saving the high part of 64-bit registers on context
- switch.  Other OSs don't support saving Altivec registers.  On those OSs,
- we don't touch the OPTION_MASK_POWERPC64 or OPTION_MASK_ALTIVEC settings;
- if the user wants either, the user must explicitly specify them and we
- won't interfere with the user's specification.  */
+  /* Some OSs don't support saving Altivec registers.  On those OSs, we don't
+ touch the OPTION_MASK_POWERPC64 or OPTION_MASK_ALTIVEC settings; if the
+ user wants either, the user must explicitly specify them and we won't
+ interfere with the user's specification.  */

   set_masks = POWERPC_MASKS;
-#ifdef OS_MISSING_POWERPC64
-  if (OS_MISSING_POWERPC64)
-set_masks &= ~OPTION_MASK_POWERPC64;
-#endif
 #ifdef OS_MISSING_ALTIVEC
   if (OS_MISSING_ALTIVEC)
 set_masks &= ~(OPTION_MASK_ALTIVEC | OPTION_MASK_VSX
@@ -3753,6 +3748,26 @@ rs6000_option_override_internal (bool global_init_p)
error ("AltiVec not supported in this target");
 }

+  /* With option powerpc64 specified explicitly (either on or off), even if
+ being compiled for 64 bit we don't need to check if it's disabled here,
+ since subtargets will check and raise an error message if necessary
+ later.  But without option powerpc64 specified explicitly, we need to
+ ensure powerpc64 enabled for 64 bit and disabled on those OSes with
+ 

Re: [PATCH v2] Libvtv-test: Fix bug that scansarif.exp cannot be found in libvtv regression test.

2022-09-27 Thread Lulu Cheng



在 2022/9/27 下午10:01, David Malcolm 写道:

On Tue, 2022-09-27 at 14:02 +0800, Lulu Cheng wrote:

 SARIF support was added in r13-967 but libvtv wasn't updated.

Sorry about breaking this.  The patch looks reasonable to me, FWIW,
assuming that it fixes the issue, of course!

Looks like my normal testing process missed this when I was testing the
SARIF patch; presumably we need to configure with --enable-vtable-
verify=yes to enable this feature.

Thanks
Dave


Hi, Dave:

 I have passed the test, if there is no problem, I will merge into the 
master branch.




Re: [PATCH] i386: Mark XMM4-XMM6 as clobbered by encodekey128/encodekey256

2022-09-27 Thread Hongtao Liu via Gcc-patches
On Wed, Sep 28, 2022 at 7:35 AM H.J. Lu via Gcc-patches
 wrote:
>
> encodekey128 and encodekey256 operations clear XMM4-XMM6.  But it is
> documented that XMM4-XMM6 are reserved for future usages and software
> should not rely upon them being zeroed.  Change encodekey128 and
Indeed. Ok for trunk and backport.
> encodekey256 to clobber XMM4-XMM6.
>
> gcc/
>
> PR target/107061
> * config/i386/predicates.md (encodekey128_operation): Check
> XMM4-XMM6 as clobbered.
> (encodekey256_operation): Likewise.
> * config/i386/sse.md (encodekey128u32): Clobber XMM4-XMM6.
> (encodekey256u32): Likewise.
>
> gcc/testsuite/
>
> PR target/107061
> * gcc.target/i386/keylocker-encodekey128.c: Don't check
> XMM4-XMM6.
> * gcc.target/i386/keylocker-encodekey256.c: Likewise.
> ---
>  gcc/config/i386/predicates.md | 20 +--
>  gcc/config/i386/sse.md|  4 ++--
>  .../gcc.target/i386/keylocker-encodekey128.c  |  1 -
>  .../gcc.target/i386/keylocker-encodekey256.c  |  1 -
>  4 files changed, 12 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> index 655eabf793b..c4141a96735 100644
> --- a/gcc/config/i386/predicates.md
> +++ b/gcc/config/i386/predicates.md
> @@ -2107,11 +2107,11 @@ (define_predicate "encodekey128_operation"
>for(i = 4; i < 7; i++)
>  {
>elt = XVECEXP (op, 0, i);
> -  if (GET_CODE (elt) != SET
> - || GET_CODE (SET_DEST (elt)) != REG
> - || GET_MODE (SET_DEST (elt)) != V2DImode
> - || REGNO (SET_DEST (elt)) != GET_SSE_REGNO (i)
> - || SET_SRC (elt) != CONST0_RTX (V2DImode))
> +  if (GET_CODE (elt) != CLOBBER
> + || GET_MODE (elt) != VOIDmode
> + || GET_CODE (XEXP (elt, 0)) != REG
> + || GET_MODE (XEXP (elt, 0)) != V2DImode
> + || REGNO (XEXP (elt, 0)) != GET_SSE_REGNO (i))
> return false;
>  }
>
> @@ -2157,11 +2157,11 @@ (define_predicate "encodekey256_operation"
>for(i = 4; i < 7; i++)
>  {
>elt = XVECEXP (op, 0, i + 1);
> -  if (GET_CODE (elt) != SET
> - || GET_CODE (SET_DEST (elt)) != REG
> - || GET_MODE (SET_DEST (elt)) != V2DImode
> - || REGNO (SET_DEST (elt)) != GET_SSE_REGNO (i)
> - || SET_SRC (elt) != CONST0_RTX (V2DImode))
> +  if (GET_CODE (elt) != CLOBBER
> + || GET_MODE (elt) != VOIDmode
> + || GET_CODE (XEXP (elt, 0)) != REG
> + || GET_MODE (XEXP (elt, 0)) != V2DImode
> + || REGNO (XEXP (elt, 0)) != GET_SSE_REGNO (i))
> return false;
>  }
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 5c189635124..076064f97e6 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -29015,7 +29015,7 @@ (define_expand "encodekey128u32"
>
>for (i = 4; i < 7; i++)
>  XVECEXP (operands[2], 0, i)
> -  = gen_rtx_SET (xmm_regs[i], CONST0_RTX (V2DImode));
> +  = gen_rtx_CLOBBER (VOIDmode, xmm_regs[i]);
>
>XVECEXP (operands[2], 0, 7)
>  = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
> @@ -29072,7 +29072,7 @@ (define_expand "encodekey256u32"
>
>for (i = 4; i < 7; i++)
>  XVECEXP (operands[2], 0, i + 1)
> -  = gen_rtx_SET (xmm_regs[i], CONST0_RTX (V2DImode));
> +  = gen_rtx_CLOBBER (VOIDmode, xmm_regs[i]);
>
>XVECEXP (operands[2], 0, 8)
>  = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
> diff --git a/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c 
> b/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
> index 805e0628673..57fa9bdc831 100644
> --- a/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
> +++ b/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
> @@ -6,7 +6,6 @@
>  /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\\n\]*%xmm0,\[^\\n\\r\]*" } } */
>  /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\\n\]*%xmm1,\[^\\n\\r\]*16\[^\\n\\r\]*" } } */
>  /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\\n\]*%xmm2,\[^\\n\\r\]*32\[^\\n\\r\]*" } } */
> -/* { dg-final { scan-assembler "(?:movdqa|movaps)\[ 
> \\t\]+\[^\\n\]*%xmm\[4-6\],\[^\\n\\r\]*" } } */
>
>  #include 
>
> diff --git a/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c 
> b/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c
> index 26f04dcf014..a9398b4e7a2 100644
> --- a/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c
> +++ b/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c
> @@ -8,7 +8,6 @@
>  /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\\n\]*%xmm1,\[^\\n\\r\]*16\[^\\n\\r\]*" } } */
>  /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\\n\]*%xmm2,\[^\\n\\r\]*32\[^\\n\\r\]*" } } */
>  /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
> \\t\]+\[^\\n\]*%xmm3,\[^\\n\\r\]*48\[^\\n\\r\]*" } } */
> -/* { dg-final { scan-assembler 

Re: [PATCH] Move reload_completed and other rtl.h globals to crtl structure.

2022-09-27 Thread Jeff Law via Gcc-patches


On 7/10/22 12:19, Roger Sayle wrote:

This patch builds upon Richard Biener's suggestion of avoiding global
variables to track state/identify which passes have already been run.
In the early middle-end, the tree-ssa passes use the curr_properties
field in cfun to track this.  This patch uses a new rtl_pass_progress
int field in crtl to do something similar.

This patch allows the global variables lra_in_progress, reload_in_progress,
reload_completed, epilogue_completed and regstack_completed to be removed
from rtl.h and implemented as bits within the new crtl->rtl_pass_progress.
I've also taken the liberty of adding a new combine_completed bit at the
same time [to respond the Segher's comment it's easy to change this to
combine1_completed and combine2_completed if we ever perform multiple
combine passes (or multiple reload/regstack passes)].  At the same time,
I've also refactored bb_reorder_complete into the same new field;
interestingly bb_reorder_complete was already a bool in crtl.

One very minor advantage of this implementation/refactoring is that the
predicate "can_create_pseudo_p ()" which is semantically defined to be
!reload_in_progress && !reload_completed, can now be performed very
efficiently as effectively the test (progress & 12) == 0, i.e. a single
test instruction on x86.

For consistency, I've also moved cse_not_expected (the last remaining
global variable in rtl.h) into crtl, as its own bool field.

The vast majority of this patch is then churn to handle these changes.
Thanks to macros, most code is unaffected, assuming it treats those
global variables as r-values, though some source files required/may
require tweaks as these "variables" are now defined in emit-rtl.h
instead of rtl.h.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Might this clean-up be acceptable in stage 1,
given the possible temporary disruption transitioning some backends?
I'll start checking various backends myself with cross-compilers, but if
Jeff Law could spin this patch on his build farm, that would help
identify targets that need attention.


Do you have any intention on moving forward with this?    I like what 
you've done and I've been carrying around changes to make it work across 
a much wider set of targets.   Many targets want to treat 
reload_in_progress and reload_completed as lvalues in their thunk code, 
so they need simple adjustments.



I suspect some of the untested targets like ia64 will need obvious 
adjustments.  The patch I've had in my tester for the last few months is 
attached...



Jeff



diff --git a/gcc/bb-reorder.cc b/gcc/bb-reorder.cc
index 5cd48255f8a..1c1cbaeb1e2 100644
--- a/gcc/bb-reorder.cc
+++ b/gcc/bb-reorder.cc
@@ -2570,7 +2570,7 @@ reorder_basic_blocks (void)
 
   /* Signal that rtl_verify_flow_info_1 can now verify that there
  is at most one switch between hot/cold sections.  */
-  crtl->bb_reorder_complete = true;
+  crtl->rtl_pass_progress |= PROGRESS_bb_reorder_complete;
 }
 
 /* Determine which partition the first basic block in the function
diff --git a/gcc/cfgrtl.cc b/gcc/cfgrtl.cc
index a05c338a4c8..8fe367e53c5 100644
--- a/gcc/cfgrtl.cc
+++ b/gcc/cfgrtl.cc
@@ -1907,7 +1907,7 @@ rtl_split_edge (edge edge_in)
  an extra partition crossing in the chain, which is illegal.
  It can't go after the src, because src may have a fall-through
  to a different block.  */
-  if (crtl->bb_reorder_complete
+  if (bb_reorder_complete
   && (edge_in->flags & EDGE_CROSSING))
 {
   after = last_bb_in_partition (edge_in->src);
@@ -2444,7 +2444,7 @@ fixup_partitions (void)
   while (! bbs_to_fix.is_empty ());
 
   /* Fix up hot cold block grouping if needed.  */
-  if (crtl->bb_reorder_complete && current_ir_type () == IR_RTL_CFGRTL)
+  if (bb_reorder_complete && current_ir_type () == IR_RTL_CFGRTL)
 	{
 	  basic_block bb, first = NULL, second = NULL;
 	  int current_partition = BB_UNPARTITIONED;
@@ -2507,7 +2507,7 @@ verify_hot_cold_block_grouping (void)
   /* Even after bb reordering is complete, we go into cfglayout mode
  again (in compgoto). Ensure we don't call this before going back
  into linearized RTL when any layout fixes would have been committed.  */
-  if (!crtl->bb_reorder_complete
+  if (!bb_reorder_complete
   || current_ir_type () != IR_RTL_CFGRTL)
 return err;
 
@@ -4481,7 +4481,7 @@ cfg_layout_initialize (int flags)
  layout required moving a block from the hot to the cold
  section. This would create an illegal partitioning unless some
  manual fixup was performed.  */
-  gcc_assert (!crtl->bb_reorder_complete || !crtl->has_bb_partition);
+  gcc_assert (!bb_reorder_complete || !crtl->has_bb_partition);
 
   initialize_original_copy_tables ();
 
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 

Re: [PATCH v2] stack-protector: Check stack canary before throwing exception

2022-09-27 Thread Jeff Law via Gcc-patches



On 8/17/22 16:18, H.J. Lu via Gcc-patches wrote:

Check stack canary before throwing exception to avoid stack corruption.

gcc/

PR middle-end/58245
* calls.cc: Include "tree-eh.h".
(expand_call): Check stack canary before throwing exception.

gcc/testsuite/

PR middle-end/58245
* g++.dg/fstack-protector-strong.C: Adjusted.
* g++.dg/pr58245-1.C: New test.


OK.

Jeff



Re: [PATCH V2] place `const volatile' objects in read-only sections

2022-09-27 Thread Jeff Law via Gcc-patches



On 8/5/22 05:41, Jose E. Marchesi via Gcc-patches wrote:

[Changes from V1:
- Added a test.]

It is common for C BPF programs to use variables that are implicitly
set by the BPF loader and run-time.  It is also necessary for these
variables to be stored in read-only storage so the BPF verifier
recognizes them as such.  This leads to declarations using both
`const' and `volatile' qualifiers, like this:

   const volatile unsigned char is_allow_list = 0;

Where `volatile' is used to avoid the compiler to optimize out the
variable, or turn it into a constant, and `const' to make sure it is
placed in .rodata.

Now, it happens that:

- GCC places `const volatile' objects in the .data section, under the
   assumption that `volatile' somehow voids the `const'.

- LLVM places `const volatile' objects in .rodata, under the
   assumption that `volatile' is orthogonal to `const'.

So there is a divergence, that has practical consequences: it makes
BPF programs compiled with GCC to not work properly.

When looking into this, I found this bugzilla:

   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25521
   "change semantics of const volatile variables"

which was filed back in 2005, long ago.  This report was already
asking to put `const volatile' objects in .rodata, questioning the
current behavior.

While discussing this in the #gcc IRC channel I was pointed out to the
following excerpt from the C18 spec:

6.7.3 Type qualifiers / 5 The properties associated with qualified
  types are meaningful only for expressions that are
  lval-values [note 135]

135) The implementation may place a const object that is not
 volatile in a read-only region of storage. Moreover, the
 implementation need not allocate storage for such an object if
 its $ address is never used.

This footnote may be interpreted as if const objects that are volatile
shouldn't be put in read-only storage.  Even if I personally was not
very convinced of that interpretation (see my earlier comment in BZ
25521) I filed the following issue in the LLVM tracker in order to
discuss the matter:

   https://github.com/llvm/llvm-project/issues/56468

As you can see, Aaron Ballman, one of the LLVM hackers, asked the WG14
reflectors about this.  He reported that the reflectors don't think
footnote 135 has any normative value.

So, not having a normative mandate on either direction, there are two
options:

a) To change GCC to place `const volatile' objects in .rodata instead
of .data.

b) To change LLVM to place `const volatile' objects in .data instead
of .rodata.

Considering that:

- One target (bpf-unknown-none) breaks with the current GCC behavior.

- No target/platform relies on the GCC behavior, that we know.

- Changing the LLVM behavior at this point would be very severely
   traumatic for the BPF people and their users.

I think the right thing to do at this point is a).
Therefore this patch.

Regtested in x86_64-linux-gnu and bpf-unknown-none.
No regressions observed.

gcc/ChangeLog:

PR middle-end/25521
* varasm.cc (categorize_decl_for_section): Place `const volatile'
objects in read-only sections.
(default_select_section): Likewise.

gcc/testsuite/ChangeLog:

PR middle-end/25521
* lib/target-supports.exp (check_effective_target_elf): Define.
* gcc.dg/pr25521.c: New test.


The best use I've heard for const volatile is stuff like hardware status 
registers which are readonly from the standpoint of the compiler, but 
which are changed by the hardware.   But for those, we're looking for 
the const to trigger compiler diagnostics if we try to write the value.  
The volatile (of course) indicates the value changes behind our back.


What you're trying to do seems to parallel that case reasonably well for 
the volatile aspect.  You want to force the compiler to read the data 
for every access.


Your need for the const is a bit different.  Instead of looking to get a 
diagnostic out of the compiler if its modified, you need the data to 
live in .rodata so the BPF verifier knows the compiler/code won't change 
the value.  Presumably the BPF verifier can't read debug info to 
determine the const-ness.



I'm not keen on the behavior change, but nobody else is stepping in to 
review and I don't have a strong case to reject.  So OK for the trunk.


jeff




[PATCH][PUSHED] Fix AutoFDO tests to not look for hot/cold splitting.

2022-09-27 Thread Eugene Rozenfeld via Gcc-patches
AutoFDO counts are not reliable and we are currently not
performing hot/cold splitting based on them. This change adjusts
several tree-prof tests not to check for hot/cold splitting
when run with AutoFDO.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/cold_partition_label.c: Don't check for hot/cold 
splitting with AutoFDO.
* gcc.dg/tree-prof/section-attr-1.c: Don't check for hot/cold splitting 
with AutoFDO.
* gcc.dg/tree-prof/section-attr-2.c: Don't check for hot/cold splitting 
with AutoFDO.
* gcc.dg/tree-prof/section-attr-3.c: Don't check for hot/cold splitting 
with AutoFDO.
---
 gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c | 4 ++--
 gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c   | 4 ++--
 gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c   | 4 ++--
 gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c   | 4 ++--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c 
b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
index 511b61067c0..b85e6c1f93d 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
@@ -43,6 +43,6 @@ main (int argc, char *argv[])
   return 0;
 }
 
-/* { dg-final-use { scan-assembler "foo\[._\]+cold" { target *-*-linux* 
*-*-gnu* } } } */
-/* { dg-final-use { scan-assembler "size\[ \ta-zA-Z0-0\]+foo\[._\]+cold" { 
target *-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "foo\[._\]+cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "size\[ 
\ta-zA-Z0-0\]+foo\[._\]+cold" { target *-*-linux* *-*-gnu* } } } */
 /* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c 
b/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
index 2087d0d2059..5376de14a2f 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
@@ -52,5 +52,5 @@ foo (int path)
 }
 }
 
-/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
-/* { dg-final-use { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c 
b/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
index b02526beaea..90de2c08ca4 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
@@ -51,5 +51,5 @@ foo (int path)
 }
 }
 
-/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
-/* { dg-final-use { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c 
b/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
index da064070653..29a48f05feb 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
@@ -52,5 +52,5 @@ foo (int path)
 }
 }
 
-/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
-/* { dg-final-use { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
-- 
2.25.1



Re: [PATCH 1/2] cselib: Keep track of further subvalue relations

2022-09-27 Thread Jeff Law via Gcc-patches



On 9/7/22 08:20, Stefan Schulze Frielinghaus via Gcc-patches wrote:

Whenever a new cselib value is created check whether a smaller value
exists which is contained in the bigger one.  If so add a subreg
relation to locs of the smaller one.

gcc/ChangeLog:

* cselib.cc (new_cselib_val): Keep track of further subvalue
relations.


OK

jeff




Re: [PATCH 2/2] var-tracking: Add entry values up to max register mode

2022-09-27 Thread Jeff Law via Gcc-patches



On 9/7/22 08:20, Stefan Schulze Frielinghaus via Gcc-patches wrote:

For parameter of type integer which do not consume a whole register
(modulo sign/zero extension) this patch adds entry values up to maximal
register mode.

gcc/ChangeLog:

* var-tracking.cc (vt_add_function_parameter): Add entry values
up to maximal register mode.


OK

jeff




Re: [PATCH v3] c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]

2022-09-27 Thread Marek Polacek via Gcc-patches
On Tue, Sep 27, 2022 at 05:44:12PM -0400, Jason Merrill wrote:
> On 9/27/22 16:26, Marek Polacek wrote:
> > --- a/gcc/cp/typeck.cc
> > +++ b/gcc/cp/typeck.cc
> > @@ -11042,8 +11042,13 @@ check_return_expr (tree retval, bool *no_warning)
> >  the conditions for the named return value optimization.  */
> > bool converted = false;
> > tree moved;
> > -  /* This is only interesting for class type.  */
> > -  if (CLASS_TYPE_P (functype)
> > +  /* Until C++23, this was only interesting for class type...  */
> > +  if ((CLASS_TYPE_P (functype)
> > +  /* ...but in C++23, we should do the below when we're converting
> > + from/to a class/reference (a non-scalar type).  */
> > +  || (cxx_dialect >= cxx23
> > +  && (!SCALAR_TYPE_P (functype)
> > +  || !SCALAR_TYPE_P (TREE_TYPE (retval)
> 
> You might reformat this as
> (cxx_dialect < cxx23
>  ? CLASS...
>  : (!SCALAR...

Done, I like that better.
 
> > --- a/gcc/testsuite/g++.dg/cpp0x/move-return3.C
> > +++ b/gcc/testsuite/g++.dg/cpp0x/move-return3.C
> > @@ -1,6 +1,7 @@
> >   // PR c++/91212
> >   // Test that C++11 implicit move semantics don't call the const copy.
> > -// { dg-do link }
> > +// In C++23, we call #2.
> 
> I guess that behavior is tested by elision2.C:twelve()?

Yeah, I think that's exactly the same case.
 
> > --- a/gcc/testsuite/g++.old-deja/g++.mike/p2846b.C
> > +++ b/gcc/testsuite/g++.old-deja/g++.mike/p2846b.C
> > @@ -1,4 +1,4 @@
> > -// { dg-do run  }
> > +// { dg-do run { target c++20_down } }
> >   // Shows that problem of initializing one object's secondary base from
> >   // another object via a user defined copy constructor for that base,
> >   // the pointer for the secondary vtable is not set after implicit
> > @@ -11,6 +11,8 @@
> >   // prms-id: 2846
> > +// This test fails in C++23 due to P2266.
> 
> Instead of disabling this test for C++23, let's add a cast to B& in the
> return statement.

Fixed.

> OK with that change and optionally the ?: reformatting above.

Thanks a lot; patch pushed.

Marek



[PATCH] i386: Mark XMM4-XMM6 as clobbered by encodekey128/encodekey256

2022-09-27 Thread H.J. Lu via Gcc-patches
encodekey128 and encodekey256 operations clear XMM4-XMM6.  But it is
documented that XMM4-XMM6 are reserved for future usages and software
should not rely upon them being zeroed.  Change encodekey128 and
encodekey256 to clobber XMM4-XMM6.

gcc/

PR target/107061
* config/i386/predicates.md (encodekey128_operation): Check
XMM4-XMM6 as clobbered.
(encodekey256_operation): Likewise.
* config/i386/sse.md (encodekey128u32): Clobber XMM4-XMM6.
(encodekey256u32): Likewise.

gcc/testsuite/

PR target/107061
* gcc.target/i386/keylocker-encodekey128.c: Don't check
XMM4-XMM6.
* gcc.target/i386/keylocker-encodekey256.c: Likewise.
---
 gcc/config/i386/predicates.md | 20 +--
 gcc/config/i386/sse.md|  4 ++--
 .../gcc.target/i386/keylocker-encodekey128.c  |  1 -
 .../gcc.target/i386/keylocker-encodekey256.c  |  1 -
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 655eabf793b..c4141a96735 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -2107,11 +2107,11 @@ (define_predicate "encodekey128_operation"
   for(i = 4; i < 7; i++)
 {
   elt = XVECEXP (op, 0, i);
-  if (GET_CODE (elt) != SET
- || GET_CODE (SET_DEST (elt)) != REG
- || GET_MODE (SET_DEST (elt)) != V2DImode
- || REGNO (SET_DEST (elt)) != GET_SSE_REGNO (i)
- || SET_SRC (elt) != CONST0_RTX (V2DImode))
+  if (GET_CODE (elt) != CLOBBER
+ || GET_MODE (elt) != VOIDmode
+ || GET_CODE (XEXP (elt, 0)) != REG
+ || GET_MODE (XEXP (elt, 0)) != V2DImode
+ || REGNO (XEXP (elt, 0)) != GET_SSE_REGNO (i))
return false;
 }
 
@@ -2157,11 +2157,11 @@ (define_predicate "encodekey256_operation"
   for(i = 4; i < 7; i++)
 {
   elt = XVECEXP (op, 0, i + 1);
-  if (GET_CODE (elt) != SET
- || GET_CODE (SET_DEST (elt)) != REG
- || GET_MODE (SET_DEST (elt)) != V2DImode
- || REGNO (SET_DEST (elt)) != GET_SSE_REGNO (i)
- || SET_SRC (elt) != CONST0_RTX (V2DImode))
+  if (GET_CODE (elt) != CLOBBER
+ || GET_MODE (elt) != VOIDmode
+ || GET_CODE (XEXP (elt, 0)) != REG
+ || GET_MODE (XEXP (elt, 0)) != V2DImode
+ || REGNO (XEXP (elt, 0)) != GET_SSE_REGNO (i))
return false;
 }
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 5c189635124..076064f97e6 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -29015,7 +29015,7 @@ (define_expand "encodekey128u32"
 
   for (i = 4; i < 7; i++)
 XVECEXP (operands[2], 0, i)
-  = gen_rtx_SET (xmm_regs[i], CONST0_RTX (V2DImode));
+  = gen_rtx_CLOBBER (VOIDmode, xmm_regs[i]);
 
   XVECEXP (operands[2], 0, 7)
 = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
@@ -29072,7 +29072,7 @@ (define_expand "encodekey256u32"
 
   for (i = 4; i < 7; i++)
 XVECEXP (operands[2], 0, i + 1)
-  = gen_rtx_SET (xmm_regs[i], CONST0_RTX (V2DImode));
+  = gen_rtx_CLOBBER (VOIDmode, xmm_regs[i]);
 
   XVECEXP (operands[2], 0, 8)
 = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
diff --git a/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c 
b/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
index 805e0628673..57fa9bdc831 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-encodekey128.c
@@ -6,7 +6,6 @@
 /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
\\t\]+\[^\\n\]*%xmm0,\[^\\n\\r\]*" } } */
 /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
\\t\]+\[^\\n\]*%xmm1,\[^\\n\\r\]*16\[^\\n\\r\]*" } } */
 /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
\\t\]+\[^\\n\]*%xmm2,\[^\\n\\r\]*32\[^\\n\\r\]*" } } */
-/* { dg-final { scan-assembler "(?:movdqa|movaps)\[ 
\\t\]+\[^\\n\]*%xmm\[4-6\],\[^\\n\\r\]*" } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c 
b/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c
index 26f04dcf014..a9398b4e7a2 100644
--- a/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c
+++ b/gcc/testsuite/gcc.target/i386/keylocker-encodekey256.c
@@ -8,7 +8,6 @@
 /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
\\t\]+\[^\\n\]*%xmm1,\[^\\n\\r\]*16\[^\\n\\r\]*" } } */
 /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
\\t\]+\[^\\n\]*%xmm2,\[^\\n\\r\]*32\[^\\n\\r\]*" } } */
 /* { dg-final { scan-assembler "(?:movdqu|movups)\[ 
\\t\]+\[^\\n\]*%xmm3,\[^\\n\\r\]*48\[^\\n\\r\]*" } } */
-/* { dg-final { scan-assembler "(?:movdqa|movaps)\[ 
\\t\]+\[^\\n\]*%xmm\[4-6\],\[^\\n\\r\]*" } } */
 
 #include 
 
-- 
2.37.3



Re: [PATCH v3] c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]

2022-09-27 Thread Jason Merrill via Gcc-patches

On 9/27/22 16:26, Marek Polacek wrote:

On Mon, Sep 26, 2022 at 01:29:35PM -0400, Jason Merrill wrote:

On 9/20/22 14:19, Marek Polacek wrote:

There's one FIXME in elision1.C:five, which we should compile but reject
with "passing 'Mutt' as 'this' argument discards qualifiers".  That
looks bogus to me, I think I'll open a PR for it.


Let's fix that now, I think.


OK, copypasting this bit from the other email so that we can have one
thread:


Can of worms.   The test is

 struct Mutt {
 operator int*() &&;
 };

 int* five(Mutt x) {
 return x;  // OK since C++20 because P1155
 }

'x' should be treated as an rvalue, therefore the operator fn taking
an rvalue ref to Mutt should be used to convert 'x' to int*.  We fail
because we don't treat 'x' as an rvalue because the function doesn't
return a class.  So the patch should be just

--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -10875,10 +10875,7 @@ check_return_expr (tree retval, bool *no_warning)
 Note that these conditions are similar to, but not as strict as,
the conditions for the named return value optimization.  */
  bool converted = false;
-  tree moved;
-  /* This is only interesting for class type.  */
-  if (CLASS_TYPE_P (functype)
- && (moved = treat_lvalue_as_rvalue_p (retval, /*return*/true)))
+  if (tree moved = treat_lvalue_as_rvalue_p (retval, /*return*/true))
   {
 if (cxx_dialect < cxx20)
   {

which fixes the test, but breaks a lot of middle-end warnings.  For instance
g++.dg/warn/nonnull3.C, where the patch above changes .gimple:

bool A::foo (struct A * const this, <<< Unknown tree: offset_type >>> p)
{
-  bool D.2146;
+  bool D.2150;
-  D.2146 = p != -1;
-  return D.2146;
+  p.0_1 = p;
+  D.2150 = p.0_1 != -1;
+  return D.2150;
}

and we no longer get the warning.  I thought maybe I could undo the implicit
rvalue conversion in cp_fold, when it sees implicit_rvalue_p, but that didn't
work.  So currently I'm stuck.  Should we try to figure this out or push aside?



Can you undo the implicit rvalue conversion within check_return_expr,
where we can still refer back to the original expression?


Unfortunately no, one problem is that treat_lvalue_as_rvalue_p modifies
the underlying decl by setting TREE_ADDRESSABLE, which then presumably
breaks warnings.  That is, treat_ can get 'VCE(x)' and produce
'*NLE<(X&) >' where 'x' flags have been modified, since we're taking
x's address.


Or avoid the rvalue conversion if the return type is scalar?


I wish :(.  In the 'five' example above, the return type is a pointer,
a scalar, but we have to convert to rvalue.


OK, then when both the return type and the type of the return value are
scalar?


Good news!  After more poking it seems we only need to do the rvalue
conversion when either to/from types is a class/reference!  That is,
if either is a non-scalar type.  And that doesn't upset the middle-end
diagnostics!  I'm still limiting the broader conversion to C++23, but
the whole condition could be:

   if ((!SCALAR_TYPE_P (functype) || !SCALAR_TYPE_P (rettype))
   && treat_lvalue_as_rvalue_p ())
   ...
  
Therefore I think we don't need ...



It's sort of sad that this corner case causes so much trouble: I think
we have to do the conversion only because of ref-qualifiers, so that
the correct operator function is chosen.

A way out may be setting a flag on the V_C_E that indicates it is an
rvalue, rather than performing the conversion above.  This was your
idea so I don't want to take credit for it.  Should I go ahead and
try it?


Sure, probably in build_static_cast_1.


... this, after all, which is just fantastic.  Besides the check_return_expr
hunk and removing a FIXME there are no other changes.


Yay!


Next step: remove the double overload.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements https://wg21.link/p2266, which, once again,
changes the implicit move rules.  Here's a brief summary of various
changes in this area:

r125211: Introduced moving from certain lvalues when returning them
r171071: CWG 1148, enable move from value parameter on return
r212099: CWG 1579, it's OK to call a converting ctor taking an rvalue
r251035: CWG 1579, do maybe-rvalue overload resolution twice
r11-2411: Avoid calling const copy ctor on implicit move
r11-2412: C++20 implicit move changes, remove the fallback overload
   resolution, allow move on throw of parameters and implicit
  move of rvalue references

P2266 enables the implicit move even for functions that return references.
That is, we will now perform a move in

   X&& foo (X&& x) {
 return x;
   }

P2266 also removes the fallback overload resolution, but this was
resolved by r11-2412: we only do convert_for_initialization with
LOOKUP_PREFER_RVALUE in C++17 and older.
P2266 also says that a returned move-eligible id-expression is always an
xvalue.  This required some 

Re: [PATCH 2/2] c++: implement __remove_cv, __remove_reference and __remove_cvref

2022-09-27 Thread Jason Merrill via Gcc-patches

On 9/27/22 15:50, Patrick Palka wrote:

This uses TRAIT_TYPE from the previous patch to implement efficient
built-ins for std::remove_cv, std::remove_reference and std::remove_cvref.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk?


OK once the previous patch goes in.


gcc/c-family/ChangeLog:

* c-common.cc (c_common_reswords): Add __remove_cv,
__remove_reference and __remove_cvref.
* c-common.h (enum rid): Add RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cp-objcp-common.cc (names_builtin_p): Likewise.
* cp-tree.h (enum cp_trait_kind): Add CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cxx-pretty-print.cc (pp_cxx_trait): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* parser.cc (cp_keyword_starts_decl_specifier_p): Return true
for RID_REMOVE_CV, RID_REMOVE_REFERENCE and RID_REMOVE_CVREF.
(cp_parser_trait): Handle RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.
(cp_parser_simple_type_specifier): Likewise.
* semantics.cc (finish_trait_type): Likewise.

libstdc++-v3/ChangeLog:

* include/bits/unique_ptr.h (unique_ptr<_Tp[], _Dp>): Remove
__remove_cv and use __remove_cv_t instead.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test __remove_cv,
__remove_reference and __remove_cvref.
* g++.dg/ext/remove_cv.C: New test.
* g++.dg/ext/remove_reference.C: New test.
* g++.dg/ext/remove_cvref.C: New test.
---
  gcc/c-family/c-common.cc|  3 +++
  gcc/c-family/c-common.h |  1 +
  gcc/cp/constraint.cc|  3 +++
  gcc/cp/cp-objcp-common.cc   |  3 +++
  gcc/cp/cp-tree.h|  5 +++-
  gcc/cp/cxx-pretty-print.cc  |  9 +++
  gcc/cp/parser.cc| 18 ++
  gcc/cp/semantics.cc | 10 
  gcc/testsuite/g++.dg/ext/has-builtin-1.C|  9 +++
  gcc/testsuite/g++.dg/ext/remove_cv.C| 26 +
  gcc/testsuite/g++.dg/ext/remove_cvref.C | 26 +
  gcc/testsuite/g++.dg/ext/remove_reference.C | 26 +
  libstdc++-v3/include/bits/unique_ptr.h  |  5 +---
  13 files changed, 139 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/ext/remove_cv.C
  create mode 100644 gcc/testsuite/g++.dg/ext/remove_cvref.C
  create mode 100644 gcc/testsuite/g++.dg/ext/remove_reference.C

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index cda6910e8c5..6e0af863a49 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -547,6 +547,9 @@ const struct c_common_resword c_common_reswords[] =
D_CXXONLY },
{ "__reference_converts_from_temporary", RID_REF_CONVERTS_FROM_TEMPORARY,
D_CXXONLY },
+  { "__remove_cv", RID_REMOVE_CV, D_CXXONLY },
+  { "__remove_reference", RID_REMOVE_REFERENCE, D_CXXONLY },
+  { "__remove_cvref", RID_REMOVE_CVREF, D_CXXONLY },
  
/* C++ transactional memory.  */

{ "synchronized", RID_SYNCHRONIZED, D_CXX_OBJC | D_TRANSMEM },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 50a4691cda6..d5c98d306ce 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -187,6 +187,7 @@ enum rid
RID_IS_CONVERTIBLE, RID_IS_NOTHROW_CONVERTIBLE,
RID_REF_CONSTRUCTS_FROM_TEMPORARY,
RID_REF_CONVERTS_FROM_TEMPORARY,
+  RID_REMOVE_CV, RID_REMOVE_REFERENCE, RID_REMOVE_CVREF,
  
/* C++11 */

RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 266ec581a20..ca73aff3f38 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3714,6 +3714,9 @@ diagnose_trait_expr (tree expr, tree args)
  case CPTK_BASES:
  case CPTK_DIRECT_BASES:
  case CPTK_UNDERLYING_TYPE:
+case CPTK_REMOVE_CV:
+case CPTK_REMOVE_REFERENCE:
+case CPTK_REMOVE_CVREF:
/* We shouldn't see these non-expression traits.  */
gcc_unreachable ();
  /* We deliberately omit the default case so that when adding a new
diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 380f288a7f1..2d3f206b530 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -467,6 +467,9 @@ names_builtin_p (const char *name)
  case RID_IS_NOTHROW_CONVERTIBLE:
  case RID_REF_CONSTRUCTS_FROM_TEMPORARY:
  case RID_REF_CONVERTS_FROM_TEMPORARY:
+case RID_REMOVE_CV:
+case RID_REMOVE_REFERENCE:
+case RID_REMOVE_CVREF:
return true;
  default:
break;
diff 

Re: [PATCH 1/2] c++: introduce TRAIT_TYPE alongside TRAIT_EXPR

2022-09-27 Thread Jason Merrill via Gcc-patches

On 9/27/22 15:50, Patrick Palka wrote:

We already have generic support for predicate-like traits that yield a
boolean via TRAIT_EXPR, but we lack the same support for transform-like
traits that yield a type.  Such support would be useful for implementing
efficient built-ins for std::decay and std::remove_cvref and other
conceptually simple type traits that are otherwise relatively expensive
to implement.

This patch implements a generic TRAIT_TYPE type and replaces the
existing hardcoded UNDERLYING_TYPE type to use TRAIT_TYPE instead.


Sounds good, perhaps we also want to convert BASES to e.g. 
TRAIT_TYPE_PACK at some point...



gcc/cp/ChangeLog:

* cp-objcp-common.cc (cp_common_init_ts): Replace
UNDERLYING_TYPE with TRAIT_TYPE.
* cp-tree.def (TRAIT_TYPE): Define.
(UNDERLYING_TYPE): Remove.
* cp-tree.h (TRAIT_TYPE_KIND_RAW): Define.
(TRAIT_TYPE_KIND): Define.
(TRAIT_TYPE_TYPE1): Define.
(TRAIT_TYPE_TYPE2): Define.
(WILDCARD_TYPE_P): Return true for TRAIT_TYPE.
(finish_trait_type): Declare.
* cxx-pretty-print.cc (cxx_pretty_printer::primary_expression):
Adjust after renaming pp_cxx_trait_expression.
(cxx_pretty_printer::type_id): Replace UNDERLYING_TYPE with
TRAIT_TYPE.
(pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait): ... this.  Handle TRAIT_TYPE as well.  Correct
pretty printing of the trailing arguments.
* cxx-pretty-print.h (pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait_type): ... this.
* error.cc (dump_type) : Remove.
: New.
(dump_type_prefix): Replace UNDERLYING_WITH with TRAIT_TYPE.
(dump_type_suffix): Likewise.
* mangle.cc (write_type) : Remove.
: New.
* module.cc (trees_out::type_node) :
Remove.
: New.
(trees_in::tree_node): Likewise.
* parser.cc (cp_parser_primary_expression): Adjust after
renaming cp_parser_trait_expr.
(cp_parser_trait_expr): Rename to ...
(cp_parser_trait): ... this.  Call finish_trait_type for traits
that yield a type.
(cp_parser_simple_type_specifier): Adjust after renaming
cp_parser_trait_expr.
* pt.cc (for_each_template_parm_r) :
Remove.
: New.
(tsubst): Likewise.
(unify): Replace UNDERLYING_TYPE with TRAIT_TYPE.
(dependent_type_p_r): Likewise.
* semantics.cc (finish_underlying_type): Don't return
UNDERLYING_TYPE anymore when processing_template_decl.
(finish_trait_type): Define.
* tree.cc (strip_typedefs) : Remove.
: New.
(cp_walk_subtrees): Likewise.
* typeck.cc (structural_comptypes): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/underlying_type7.C: Adjust expected error message.
---
  gcc/cp/cp-objcp-common.cc   |  2 +-
  gcc/cp/cp-tree.def  |  9 ++--
  gcc/cp/cp-tree.h| 18 
  gcc/cp/cxx-pretty-print.cc  | 49 ++---
  gcc/cp/cxx-pretty-print.h   |  2 +-
  gcc/cp/error.cc | 14 +++---
  gcc/cp/mangle.cc|  5 ++-
  gcc/cp/module.cc| 24 +-
  gcc/cp/parser.cc| 24 +-
  gcc/cp/pt.cc| 26 +++
  gcc/cp/semantics.cc | 41 -
  gcc/cp/tree.cc  | 22 ++---
  gcc/cp/typeck.cc|  7 ++-
  gcc/testsuite/g++.dg/cpp0x/alias-decl-59.C  |  4 +-
  gcc/testsuite/g++.dg/ext/underlying_type7.C |  2 +-
  15 files changed, 171 insertions(+), 78 deletions(-)

diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 64975699351..380f288a7f1 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -518,7 +518,7 @@ cp_common_init_ts (void)
MARK_TS_TYPE_NON_COMMON (DECLTYPE_TYPE);
MARK_TS_TYPE_NON_COMMON (TYPENAME_TYPE);
MARK_TS_TYPE_NON_COMMON (TYPEOF_TYPE);
-  MARK_TS_TYPE_NON_COMMON (UNDERLYING_TYPE);
+  MARK_TS_TYPE_NON_COMMON (TRAIT_TYPE);
MARK_TS_TYPE_NON_COMMON (BOUND_TEMPLATE_TEMPLATE_PARM);
MARK_TS_TYPE_NON_COMMON (TEMPLATE_TEMPLATE_PARM);
MARK_TS_TYPE_NON_COMMON (TEMPLATE_TYPE_PARM);
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index f9cbd339f19..f83b4c54d43 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -444,9 +444,12 @@ DEFTREECODE (BIT_CAST_EXPR, "bit_cast_expr", 
tcc_expression, 1)
  
  /** C++ extensions. */
  
-/* Represents a trait expression during template expansion.  */

+/* Represents a templated trait that yields an expression.  */
  DEFTREECODE (TRAIT_EXPR, "trait_expr", tcc_exceptional, 0)
  
+/* Represents a templated trait that yields a type.  */

+DEFTREECODE (TRAIT_TYPE, "trait_type", tcc_type, 0)
+
  /* 

Re: [PATCH 2/2] c++: implement __remove_cv, __remove_reference and __remove_cvref

2022-09-27 Thread Jonathan Wakely via Gcc-patches
On Tue, 27 Sept 2022 at 20:50, Patrick Palka via Libstdc++
 wrote:
> libstdc++-v3/ChangeLog:
>
> * include/bits/unique_ptr.h (unique_ptr<_Tp[], _Dp>): Remove
> __remove_cv and use __remove_cv_t instead.

This part is OK. I added that __remove_cv in 2012, and could have
replaced it with __remove_cv_t when I added that in 2019.



[PATCH v3] c++: Implement C++23 P2266R1, Simpler implicit move [PR101165]

2022-09-27 Thread Marek Polacek via Gcc-patches
On Mon, Sep 26, 2022 at 01:29:35PM -0400, Jason Merrill wrote:
> On 9/20/22 14:19, Marek Polacek wrote:
> > > > There's one FIXME in elision1.C:five, which we should compile but reject
> > > > with "passing 'Mutt' as 'this' argument discards qualifiers".  That
> > > > looks bogus to me, I think I'll open a PR for it.
> > > 
> > > Let's fix that now, I think.
> > 
> > OK, copypasting this bit from the other email so that we can have one
> > thread:
> > 
> > > Can of worms.   The test is
> > > 
> > > struct Mutt {
> > > operator int*() &&;
> > > };
> > > 
> > > int* five(Mutt x) {
> > > return x;  // OK since C++20 because P1155
> > > }
> > > 
> > > 'x' should be treated as an rvalue, therefore the operator fn taking
> > > an rvalue ref to Mutt should be used to convert 'x' to int*.  We fail
> > > because we don't treat 'x' as an rvalue because the function doesn't
> > > return a class.  So the patch should be just
> > > 
> > > --- a/gcc/cp/typeck.cc
> > > +++ b/gcc/cp/typeck.cc
> > > @@ -10875,10 +10875,7 @@ check_return_expr (tree retval, bool *no_warning)
> > > Note that these conditions are similar to, but not as strict 
> > > as,
> > >the conditions for the named return value optimization.  */
> > >  bool converted = false;
> > > -  tree moved;
> > > -  /* This is only interesting for class type.  */
> > > -  if (CLASS_TYPE_P (functype)
> > > - && (moved = treat_lvalue_as_rvalue_p (retval, /*return*/true)))
> > > +  if (tree moved = treat_lvalue_as_rvalue_p (retval, /*return*/true))
> > >   {
> > > if (cxx_dialect < cxx20)
> > >   {
> > > 
> > > which fixes the test, but breaks a lot of middle-end warnings.  For 
> > > instance
> > > g++.dg/warn/nonnull3.C, where the patch above changes .gimple:
> > > 
> > >bool A::foo (struct A * const this, <<< Unknown tree: offset_type 
> > > >>> p)
> > >{
> > > -  bool D.2146;
> > > +  bool D.2150;
> > > -  D.2146 = p != -1;
> > > -  return D.2146;
> > > +  p.0_1 = p;
> > > +  D.2150 = p.0_1 != -1;
> > > +  return D.2150;
> > >}
> > > 
> > > and we no longer get the warning.  I thought maybe I could undo the 
> > > implicit
> > > rvalue conversion in cp_fold, when it sees implicit_rvalue_p, but that 
> > > didn't
> > > work.  So currently I'm stuck.  Should we try to figure this out or push 
> > > aside?
> > 
> > > Can you undo the implicit rvalue conversion within check_return_expr,
> > > where we can still refer back to the original expression?
> > 
> > Unfortunately no, one problem is that treat_lvalue_as_rvalue_p modifies
> > the underlying decl by setting TREE_ADDRESSABLE, which then presumably
> > breaks warnings.  That is, treat_ can get 'VCE(x)' and produce
> > '*NLE<(X&) >' where 'x' flags have been modified, since we're taking
> > x's address.
> > 
> > > Or avoid the rvalue conversion if the return type is scalar?
> > 
> > I wish :(.  In the 'five' example above, the return type is a pointer,
> > a scalar, but we have to convert to rvalue.
> 
> OK, then when both the return type and the type of the return value are
> scalar?

Good news!  After more poking it seems we only need to do the rvalue
conversion when either to/from types is a class/reference!  That is,
if either is a non-scalar type.  And that doesn't upset the middle-end
diagnostics!  I'm still limiting the broader conversion to C++23, but
the whole condition could be:

  if ((!SCALAR_TYPE_P (functype) || !SCALAR_TYPE_P (rettype))
  && treat_lvalue_as_rvalue_p ())
  ...
 
Therefore I think we don't need ...

> > It's sort of sad that this corner case causes so much trouble: I think
> > we have to do the conversion only because of ref-qualifiers, so that
> > the correct operator function is chosen.
> > 
> > A way out may be setting a flag on the V_C_E that indicates it is an
> > rvalue, rather than performing the conversion above.  This was your
> > idea so I don't want to take credit for it.  Should I go ahead and
> > try it?
> 
> Sure, probably in build_static_cast_1.

... this, after all, which is just fantastic.  Besides the check_return_expr
hunk and removing a FIXME there are no other changes.

Next step: remove the double overload.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements https://wg21.link/p2266, which, once again,
changes the implicit move rules.  Here's a brief summary of various
changes in this area:

r125211: Introduced moving from certain lvalues when returning them
r171071: CWG 1148, enable move from value parameter on return
r212099: CWG 1579, it's OK to call a converting ctor taking an rvalue
r251035: CWG 1579, do maybe-rvalue overload resolution twice
r11-2411: Avoid calling const copy ctor on implicit move
r11-2412: C++20 implicit move changes, remove the fallback overload
  resolution, allow move on throw of parameters and implicit
  move of rvalue references

P2266 enables the implicit move even 

[RFA] Avoid unnecessary load-immediate in coremark

2022-09-27 Thread Jeff Law


This is another minor improvement to coremark.   I suspect this only 
improves code size as the load-immediate was likely issuing with the ret 
statement on multi-issue machines.



Basically we're failing to utilize conditional equivalences during the 
post-reload CSE pass.  So if a particular block is only reached when a 
certain condition holds (say for example a4 == 0) and the block has an 
assignment like a4 = 0, we would fail to eliminate the unnecessary 
assignment.



So the way this works, as we enter each block in reload_cse_regs_1 we 
look at the block's predecessors to see if all of them have the same 
implicit assignment.  If they do, then we create a dummy insn 
representing that implicit assignment.



Before processing the first real insn, we enter the implicit assignment 
into the cselib hash tables.    This deferred action is necessary 
because of CODE_LABEL handling in cselib -- when it sees a CODE_LABEL it 
wipes state.  So we have to add the implicit assignment after processing 
the (optional) CODE_LABEL, but before processing real insns.



Note we have to walk all the block's predecessors to verify they all 
have the same implicit assignment.  That could potentially be expensive, 
so we limit it to cases where there are only a few predecessors.   For 
reference on x86_64, 81% of the cases where implicit assignments can be 
found are for single predecessor blocks.  96% have two preds, 99.1% have 
3 preds, 99.6% have 4 preds, 99.8% have 5 preds and so-on.   While there 
were cases where all 19 preds had the same implicit assignment capturing 
those cases just doesn't seem terribly important.   I put the clamp at 3 
preds.    If folks think it's important, I could certainly make that a 
PARAM.



Bootstrapped and regression tested on x86.  Bootstrapped on riscv as well.


OK for the trunk?


Jeff


gcc/
* postreload.cc (reload_cse_regs_1): Record implicit sets from
conditional branches into the cselib tables.

gcc/testsuite/

* gcc.target/riscv/implict-set.c: New test.


diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index 41f61d32648..2f155a239ae 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "emit-rtl.h"
 #include "recog.h"
 
+#include "cfghooks.h"
 #include "cfgrtl.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
@@ -221,13 +222,108 @@ reload_cse_regs_1 (void)
   init_alias_analysis ();
 
   FOR_EACH_BB_FN (bb, cfun)
-FOR_BB_INSNS (bb, insn)
-  {
-   if (INSN_P (insn))
- cfg_changed |= reload_cse_simplify (insn, testreg);
+{
+  /* If BB has a small number of predecessors, see if each of the
+has the same implicit set.  If so, record that implicit set so
+that we can add it to the cselib tables.  */
+  rtx_insn *implicit_set;
 
-   cselib_process_insn (insn);
-  }
+  implicit_set = NULL;
+  if (EDGE_COUNT (bb->preds) <= 3)
+   {
+ edge e;
+ edge_iterator ei;
+ rtx src = NULL_RTX;
+ rtx dest = NULL_RTX;
+ bool found = true;
+
+ /* Iterate over each incoming edge and see if they
+all have the same implicit set.  */
+ FOR_EACH_EDGE (e, ei, bb->preds)
+   {
+ /* If the predecessor does not end in a conditional
+jump, then it does not have an implicit set.  */
+ if (e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
+ && !block_ends_with_condjump_p (e->src))
+   {
+ found = false;
+ break;
+   }
+
+ /* We know the predecessor ends with a conditional
+jump.  Now dig into the actal form of the jump
+to potentially extract an implicit set.  */
+ rtx_insn *condjump = BB_END (e->src);
+ if (condjump
+ && any_condjump_p (condjump)
+ && onlyjump_p (condjump))
+   {
+ /* Extract the condition.  */
+ rtx pat = PATTERN (condjump);
+ rtx i_t_e = SET_SRC (pat);
+ gcc_assert (GET_CODE (i_t_e) == IF_THEN_ELSE);
+ rtx cond = XEXP (i_t_e, 0);
+ if ((GET_CODE (cond) == EQ
+  && GET_CODE (XEXP (i_t_e, 1)) == LABEL_REF
+  && XEXP (XEXP (i_t_e, 1), 0) == BB_HEAD (bb))
+ || (GET_CODE (cond) == NE
+ && XEXP (i_t_e, 2) == pc_rtx
+ && e->src->next_bb == bb))
+   {
+ /* If this is the first time through record
+the source and destination.  */
+ if (!dest)
+   {
+ dest = XEXP (cond, 0);
+ src = XEXP (cond, 1);
+   }
+ /* If this is not the first time 

[PATCH 2/2] c++: implement __remove_cv, __remove_reference and __remove_cvref

2022-09-27 Thread Patrick Palka via Gcc-patches
This uses TRAIT_TYPE from the previous patch to implement efficient
built-ins for std::remove_cv, std::remove_reference and std::remove_cvref.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk?

gcc/c-family/ChangeLog:

* c-common.cc (c_common_reswords): Add __remove_cv,
__remove_reference and __remove_cvref.
* c-common.h (enum rid): Add RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cp-objcp-common.cc (names_builtin_p): Likewise.
* cp-tree.h (enum cp_trait_kind): Add CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* cxx-pretty-print.cc (pp_cxx_trait): Handle CPTK_REMOVE_CV,
CPTK_REMOVE_REFERENCE and CPTK_REMOVE_CVREF.
* parser.cc (cp_keyword_starts_decl_specifier_p): Return true
for RID_REMOVE_CV, RID_REMOVE_REFERENCE and RID_REMOVE_CVREF.
(cp_parser_trait): Handle RID_REMOVE_CV, RID_REMOVE_REFERENCE
and RID_REMOVE_CVREF.
(cp_parser_simple_type_specifier): Likewise.
* semantics.cc (finish_trait_type): Likewise.

libstdc++-v3/ChangeLog:

* include/bits/unique_ptr.h (unique_ptr<_Tp[], _Dp>): Remove
__remove_cv and use __remove_cv_t instead.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test __remove_cv,
__remove_reference and __remove_cvref.
* g++.dg/ext/remove_cv.C: New test.
* g++.dg/ext/remove_reference.C: New test.
* g++.dg/ext/remove_cvref.C: New test.
---
 gcc/c-family/c-common.cc|  3 +++
 gcc/c-family/c-common.h |  1 +
 gcc/cp/constraint.cc|  3 +++
 gcc/cp/cp-objcp-common.cc   |  3 +++
 gcc/cp/cp-tree.h|  5 +++-
 gcc/cp/cxx-pretty-print.cc  |  9 +++
 gcc/cp/parser.cc| 18 ++
 gcc/cp/semantics.cc | 10 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C|  9 +++
 gcc/testsuite/g++.dg/ext/remove_cv.C| 26 +
 gcc/testsuite/g++.dg/ext/remove_cvref.C | 26 +
 gcc/testsuite/g++.dg/ext/remove_reference.C | 26 +
 libstdc++-v3/include/bits/unique_ptr.h  |  5 +---
 13 files changed, 139 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_cv.C
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_cvref.C
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_reference.C

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index cda6910e8c5..6e0af863a49 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -547,6 +547,9 @@ const struct c_common_resword c_common_reswords[] =
D_CXXONLY },
   { "__reference_converts_from_temporary", RID_REF_CONVERTS_FROM_TEMPORARY,
D_CXXONLY },
+  { "__remove_cv", RID_REMOVE_CV, D_CXXONLY },
+  { "__remove_reference", RID_REMOVE_REFERENCE, D_CXXONLY },
+  { "__remove_cvref", RID_REMOVE_CVREF, D_CXXONLY },
 
   /* C++ transactional memory.  */
   { "synchronized",RID_SYNCHRONIZED, D_CXX_OBJC | D_TRANSMEM },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 50a4691cda6..d5c98d306ce 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -187,6 +187,7 @@ enum rid
   RID_IS_CONVERTIBLE,  RID_IS_NOTHROW_CONVERTIBLE,
   RID_REF_CONSTRUCTS_FROM_TEMPORARY,
   RID_REF_CONVERTS_FROM_TEMPORARY,
+  RID_REMOVE_CV, RID_REMOVE_REFERENCE, RID_REMOVE_CVREF,
 
   /* C++11 */
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 266ec581a20..ca73aff3f38 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3714,6 +3714,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_BASES:
 case CPTK_DIRECT_BASES:
 case CPTK_UNDERLYING_TYPE:
+case CPTK_REMOVE_CV:
+case CPTK_REMOVE_REFERENCE:
+case CPTK_REMOVE_CVREF:
   /* We shouldn't see these non-expression traits.  */
   gcc_unreachable ();
 /* We deliberately omit the default case so that when adding a new
diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 380f288a7f1..2d3f206b530 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -467,6 +467,9 @@ names_builtin_p (const char *name)
 case RID_IS_NOTHROW_CONVERTIBLE:
 case RID_REF_CONSTRUCTS_FROM_TEMPORARY:
 case RID_REF_CONVERTS_FROM_TEMPORARY:
+case RID_REMOVE_CV:
+case RID_REMOVE_REFERENCE:
+case RID_REMOVE_CVREF:
   return true;
 default:
   break;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c9adf1b3822..5c8f585b821 100644
--- a/gcc/cp/cp-tree.h
+++ 

[PATCH 1/2] c++: introduce TRAIT_TYPE alongside TRAIT_EXPR

2022-09-27 Thread Patrick Palka via Gcc-patches
We already have generic support for predicate-like traits that yield a
boolean via TRAIT_EXPR, but we lack the same support for transform-like
traits that yield a type.  Such support would be useful for implementing
efficient built-ins for std::decay and std::remove_cvref and other
conceptually simple type traits that are otherwise relatively expensive
to implement.

This patch implements a generic TRAIT_TYPE type and replaces the
existing hardcoded UNDERLYING_TYPE type to use TRAIT_TYPE instead.

gcc/cp/ChangeLog:

* cp-objcp-common.cc (cp_common_init_ts): Replace
UNDERLYING_TYPE with TRAIT_TYPE.
* cp-tree.def (TRAIT_TYPE): Define.
(UNDERLYING_TYPE): Remove.
* cp-tree.h (TRAIT_TYPE_KIND_RAW): Define.
(TRAIT_TYPE_KIND): Define.
(TRAIT_TYPE_TYPE1): Define.
(TRAIT_TYPE_TYPE2): Define.
(WILDCARD_TYPE_P): Return true for TRAIT_TYPE.
(finish_trait_type): Declare.
* cxx-pretty-print.cc (cxx_pretty_printer::primary_expression):
Adjust after renaming pp_cxx_trait_expression.
(cxx_pretty_printer::type_id): Replace UNDERLYING_TYPE with
TRAIT_TYPE.
(pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait): ... this.  Handle TRAIT_TYPE as well.  Correct
pretty printing of the trailing arguments.
* cxx-pretty-print.h (pp_cxx_trait_expression): Rename to ...
(pp_cxx_trait_type): ... this.
* error.cc (dump_type) : Remove.
: New.
(dump_type_prefix): Replace UNDERLYING_WITH with TRAIT_TYPE.
(dump_type_suffix): Likewise.
* mangle.cc (write_type) : Remove.
: New.
* module.cc (trees_out::type_node) :
Remove.
: New.
(trees_in::tree_node): Likewise.
* parser.cc (cp_parser_primary_expression): Adjust after
renaming cp_parser_trait_expr.
(cp_parser_trait_expr): Rename to ...
(cp_parser_trait): ... this.  Call finish_trait_type for traits
that yield a type.
(cp_parser_simple_type_specifier): Adjust after renaming
cp_parser_trait_expr.
* pt.cc (for_each_template_parm_r) :
Remove.
: New.
(tsubst): Likewise.
(unify): Replace UNDERLYING_TYPE with TRAIT_TYPE.
(dependent_type_p_r): Likewise.
* semantics.cc (finish_underlying_type): Don't return
UNDERLYING_TYPE anymore when processing_template_decl.
(finish_trait_type): Define.
* tree.cc (strip_typedefs) : Remove.
: New.
(cp_walk_subtrees): Likewise.
* typeck.cc (structural_comptypes): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/underlying_type7.C: Adjust expected error message.
---
 gcc/cp/cp-objcp-common.cc   |  2 +-
 gcc/cp/cp-tree.def  |  9 ++--
 gcc/cp/cp-tree.h| 18 
 gcc/cp/cxx-pretty-print.cc  | 49 ++---
 gcc/cp/cxx-pretty-print.h   |  2 +-
 gcc/cp/error.cc | 14 +++---
 gcc/cp/mangle.cc|  5 ++-
 gcc/cp/module.cc| 24 +-
 gcc/cp/parser.cc| 24 +-
 gcc/cp/pt.cc| 26 +++
 gcc/cp/semantics.cc | 41 -
 gcc/cp/tree.cc  | 22 ++---
 gcc/cp/typeck.cc|  7 ++-
 gcc/testsuite/g++.dg/cpp0x/alias-decl-59.C  |  4 +-
 gcc/testsuite/g++.dg/ext/underlying_type7.C |  2 +-
 15 files changed, 171 insertions(+), 78 deletions(-)

diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 64975699351..380f288a7f1 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -518,7 +518,7 @@ cp_common_init_ts (void)
   MARK_TS_TYPE_NON_COMMON (DECLTYPE_TYPE);
   MARK_TS_TYPE_NON_COMMON (TYPENAME_TYPE);
   MARK_TS_TYPE_NON_COMMON (TYPEOF_TYPE);
-  MARK_TS_TYPE_NON_COMMON (UNDERLYING_TYPE);
+  MARK_TS_TYPE_NON_COMMON (TRAIT_TYPE);
   MARK_TS_TYPE_NON_COMMON (BOUND_TEMPLATE_TEMPLATE_PARM);
   MARK_TS_TYPE_NON_COMMON (TEMPLATE_TEMPLATE_PARM);
   MARK_TS_TYPE_NON_COMMON (TEMPLATE_TYPE_PARM);
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index f9cbd339f19..f83b4c54d43 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -444,9 +444,12 @@ DEFTREECODE (BIT_CAST_EXPR, "bit_cast_expr", 
tcc_expression, 1)
 
 /** C++ extensions. */
 
-/* Represents a trait expression during template expansion.  */
+/* Represents a templated trait that yields an expression.  */
 DEFTREECODE (TRAIT_EXPR, "trait_expr", tcc_exceptional, 0)
 
+/* Represents a templated trait that yields a type.  */
+DEFTREECODE (TRAIT_TYPE, "trait_type", tcc_type, 0)
+
 /* A lambda expression.  This is a C++0x extension.
LAMBDA_EXPR_DEFAULT_CAPTURE_MODE is an enum for the default, which may be
none.
@@ -466,10 +469,6 @@ 

Re: [RFC] postreload cse'ing vector constants

2022-09-27 Thread H.J. Lu via Gcc-patches
On Tue, Sep 27, 2022 at 10:46 AM Robin Dapp via Gcc-patches
 wrote:
>
> > I did bootstrapping and ran the testsuite on x86(-64), aarch64, Power9
> > and s390.  Everything looks good except two additional fails on x86
> > where code actually looks worse.
> >
> > gcc.target/i386/keylocker-encodekey128.c
> >
> > 17c17,18
> > <   movaps  %xmm4, k2(%rip)
> > ---
> >>   pxor%xmm0, %xmm0
> >>   movaps  %xmm0, k2(%rip)
> >
> > gcc.target/i386/keylocker-encodekey256.c:
> >
> > 19c19,20
> > <   movaps  %xmm4, k3(%rip)
> > ---
> >>   pxor%xmm0, %xmm0
> >>   movaps  %xmm0, k3(%rip)
>
> Before the patch and after postreload we have:
>
> (insn (set (reg:V2DI xmm0)
> (reg:V2DI xmm4))
>  (expr_list:REG_DEAD (reg:V2DI 24 xmm4)
> (expr_list:REG_EQUIV (const_vector:V2DI [
> (const_int 0 [0]) repeated x2
> ])
> (insn (set (mem/c:V2DI (symbol_ref:DI ("k2"))
> (reg:V2DI xmm0
>
> which is converted by cprop_hardreg to:
>
> (insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
> (reg:V2DI xmm4
>
> With the change there is:
>
> (insn (set (reg:V2DI xmm0)
> (const_vector:V2DI [
> (const_int 0 [0]) repeated x2
> ])))
> (insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
> (reg:V2DI xmm0
>
> which is not simplified further because xmm0 needs to be explicitly
> zeroed while xmm4 is assumed to be zeroed by encodekey128.  I'm not
> familiar with this so I'm supposing this is correct even though I found
> "XMM4 through XMM6 are reserved for future usages and software should
> not rely upon them being zeroed." online.

I opened:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107061

> Even inf xmm4 were zeroed explicity, I guess in this case the simple
> costing of mov reg,reg vs mov reg,imm (with the latter not being more
> expensive) falls short?  cprop_hardreg can actually propagate the zeroed
> xmm4 into the next move.
> The same mechanism could possibly even elide many such moves which would
> mean we'd unnecessarily emit many mov reg,0?  Hmm...

This sounds like an issue.

-- 
H.J.


Re: [PATCH] Fortran: error recovery while simplifying intrinsic UNPACK [PR107054]

2022-09-27 Thread Mikael Morin

Le 27/09/2022 à 21:05, Harald Anlauf via Fortran a écrit :

Dear all,

invalid input may trigger an assert while trying to simplify an
expression involving the intrinsic UNPACK and when the constructor
is lacking sufficient valid elements.  The obvious solution is to
replace the assert by a condition that terminates simplification
in that case.

Report and testcase by Gerhard.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This is a 10/11/12/13 regression and shall be backported.


OK, thanks.



[PATCH] Fortran: error recovery while simplifying intrinsic UNPACK [PR107054]

2022-09-27 Thread Harald Anlauf via Gcc-patches
Dear all,

invalid input may trigger an assert while trying to simplify an
expression involving the intrinsic UNPACK and when the constructor
is lacking sufficient valid elements.  The obvious solution is to
replace the assert by a condition that terminates simplification
in that case.

Report and testcase by Gerhard.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This is a 10/11/12/13 regression and shall be backported.

Thanks,
Harald

From 80285cdad1fe98c52ebf38f9f66070b2a50191c6 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 27 Sep 2022 20:54:28 +0200
Subject: [PATCH] Fortran: error recovery while simplifying intrinsic UNPACK
 [PR107054]

gcc/fortran/ChangeLog:

	PR fortran/107054
	* simplify.cc (gfc_simplify_unpack): Replace assert by condition
	that terminates simplification when there are not enough elements
	in the constructor of argument VECTOR.

gcc/testsuite/ChangeLog:

	PR fortran/107054
	* gfortran.dg/pr107054.f90: New test.
---
 gcc/fortran/simplify.cc| 13 ++---
 gcc/testsuite/gfortran.dg/pr107054.f90 | 13 +
 2 files changed, 23 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr107054.f90

diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index c0fbd0ed7c2..6ac92cf9db8 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -8458,9 +8458,16 @@ gfc_simplify_unpack (gfc_expr *vector, gfc_expr *mask, gfc_expr *field)
 {
   if (mask_ctor->expr->value.logical)
 	{
-	  gcc_assert (vector_ctor);
-	  e = gfc_copy_expr (vector_ctor->expr);
-	  vector_ctor = gfc_constructor_next (vector_ctor);
+	  if (vector_ctor)
+	{
+	  e = gfc_copy_expr (vector_ctor->expr);
+	  vector_ctor = gfc_constructor_next (vector_ctor);
+	}
+	  else
+	{
+	  gfc_free_expr (result);
+	  return NULL;
+	}
 	}
   else if (field->expr_type == EXPR_ARRAY)
 	e = gfc_copy_expr (field_ctor->expr);
diff --git a/gcc/testsuite/gfortran.dg/pr107054.f90 b/gcc/testsuite/gfortran.dg/pr107054.f90
new file mode 100644
index 000..bbfe646beba
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr107054.f90
@@ -0,0 +1,13 @@
+! { dg-do compile }
+! PR fortran/107054 - ICE in gfc_simplify_unpack
+! Contributed by G.Steinmetz
+
+program p
+  type t
+ integer :: n = 0
+  end type
+  type(t), parameter :: a(4) = t(2)
+  type(t), parameter :: b(4) = reshape(a,[2]) ! { dg-error "Different shape" }
+  type(t), parameter :: c(2) = pack(b,[.false.,.true.,.false.,.true.]) ! { dg-error "Different shape" }
+  type(t), parameter :: d(4) = unpack(c,[.false.,.true.,.false.,.true.],a)
+end
--
2.35.3



Re: [RFC] postreload cse'ing vector constants

2022-09-27 Thread Robin Dapp via Gcc-patches
> I did bootstrapping and ran the testsuite on x86(-64), aarch64, Power9
> and s390.  Everything looks good except two additional fails on x86
> where code actually looks worse.
> 
> gcc.target/i386/keylocker-encodekey128.c
> 
> 17c17,18
> <   movaps  %xmm4, k2(%rip)
> ---
>>   pxor%xmm0, %xmm0
>>   movaps  %xmm0, k2(%rip)
> 
> gcc.target/i386/keylocker-encodekey256.c:
> 
> 19c19,20
> <   movaps  %xmm4, k3(%rip)
> ---
>>   pxor%xmm0, %xmm0
>>   movaps  %xmm0, k3(%rip)

Before the patch and after postreload we have:

(insn (set (reg:V2DI xmm0)
(reg:V2DI xmm4))
 (expr_list:REG_DEAD (reg:V2DI 24 xmm4)
(expr_list:REG_EQUIV (const_vector:V2DI [
(const_int 0 [0]) repeated x2
])
(insn (set (mem/c:V2DI (symbol_ref:DI ("k2"))
(reg:V2DI xmm0

which is converted by cprop_hardreg to:

(insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
(reg:V2DI xmm4

With the change there is:

(insn (set (reg:V2DI xmm0)
(const_vector:V2DI [
(const_int 0 [0]) repeated x2
])))
(insn (set (mem/c:V2DI (symbol_ref:DI ("k2")))
(reg:V2DI xmm0

which is not simplified further because xmm0 needs to be explicitly
zeroed while xmm4 is assumed to be zeroed by encodekey128.  I'm not
familiar with this so I'm supposing this is correct even though I found
"XMM4 through XMM6 are reserved for future usages and software should
not rely upon them being zeroed." online.

Even inf xmm4 were zeroed explicity, I guess in this case the simple
costing of mov reg,reg vs mov reg,imm (with the latter not being more
expensive) falls short?  cprop_hardreg can actually propagate the zeroed
xmm4 into the next move.
The same mechanism could possibly even elide many such moves which would
mean we'd unnecessarily emit many mov reg,0?  Hmm...


[PATCH 1/1] p1689r5: initial support

2022-09-27 Thread Ben Boeckel via Gcc-patches
This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.

Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
  `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.

- `-fdep-output=` specifies the `.o` that will be written for the TU
  that is scanned. This is required so that the build system can
  correlate the dependency output with the actual compilation that will
  occur.

CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: 
https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.

- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously represetable in UTF-8" case).

- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

Signed-off-by: Ben Boeckel 
---
 gcc/ChangeLog   |   9 ++
 gcc/c-family/ChangeLog  |   6 +
 gcc/c-family/c-opts.cc  |  40 ++-
 gcc/c-family/c.opt  |  12 ++
 gcc/cp/ChangeLog|   5 +
 gcc/cp/module.cc|   3 +-
 gcc/doc/invoke.texi |  15 +++
 gcc/fortran/ChangeLog   |   5 +
 gcc/fortran/cpp.cc  |   4 +-
 gcc/genmatch.cc |   2 +-
 gcc/input.cc|   4 +-
 libcpp/ChangeLog|  11 ++
 libcpp/include/cpplib.h |  12 +-
 libcpp/include/mkdeps.h |  17 ++-
 libcpp/init.cc  |  14 ++-
 libcpp/mkdeps.cc| 235 ++--
 16 files changed, 368 insertions(+), 26 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6dded16c0e3..2d61de6adde 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2022-09-20  Ben Boeckel  
+
+   * doc/invoke.texi: Document -fdeps-format=, -fdep-file=, and
+   -fdep-output= flags.
+   * genmatch.cc (main): Add new preprocessor parameter used for C++
+   module tracking.
+   * input.cc (test_lexer): Add new preprocessor parameter used for C++
+   module tracking.
+
 2022-09-19  Torbjörn SVENSSON  
 
* targhooks.cc (default_zero_call_used_regs): Improve sorry
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index ba3d76dd6cb..569dcd96e8c 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,9 @@
+2022-09-20  Ben Boeckel  
+
+   * c-opts.cc (c_common_handle_option): Add fdeps_file variable and
+   -fdeps-format=, -fdep-file=, and -fdep-output= parsing.
+   * c.opt: Add -fdeps-format=, -fdep-file=, and -fdep-output= flags.
+
 2022-09-15  Richard Biener  
 
* c-common.h (build_void_list_node): Remove.
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index babaa2fc157..617d0e93696 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -77,6 +77,9 @@ static bool verbose;
 /* Dependency output file.  */
 static const char *deps_file;
 
+/* Enhanced dependency output file.  */
+static const char *fdeps_file;
+
 /* The prefix given by -iprefix, if any.  */
 static const char *iprefix;
 
@@ -360,6 +363,23 @@ c_common_handle_option (size_t scode, const char *arg, 
HOST_WIDE_INT value,
   deps_file = arg;
   break;
 
+case OPT_fdep_format_:
+  if (!strcmp (arg, "p1689r5"))
+   cpp_opts->deps.format = DEPS_FMT_P1689R5;
+  else
+   error ("%<-fdep-format=%> unknown format %s", arg);
+  break;
+
+case OPT_fdep_file_:
+  deps_seen = true;
+  fdeps_file = arg;
+  break;
+
+case OPT_fdep_output_:
+  deps_seen = true;
+  defer_opt (code, arg);
+  break;
+
 case OPT_MF:
   deps_seen = true;
   

[PATCH 0/1] RFC: P1689R5 support

2022-09-27 Thread Ben Boeckel via Gcc-patches
This patch adds initial support for ISO C++'s [P1689R5][], a format for
describing C++ module requirements and provisions based on the source
code. This is required because compiling C++ with modules is not
embarrassingly parallel and need to be ordered to ensure that `import
some_module;` can be satisfied in time by making sure that the TU with
`export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

Testing is currently happening in CMake's CI using a prior revision of
this patch (the differences are basically the changelog, some style, and
`trtbd` instead of `p1689r5` as the format name).

For testing within GCC, I'll work on the following:

- scanning non-module source
- scanning module-importing source (`import X;`)
- scanning module-exporting source (`export module X;`)
- scanning module implementation unit (`module X;`)
- flag combinations?

Are there existing tools for handling JSON output for testing purposes?
Basically, something that I can add to the test suite that doesn't care
about whitespace, but checks the structure (with sensible replacements
for absolute paths where relevant)?

For the record, Clang has patches with similar flags and behavior by
Chuanqi Xu here:

https://reviews.llvm.org/D134269

with the same flags (though using my old `trtbd` spelling for the
format name).

Thanks,

--Ben

Ben Boeckel (1):
  p1689r5: initial support

 gcc/ChangeLog   |   9 ++
 gcc/c-family/ChangeLog  |   6 +
 gcc/c-family/c-opts.cc  |  40 ++-
 gcc/c-family/c.opt  |  12 ++
 gcc/cp/ChangeLog|   5 +
 gcc/cp/module.cc|   3 +-
 gcc/doc/invoke.texi |  15 +++
 gcc/fortran/ChangeLog   |   5 +
 gcc/fortran/cpp.cc  |   4 +-
 gcc/genmatch.cc |   2 +-
 gcc/input.cc|   4 +-
 libcpp/ChangeLog|  11 ++
 libcpp/include/cpplib.h |  12 +-
 libcpp/include/mkdeps.h |  17 ++-
 libcpp/init.cc  |  14 ++-
 libcpp/mkdeps.cc| 235 ++--
 16 files changed, 368 insertions(+), 26 deletions(-)


base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
-- 
2.37.3



Re: [PATCH v2] libgo: Portable access to thread ID in struct sigevent

2022-09-27 Thread Ian Lance Taylor via Gcc-patches
On Fri, Sep 23, 2022 at 6:59 AM  wrote:
>
> From: Sören Tempel 
>
> Tested on x86_64 Arch Linux (glibc) and Alpine Linux (musl libc).
>
> Previously, libgo relied on the _sigev_un implementation-specific
> field in struct sigevent, which is only available on glibc. This
> patch uses the sigev_notify_thread_id macro instead which is mandated
> by timer_create(2). In theory, this should work with any libc
> implementation for Linux. Unfortunately, there is an open glibc bug
> as glibc does not define this macro. For this reason, a glibc-specific
> workaround is required. Other libcs (such as musl) define the macro
> and don't require the workaround.
>
> This makes go_signal compatible with musl libc.
>
> See: https://sourceware.org/bugzilla/show_bug.cgi?id=27417

Thanks.  Committed with some changes, as appended.

Sorry for the delay.

Ian
e73d9fcafbd07bc3714fbaf8a82db71d50015c92
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 73aa712dbdf..4793c821eba 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-0140cca9bc0fad1108c7ed369376ac71cc4bfecf
+8f1a91aeff400d572857895b7f5e863ec5a4d93e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/runtime/os_linux.go b/libgo/go/runtime/os_linux.go
index 96fb178870e..2b2d827cee8 100644
--- a/libgo/go/runtime/os_linux.go
+++ b/libgo/go/runtime/os_linux.go
@@ -22,6 +22,12 @@ type mOS struct {
profileTimerValid uint32
 }
 
+// setSigeventTID is written in C to set the sigev_notify_thread_id
+// field of a sigevent struct.
+//
+//go:noescape
+func setSigeventTID(*_sigevent, int32)
+
 func getProcID() uint64 {
return uint64(gettid())
 }
@@ -52,9 +58,12 @@ const (
 )
 
 // Atomically,
+//
 // if(*addr == val) sleep
+//
 // Might be woken up spuriously; that's allowed.
 // Don't sleep longer than ns; ns < 0 means forever.
+//
 //go:nosplit
 func futexsleep(addr *uint32, val uint32, ns int64) {
// Some Linux kernels have a bug where futex of
@@ -73,6 +82,7 @@ func futexsleep(addr *uint32, val uint32, ns int64) {
 }
 
 // If any procs are sleeping on addr, wake up at most cnt.
+//
 //go:nosplit
 func futexwakeup(addr *uint32, cnt uint32) {
ret := futex(unsafe.Pointer(addr), _FUTEX_WAKE_PRIVATE, cnt, nil, nil, 
0)
@@ -365,7 +375,7 @@ func setThreadCPUProfiler(hz int32) {
var sevp _sigevent
sevp.sigev_notify = _SIGEV_THREAD_ID
sevp.sigev_signo = _SIGPROF
-   *((*int32)(unsafe.Pointer(_sigev_un))) = int32(mp.procid)
+   setSigeventTID(, int32(mp.procid))
ret := timer_create(_CLOCK_THREAD_CPUTIME_ID, , )
if ret != 0 {
// If we cannot create a timer for this M, leave 
profileTimerValid false
diff --git a/libgo/runtime/go-signal.c b/libgo/runtime/go-signal.c
index 528d9b6d9fe..aa1b6305ad0 100644
--- a/libgo/runtime/go-signal.c
+++ b/libgo/runtime/go-signal.c
@@ -183,6 +183,24 @@ setSigactionHandler(struct sigaction* sa, uintptr handler)
sa->sa_sigaction = (void*)(handler);
 }
 
+#ifdef __linux__
+
+// Workaround for https://sourceware.org/bugzilla/show_bug.cgi?id=27417
+#ifndef sigev_notify_thread_id
+  #define sigev_notify_thread_id _sigev_un._tid
+#endif
+
+void setSigeventTID(struct sigevent*, int32_t)
+   __asm__ (GOSYM_PREFIX "runtime.setSigeventTID");
+
+void
+setSigeventTID(struct sigevent *sev, int32_t v)
+{
+   sev->sigev_notify_thread_id = v;
+}
+
+#endif // defined(__linux__)
+
 // C code to fetch values from the siginfo_t and ucontext_t pointers
 // passed to a signal handler.
 


libgo patch committed: Synchronize empty struct field handling

2022-09-27 Thread Ian Lance Taylor via Gcc-patches
This libgo patch by Funan Zeng synchronizes the handling of empty
struct fields between the Go frontend and the libgo FFI code.  In the
compiler the logic for allocating one byte for the last field of a
struct is:
1. the last field has zero size
2. the struct itself does not have zero size
3. the last field is not blank
This patch adds the last two conditions to runtime.structToFFI.  This
is for https://go.dev/issue/55146.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
085bacba3502ff77c70a7660c19a68f50e9b7877
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index f7a7985287d..73aa712dbdf 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-42efec8c126cf3787bc7c89d9c7f224eff7c5a21
+0140cca9bc0fad1108c7ed369376ac71cc4bfecf
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/runtime/ffi.go b/libgo/go/runtime/ffi.go
index cd8479ef551..86ce5b85d04 100644
--- a/libgo/go/runtime/ffi.go
+++ b/libgo/go/runtime/ffi.go
@@ -4,6 +4,7 @@
 
 // Only build this file if libffi is supported.
 
+//go:build libffi
 // +build libffi
 
 package runtime
@@ -221,9 +222,6 @@ func stringToFFI() *__ffi_type {
 // structToFFI returns an ffi_type for a Go struct type.
 func structToFFI(typ *structtype) *__ffi_type {
c := len(typ.fields)
-   if c == 0 {
-   return emptyStructToFFI()
-   }
if typ.typ.kind != 0 {
return ffi_type_pointer()
}
@@ -231,6 +229,7 @@ func structToFFI(typ *structtype) *__ffi_type {
fields := make([]*__ffi_type, 0, c+1)
checkPad := false
lastzero := false
+   sawnonzero := false
for i, v := range typ.fields {
// Skip zero-sized fields; they confuse libffi,
// and there is no value to pass in any case.
@@ -239,10 +238,13 @@ func structToFFI(typ *structtype) *__ffi_type {
// next field.
if v.typ.size == 0 {
checkPad = true
-   lastzero = true
+   if v.name == nil || *v.name != "_" {
+   lastzero = true
+   }
continue
}
lastzero = false
+   sawnonzero = true
 
if checkPad {
off := uintptr(0)
@@ -263,6 +265,10 @@ func structToFFI(typ *structtype) *__ffi_type {
fields = append(fields, typeToFFI(v.typ))
}
 
+   if !sawnonzero {
+   return emptyStructToFFI()
+   }
+
if lastzero {
// The compiler adds one byte padding to non-empty struct ending
// with a zero-sized field (types.cc:get_backend_struct_fields).


Re: [PATCH][RFC] tree-optimization/105646 - re-interpret always executed in uninit diag

2022-09-27 Thread Jeff Law via Gcc-patches



On 8/22/22 00:16, Richard Biener via Gcc-patches wrote:

The following fixes PR105646, not diagnosing

int f1();
int f3(){
 auto const & a = f1();
 bool v3{v3};
 return a;
}

with optimization because the early uninit diagnostic pass only
diagnoses always executed cases.  The patch does this by
re-interpreting what always executed means and choosing to
ignore exceptional and abnormal control flow for this.  At the
same time it improves things as suggested in a comment - when
the value-numbering run done without optimizing figures there's
a fallthru path, consider blocks on it as always executed.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

PR tree-optimization/105646
* tree-ssa-uninit.cc (warn_uninitialized_vars): Pre-compute
the set of fallthru reachable blocks from function entry
and use that to determine wlims.always_executed.

* g++.dg/uninit-pr105646.C: New testcase.


I'm torn on this.  On one hand, ignoring abnormal flow control in the 
early pass is almost certainly going to result in false positives but 
it's also going to result in fixing some false negatives.


I'm willing to ACK and see what the real world fallout is in the spring 
when the distros run their builds.  Your call.



Jeff




RE: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2022-09-27 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: Tuesday, September 27, 2022 11:06 AM
> To: Kyrylo Tkachov 
> Cc: Andrea Corallo via Gcc-patches ; Richard
> Earnshaw ; nd 
> Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
> popping if necessary
> 
> Kyrylo Tkachov  writes:
> 
> > Hi Andrea,
> >
> >> -Original Message-
> >> From: Gcc-patches  >> bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
> >> Corallo via Gcc-patches
> >> Sent: Friday, August 12, 2022 4:34 PM
> >> To: Andrea Corallo via Gcc-patches 
> >> Cc: Richard Earnshaw ; nd 
> >> Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
> popping
> >> if necessary
> >>
> >> Hi all,
> >>
> >> this patch enables 'arm_emit_multi_reg_pop' to set again the stack
> >> pointer as CFA reg when popping if this is necessary.
> >>
> >
> > From what I can tell from similar functions this is correct, but could you
> elaborate on why this change is needed for my understanding please?
> > Thanks,
> > Kyrill
> 
> Hi Kyrill,
> 
> sure, if the frame pointer was set, than it is the current CFA register.
> If we request to adjust the current CFA register offset indicating it
> being SP (while it's actually FP) that is indeed not correct and the
> incoherence we will be detected by an assertion in the dwarf emission
> machinery.

Thanks,  the patch is ok
Kyrill

> 
> Best Regards
> 
>   Andrea


Re: [Documentation] Correct RTL documentation: (use (mem ...)) is allowed.

2022-09-27 Thread Jeff Law via Gcc-patches



On 7/23/22 03:26, Roger Sayle wrote:

This patch is a one line correction/clarification to GCC's current
RTL documentation that explains a USE of a MEM is permissible.

PR rtl-optimization/99930 is an interesting example on x86_64 where
the backend generates better code when a USE is a (const) MEM than
when it is a REG. In fact the backend relies on CSE to propagate the
MEM (a constant pool reference) into the USE, to enable combine to
merge/simplify instructions.

This change has been tested with a make bootstrap, but as it might
provoke a discussion, I've decided to not consider it "obvious".
Ok for mainline (to document the actual current behavior)?


2022-07-23  Roger Sayle   

gcc/ChangeLog
 * doc/rtl.texi (use): Document that the operand may be a MEM.


Given this is documenting existing behavior and it's not hard to 
envision the MEM being useful in this context.  OK.


jeff




[PATCH] Don't ICE running selftests if errors were raised [PR99723]

2022-09-27 Thread Andrea Corallo via Gcc-patches
Hi all

this is to address PR 99723.

In the PR GCC crashes as the initialization of common trees is not
performed as no compilation is happening, this is because we raise an
error earlier while processing the arch flags.

This patch changes the code to execute selftests only if no errors
where raised before.

Bootstrapped on aarch64, okay for trunk?

Best Regards

  Andrea

2022-09-27  Andrea Corallo  

* toplev.cc (toplev::main): Don't run self tests in case of
previous error.
---
 gcc/toplev.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/toplev.cc b/gcc/toplev.cc
index 924871fa9a8..b53a78bbaf1 100644
--- a/gcc/toplev.cc
+++ b/gcc/toplev.cc
@@ -2276,7 +2276,7 @@ toplev::main (int argc, char **argv)
start_timevars ();
   do_compile (no_backend);
 
-  if (flag_self_test)
+  if (flag_self_test && !seen_error ())
{
  if (no_backend)
error_at (UNKNOWN_LOCATION, "self-tests incompatible with %<-E%>");
-- 
2.25.1



Re: [PATCH] docs: update abi version info

2022-09-27 Thread Jeff Law via Gcc-patches



On 7/21/22 01:18, Kim Kuparinen via Gcc-patches wrote:

Subject:
[PATCH] docs: update abi version info
From:
Kim Kuparinen via Gcc-patches 
Date:
7/21/22, 01:18

To:
"gcc-patches@gcc.gnu.org" 


Synchronize gcc/common.opts and gcc/doc/invoke.texi w.r.t -fabi-version, and
correct -fabi-compat-version from ABIv11 to ABIv13, since it was changed in
a37e8ce3b66325f0c6de55c80d50ac1664c3d0eb

gcc/ChangeLog:

* doc/invoke.texi: update abi version info


THanks.  I pushed this to the trunk.  Sorry for the long wait.


jeff



[COMMITTED] range-ops: Calculate the popcount of a singleton.

2022-09-27 Thread Aldy Hernandez via Gcc-patches
The legacy popcount folding didn't actually fold singleton ranges.
I don't think anyone noticed because there are match.pd patterns that
pick up the slack using the global nonzero bits set by CCP.

It's good form to handle this, even without CCP's help.

Tested on x86-64 Linux.

p.s. This doesn't fix anything else in PR107043, except at
the first two testcases at -fno-tree-ccp, so nothing new.

gcc/ChangeLog:

* gimple-range-op.cc (cfn_popcount): Calculate the popcount of a
singleton.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/popcount6b.c: New test.
---
 gcc/gimple-range-op.cc | 8 
 gcc/testsuite/gcc.dg/tree-ssa/popcount6b.c | 6 ++
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount6b.c

diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index d7c6dfa933d..3f5e5852e5a 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -397,6 +397,14 @@ public:
   {
 if (lh.undefined_p ())
   return false;
+// Calculating the popcount of a singleton is trivial.
+if (lh.singleton_p ())
+  {
+   wide_int nz = lh.get_nonzero_bits ();
+   wide_int pop = wi::shwi (wi::popcount (nz), TYPE_PRECISION (type));
+   r.set (type, pop, pop);
+   return true;
+  }
 // __builtin_ffs* and __builtin_popcount* return [0, prec].
 int prec = TYPE_PRECISION (lh.type ());
 // If arg is non-zero, then ffs or popcount are non-zero.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount6b.c 
b/gcc/testsuite/gcc.dg/tree-ssa/popcount6b.c
new file mode 100644
index 000..90336ecb070
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount6b.c
@@ -0,0 +1,6 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-evrp -fno-tree-ccp" }
+
+#include "popcount6.c"
+
+// { dg-final { scan-tree-dump "return 1;" "evrp" } }
-- 
2.37.1



[PATCH] testsuite: Skip intrinsics test if arm

2022-09-27 Thread Torbjörn SVENSSON via Gcc-patches
In the test cases, it's clearly written that intrinsics are not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By changing to dg-skip-if, the entire test case is omitted.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Rephrase
to unimplemented.
* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: Replace
dg-xfail-if with dg-skip-if.
* gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: Likewise.

Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x2.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x3.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x4.c | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
index f933102be47..0c45a2b227b 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
 /* { dg-do run } */
-/* { dg-skip-if "unsupported" { arm*-*-* } } */
+/* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
index b20dec061b5..4174dcd064a 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
 /* { dg-do run } */
-/* { dg-skip-if "unsupported" { arm*-*-* } } */
+/* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
index e59f845880e..89b289bb21d 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x4.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
 /* { dg-do run } */
-/* { dg-skip-if "unsupported" { arm*-*-* } } */
+/* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x2.c
index cb13da0caed..6d20a46b8b6 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x2.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
 /* { dg-do run } */
+/* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x3.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x3.c
index 3ce272a5007..87eae4d2f35 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x3.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x3.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
 /* { dg-do run } */
+/* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x4.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x4.c
index 1f17b5342de..829a18ddac0 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x4.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vst1x4.c
@@ -1,6 +1,6 @@
 /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
 /* { dg-do run } */
+/* { dg-skip-if "unimplemented" { arm*-*-* } } */
 /* { dg-options "-O3" } */
 
 #include 
-- 
2.25.1



Re: [PATCH v2] Libvtv-test: Fix bug that scansarif.exp cannot be found in libvtv regression test.

2022-09-27 Thread David Malcolm via Gcc-patches
On Tue, 2022-09-27 at 14:02 +0800, Lulu Cheng wrote:
> SARIF support was added in r13-967 but libvtv wasn't updated.

Sorry about breaking this.  The patch looks reasonable to me, FWIW,
assuming that it fixes the issue, of course!

Looks like my normal testing process missed this when I was testing the
SARIF patch; presumably we need to configure with --enable-vtable-
verify=yes to enable this feature.

Thanks
Dave

> 
> libvtv/ChangeLog:
> 
> * testsuite/lib/libvtv-dg.exp: Add load_gcc_lib of
> scansarif.exp.
> ---
>  libvtv/testsuite/lib/libvtv-dg.exp | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/libvtv/testsuite/lib/libvtv-dg.exp
> b/libvtv/testsuite/lib/libvtv-dg.exp
> index b140c194cdc..454d916e556 100644
> --- a/libvtv/testsuite/lib/libvtv-dg.exp
> +++ b/libvtv/testsuite/lib/libvtv-dg.exp
> @@ -12,6 +12,8 @@
>  # along with this program; if not, write to the Free Software
>  # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301, USA.
>  
> +load_gcc_lib scansarif.exp
> +
>  proc libvtv-dg-test { prog do_what extra_tool_flags } {
>  return [gcc-dg-test-1 libvtv_target_compile $prog $do_what
> $extra_tool_flags]
>  }



Re: VN, len_store and endianness

2022-09-27 Thread Robin Dapp via Gcc-patches
> Yes, because the native_interpret always starts at offset zero
> (we can't easily feed in a "shifted" RHS).  So what I assumed is
> that IFN_LEN_STORE always stores elements [0, len + adj].

Hmm, but this assumption is not violated here or am I missing something?
 It's not like we're storing [vec_size - (len + adj) - 1, vec_size - 1]
but indeed [0, len + adj].  Just the access to the buffer later is
reversed which it should not be.


Re: [PATCH] support -gz=zstd for both linker and assembler

2022-09-27 Thread Martin Liška
PING^1

On 9/22/22 14:51, Martin Liška wrote:
> Hi.
> 
> Tested with Fangrui's patch set sent to binutils ML and mold linker.
> 
> $ gcc -g -gz=zstd a.c --save-temps --verbose 2>&1 | grep debug-sections
>  /home/marxin/Programming/binutils/objdir/gas/as-new -v --gdwarf-5 
> --compress-debug-sections=zstd --64 -o a.o a.s
>  /home/marxin/bin/gcc/libexec/gcc/x86_64-pc-linux-gnu/13.0.0/collect2 -plugin 
> /home/marxin/bin/gcc/libexec/gcc/x86_64-pc-linux-gnu/13.0.0/liblto_plugin.so 
> -plugin-opt=/home/marxin/bin/gcc/libexec/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper
>  -plugin-opt=-fresolution=a.res -plugin-opt=-pass-through=-lgcc 
> -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc 
> -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s 
> --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 
> --compress-debug-sections=zstd /lib/../lib64/crt1.o /lib/../lib64/crti.o 
> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/13.0.0/crtbegin.o 
> -L/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/13.0.0 
> -L/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/13.0.0/../../../../lib64 
> -L/lib/../lib64 -L/usr/lib/../lib64 
> -L/home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/13.0.0/../../.. a.o 
> -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s 
> --no-as-needed 
> /home/marxin/bin/gcc/lib64/gcc/x86_64-pc-linux-gnu/13.0.0/crtend.o 
> /lib/../lib64/crtn.o
> 
> $ gdb a.out
> ...
> BFD: /home/marxin/Programming/testcases/a.out: unable to initialize 
> decompress status for section .debug_abbrev
> BFD: /home/marxin/Programming/testcases/a.out: unable to initialize 
> decompress status for section .debug_abbrev
> "/home/marxin/Programming/testcases/a.out": not in executable format: file 
> format not recognized
> 
> So it's really compressed with zstd. I'm going to write ChangeLog entry for 
> zlib-gnu once this gets merged as well.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
>   PR driver/106897
> 
> gcc/ChangeLog:
> 
>   * common.opt: Add -gz=zstd value.
>   * configure.ac: Detect --compress-debug-sections=zstd
>   for both linker and assembler.
>   * configure: Regenerate.
>   * gcc.cc (LINK_COMPRESS_DEBUG_SPEC): Handle -gz=zstd.
>   (ASM_COMPRESS_DEBUG_SPEC): Likewise.
> ---
>  gcc/common.opt   |  5 -
>  gcc/configure| 11 +--
>  gcc/configure.ac | 11 +--
>  gcc/gcc.cc   | 15 +++
>  4 files changed, 37 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 06ef768ab78..68370db816b 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3419,7 +3419,10 @@ EnumValue
>  Enum(compressed_debug_sections) String(zlib) Value(1)
>  
>  EnumValue
> -Enum(compressed_debug_sections) String(zlib-gnu) Value(2)
> +Enum(compressed_debug_sections) String(zstd) Value(2)
> +
> +EnumValue
> +Enum(compressed_debug_sections) String(zlib-gnu) Value(3)
>  
>  gz
>  Common Driver
> diff --git a/gcc/configure b/gcc/configure
> index 70a013e9a30..ce4e1859e1f 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -29727,13 +29727,16 @@ else
> if $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s 2>&1 | 
> grep -i warning > /dev/null
> then
>   gcc_cv_as_compress_debug=0
> -   # Since binutils 2.26, gas supports --compress-debug-sections=zlib,
> -   # defaulting to the ELF gABI format.
> elif $gcc_cv_as --compress-debug-sections=zlib -o conftest.o conftest.s > 
> /dev/null 2>&1
> then
>   gcc_cv_as_compress_debug=1
>   gcc_cv_as_compress_debug_option="--compress-debug-sections"
>   gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
> + # Since binutils 2.40, gas supports --compress-debug-sections=zstd.
> + if $gcc_cv_as --compress-debug-sections=zstd -o conftest.o conftest.s > 
> /dev/null 2>&1
> + then
> +   gcc_cv_as_compress_debug=2
> + fi
> else
>   gcc_cv_as_compress_debug=0
> fi
> @@ -30251,6 +30254,10 @@ $as_echo_n "checking linker for compressed debug 
> sections... " >&6; }
>  if $gcc_cv_ld --help 2>&1 | grep -- 
> '--compress-debug-sections.*\' > /dev/null; then
>  gcc_cv_ld_compress_debug=1
>  gcc_cv_ld_compress_debug_option="--compress-debug-sections"
> +# Detect zstd debug section compression support
> +if $gcc_cv_ld --help 2>&1 | grep -- 
> '--compress-debug-sections.*\' > /dev/null; then
> +  gcc_cv_ld_compress_debug=2
> +fi
>  else
>case "${target}" in
>  *-*-solaris2*)
> diff --git a/gcc/configure.ac b/gcc/configure.ac
> index 96e10d7c194..b6bafa8b7d6 100644
> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -5732,13 +5732,16 @@ gcc_GAS_CHECK_FEATURE([compressed debug sections],
> if $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s 2>&1 | 
> grep -i warning > /dev/null
> then
>   gcc_cv_as_compress_debug=0
> -   # Since binutils 2.26, gas supports --compress-debug-sections=zlib,
> -   # defaulting to 

Re: VN, len_store and endianness

2022-09-27 Thread Richard Biener via Gcc-patches
On Tue, Sep 27, 2022 at 3:19 PM Robin Dapp  wrote:
>
> > The error is probably in vn_reference_lookup_3 which assumes that
> > 'len' applies to the vector elements in element order.  See the part
> > of the code where it checks for internal_store_fn_p.  If 'len' is with
> > respect to the memory and thus endianess has to be taken into
> > account then for the IFN_LEN_STORE
> >
> >   else if (fn == IFN_LEN_STORE)
> > {
> >   pd.rhs_off = 0;
> >   pd.offset = offset2i;
> >   pd.size = (tree_to_uhwi (len)
> >  + -tree_to_shwi (bias)) * BITS_PER_UNIT;
> >   if (ranges_known_overlap_p (offset, maxsize,
> >   pd.offset, pd.size))
> > return data->push_partial_def (pd, set, set,
> >offseti, maxsizei);
> >
> > likely needs to adjust rhs_off from zero for big endian?
>
> Not sure I follow entirely.  rhs_off only seems to be used for
> native_encode_expr which properly encodes already ({-1, 1, -1, 1} in
> that order in memory).  A 'len' of 12 is the first three elements (in
> the same order or element order as well).

Yes, because the native_interpret always starts at offset zero
(we can't easily feed in a "shifted" RHS).  So what I assumed is
that IFN_LEN_STORE always stores elements [0, len + adj].

> If the constant were encoded in little endian ({1, -1, 1, -1}) 'q' would
> kind of address the right elements (using always the second, or
> "reversed third" element while shifting the buffer by 4 bytes each time).


Re: VN, len_store and endianness

2022-09-27 Thread Robin Dapp via Gcc-patches
> The error is probably in vn_reference_lookup_3 which assumes that
> 'len' applies to the vector elements in element order.  See the part
> of the code where it checks for internal_store_fn_p.  If 'len' is with
> respect to the memory and thus endianess has to be taken into
> account then for the IFN_LEN_STORE
> 
>   else if (fn == IFN_LEN_STORE)
> {
>   pd.rhs_off = 0;
>   pd.offset = offset2i;
>   pd.size = (tree_to_uhwi (len)
>  + -tree_to_shwi (bias)) * BITS_PER_UNIT;
>   if (ranges_known_overlap_p (offset, maxsize,
>   pd.offset, pd.size))
> return data->push_partial_def (pd, set, set,
>offseti, maxsizei);
> 
> likely needs to adjust rhs_off from zero for big endian?

Not sure I follow entirely.  rhs_off only seems to be used for
native_encode_expr which properly encodes already ({-1, 1, -1, 1} in
that order in memory).  A 'len' of 12 is the first three elements (in
the same order or element order as well).

If the constant were encoded in little endian ({1, -1, 1, -1}) 'q' would
kind of address the right elements (using always the second, or
"reversed third" element while shifting the buffer by 4 bytes each time).


Re: [PATCH v2] c++: Don't quote nothrow in diagnostic

2022-09-27 Thread Jason Merrill via Gcc-patches

On 9/27/22 04:41, Richard Biener wrote:

On Mon, Sep 26, 2022 at 9:54 PM Marek Polacek  wrote:


On Mon, Sep 26, 2022 at 12:34:04PM -0400, Jason Merrill wrote:

On 9/26/22 03:50, Richard Biener wrote:

On Fri, Sep 23, 2022 at 8:41 PM Marek Polacek via Gcc-patches
 wrote:


In 
Jason noticed that we quote "nothrow" in diagnostics even though it's
not a keyword in C++.  Just removing the quotes didn't work because
then -Wformat-diag complains, so this patch replaces it with "no-throw".

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


That doesn't look like an improvement to me.  Can we quote 'nothrow()' instead?


Understood.


nothrow() is a syntax error; the C++11 keyword is 'noexcept'. std::nothrow
is a dummy placement argument used to indicate that a new-expression should
return null rather than throw on failure.

But bizarrely, the library traits use the word "nothrow".  Marek's patch
clarifies that we are not trying to refer to anything in the language.


I'd rather leave it alone than changing it to no-throw.  Why does -Wformat-diag
complain?  If we shouldn't quote nothrow that should be adjusted?


I think -Wformat-diag complains because "nothrow" is an attribute; it also
includes some other attribute names in the list of "keywords".

I would also be fine with just removing the quotes and removing nothrow from
c_keywords.


Like below?   Bootstrapped/regtested on x86_64-pc-linux-gnu.


Yes.  I assume that terms like "nothrow constructible" are used in the
C++ standard?


Not in the language, only in library names like 
std::is_nothrow_constructible.


Jason



Re: [Patch] libgomp/gcn: Prepare for reverse-offload callback handling

2022-09-27 Thread Tobias Burnus

For those without a working crystal ball, I have now also included the patch.

On 27.09.22 15:15, Tobias Burnus wrote:

This patch adds support to handle reverse offload to libgomp's plugin-gcn.c and
to AMD GCN's libgomp target.c.

In theory, that's all whats needed for GCN – in practice there a known issue 
with
private stack variables which has to be addressed independently. Once this and
the target.c generic code is committed, omp requires reverse-offload
support can be claimed for the device (→ GOMP_OFFLOAD_get_num_devices).

Note: Contrary to nvptx, the code to handle reverse offload is already enabled
if there is 'omp requires reverse_offload' (+ target functions) in the code;
for nvptx, an actual reverse-offload-target region has to exist in the code.
This probably does not matter that much in practice.

The '#if 1' code block inside plugin-gcn.c has to be replaced by the
target.c/libgomp-plugins.h/libgomp.h/libgomp.map patch that is part
of the patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602321.html
"[Patch] libgomp/nvptx: Prepare for reverse-offload callback handling"

Andrew did suggest a while back to piggyback on the console_output handling,
avoiding another atomic access. - If this is still wanted, I like to have some
guidance regarding how to actually implement it.

Comments, suggestions?
If not, OK for mainline?*

Tobias

*Without the '#if 1' code and once the non-nvptx bits of the other patch
have been approved and committed.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp/gcn: Prepare for reverse-offload callback handling

libgomp/ChangeLog:

	* config/gcn/libgomp-gcn.h: New file.
	* config/gcn/target.c: Include it.
	(GOMP_ADDITIONAL_ICVS): Declare as extern var.
	(GOMP_target_ext): Handle reverse offload.
	* plugin/plugin-gcn.c (struct kernargs): Add 'int64_t rev_ptr' as
	6th argument and 'struct rev_offload rev_data'.
	(struct agent_info): Add has_reverse_offload; move prog_finalized
	up to reduce padding.
	(create_kernel_dispatch): Init kernargs' rev_ptr and rev_data.
	(reverse_offload): New.
	(run_kernel): Call it.
	(GOMP_OFFLOAD_init_device, GOMP_OFFLOAD_load_image): Set
	has_reverse_offload.

 libgomp/config/gcn/libgomp-gcn.h | 50 +
 libgomp/config/gcn/target.c  | 35 --
 libgomp/plugin/plugin-gcn.c  | 54 +---
 3 files changed, 129 insertions(+), 10 deletions(-)

diff --git a/libgomp/config/gcn/libgomp-gcn.h b/libgomp/config/gcn/libgomp-gcn.h
new file mode 100644
index 000..884f0094d05
--- /dev/null
+++ b/libgomp/config/gcn/libgomp-gcn.h
@@ -0,0 +1,50 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Tobias Burnus .
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+/* This file contains defines and type definitions shared between the
+   nvptx target's libgomp.a and the plugin-nvptx.c, but that is only
+   needef for this target.  */
+
+#ifndef LIBGOMP_GCN_H
+#define LIBGOMP_GCN_H 1
+
+
+struct rev_offload {
+  uint64_t fn;
+  uint64_t mapnum;
+  uint64_t addrs;
+  uint64_t sizes;
+  uint64_t kinds;
+  int32_t dev_num;
+  uint32_t lock;
+};
+
+#if (__SIZEOF_SHORT__ != 2 \
+ || __SIZEOF_SIZE_T__ != 8 \
+ || __SIZEOF_POINTER__ != 8)
+#error "Data-type conversion required for rev_offload"
+#endif
+
+#endif  /* LIBGOMP_GCN_H */
diff --git a/libgomp/config/gcn/target.c b/libgomp/config/gcn/target.c
index c8484fa18d9..ecbf3f337d0 100644
--- a/libgomp/config/gcn/target.c
+++ b/libgomp/config/gcn/target.c
@@ -24,8 +24,11 @@
.  */
 
 #include "libgomp.h"
+#include "libgomp-gcn.h"
 #include 
 
+extern volatile struct gomp_offload_icvs GOMP_ADDITIONAL_ICVS;
+
 bool
 GOMP_teams4 (unsigned int num_teams_lower, 

[Patch] libgomp/gcn: Prepare for reverse-offload callback handling

2022-09-27 Thread Tobias Burnus

This patch adds support to handle reverse offload to libgomp's plugin-gcn.c and
to AMD GCN's libgomp target.c.

In theory, that's all whats needed for GCN – in practice there a known issue 
with
private stack variables which has to be addressed independently. Once this and
the target.c generic code is committed, omp requires reverse-offload
support can be claimed for the device (→ GOMP_OFFLOAD_get_num_devices).

Note: Contrary to nvptx, the code to handle reverse offload is already enabled
if there is 'omp requires reverse_offload' (+ target functions) in the code;
for nvptx, an actual reverse-offload-target region has to exist in the code.
This probably does not matter that much in practice.

The '#if 1' code block inside plugin-gcn.c has to be replaced by the
target.c/libgomp-plugins.h/libgomp.h/libgomp.map patch that is part
of the patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602321.html
"[Patch] libgomp/nvptx: Prepare for reverse-offload callback handling"

Andrew did suggest a while back to piggyback on the console_output handling,
avoiding another atomic access. - If this is still wanted, I like to have some
guidance regarding how to actually implement it.

Comments, suggestions?
If not, OK for mainline?*

Tobias

*Without the '#if 1' code and once the non-nvptx bits of the other patch
have been approved and committed.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH v2] c++: Don't quote nothrow in diagnostic

2022-09-27 Thread Marek Polacek via Gcc-patches
On Tue, Sep 27, 2022 at 10:41:29AM +0200, Richard Biener wrote:
> On Mon, Sep 26, 2022 at 9:54 PM Marek Polacek  wrote:
> >
> > On Mon, Sep 26, 2022 at 12:34:04PM -0400, Jason Merrill wrote:
> > > On 9/26/22 03:50, Richard Biener wrote:
> > > > On Fri, Sep 23, 2022 at 8:41 PM Marek Polacek via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > In 
> > > > > 
> > > > > Jason noticed that we quote "nothrow" in diagnostics even though it's
> > > > > not a keyword in C++.  Just removing the quotes didn't work because
> > > > > then -Wformat-diag complains, so this patch replaces it with 
> > > > > "no-throw".
> > > > >
> > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > >
> > > > That doesn't look like an improvement to me.  Can we quote 'nothrow()' 
> > > > instead?
> >
> > Understood.
> >
> > > nothrow() is a syntax error; the C++11 keyword is 'noexcept'. std::nothrow
> > > is a dummy placement argument used to indicate that a new-expression 
> > > should
> > > return null rather than throw on failure.
> > >
> > > But bizarrely, the library traits use the word "nothrow".  Marek's patch
> > > clarifies that we are not trying to refer to anything in the language.
> > >
> > > > I'd rather leave it alone than changing it to no-throw.  Why does 
> > > > -Wformat-diag
> > > > complain?  If we shouldn't quote nothrow that should be adjusted?
> > >
> > > I think -Wformat-diag complains because "nothrow" is an attribute; it also
> > > includes some other attribute names in the list of "keywords".
> > >
> > > I would also be fine with just removing the quotes and removing nothrow 
> > > from
> > > c_keywords.
> >
> > Like below?   Bootstrapped/regtested on x86_64-pc-linux-gnu.
> 
> Yes.  I assume that terms like "nothrow constructible" are used in the
> C++ standard?

I don't really see that.  [meta.unary.prop] says "known not to throw any
exceptions" for the _nothrow built-ins.  That may be too long to use in
diagnostics.  And the warning would probably complain about the unquoted
'throw' in it.  :)
 
> > Note that now I see warnings with my system compiler (gcc-12.2.1).  Can
> > I commit the c-format.cc hunk to gcc 12 so that eventually even gcc 12
> > stops warning?
> 
> Sure.

Thanks.

Marek



Re: [PATCH] Rewrite NAN and sign handling in frange

2022-09-27 Thread Mikael Morin

Hello,

Le 16/09/2022 à 15:26, Aldy Hernandez via Gcc-patches a écrit :

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index d759fcf178c..55a216efd8b 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -617,21 +602,24 @@ frange::contains_p (tree cst) const
   if (varying_p ())
 return true;
 

 (...)
 
   if (real_compare (GE_EXPR, rv, _min) && real_compare (LE_EXPR, rv, _max))

 {
+  // Make sure the signs are equal for signed zeros.
   if (HONOR_SIGNED_ZEROS (m_type) && real_iszero (rv))
-   {
- // FIXME: This is still using get_signbit() instead of
- // known_signbit() because the latter bails on possible NANs
- // (for now).
- if (get_signbit ().yes_p ())
-   return real_isneg (rv);
- else if (get_signbit ().no_p ())
-   return !real_isneg (rv);
- else
-   return true;
-   }
+   return m_min.sign == m_max.sign && m_min.sign == rv->sign;
   return true;
 }
   return false;


It seems that this won't report any range with mismatching bound signs 
as containing zero.

Maybe a selftest explains it better: the following fails.

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 9ca442478c9..8fc909171bc 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -3780,6 +3780,14 @@ range_tests_signed_zeros ()
   ASSERT_TRUE (r0.contains_p (neg_zero));
   ASSERT_FALSE (r0.contains_p (zero));

+  r0 = frange_float ("-3", "5");
+  ASSERT_TRUE (r0.contains_p (neg_zero));
+  ASSERT_TRUE (r0.contains_p (zero));
+
+  r0 = frange (neg_zero, zero);
+  ASSERT_TRUE (r0.contains_p (neg_zero));
+  ASSERT_TRUE (r0.contains_p (zero));
+
   // The intersection of zeros that differ in sign is a NAN (or
   // undefined if not honoring NANs).
   r0 = frange (neg_zero, neg_zero);



Re: [PATCH] c++: Make __is_{,nothrow_}convertible SFINAE on access [PR107049]

2022-09-27 Thread Jason Merrill via Gcc-patches

On 9/27/22 06:35, Jonathan Wakely wrote:

Tested powerpc64le-linux. OK for trunk?


OK, thanks.


-- >8 --

The is_convertible built-ins should return false if the conversion fails
an access check, not report an error.

PR c++/107049

gcc/cp/ChangeLog:

* method.cc (is_convertible_helper): Use access check sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible4.C: New test.
* g++.dg/ext/is_nothrow_convertible4.C: New test.

libstdc++-v3/ChangeLog:

* testsuite/20_util/is_convertible/requirements/access.cc: New
test.
---
  gcc/cp/method.cc  |  1 +
  gcc/testsuite/g++.dg/ext/is_convertible4.C| 33 +++
  .../g++.dg/ext/is_nothrow_convertible4.C  | 33 +++
  .../is_convertible/requirements/access.cc | 18 ++
  4 files changed, 85 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible4.C
  create mode 100644 gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
  create mode 100644 
libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 9f917f13134..55af5c43c18 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2246,6 +2246,7 @@ is_convertible_helper (tree from, tree to)
  return integer_one_node;
cp_unevaluated u;
tree expr = build_stub_object (from);
+  deferring_access_check_sentinel acs (dk_no_deferred);
return perform_implicit_conversion (to, expr, tf_none);
  }
  
diff --git a/gcc/testsuite/g++.dg/ext/is_convertible4.C b/gcc/testsuite/g++.dg/ext/is_convertible4.C

new file mode 100644
index 000..8a7724c5852
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_convertible4.C
@@ -0,0 +1,33 @@
+// PR c++/107049
+// { dg-do compile { target c++11 } }
+// Failed access check should be a substitution failure, not an error.
+
+template
+struct bool_constant { static constexpr bool value = B; };
+
+template
+struct is_convertible
+: public bool_constant<__is_convertible(From, To)>
+{ };
+
+#if __cpp_variable_templates
+template
+constexpr bool is_convertible_v = __is_convertible(From, To);
+#endif
+
+class Private
+{
+  operator int() const
+  {
+static_assert( not is_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_convertible_v, "" );
+#endif
+return 0;
+  }
+};
+
+static_assert( not is_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_convertible_v, "" );
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C 
b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
new file mode 100644
index 000..f81b5944ca2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
@@ -0,0 +1,33 @@
+// PR c++/107049
+// { dg-do compile { target c++11 } }
+// Failed access check should be a substitution failure, not an error.
+
+template
+struct bool_constant { static constexpr bool value = B; };
+
+template
+struct is_nt_convertible
+: public bool_constant<__is_nothrow_convertible(From, To)>
+{ };
+
+#if __cpp_variable_templates
+template
+constexpr bool is_nt_convertible_v = __is_nothrow_convertible(From, To);
+#endif
+
+class Private
+{
+  operator int() const
+  {
+static_assert( not is_nt_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_nt_convertible_v, "" );
+#endif
+return 0;
+  }
+};
+
+static_assert( not is_nt_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_nt_convertible_v, "" );
+#endif
diff --git 
a/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc 
b/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
new file mode 100644
index 000..04a8c525961
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
@@ -0,0 +1,18 @@
+// { dg-do compile { target  c++11 } }
+// PR c++/107049
+
+#include 
+
+class Private
+{
+  operator int() const
+  {
+static_assert( not std::is_convertible::value, "" );
+#if __cpp_lib_type_trait_variable_templates
+static_assert( not std::is_convertible_v, "" );
+#endif
+return 0;
+  }
+};
+
+static_assert( not std::is_convertible::value, "" );




Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into BIT_FIELD_REFs alone

2022-09-27 Thread Richard Biener via Gcc-patches
On Mon, Sep 26, 2022 at 5:25 PM Andrew Pinski via Gcc-patches
 wrote:
>
> On Sun, Sep 25, 2022 at 9:56 PM Tamar Christina  
> wrote:
> >
> > > -Original Message-
> > > From: Andrew Pinski 
> > > Sent: Saturday, September 24, 2022 8:57 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de
> > > Subject: Re: [PATCH 1/2]middle-end Fold BIT_FIELD_REF and Shifts into
> > > BIT_FIELD_REFs alone
> > >
> > > On Fri, Sep 23, 2022 at 4:43 AM Tamar Christina via Gcc-patches  > > patc...@gcc.gnu.org> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > This adds a match.pd rule that can fold right shifts and
> > > > bit_field_refs of integers into just a bit_field_ref by adjusting the
> > > > offset and the size of the extract and adds an extend to the previous 
> > > > size.
> > > >
> > > > Concretely turns:
> > > >
> > > > #include 
> > > >
> > > > unsigned int foor (uint32x4_t x)
> > > > {
> > > > return x[1] >> 16;
> > > > }
> > > >
> > > > which used to generate:
> > > >
> > > >   _1 = BIT_FIELD_REF ;
> > > >   _3 = _1 >> 16;
> > > >
> > > > into
> > > >
> > > >   _4 = BIT_FIELD_REF ;
> > > >   _2 = (unsigned int) _4;
> > > >
> > > > I currently limit the rewrite to only doing it if the resulting
> > > > extract is in a mode the target supports. i.e. it won't rewrite it to
> > > > extract say 13-bits because I worry that for targets that won't have a
> > > > bitfield extract instruction this may be a de-optimization.
> > >
> > > It is only a de-optimization for the following case:
> > > * vector extraction
> > >
> > > All other cases should be handled correctly in the middle-end when
> > > expanding to RTL because they need to be handled for bit-fields anyways.
> > > Plus SIGN_EXTRACT and ZERO_EXTRACT would be used in the integer case
> > > for the RTL.
> > > Getting SIGN_EXTRACT/ZERO_EXTRACT early on in the RTL is better than
> > > waiting until combine really.
> > >
> >
> > Fair enough, I've dropped the constraint.
>
> Well the constraint should be done still for VECTOR_TYPE I think.
> Attached is what I had done for left shift for integer types.
> Note the BYTES_BIG_ENDIAN part which you missed for the right shift case.

Note we formerly had BIT_FIELD_REF_UNSIGNED and allowed the precision
of the TREE_TYPE of the BIT_FIELD_REF to not match the extracted size.  That
might have mapped directly to zero/sign_extract.

Now that this is no more we should think of a canonical way to express this
and make sure we can synthesize those early.

Richard.

> Thanks,
> Andrew Pinski
>
> >
> > >
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > > > and no issues.
> > > >
> > > > Testcase are added in patch 2/2.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * match.pd: Add bitfield and shift folding.
> > > >
> > > > --- inline copy of patch --
> > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > >
> > > 1d407414bee278c64c00d425d9f025c1c58d853d..b225d36dc758f1581502c8d03
> > > 761
> > > > 544bfd499c01 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -7245,6 +7245,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > >&& ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P
> > > (TREE_TYPE(@0)))
> > > >(IFN_REDUC_PLUS_WIDEN @0)))
> > > >
> > > > +/* Canonicalize BIT_FIELD_REFS and shifts to BIT_FIELD_REFS.  */ (for
> > > > +shift (rshift)
> > > > + op (plus)
> > > > + (simplify
> > > > +  (shift (BIT_FIELD_REF @0 @1 @2) integer_pow2p@3)
> > > > +  (if (INTEGRAL_TYPE_P (type))
> > > > +   (with { /* Can't use wide-int here as the precision differs between
> > > > + @1 and @3.  */
> > > > +  unsigned HOST_WIDE_INT size = tree_to_uhwi (@1);
> > > > +  unsigned HOST_WIDE_INT shiftc = tree_to_uhwi (@3);
> > > > +  unsigned HOST_WIDE_INT newsize = size - shiftc;
> > > > +  tree nsize = wide_int_to_tree (bitsizetype, newsize);
> > > > +  tree ntype
> > > > += build_nonstandard_integer_type (newsize, 1); }
> > >
> > > Maybe use `build_nonstandard_integer_type (newsize, /* unsignedp = */
> > > true);` or better yet `build_nonstandard_integer_type (newsize,
> > > UNSIGNED);`
> >
> > Ah, will do,
> > Tamar.
> >
> > >
> > > I had started to convert some of the unsignedp into enum signop but I 
> > > never
> > > finished or submitted the patch.
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > >
> > > > +(if (ntype)
> > > > + (convert:type (BIT_FIELD_REF:ntype @0 { nsize; } (op @2
> > > > + @3
> > > > +
> > > >  (simplify
> > > >   (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4)
> > > >   (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4);
> > > > }))
> > > >
> > > >
> > > >
> > > >
> > > > --


Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-09-27 Thread Richard Biener via Gcc-patches
On Mon, 26 Sep 2022, Andre Vieira (lists) wrote:

> 
> On 08/09/2022 12:51, Richard Biener wrote:
> >
> > I'm curious, why the push to redundant_ssa_names?  That could use
> > a comment ...
> So I purposefully left a #if 0 #else #endif in there so you can see the two
> options. But the reason I used redundant_ssa_names is because ifcvt seems to
> use that as a container for all pairs of (old, new) ssa names to replace
> later. So I just piggy backed on that. I don't know if there's a specific
> reason they do the replacement at the end? Maybe some ordering issue? Either
> way both adding it to redundant_ssa_names or doing the replacement inline work
> for the bitfield lowering (or work in my testing at least).

Possibly because we (in the past?) inserted/copied stuff based on
predicates generated at analysis time after we decide to elide something
so we need to watch for later appearing uses.  But who knows ... my mind
fails me here.

If it works to replace uses immediately please do so.  But now
I wonder why we need this - the value shouldn't change so you
should get away with re-using the existing SSA name for the final value?

> > Note I fear we will have endianess issues when translating
> > bit-field accesses to BIT_FIELD_REF/INSERT and then to shifts.  Rules
> > for memory and register operations do not match up (IIRC, I repeatedly
> > run into issues here myself).  The testcases all look like they
> > won't catch this - I think an example would be sth like
> > struct X { unsigned a : 23; unsigned b : 9; }, can you see to do
> > testing on a big-endian target?
> I've done some testing and you were right, it did fall apart on big-endian. I
> fixed it by changing the way we compute the 'shift' value and added two extra
> testcases for read and write each.
> >
> > Sorry for the delay in reviewing.
> No worries, apologies myself for the delay in reworking this, had a nice
> little week holiday in between :)
> 
> I'll write the ChangeLogs once the patch has stabilized.

Thanks,
Richard.


Re: [PATCH] c++: Make __is_{,nothrow_}convertible SFINAE on access [PR107049]

2022-09-27 Thread Marek Polacek via Gcc-patches
On Tue, Sep 27, 2022 at 11:35:10AM +0100, Jonathan Wakely wrote:
> Tested powerpc64le-linux. OK for trunk?
> 
> -- >8 --
> 
> The is_convertible built-ins should return false if the conversion fails
> an access check, not report an error.

Ah, so we do need that sentinel after all.

Patch looks good, thanks.
 
>   PR c++/107049
> 
> gcc/cp/ChangeLog:
> 
>   * method.cc (is_convertible_helper): Use access check sentinel.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/ext/is_convertible4.C: New test.
>   * g++.dg/ext/is_nothrow_convertible4.C: New test.
> 
> libstdc++-v3/ChangeLog:
> 
>   * testsuite/20_util/is_convertible/requirements/access.cc: New
>   test.
> ---
>  gcc/cp/method.cc  |  1 +
>  gcc/testsuite/g++.dg/ext/is_convertible4.C| 33 +++
>  .../g++.dg/ext/is_nothrow_convertible4.C  | 33 +++
>  .../is_convertible/requirements/access.cc | 18 ++
>  4 files changed, 85 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible4.C
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
>  create mode 100644 
> libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
> 
> diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
> index 9f917f13134..55af5c43c18 100644
> --- a/gcc/cp/method.cc
> +++ b/gcc/cp/method.cc
> @@ -2246,6 +2246,7 @@ is_convertible_helper (tree from, tree to)
>  return integer_one_node;
>cp_unevaluated u;
>tree expr = build_stub_object (from);
> +  deferring_access_check_sentinel acs (dk_no_deferred);
>return perform_implicit_conversion (to, expr, tf_none);
>  }
>  
> diff --git a/gcc/testsuite/g++.dg/ext/is_convertible4.C 
> b/gcc/testsuite/g++.dg/ext/is_convertible4.C
> new file mode 100644
> index 000..8a7724c5852
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_convertible4.C
> @@ -0,0 +1,33 @@
> +// PR c++/107049
> +// { dg-do compile { target c++11 } }
> +// Failed access check should be a substitution failure, not an error.
> +
> +template
> +struct bool_constant { static constexpr bool value = B; };
> +
> +template
> +struct is_convertible
> +: public bool_constant<__is_convertible(From, To)>
> +{ };
> +
> +#if __cpp_variable_templates
> +template
> +constexpr bool is_convertible_v = __is_convertible(From, To);
> +#endif
> +
> +class Private
> +{
> +  operator int() const
> +  {
> +static_assert( not is_convertible::value, "" );
> +#if __cpp_variable_templates
> +static_assert( not is_convertible_v, "" );
> +#endif
> +return 0;
> +  }
> +};
> +
> +static_assert( not is_convertible::value, "" );
> +#if __cpp_variable_templates
> +static_assert( not is_convertible_v, "" );
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C 
> b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
> new file mode 100644
> index 000..f81b5944ca2
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
> @@ -0,0 +1,33 @@
> +// PR c++/107049
> +// { dg-do compile { target c++11 } }
> +// Failed access check should be a substitution failure, not an error.
> +
> +template
> +struct bool_constant { static constexpr bool value = B; };
> +
> +template
> +struct is_nt_convertible
> +: public bool_constant<__is_nothrow_convertible(From, To)>
> +{ };
> +
> +#if __cpp_variable_templates
> +template
> +constexpr bool is_nt_convertible_v = __is_nothrow_convertible(From, To);
> +#endif
> +
> +class Private
> +{
> +  operator int() const
> +  {
> +static_assert( not is_nt_convertible::value, "" );
> +#if __cpp_variable_templates
> +static_assert( not is_nt_convertible_v, "" );
> +#endif
> +return 0;
> +  }
> +};
> +
> +static_assert( not is_nt_convertible::value, "" );
> +#if __cpp_variable_templates
> +static_assert( not is_nt_convertible_v, "" );
> +#endif
> diff --git 
> a/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc 
> b/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
> new file mode 100644
> index 000..04a8c525961
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
> @@ -0,0 +1,18 @@
> +// { dg-do compile { target  c++11 } }
> +// PR c++/107049
> +
> +#include 
> +
> +class Private
> +{
> +  operator int() const
> +  {
> +static_assert( not std::is_convertible::value, "" );
> +#if __cpp_lib_type_trait_variable_templates
> +static_assert( not std::is_convertible_v, "" );
> +#endif
> +return 0;
> +  }
> +};
> +
> +static_assert( not std::is_convertible::value, "" );
> -- 
> 2.37.3
> 

Marek



Re: [PATCH] c++ modules: ICE with class NTTP argument [PR100616]

2022-09-27 Thread Nathan Sidwell via Gcc-patches

On 9/26/22 15:05, Patrick Palka wrote:

On Mon, 26 Sep 2022, Patrick Palka wrote:


On Mon, 26 Sep 2022, Nathan Sidwell wrote:


On 9/26/22 10:08, Nathan Sidwell wrote:

On 9/23/22 09:32, Patrick Palka wrote:


Judging by the two commits that introduced/modified this part of
maybe_register_incomplete_var, r196852 and r214333, ISTM the code
is really only concerned with constexpr static data members (whose
initializer may contain a pointer-to-member for a currently open class).
So maybe we ought to restrict the branch like so, which effectively
disables this part of maybe_register_incomplete_var during stream-in, and
guarantees that outermost_open_class doesn't return NULL if the branch is
taken?


I think the problem is that we're streaming these VAR_DECLs as regular
VAR_DECLS, when we should be handling them as a new kind of object fished
out from the template they're instantiating. (I'm guessing that'll just be a
new tag, a type and an initializer?)

Then on stream-in we can handle them in the same way as a non-modules
compilation handles such redeclarations.  I.e. how does:

template struct C { };
struct A { };
C c1; // #1
C c2; // #2

work.  Presumably at some point #2's A{} gets unified such that we find the
instantation that occurred at #1?


This works because the lookup in get_template_parm_object for #2's A{}
finds and reuses the VAR_DECL created for #1's A{}.

But IIUC this lookup (performed via get_global_binding) isn't
import-aware, which I suppose explains why we don't find the VAR_DECL
from another TU.



I notice the template arg for C is a var decl mangled as _ZTAXtl1AEE,
which is a 'template paramete object for A{}'.  I see that's a special
mangler 'mangle_template_parm_object', called from
get_template_parm_object.  Perhaps these VAR_DECLs need an additional
in-tree flag that the streamer can check for?


I wonder if we're setting the module attachment for these variables sanely?
They should be attached to the global module.  My guess is the
pushdecl_top_level_and_finish call in get_templatE_parm_object is not doing
what is needed (as well as the other issues).


This is a bit of a shot in the dark, but the following seems to work:
when pushing the VAR_DECL, we need to call set_originating_module to
attach it to the global module, and when looking it up, we need to do so
in an import-aware way.  Hopefully something like this is sufficient
to properly handle these VAR_DECLs and we don't need to stream them
specially?


Err, rather than changing the behavior of get_namespace_binding (which
has many unrelated callers), I guess we could just use the already
import-aware lookup_qualified_name instead where appropriate.  WDYT of
the following? (testing in progress)


I'm going to have to think further.  Morally these VAR_DECLs are like 
the typeinfo objects -- which we have to handle specially.  Reminding 
myself, I see rtti.cc does the pushdecl_top_level stuff creating them -- 
so they go into the slot for the current TU.  But the streaming code 
writes tt_tinfo_var/tt_tinfo_typedef records, and recreates the typeinfo 
on stream in, using the same above pushdec_top_level path. So even 
though the tinfo decls might seem attached to the current module, that 
doesn;t confuse the streaming, nor create collisions on read back.  Nor 
do we stream out tinfo decls that are not reachable through the streamed 
AST (if we treated then as normal decls, we'd stream them cos they're 
inthe current TU in the symbol table.  I have the same fear for these 
NTTPs.)


It looks like TREE_LANG_FLAG_5 can be used to note these VAR_DECLs are 
NTTPs, and then the streaming can deal with them.  Let me look further.



@@ -7307,6 +7307,7 @@ get_template_parm_object (tree expr, tsubst_flags_t 
complain)
hash_map_safe_put (tparm_obj_values, decl, copy);
  }
  
+  set_originating_module (decl);

pushdecl_top_level_and_finish (decl, expr);


this is wrong.  You're attaching the decl to the current module. which 
will mean conflicts when reading in such VAR_DECLs for the same NTTP 
from different modules.  Your test case might be hiding that as you have 
an interface and implementation unit from the same module (but you 
should be getting some kind of multiple definition error anyway?)



  
return decl;

@@ -29150,9 +29151,10 @@ finish_concept_definition (cp_expr id, tree init)
  static tree
  listify (tree arg)
  {
-  tree std_init_list = get_namespace_binding (std_node, init_list_identifier);
+  tree std_init_list = lookup_qualified_name (std_node, init_list_identifier);
  
-  if (!std_init_list || !DECL_CLASS_TEMPLATE_P (std_init_list))

+  if (std_init_list == error_mark_node
+  || !DECL_CLASS_TEMPLATE_P (std_init_list))
  {
gcc_rich_location richloc (input_location);
maybe_add_include_fixit (, "", false);
diff --git a/gcc/testsuite/g++.dg/modules/pr100616_a.C 
b/gcc/testsuite/g++.dg/modules/pr100616_a.C
new file mode 100644
index 000..788af2eb533
--- /dev/null
+++ 

Re: [PATCH v2] LoongArch: Libvtv add loongarch support.

2022-09-27 Thread Xi Ruoyao via Gcc-patches
On Tue, 2022-09-27 at 15:49 +0800, Lulu Cheng wrote:
>  #if defined (__CYGWIN__) || defined (__MINGW32__)
>    if (VTV_PAGE_SIZE != sysconf_SC_PAGE_SIZE())
> +#elif defined (__loongarch_lp64)
> +  /* I think that under the LoongArch 64-bit system, VTV_PAGE_SIZE is set
> + to the maximum value of 64K supported by the system, so there is no
> + need to judge here.  */
> +  if (false)

I think "if (false)" can trigger some compiler warnings...

Still not sure if the maximum value is always correct (+ Caroline for a
confirmation).  If it's correct I'd suggest...

>  #else
>    if (VTV_PAGE_SIZE != sysconf (_SC_PAGE_SIZE))

if (VTV_PAGE_SIZE % sysconf (_SC_PAGE_SIZE) != 0)

>  #endif

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[committed] libstdc++: Adjust deduction guides for static operator() [PR106651]

2022-09-27 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Adjust the deduction guides for std::function and std::packaged_task to
work with static call operators. This finishes the implementation of
P1169R4 for C++23.

libstdc++-v3/ChangeLog:

PR c++/106651
* include/bits/std_function.h (__function_guide_t): New alias
template.
[__cpp_static_call_operator] (__function_guide_static_helper):
New class template.
(function): Use __function_guide_t in deduction guide.
* include/std/future (packaged_task): Use __function_guide_t in
deduction guide.
* testsuite/20_util/function/cons/deduction_c++23.cc: New test.
* testsuite/30_threads/packaged_task/cons/deduction_c++23.cc:
New test.
---
 libstdc++-v3/include/bits/std_function.h  | 25 ---
 libstdc++-v3/include/std/future   |  4 +--
 .../20_util/function/cons/deduction_c++23.cc  | 23 +
 .../packaged_task/cons/deduction_c++23.cc | 23 +
 4 files changed, 70 insertions(+), 5 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/20_util/function/cons/deduction_c++23.cc
 create mode 100644 
libstdc++-v3/testsuite/30_threads/packaged_task/cons/deduction_c++23.cc

diff --git a/libstdc++-v3/include/bits/std_function.h 
b/libstdc++-v3/include/bits/std_function.h
index 96918a04a35..f5423a3a5c7 100644
--- a/libstdc++-v3/include/bits/std_function.h
+++ b/libstdc++-v3/include/bits/std_function.h
@@ -697,12 +697,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 >
 { using type = _Res(_Args...); };
 
+#if __cpp_static_call_operator >= 202207L && __cpp_concepts >= 202002L
+  template
+struct __function_guide_static_helper
+{ };
+
+  template
+struct __function_guide_static_helper<_Res (*) (_Args...) noexcept(_Nx)>
+{ using type = _Res(_Args...); };
+
+  template
+using __function_guide_t = typename __conditional_t<
+  requires (_Fn& __f) { (void) __f.operator(); },
+  __function_guide_static_helper<_Op>,
+  __function_guide_helper<_Op>>::type;
+#else
+  template
+using __function_guide_t = typename __function_guide_helper<_Op>::type;
+#endif
+
   template
 function(_Res(*)(_ArgTypes...)) -> function<_Res(_ArgTypes...)>;
 
-  template::type>
-function(_Functor) -> function<_Signature>;
+  template>
+function(_Fn) -> function<_Signature>;
 #endif
 
   // [20.7.15.2.6] null pointer comparisons
diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index a1b2d7f1d3a..cf08c155a24 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -1649,8 +1649,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 packaged_task(_Res(*)(_ArgTypes...)) -> packaged_task<_Res(_ArgTypes...)>;
 
-  template::type>
+  template>
 packaged_task(_Fun) -> packaged_task<_Signature>;
 #endif
 
diff --git a/libstdc++-v3/testsuite/20_util/function/cons/deduction_c++23.cc 
b/libstdc++-v3/testsuite/20_util/function/cons/deduction_c++23.cc
new file mode 100644
index 000..17454ea4108
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/function/cons/deduction_c++23.cc
@@ -0,0 +1,23 @@
+// { dg-options "-std=gnu++23" }
+// { dg-do compile { target c++23 } }
+
+#include 
+
+template struct require_same;
+template struct require_same { using type = void; };
+
+template
+  typename require_same::type
+  check_type(U&) { }
+
+void
+test_static_call_operator()
+{
+  struct F1 { static long operator()() { return 0; } };
+  std::function func1 = F1{};
+  check_type>(func1);
+
+  struct F2 { static float operator()(char, void*) noexcept { return 0; } };
+  std::function func2 = F2{};
+  check_type>(func2);
+}
diff --git 
a/libstdc++-v3/testsuite/30_threads/packaged_task/cons/deduction_c++23.cc 
b/libstdc++-v3/testsuite/30_threads/packaged_task/cons/deduction_c++23.cc
new file mode 100644
index 000..e36edfa0359
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/packaged_task/cons/deduction_c++23.cc
@@ -0,0 +1,23 @@
+// { dg-options "-std=gnu++23" }
+// { dg-do compile { target c++23 } }
+
+#include 
+
+template struct require_same;
+template struct require_same { using type = void; };
+
+template
+  typename require_same::type
+  check_type(U&) { }
+
+void
+test_static_call_operator()
+{
+  struct F1 { static long operator()() { return 0; } };
+  std::packaged_task task1{ F1{} };
+  check_type>(task1);
+
+  struct F2 { static float operator()(char, void*) noexcept { return 0; } };
+  std::packaged_task task2{ F2{} };
+  check_type>(task2);
+}
-- 
2.37.3



[committed] fixincludes: FIx up for Debian/Ubuntu includes

2022-09-27 Thread Jakub Jelinek via Gcc-patches
Hi!

As reported by Tobias, my C++ _Float{16,32,64,128,32x,64x,128x} support
patch broke Debian/Ubuntu bootstraps.  The problem is that there
glibc bits/floatn.h and bits/floatn-common.h isn't in /usr/include/
directly, but in a subdirectory like /usr/include/x86_64-linux-gnu/
Seems other fixinclude rules for bits/* headers use
files = bits/whatever.h, "*/bits/whatever.h";
so this patch just follows the suit.

Lightly tested on x86_64-linux (Fedora) and Tobias has tested it
on Ubuntu, committed to trunk as obvious to unbreak build for those
using Debian/Ubuntu.

2022-06-27  Jakub Jelinek  

* inclhack.def (glibc_cxx_floatn_1, glibc_cxx_floatn_2,
glibc_cxx_floatn_3): Add to files also "*/bits/floatn.h"
and "*/bits/floatn-common.h".
* fixincl.x: Regenerated.

--- fixincludes/inclhack.def.jj 2022-09-27 08:03:27.183981867 +0200
+++ fixincludes/inclhack.def2022-09-27 12:21:34.215129010 +0200
@@ -2021,7 +2021,7 @@ fix = {
  */
 fix = {
 hackname  = glibc_cxx_floatn_1;
-files = bits/floatn.h, bits/floatn-common.h;
+files = bits/floatn.h, bits/floatn-common.h, "*/bits/floatn.h", 
"*/bits/floatn-common.h";
 select= "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\| )defined 
__cplusplus\n"
"(([ \t]*/\\*[^\n]*\\*/\n)?"
"([ \t]*#[ \t]*if[^\n]*\n)?"
@@ -2059,7 +2059,7 @@ fix = {
 
 fix = {
 hackname  = glibc_cxx_floatn_2;
-files = bits/floatn.h, bits/floatn-common.h;
+files = bits/floatn.h, bits/floatn-common.h, "*/bits/floatn.h", 
"*/bits/floatn-common.h";
 select= "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\| )defined 
__cplusplus\n"
"(([ \t]*/\\*[^\n]*\\*/\n)?"
"[ \t]*typedef[ \t]+[^\n]*[ \t]+_Float(16|32|64|128)x?([ 
\t]+__attribute__ \\(\\(__mode__ \\(__HF__\\)\\)\\))?;)";
@@ -2077,7 +2077,7 @@ fix = {
 
 fix = {
 hackname  = glibc_cxx_floatn_3;
-files = bits/floatn.h, bits/floatn-common.h;
+files = bits/floatn.h, bits/floatn-common.h, "*/bits/floatn.h", 
"*/bits/floatn-common.h";
 select= "^([ \t]*#[ \t]*if !__GNUC_PREREQ \\(7, 0\\) \\|\\| )defined 
__cplusplus\n"
"(([ \t]*/\\*[^\n]*\n?[^\n]*\\*/\n)?"
"([ \t]*#[ \t]*if[^\n]*\n)?"
--- fixincludes/fixincl.x.jj2022-09-27 08:03:27.189981786 +0200
+++ fixincludes/fixincl.x   2022-09-27 12:21:44.813904865 +0200
@@ -2,11 +2,11 @@
  *
  * DO NOT EDIT THIS FILE   (fixincl.x)
  *
- * It has been AutoGen-ed  September 27, 2022 at 12:49:21 AM by AutoGen 5.18.16
+ * It has been AutoGen-ed  September 27, 2022 at 12:21:44 PM by AutoGen 5.18.16
  * From the definitionsinclhack.def
  * and the template file   fixincl
  */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Tue Sep 27 00:49:21 CEST 2022
+/* DO NOT SVN-MERGE THIS FILE, EITHER Tue Sep 27 12:21:44 CEST 2022
  *
  * You must regenerate it.  Use the ./genfixes script.
  *
@@ -4116,7 +4116,7 @@ tSCC zGlibc_Cxx_Floatn_1Name[] =
  *  File name selection pattern
  */
 tSCC zGlibc_Cxx_Floatn_1List[] =
-  "bits/floatn.h\0bits/floatn-common.h\0";
+  
"bits/floatn.h\0bits/floatn-common.h\0*/bits/floatn.h\0*/bits/floatn-common.h\0";
 /*
  *  Machine/OS name selection pattern
  */
@@ -4157,7 +4157,7 @@ tSCC zGlibc_Cxx_Floatn_2Name[] =
  *  File name selection pattern
  */
 tSCC zGlibc_Cxx_Floatn_2List[] =
-  "bits/floatn.h\0bits/floatn-common.h\0";
+  
"bits/floatn.h\0bits/floatn-common.h\0*/bits/floatn.h\0*/bits/floatn-common.h\0";
 /*
  *  Machine/OS name selection pattern
  */
@@ -4197,7 +4197,7 @@ tSCC zGlibc_Cxx_Floatn_3Name[] =
  *  File name selection pattern
  */
 tSCC zGlibc_Cxx_Floatn_3List[] =
-  "bits/floatn.h\0bits/floatn-common.h\0";
+  
"bits/floatn.h\0bits/floatn-common.h\0*/bits/floatn.h\0*/bits/floatn-common.h\0";
 /*
  *  Machine/OS name selection pattern
  */

Jakub



[PATCH] c++: Make __is_{, nothrow_}convertible SFINAE on access [PR107049]

2022-09-27 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. OK for trunk?

-- >8 --

The is_convertible built-ins should return false if the conversion fails
an access check, not report an error.

PR c++/107049

gcc/cp/ChangeLog:

* method.cc (is_convertible_helper): Use access check sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_convertible4.C: New test.
* g++.dg/ext/is_nothrow_convertible4.C: New test.

libstdc++-v3/ChangeLog:

* testsuite/20_util/is_convertible/requirements/access.cc: New
test.
---
 gcc/cp/method.cc  |  1 +
 gcc/testsuite/g++.dg/ext/is_convertible4.C| 33 +++
 .../g++.dg/ext/is_nothrow_convertible4.C  | 33 +++
 .../is_convertible/requirements/access.cc | 18 ++
 4 files changed, 85 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_convertible4.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
 create mode 100644 
libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 9f917f13134..55af5c43c18 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2246,6 +2246,7 @@ is_convertible_helper (tree from, tree to)
 return integer_one_node;
   cp_unevaluated u;
   tree expr = build_stub_object (from);
+  deferring_access_check_sentinel acs (dk_no_deferred);
   return perform_implicit_conversion (to, expr, tf_none);
 }
 
diff --git a/gcc/testsuite/g++.dg/ext/is_convertible4.C 
b/gcc/testsuite/g++.dg/ext/is_convertible4.C
new file mode 100644
index 000..8a7724c5852
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_convertible4.C
@@ -0,0 +1,33 @@
+// PR c++/107049
+// { dg-do compile { target c++11 } }
+// Failed access check should be a substitution failure, not an error.
+
+template
+struct bool_constant { static constexpr bool value = B; };
+
+template
+struct is_convertible
+: public bool_constant<__is_convertible(From, To)>
+{ };
+
+#if __cpp_variable_templates
+template
+constexpr bool is_convertible_v = __is_convertible(From, To);
+#endif
+
+class Private
+{
+  operator int() const
+  {
+static_assert( not is_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_convertible_v, "" );
+#endif
+return 0;
+  }
+};
+
+static_assert( not is_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_convertible_v, "" );
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C 
b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
new file mode 100644
index 000..f81b5944ca2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_nothrow_convertible4.C
@@ -0,0 +1,33 @@
+// PR c++/107049
+// { dg-do compile { target c++11 } }
+// Failed access check should be a substitution failure, not an error.
+
+template
+struct bool_constant { static constexpr bool value = B; };
+
+template
+struct is_nt_convertible
+: public bool_constant<__is_nothrow_convertible(From, To)>
+{ };
+
+#if __cpp_variable_templates
+template
+constexpr bool is_nt_convertible_v = __is_nothrow_convertible(From, To);
+#endif
+
+class Private
+{
+  operator int() const
+  {
+static_assert( not is_nt_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_nt_convertible_v, "" );
+#endif
+return 0;
+  }
+};
+
+static_assert( not is_nt_convertible::value, "" );
+#if __cpp_variable_templates
+static_assert( not is_nt_convertible_v, "" );
+#endif
diff --git 
a/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc 
b/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
new file mode 100644
index 000..04a8c525961
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/is_convertible/requirements/access.cc
@@ -0,0 +1,18 @@
+// { dg-do compile { target  c++11 } }
+// PR c++/107049
+
+#include 
+
+class Private
+{
+  operator int() const
+  {
+static_assert( not std::is_convertible::value, "" );
+#if __cpp_lib_type_trait_variable_templates
+static_assert( not std::is_convertible_v, "" );
+#endif
+return 0;
+  }
+};
+
+static_assert( not std::is_convertible::value, "" );
-- 
2.37.3



[committed] d: Merge upstream dmd d579c467c1, phobos 88aa69b14.

2022-09-27 Thread ibuclaw--- via Gcc-patches
Hi,

This patch merges the D front-end/run-time library with upstream dmd
d579c467c1, and standard library with phobos 88aa69b14.

D front-end changes:

- Throwing from contracts of `nothrow' functions has been
  deprecated, as this breaks the guarantees of `nothrow'.
- Added language support for initializing the interior pointer of
  associative arrays using `new' keyword.

Phobos changes:

- The std.digest.digest module has been removed.
- The std.xml module has been removed.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32,
committed to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd d579c467c1.
* decl.cc (layout_struct_initializer): Update for new front-end
interface.
* expr.cc (ExprVisitor::visit (AssignExp *)): Remove lowering of array
assignments.
(ExprVisitor::visit (NewExp *)): Add new lowering of new'ing
associative arrays to an _aaNew() library call.
* runtime.def (ARRAYSETASSIGN): Remove.
(AANEW): Define.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime d579c467c1.
* libdruntime/Makefile.am (DRUNTIME_DSOURCES): Remove
rt/arrayassign.d.
* libdruntime/Makefile.in: Regenerate.
* src/MERGE: Merge upstream phobos 88aa69b14.
* src/Makefile.am (PHOBOS_DSOURCES): Remove std/digest/digest.d,
std/xml.d.
* src/Makefile.in: Regenerate.
---
 gcc/d/decl.cc |2 +-
 gcc/d/dmd/MERGE   |2 +-
 gcc/d/dmd/aggregate.d |   13 +-
 gcc/d/dmd/aggregate.h |2 +-
 gcc/d/dmd/apply.d |   25 +-
 gcc/d/dmd/arrayop.d   |   12 +-
 gcc/d/dmd/attrib.d|4 +-
 gcc/d/dmd/canthrow.d  |6 +-
 gcc/d/dmd/chkformat.d |  600 +---
 gcc/d/dmd/clone.d |6 +-
 gcc/d/dmd/cparse.d|   19 +-
 gcc/d/dmd/dcast.d |4 +-
 gcc/d/dmd/declaration.h   |4 +-
 gcc/d/dmd/dimport.d   |7 +-
 gcc/d/dmd/dinterpret.d|   12 +-
 gcc/d/dmd/dmangle.d   |   17 +
 gcc/d/dmd/doc.d   |4 +-
 gcc/d/dmd/dsymbol.d   |6 +
 gcc/d/dmd/dsymbol.h   |2 +-
 gcc/d/dmd/dsymbolsem.d|   48 +-
 gcc/d/dmd/dtemplate.d |   71 +-
 gcc/d/dmd/escape.d|5 +-
 gcc/d/dmd/expression.d|   20 +
 gcc/d/dmd/expression.h|   22 +-
 gcc/d/dmd/expressionsem.d |   92 +-
 gcc/d/dmd/func.d  |   19 +-
 gcc/d/dmd/iasmgcc.d   |8 +-
 gcc/d/dmd/id.d|2 +
 gcc/d/dmd/init.d  |1 +
 gcc/d/dmd/init.h  |1 +
 gcc/d/dmd/initsem.d   |  553 +--
 gcc/d/dmd/lexer.d |9 +-
 gcc/d/dmd/module.h|2 +-
 gcc/d/dmd/mtype.d |  649 ++--
 gcc/d/dmd/mtype.h |4 +-
 gcc/d/dmd/opover.d|9 +-
 gcc/d/dmd/parse.d |  102 +-
 gcc/d/dmd/root/object.h   |2 +-
 gcc/d/dmd/semantic3.d |   40 +-
 gcc/d/dmd/transitivevisitor.d |   73 +-
 gcc/d/dmd/typesem.d   |   18 +-
 gcc/d/expr.cc |   33 +-
 gcc/d/runtime.def |5 +-
 .../gdc.test/compilable/commontype.d  |   20 +-
 .../gdc.test/compilable/imports/cimports2a.i  |4 +
 .../gdc.test/compilable/imports/cimports2b.i  |4 +
 .../gdc.test/compilable/imports/format23327.d |7 +
 .../compilable/imports/format23327/write.d|0
 .../gdc.test/compilable/segfaultgolf.d|   50 +
 .../gdc.test/compilable/statictemplatethis.d  |   45 +
 gcc/testsuite/gdc.test/compilable/test13123.d |   38 +
 gcc/testsuite/gdc.test/compilable/test21243.d |   21 +
 gcc/testsuite/gdc.test/compilable/test21956.d |   16 +
 gcc/testsuite/gdc.test/compilable/test22674.d |   10 +
 gcc/testsuite/gdc.test/compilable/test23173.d |6 +
 gcc/testsuite/gdc.test/compilable/test23258.d |   21 +
 gcc/testsuite/gdc.test/compilable/test23306.d |7 +
 gcc/testsuite/gdc.test/compilable/test23327.d |3 +
 gcc/testsuite/gdc.test/compilable/vararg.d|   20 +
 .../gdc.test/fail_compilation/diag10169.d |2 +-
 .../gdc.test/fail_compilation/diag10783.d |2 +-
 .../gdc.test/fail_compilation/diag13528.d |  

Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2022-09-27 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

> Hi Andrea,
>
>> -Original Message-
>> From: Gcc-patches > bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
>> Corallo via Gcc-patches
>> Sent: Friday, August 12, 2022 4:34 PM
>> To: Andrea Corallo via Gcc-patches 
>> Cc: Richard Earnshaw ; nd 
>> Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping
>> if necessary
>> 
>> Hi all,
>> 
>> this patch enables 'arm_emit_multi_reg_pop' to set again the stack
>> pointer as CFA reg when popping if this is necessary.
>> 
>
> From what I can tell from similar functions this is correct, but could you 
> elaborate on why this change is needed for my understanding please?
> Thanks,
> Kyrill

Hi Kyrill,

sure, if the frame pointer was set, than it is the current CFA register.
If we request to adjust the current CFA register offset indicating it
being SP (while it's actually FP) that is indeed not correct and the
incoherence we will be detected by an assertion in the dwarf emission
machinery.

Best Regards

  Andrea


[PATCH v2] LoongArch: Libvtv add loongarch support.

2022-09-27 Thread Lulu Cheng


v1 - > v2:

1. When the macro __loongarch_lp64 is defined, the VTV_PAGE_SIZE is set to 64K.
2. In the vtv_malloc.cc file __vtv_malloc_init function, it does not check
   whether VTV_PAGE_SIZE is equal to the system page size, if the macro
   __loongarch_lp64 is defined.

All regression tests of libvtv passed.

=== libvtv Summary ===

# of expected passes176

But I haven't tested the performance yet.

---
Co-Authored-By: qijingwen 

include/ChangeLog:

* vtv-change-permission.h (defined):
(VTV_PAGE_SIZE): Under the loongarch64 architecture,
set VTV_PAGE_SIZE to 64K.

libvtv/ChangeLog:

* configure.tgt: Add loongarch support.
* vtv_malloc.cc (defined): If macro __loongarch_lp64 is
defined, then don't check whether VTV_PAGE_SIZE is the
same as the system page size.
---
 include/vtv-change-permission.h | 4 
 libvtv/configure.tgt| 3 +++
 libvtv/vtv_malloc.cc| 5 +
 3 files changed, 12 insertions(+)

diff --git a/include/vtv-change-permission.h b/include/vtv-change-permission.h
index 70bdad92bca..64e419c29d5 100644
--- a/include/vtv-change-permission.h
+++ b/include/vtv-change-permission.h
@@ -48,6 +48,10 @@ extern void __VLTChangePermission (int);
 #else 
 #if defined(__sun__) && defined(__svr4__) && defined(__sparc__)
 #define VTV_PAGE_SIZE 8192
+/* LoongArch architecture 64-bit system supports 4k,16k and 64k
+   page size, which is set to the maximum value here.  */
+#elif defined(__loongarch_lp64)
+#define VTV_PAGE_SIZE 65536
 #else
 #define VTV_PAGE_SIZE 4096
 #endif
diff --git a/libvtv/configure.tgt b/libvtv/configure.tgt
index aa2a3f675b8..6cdd1e97ab1 100644
--- a/libvtv/configure.tgt
+++ b/libvtv/configure.tgt
@@ -50,6 +50,9 @@ case "${target}" in
;;
   x86_64-*-darwin[1]* | i?86-*-darwin[1]*)
;;
+  loongarch*-*-linux*)
+   VTV_SUPPORTED=yes
+   ;;
   *)
;;
 esac
diff --git a/libvtv/vtv_malloc.cc b/libvtv/vtv_malloc.cc
index 67c5de6d4e9..45804b8d7f8 100644
--- a/libvtv/vtv_malloc.cc
+++ b/libvtv/vtv_malloc.cc
@@ -212,6 +212,11 @@ __vtv_malloc_init (void)
 
 #if defined (__CYGWIN__) || defined (__MINGW32__)
   if (VTV_PAGE_SIZE != sysconf_SC_PAGE_SIZE())
+#elif defined (__loongarch_lp64)
+  /* I think that under the LoongArch 64-bit system, VTV_PAGE_SIZE is set
+ to the maximum value of 64K supported by the system, so there is no
+ need to judge here.  */
+  if (false)
 #else
   if (VTV_PAGE_SIZE != sysconf (_SC_PAGE_SIZE))
 #endif
-- 
2.31.1



Re: [COMMITTED] Optimize [0 = x & MASK] in range-ops.

2022-09-27 Thread Mikael Morin

Le 26/09/2022 à 19:24, Aldy Hernandez via Gcc-patches a écrit :

For [0 = x & MASK], we can determine that x is ~MASK.

Suggestion: as AND is a bitwise operator, a X non-zero bit can be 
cleared for every bit at which the result is cleared and the MASK is 
set, so what you do here can be extended to non-zero result values.




[PATCH] RISC-V: Add ABI-defined RVV types.

2022-09-27 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config.gcc: Add riscv-vector-builtins.o.
* config/riscv/riscv-builtins.cc (riscv_init_builtins): Add RVV builtin 
function.
* config/riscv/riscv-protos.h (riscv_v_ext_enabled_vector_mode_p): New 
function.
* config/riscv/riscv.cc (ENTRY): New macro.
(riscv_v_ext_enabled_vector_mode_p): New function.
(riscv_mangle_type): Add RVV mangle.
(riscv_vector_mode_supported_p): Adjust RVV machine mode.
(riscv_verify_type_context): Add context check for RVV.
(riscv_vector_alignment): Add RVV alignment target hook support.
(TARGET_VECTOR_MODE_SUPPORTED_P): New target hook support.
(TARGET_VERIFY_TYPE_CONTEXT): Ditto.
(TARGET_VECTOR_ALIGNMENT): Ditto.
* config/riscv/t-riscv: Add riscv-vector-builtins.o
* config/riscv/riscv-vector-builtins.cc: New file.
* config/riscv/riscv-vector-builtins.def: New file.
* config/riscv/riscv-vector-builtins.h: New file.
* config/riscv/riscv-vector-switch.def: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-1.c: New test.
* gcc.target/riscv/rvv/base/abi-2.c: New test.
* gcc.target/riscv/rvv/base/abi-3.c: New test.
* gcc.target/riscv/rvv/base/abi-4.c: New test.
* gcc.target/riscv/rvv/base/abi-5.c: New test.
* gcc.target/riscv/rvv/base/abi-6.c: New test.
* gcc.target/riscv/rvv/base/abi-7.c: New test.
* gcc.target/riscv/rvv/rvv.exp: New test.

---
 gcc/config.gcc|   1 +
 gcc/config/riscv/riscv-builtins.cc|   2 +
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-vector-builtins.cc | 281 ++
 gcc/config/riscv/riscv-vector-builtins.def| 199 +
 gcc/config/riscv/riscv-vector-builtins.h  |  79 +
 gcc/config/riscv/riscv-vector-switch.def  | 164 ++
 gcc/config/riscv/riscv.cc |  95 +-
 gcc/config/riscv/t-riscv  |  10 +
 .../gcc.target/riscv/rvv/base/abi-1.c |  63 
 .../gcc.target/riscv/rvv/base/abi-2.c |  63 
 .../gcc.target/riscv/rvv/base/abi-3.c |  63 
 .../gcc.target/riscv/rvv/base/abi-4.c |  63 
 .../gcc.target/riscv/rvv/base/abi-5.c |  63 
 .../gcc.target/riscv/rvv/base/abi-6.c |  63 
 .../gcc.target/riscv/rvv/base/abi-7.c |  63 
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|  47 +++
 17 files changed, 1319 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/riscv/riscv-vector-builtins.cc
 create mode 100644 gcc/config/riscv/riscv-vector-builtins.def
 create mode 100644 gcc/config/riscv/riscv-vector-builtins.h
 create mode 100644 gcc/config/riscv/riscv-vector-switch.def
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/abi-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 27ffce3fb50..615a06f87dd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -516,6 +516,7 @@ pru-*-*)
 riscv*)
cpu_type=riscv
extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o 
riscv-shorten-memrefs.o riscv-selftests.o"
+   extra_objs="${extra_objs} riscv-vector-builtins.o"
d_target_objs="riscv-d.o"
;;
 rs6000*-*-*)
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 3009311604d..a51037a8f7a 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "expr.h"
 #include "langhooks.h"
+#include "riscv-vector-builtins.h"
 
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define RISCV_FTYPE_NAME0(A) RISCV_##A##_FTYPE
@@ -213,6 +214,7 @@ void
 riscv_init_builtins (void)
 {
   riscv_init_builtin_types ();
+  riscv_vector::init_builtins ();
 
   for (size_t i = 0; i < ARRAY_SIZE (riscv_builtins); i++)
 {
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index f9a2baa46c7..101361a4b44 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -75,6 +75,7 @@ extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn 
*);
 extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *);
 extern bool riscv_gpr_save_operation_p (rtx);
 extern void riscv_reinit (void);
+extern bool riscv_v_ext_enabled_vector_mode_p (machine_mode);
 
 /* Routines 

Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-09-27 Thread Tobias Burnus

Hi,

On 26.09.22 19:45, Alexander Monakov wrote:

My main concerns remain not addressed:
1) what I said in the opening paragraphs of my previous email;

(i.e. the general disagreement whether the feature itself should be implemented 
for nvptx or not.)

2) device-issued atomics are not guaranteed to appear atomic to the host
unless using atom.sys and translating for CUDA compute capability 6.0+.

As you seem to have no other rough review comments, this can now be addressed 
:-)

We do support
 #if __PTX_SM__ >= 600  (CUDA >= 8.0, ptx isa >= 5.0)
and we also can configure GCC with
 --with-arch=sm_70 (or sm_80 or ...)
Thus, adding atomics with .sys scope is possible.

See attached patch. This seems to work fine and I hope I got the
assembly right in terms of atomic use. (And I do believe that the
.release/.acquire do not need an additional __sync_syncronize()/"membar.sys".)

Ignoring (1), does the overall patch and this part otherwise look okay(ish)?


Caveat: The .sys scope works well with >= sm_60 but not does not handle older
versions. For those, the __atomic_{load/store}_n are used.
I do not see a good solution beyond documentation. In the way it is used
(one thread only setting only on/off flag, no atomic increments etc.), I think 
it is
unlikely to cause races without .sys scope, but as always is difficult to rule 
out
some special unfortunate case where it does. At lease we do have now some
documentation (in general) - which still needs to be expanded and improved.
For this feature, I did not add any wording in this patch: until the feature
is actually enabled, it would be more confusing than helpful.


On Mon, 26 Sep 2022, Tobias Burnus wrote:


In theory, compiling with "-m32 -foffload-options=-m64" or "-m32
-foffload-options=-m32" or "-m64 -foffload-options=-m32" is supported.


I have no words.

@node Nvidia PTX Options
...
@item -m64
@opindex m64
Ignored, but preserved for backward compatibility.  Only 64-bit ABI is
supported.

And in config/nvptx/mkoffload.cc you also still find leftovers from -m32.

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp/nvptx: Prepare for reverse-offload callback handling

This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.

include/ChangeLog:

	* cuda/cuda.h (enum CUdevice_attribute): Add
	CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
	(cuMemHostAlloc): Add prototype.

libgomp/ChangeLog:

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
	'static' for this variable.
	* config/nvptx/libgomp-nvptx.h: New file.
	* config/nvptx/target.c: Include it.
	(GOMP_ADDITIONAL_ICVS): Declare extern var.
	(GOMP_REV_OFFLOAD_VAR): Declare var.
	(GOMP_target_ext): Handle reverse offload.
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
	* target.c (gomp_target_rev): ... this new stub function.
	* libgomp.h (gomp_target_rev): Declare.
	* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
	* plugin/cuda-lib.def (cuMemHostAlloc): Add.
	* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
	(struct ptx_device): Add rev_data member. 
	(nvptx_open_device): #if 0 unused check; add
	unified address assert check.
	(GOMP_OFFLOAD_get_num_devices): Claim unified address
	support.
	(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
	offload functions exist. Make offload var available
	on host and device.
	(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
	(GOMP_OFFLOAD_run): Handle reverse offload.

 include/cuda/cuda.h  |   3 +
 libgomp/config/nvptx/icv-device.c|   2 +-
 libgomp/config/nvptx/libgomp-nvptx.h |  52 
 libgomp/config/nvptx/target.c|  48 ---
 libgomp/libgomp-plugin.c |  12 
 libgomp/libgomp-plugin.h |   7 +++
 libgomp/libgomp.h|   5 ++
 libgomp/libgomp.map  |   5 ++
 libgomp/plugin/cuda-lib.def  |   1 +
 libgomp/plugin/plugin-nvptx.c| 111 +--
 libgomp/target.c |  19 ++
 11 files changed, 251 insertions(+), 14 deletions(-)

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 3938d05..e081f04 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -77,6 +77,7 @@ typedef enum {
   CU_DEVICE_ATTRIBUTE_CONCURRENT_KERNELS = 31,
   CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR = 39,
   CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT = 40,
+  CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING = 41,
   

RE: [PATCH 12/15] arm: implement bti injection

2022-09-27 Thread Kyrylo Tkachov via Gcc-patches
Hi Andrea,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
> Corallo via Gcc-patches
> Sent: Friday, August 12, 2022 4:42 PM
> To: Andrea Corallo via Gcc-patches 
> Cc: Richard Earnshaw ; nd 
> Subject: [PATCH 12/15] arm: implement bti injection
> 
> Hi all,
> 
> this patch enables Branch Target Identification Armv8.1-M Mechanism
> [1].
> 
> This is achieved by using the bti pass made common with Aarch64.
> 
> The pass iterates through the instructions and adds the necessary BTI
> instructions at the beginning of every function and at every landing
> pads targeted by indirect jumps.
> 
> Best Regards
> 
>   Andrea
> 
> [1]
>  products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-
> authentication-and-branch-target-identification-extension>
> 
> gcc/ChangeLog
> 
> 2022-04-07  Andrea Corallo  
> 
>   * config.gcc (arm*-*-*): Add 'aarch-bti-insert.o' object.
>   * config/arm/arm-protos.h: Update.
>   * config/arm/arm.cc (aarch_bti_enabled, aarch_bti_j_insn_p)
>   (aarch_pac_insn_p, aarch_gen_bti_c, aarch_gen_bti_j): New
>   functions.
>   * config/arm/arm.md (bti_nop): New insn.
>   * config/arm/t-arm (PASSES_EXTRA): Add 'arm-passes.def'.
>   (aarch-bti-insert.o): New target.
>   * config/arm/unspecs.md (UNSPEC_BTI_NOP): New unspec.
>   * config/arm/aarch-bti-insert.cc (rest_of_insert_bti): Update
>   to verify arch compatibility.
>   * config/arm/arm-passes.def: New file.
> 
> gcc/testsuite/ChangeLog
> 
> 2022-04-07  Andrea Corallo  
> 
>   * gcc.target/arm/bti-1.c: New testcase.
>   * gcc.target/arm/bti-2.c: Likewise.

diff --git a/gcc/config/arm/aarch-bti-insert.cc 
b/gcc/config/arm/aarch-bti-insert.cc
index 2d1d2e334a9..8f045c247bf 100644
--- a/gcc/config/arm/aarch-bti-insert.cc
+++ b/gcc/config/arm/aarch-bti-insert.cc
@@ -41,6 +41,7 @@
 #include "cfgrtl.h"
 #include "tree-pass.h"
 #include "cgraph.h"
+#include "diagnostic-core.h"
 
This change doesn't seem to match what's in the ChangeLog and doesn't make 
sense to me.

@@ -32985,6 +32979,58 @@ arm_current_function_pac_enabled_p (void)
&& !crtl->is_leaf);
 }
 
+/* Return TRUE if Branch Target Identification Mechanism is enabled.  */
+bool
+aarch_bti_enabled (void)
+{
+  return aarch_enable_bti == 1;
+}
+
+/* Check if INSN is a BTI J insn.  */
+bool
+aarch_bti_j_insn_p (rtx_insn *insn)
+{
+  if (!insn || !INSN_P (insn))
+return false;
+
+  rtx pat = PATTERN (insn);
+  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) == UNSPEC_BTI_NOP;
+}
+
+/* Check if X (or any sub-rtx of X) is a PACIASP/PACIBSP instruction.  */

The arm instructions are not PACIASP/PACIBSP.
This comment should be rewritten.

+bool
+aarch_pac_insn_p (rtx x)
+{

..

+rtx
+aarch_gen_bti_c (void)
+{
+  return gen_bti_nop ();
+}
+
+rtx
+aarch_gen_bti_j (void)
+{
+  return gen_bti_nop ();
+}
+

A reader may be confused for why we have a bti_c and bti_j function that have 
identical functionality.
Please add function comments explaining the situation.

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 92269a7819a..90c8c1d66f5 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12913,6 +12913,13 @@
   "aut\t%|ip, %|lr, %|sp"
   [(set_attr "length" "4")])
 
+(define_insn "bti_nop"
+  [(unspec_volatile [(const_int 0)] UNSPEC_BTI_NOP)]
+  "arm_arch7 && arm_arch_cmse"

That seems like a copy-paste mistake. CMSE has nothing to do with this 
functionality?

+  "bti"
+  [(set_attr "length" "4")

The length of instructions in the arm backend is 4 by default, this set_attr 
can be omitted

+   (set_attr "type" "mov_reg")])
+
Probably better to use the "nop" attribute here?

Thanks,
Kyrill


RE: [PATCH 11/15] aarch64: Make bti pass generic so it can be used by the arm backend

2022-09-27 Thread Kyrylo Tkachov via Gcc-patches
Hi Andrea,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
> Corallo via Gcc-patches
> Sent: Friday, August 12, 2022 4:40 PM
> To: Andrea Corallo via Gcc-patches 
> Cc: Richard Earnshaw ; nd 
> Subject: [PATCH 11/15] aarch64: Make bti pass generic so it can be used by
> the arm backend
> 
> Hi all,
> 
> this patch splits and restructures the aarch64 bti pass code in order
> to have it usable by the arm backend as well.  These changes have no
> functional impact.
> 
> The original patch was approved here:
> .
> 
> After that Richard E. noted that was better to move the new pass
> definition for arm in the following patch and so I did.
> 

Ok. The renaming and splits look fine and as long as it builds without problems 
on arm and aarch64 it's all good.
Thanks,
Kyirll


> Best Regards
> 
>   Andrea
> 
> gcc/Changelog
> 
>   * config.gcc (aarch64*-*-*): Rename 'aarch64-bti-insert.o' into
>   'aarch-bti-insert.o'.
>   * config/aarch64/aarch64-protos.h: Remove 'aarch64_bti_enabled'
>   proto.
>   * config/aarch64/aarch64.cc (aarch_bti_enabled): Rename.
>   (aarch_bti_j_insn_p, aarch_pac_insn_p): New functions.
>   (aarch64_output_mi_thunk)
>   (aarch64_print_patchable_function_entry)
>   (aarch64_file_end_indicate_exec_stack): Update renamed function
>   calls to renamed functions.
>   * config/aarch64/t-aarch64 (aarch-bti-insert.o): Update target.
>   * config/arm/aarch-bti-insert.cc: New file including and
>   generalizing code from aarch64-bti-insert.cc.
>   * config/arm/aarch-common-protos.h: Update.



RE: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2022-09-27 Thread Kyrylo Tkachov via Gcc-patches
Hi Andrea,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
> Corallo via Gcc-patches
> Sent: Friday, August 12, 2022 4:34 PM
> To: Andrea Corallo via Gcc-patches 
> Cc: Richard Earnshaw ; nd 
> Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping
> if necessary
> 
> Hi all,
> 
> this patch enables 'arm_emit_multi_reg_pop' to set again the stack
> pointer as CFA reg when popping if this is necessary.
> 

>From what I can tell from similar functions this is correct, but could you 
>elaborate on why this change is needed for my understanding please?
Thanks,
Kyrill

> /gcc/
> 
>   * config/arm/arm.cc (arm_emit_multi_reg_pop): If the frame pointer
>   was set define again the stack pointer as CFA reg when popping.


[COMMITTED] irange: keep better track of powers of 2.

2022-09-27 Thread Aldy Hernandez via Gcc-patches
When setting the nonzero bits to a mask containing only one bit, set
the range immediately, as it can be devined from the mask.  This helps
us keep better track of powers of two.

For example, with this patch a nonzero mask of 0x8000 is set to a
range of [0,0][0x8000,0x8000] with a nonzero mask of 0x8000.

Tested on x86-64 Linux.

p.s. Thanks for the bug report Uli.

gcc/ChangeLog:

* value-range.cc (irange::set_nonzero_bits): Set range when known.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/popcount6.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/popcount6.c | 12 
 gcc/value-range.cc| 13 +
 2 files changed, 25 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount6.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/popcount6.c
new file mode 100644
index 000..1406ad9d33b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount6.c
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-evrp" }
+
+int g(int n)
+{
+  n &= 0x8000;
+  if (n == 0)
+return 1;
+  return __builtin_popcount(n);
+}
+
+// { dg-final { scan-tree-dump "return 1;" "evrp" } }
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 754379add19..6154d73ccf5 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2930,6 +2930,19 @@ irange::set_nonzero_bits (const wide_int_ref )
   set_nonzero_bits (NULL);
   return;
 }
+  // If we have only one bit set in the mask, we can figure out the
+  // range immediately.
+  if (wi::popcount (bits) == 1)
+{
+  bool has_zero = contains_p (build_zero_cst (type ()));
+  set (type (), bits, bits);
+  if (has_zero)
+   {
+ int_range<2> zero;
+ zero.set_zero (type ());
+ union_ (zero);
+   }
+}
   set_nonzero_bits (wide_int_to_tree (type (), bits));
 }
 
-- 
2.37.1



[RFC PATCH] libstdc++: Partial library support for std::float{16,32,64,128}_t

2022-09-27 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch is partial support for std::float{16,32,64,128}_t
in libstdc++.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1467r9.html
says that , , ,  and 
need changes too, but before doing that, it would be nice to get an
agreement on what macros to use etc.
The support for cmath/complex is possible in multiple ways:
1) glibc >= 2.26 support whatever{f32,f64,f128} APIs next to
   whatever{f,,l} APIs, so I think we can use
   __builtin_sin{f32,f64,f128} etc. to support _Float{32,64,128}
   overloads
2) if not, pretty much everywhere float is actually the same mode as
   _Float32 and double as _Float64, I guess one could use guards like
   #if __FLT32_MANT_DIG__ == __FLT_MANT_DIG__ \
   && __FLT32_MAX_EXP__ == __FLT_MAX_EXP__
   or
   #if __FLT64_MANT_DIG__ == __DBL_MANT_DIG__ \
   && __FLT64_MAX_EXP__ == __DBL_MAX_EXP__
   or even
   #if __FLT128_MANT_DIG__ == __LDBL_MANT_DIG__ \
   && __FLT128_MAX_EXP__ == __LDBL_MAX_EXP__
   for the few arches like aarch64 where long double and _Float128
   has the same mode.  Then we can use
   __builtin_sin{f,,l} etc. to support _Float{32,64,128} overloads
3) not sure what to do as fallback if neither 1) nor 2) work out,
   not provide the overloads, or even undef __STDCPP_FLOAT32_T__
   etc. in the library?
4) AFAIK glibc doesn't have _Float16 APIs, so we need to widen and
   use __builtin_sinf32 1) or __builtin_sinf 2) for _Float16
   and explicitly do narrowing cast
5) bfloat16_t needs more work even on the compiler side, but once
   the support is there and in stdfloat and other headers touched
   by the patch below (testcases already include bfloat16_t though),
   it will need something like 4) too.

The patch also doesn't include any testcases to cover the 
changes, it isn't clear to me where to put that.

Tested on x86_64-linux.

2022-09-27  Jakub Jelinek  

* include/std/stdfloat: New file.
* include/std/numbers (__glibcxx_numbers): Define and use it
for __float128 explicit instantiations as well as
_Float{16,32,64,128}.
* include/std/atomic (atomic<_Float16>, atomic<_Float32>,
atomic<_Float64>, atomic<_Float128>): New explicit instantiations.
* include/std/type_traits (__is_floating_point_helper<_Float16>,
__is_floating_point_helper<_Float32>,
__is_floating_point_helper<_Float64>,
__is_floating_point_helper<_Float128>): Likewise.
* include/std/limits (__glibcxx_concat3_, __glibcxx_concat3,
__glibcxx_float_n): Define.
(numeric_limits<_Float16>, numeric_limits<_Float32>,
numeric_limits<_Float64>, numeric_limits<_Float128>): New explicit
instantiations.
* include/Makefile.am (std_headers): Add stdfloat.
* include/Makefile.in: Regenerated.
* include/precompiled/stdc++.h: Include stdfloat.
* testsuite/18_support/headers/stdfloat/types_std.cc: New test.
* testsuite/18_support/headers/limits/synopsis_cxx23.cc: New test.
* testsuite/26_numerics/numbers/4.cc: New test.
* testsuite/29_atomics/atomic_float/requirements_cxx23.cc: New test.

--- libstdc++-v3/include/std/stdfloat.jj2022-09-27 08:49:45.932769534 
+0200
+++ libstdc++-v3/include/std/stdfloat   2022-09-27 08:49:45.932769534 +0200
@@ -0,0 +1,58 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2022 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/stdfloat
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_STDFLOAT
+#define _GLIBCXX_STDFLOAT 1
+
+#if __cplusplus > 202002L
+#include 
+
+namespace std
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  #ifdef __STDCPP_FLOAT16_T__
+  using float16_t = _Float16;
+  #endif
+
+  #ifdef __STDCPP_FLOAT32_T__
+  using float32_t = _Float32;
+  #endif
+
+  #ifdef __STDCPP_FLOAT64_T__
+  using float64_t = _Float64;
+  #endif
+
+  #ifdef __STDCPP_FLOAT128_T__
+  using float128_t = _Float128;
+  #endif
+
+_GLIBCXX_END_NAMESPACE_VERSION
+} // namespace std
+#endif // C++23

[COMMITTED] Add an irange setter for wide_ints.

2022-09-27 Thread Aldy Hernandez via Gcc-patches
Just the same way as we have real_value setters for franges, we should
have a wide_int version for irange.  This matches the irange
constructor for wide_ints, and paves the way for the eventual
conversion of irange to wide ints.

gcc/ChangeLog:

* value-range.h (irange::set): New version taking wide_int_ref.
---
 gcc/value-range.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/value-range.h b/gcc/value-range.h
index 413e54bda6f..556e31aece1 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -117,6 +117,8 @@ class GTY((user)) irange : public vrange
 public:
   // In-place setters.
   virtual void set (tree, tree, value_range_kind = VR_RANGE) override;
+  void set (tree type, const wide_int_ref &, const wide_int_ref &,
+   value_range_kind = VR_RANGE);
   virtual void set_nonzero (tree type) override;
   virtual void set_zero (tree type) override;
   virtual void set_nonnegative (tree type) override;
@@ -687,6 +689,13 @@ irange::varying_compatible_p () const
   return true;
 }
 
+inline void
+irange::set (tree type, const wide_int_ref , const wide_int_ref ,
+value_range_kind kind)
+{
+  set (wide_int_to_tree (type, min), wide_int_to_tree (type, max), kind);
+}
+
 inline bool
 vrange::varying_p () const
 {
-- 
2.37.1



Re: [PATCH v2] c++: Don't quote nothrow in diagnostic

2022-09-27 Thread Richard Biener via Gcc-patches
On Mon, Sep 26, 2022 at 9:54 PM Marek Polacek  wrote:
>
> On Mon, Sep 26, 2022 at 12:34:04PM -0400, Jason Merrill wrote:
> > On 9/26/22 03:50, Richard Biener wrote:
> > > On Fri, Sep 23, 2022 at 8:41 PM Marek Polacek via Gcc-patches
> > >  wrote:
> > > >
> > > > In 
> > > > 
> > > > Jason noticed that we quote "nothrow" in diagnostics even though it's
> > > > not a keyword in C++.  Just removing the quotes didn't work because
> > > > then -Wformat-diag complains, so this patch replaces it with "no-throw".
> > > >
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > >
> > > That doesn't look like an improvement to me.  Can we quote 'nothrow()' 
> > > instead?
>
> Understood.
>
> > nothrow() is a syntax error; the C++11 keyword is 'noexcept'. std::nothrow
> > is a dummy placement argument used to indicate that a new-expression should
> > return null rather than throw on failure.
> >
> > But bizarrely, the library traits use the word "nothrow".  Marek's patch
> > clarifies that we are not trying to refer to anything in the language.
> >
> > > I'd rather leave it alone than changing it to no-throw.  Why does 
> > > -Wformat-diag
> > > complain?  If we shouldn't quote nothrow that should be adjusted?
> >
> > I think -Wformat-diag complains because "nothrow" is an attribute; it also
> > includes some other attribute names in the list of "keywords".
> >
> > I would also be fine with just removing the quotes and removing nothrow from
> > c_keywords.
>
> Like below?   Bootstrapped/regtested on x86_64-pc-linux-gnu.

Yes.  I assume that terms like "nothrow constructible" are used in the
C++ standard?

> Note that now I see warnings with my system compiler (gcc-12.2.1).  Can
> I commit the c-format.cc hunk to gcc 12 so that eventually even gcc 12
> stops warning?

Sure.

Thanks,
Richard.

> -- >8 --
> In 
> Jason noticed that we quote "nothrow" in diagnostics even though it's
> not a keyword in C++.  This patch removes the quotes and also drops
> "nothrow" from c_keywords.
>
> gcc/c-family/ChangeLog:
>
> * c-format.cc (c_keywords): Drop nothrow.
>
> gcc/cp/ChangeLog:
>
> * constraint.cc (diagnose_trait_expr): Say "nothrow" without quotes
> rather than in quotes.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp2a/concepts-traits3.C: Adjust expected diagnostics.
> ---
>  gcc/c-family/c-format.cc  |  3 +--
>  gcc/cp/constraint.cc  | 14 +++---
>  gcc/testsuite/g++.dg/cpp2a/concepts-traits3.C |  8 
>  3 files changed, 12 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
> index a6c380bf1c8..a2026591ed1 100644
> --- a/gcc/c-family/c-format.cc
> +++ b/gcc/c-family/c-format.cc
> @@ -2900,7 +2900,7 @@ static const token_t cxx_opers[] =
>};
>
>  /* Common C/C++ keywords that are expected to be quoted within the format
> -   string.  Keywords like auto, inline, or volatile are exccluded because
> +   string.  Keywords like auto, inline, or volatile are excluded because
> they are sometimes used in common terms like /auto variables/, /inline
> function/, or /volatile access/ where they should not be quoted.  */
>
> @@ -2927,7 +2927,6 @@ static const token_t c_keywords[] =
> NAME ("noinline", NULL),
> NAME ("nonnull", NULL),
> NAME ("noreturn", NULL),
> -   NAME ("nothrow", NULL),
> NAME ("offsetof", NULL),
> NAME ("readonly", "read-only"),
> NAME ("readwrite", "read-write"),
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 5839bfb4b52..266ec581a20 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3592,13 +3592,13 @@ diagnose_trait_expr (tree expr, tree args)
>switch (TRAIT_EXPR_KIND (expr))
>  {
>  case CPTK_HAS_NOTHROW_ASSIGN:
> -  inform (loc, "  %qT is not % copy assignable", t1);
> +  inform (loc, "  %qT is not nothrow copy assignable", t1);
>break;
>  case CPTK_HAS_NOTHROW_CONSTRUCTOR:
> -  inform (loc, "  %qT is not % default constructible", t1);
> +  inform (loc, "  %qT is not nothrow default constructible", t1);
>break;
>  case CPTK_HAS_NOTHROW_COPY:
> -  inform (loc, "  %qT is not % copy constructible", t1);
> +  inform (loc, "  %qT is not nothrow copy constructible", t1);
>break;
>  case CPTK_HAS_TRIVIAL_ASSIGN:
>inform (loc, "  %qT is not trivially copy assignable", t1);
> @@ -3674,7 +3674,7 @@ diagnose_trait_expr (tree expr, tree args)
>inform (loc, "  %qT is not trivially assignable from %qT", t1, t2);
>break;
>  case CPTK_IS_NOTHROW_ASSIGNABLE:
> -  inform (loc, "  %qT is not % assignable from %qT", t1, t2);
> +  inform (loc, "  %qT is not nothrow assignable from %qT", t1, t2);
>break;
>  case CPTK_IS_CONSTRUCTIBLE:
>if 

Re: VN, len_store and endianness

2022-09-27 Thread Richard Biener via Gcc-patches
On Mon, Sep 26, 2022 at 4:21 PM Robin Dapp  wrote:
>
> Hi,
>
> I'm locally testing a branch that enables vll/vstl for partial vector
> usage i.e. len_load and len_store on s390.  I see a FAIL in
> testsuite/gfortran.dg/power_3.f90.
> Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores
> and things go wrong there.  The problem seems to be that we evaluate a
> vector constant {-1, 1, -1, 1} loaded with length 11 + 1(bias) = 12 as
> {1, -1, 1} instead of {-1, 1, -1}.
>
> I found it a bit difficult to navigate through the logic due to several
> sizes, offsets, lengths and "amounts" :)  From what I can tell the
> culprit code is (guarded by BYTES_BIG_ENDIAN)
>
>if (TREE_CODE (pd.rhs) != CONSTRUCTOR)
>  {
>  q = (this_buffer + len
>   - (ROUND_UP (size - amnt, BITS_PER_UNIT)
>  / BITS_PER_UNIT));
>  }
>
> where, with pd.rhs = { 255, 255, 255, 255, 0, 0, 0, 1, 255, 255, 255,
> 255, 0, 0, 0, 1 }, len = 16 bytes, size = 96 bits, we read after the
> first 32 bits.  What is supposed to happen here?  It looks like going
> backwards (when size grows), but actually size shrinks for my example
> with each successive element via pd.offset 0, -32 and -64.
>
> When skipping the block with && TREE_CODE (pd.rhs) != VECTOR_CST the
> test and various others succeed but I didn't pursue testing further and
> figured I'd rather ask here for more insight.

The error is probably in vn_reference_lookup_3 which assumes that
'len' applies to the vector elements in element order.  See the part
of the code where it checks for internal_store_fn_p.  If 'len' is with
respect to the memory and thus endianess has to be taken into
account then for the IFN_LEN_STORE

  else if (fn == IFN_LEN_STORE)
{
  pd.rhs_off = 0;
  pd.offset = offset2i;
  pd.size = (tree_to_uhwi (len)
 + -tree_to_shwi (bias)) * BITS_PER_UNIT;
  if (ranges_known_overlap_p (offset, maxsize,
  pd.offset, pd.size))
return data->push_partial_def (pd, set, set,
   offseti, maxsizei);

likely needs to adjust rhs_off from zero for big endian?

>
> Regards
>  Robin


Re: [PATCH] gcc: honour -ffile-prefix-map in ASM_MAP [PR93371]

2022-09-27 Thread Rasmus Villemoes
On 12/09/2022 11.46, Rasmus Villemoes wrote:
> On 29/08/2022 11.29, Rasmus Villemoes wrote:
>> -ffile-prefix-map is supposed to be a superset of -fmacro-prefix-map
>> and -fdebug-prefix-map. However, when building .S or .s files, gas is
>> not called with the appropriate --debug-prefix-map option when
>> -ffile-prefix-map is used.
>>
>> While the user can specify -fdebug-prefix-map when building assembly
>> files via gcc, it's more ergonomic to also support -ffile-prefix-map;
>> especially since for .S files that could contain the __FILE__ macro,
>> one would then also have to specify -fmacro-prefix-map.
>>
>> gcc:
>>  PR driver/93371
>>  * gcc.cc (ASM_MAP): Honour -ffile-prefix-map.
>> ---
>>
>> I've tested that this works as expected, both by looking at how gas is
>> now invoked, and by running 'strings' on the generated .o file. But I
>> don't know how to add something to the testsuite for this.
> 
> Is this ok for trunk? If so, how about older maintained branches?
> 
> And does anyone have ideas for how I could add a test case?

ping.

> 
>>
>> I stumbled on this since it came up on the U-Boot mailing list:
>> https://lore.kernel.org/u-boot/4ed9f811-5244-54ef-b58e-83dba5151...@prevas.dk/
>> .
>>
>>  gcc/gcc.cc | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
>> index b6d562a92f0..44eafc60187 100644
>> --- a/gcc/gcc.cc
>> +++ b/gcc/gcc.cc
>> @@ -878,7 +878,7 @@ proper position among the other output files.  */
>>  #endif
>>  
>>  #ifdef HAVE_AS_DEBUG_PREFIX_MAP
>> -#define ASM_MAP " %{fdebug-prefix-map=*:--debug-prefix-map %*}"
>> +#define ASM_MAP " %{ffile-prefix-map=*:--debug-prefix-map %*} 
>> %{fdebug-prefix-map=*:--debug-prefix-map %*}"
>>  #else
>>  #define ASM_MAP ""
>>  #endif
> 



Re: [patch] libgompd: Add thread handles

2022-09-27 Thread Bernhard Reutner-Fischer via Gcc-patches
On Tue, 27 Sep 2022 03:20:51 +0200
Ahmed Sayed Mousse via Gcc-patches  wrote:

> diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
> index 6d913a93e7f..23f5bede1bf 100644
> --- a/libgomp/Makefile.am
> +++ b/libgomp/Makefile.am
> @@ -94,7 +94,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
> env.c error.c \
>   priority_queue.c affinity-fmt.c teams.c allocator.c oacc-profiling.c \
>   oacc-target.c ompd-support.c
>  
> -libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c
> +libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c ompd-threads.c
>  
>  include $(top_srcdir)/plugin/Makefrag.am
>  
> diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
> index 40f896b5f03..7acdcbf31d5 100644
> --- a/libgomp/Makefile.in
> +++ b/libgomp/Makefile.in
> @@ -233,7 +233,8 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo 
> critical.lo \
>   affinity-fmt.lo teams.lo allocator.lo oacc-profiling.lo \
>   oacc-target.lo ompd-support.lo $(am__objects_1)
>  libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
> -am_libgompd_la_OBJECTS = ompd-init.lo ompd-helper.lo ompd-icv.lo
> +am_libgompd_la_OBJECTS = ompd-init.lo ompd-helper.lo ompd-icv.lo \
> + ompd-threads.lo
>  libgompd_la_OBJECTS = $(am_libgompd_la_OBJECTS)
>  AM_V_P = $(am__v_P_@AM_V@)
>  am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
> @@ -583,7 +584,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c 
> critical.c env.c \
>   oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
>   affinity-fmt.c teams.c allocator.c oacc-profiling.c \
>   oacc-target.c ompd-support.c $(am__append_7)
> -libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c
> +libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c ompd-threads.c
>  
>  # Nvidia PTX OpenACC plugin.
>  @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info 
> $(libtool_VERSION)
> @@ -801,6 +802,7 @@ distclean-compile:
>  @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-icv.Plo@am__quote@
>  @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-init.Plo@am__quote@
>  @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-support.Plo@am__quote@
> +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-threads.Plo@am__quote@
>  @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@
>  @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/parallel.Plo@am__quote@
>  @AMDEP_TRUE@@am__include@ 
> @am__quote@./$(DEPDIR)/priority_queue.Plo@am__quote@
> diff --git a/libgomp/ompd-support.c b/libgomp/ompd-support.c
> index 27c5ad148e0..5b1afd37788 100644
> --- a/libgomp/ompd-support.c
> +++ b/libgomp/ompd-support.c
> @@ -33,6 +33,8 @@ const unsigned short gompd_sizeof_gomp_thread_handle
>__attribute__ ((used)) OMPD_SECTION = 0;
>  #endif
>  
> +unsigned long gompd_thread_initial_tls_bias __attribute__ ((used));
> +
>  /* Get offset of the member m in struct t.  */
>  #define gompd_get_offset(t, m) \
>const unsigned short gompd_access_##t##_##m __attribute__ ((used)) \
> @@ -67,6 +69,11 @@ gompd_load (void)
>gompd_state |= OMPD_ENABLED;
>ompd_dll_locations = _dll_locations_array[0];
>ompd_dll_locations_valid ();
> +
> +  #if defined(LIBGOMP_USE_PTHREADS) && !defined(GOMP_NEEDS_THREAD_HANDLE)
> +  gompd_thread_initial_tls_bias = (unsigned long) ((char *) _tls_data
> +- (char *) pthread_self ());
> +  #endif
>  }
>  
>  #ifndef __ELF__
> diff --git a/libgomp/ompd-threads.c b/libgomp/ompd-threads.c
> new file mode 100644
> index 000..723ef740181
> --- /dev/null
> +++ b/libgomp/ompd-threads.c
> @@ -0,0 +1,222 @@
> +/* Copyright (C) The GNU Toolchain Authors.
> +   Contributed by Ahmed Sayed .
> +   This file is part of the GNU Offloading and Multi Processing Library
> +   (libgomp).
> +
> +   Libgomp is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> +   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +   more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +/* This file contains the implementation of functions defined in
> +   Section 5.5 ThreadHandles. */
> +
> +
> +#include "ompd-helper.h"
> +
> +ompd_rc_t
> +ompd_get_thread_in_parallel (ompd_parallel_handle_t 

[committed] Fix ICE's due to jump-to-return optimization changes

2022-09-27 Thread Jeff Law


v850 and rl78 failed to build newlib with an ICE.  I've also got a 
report from an ARM automated tester that looks like the same underlying 
problem.



Basically we need to check if simple_return and return insns are 
available before trying to use them.



Bootstrapped on x86_64 (regression testing in progress). Verified this 
fixes the v850 and rl78 build failures.   Installing on the trunk 
momentarily.



Jeff
commit fe527a06a77093bc3de4ee2007516a4e9fa30f18
Author: Jeff Law 
Date:   Tue Sep 27 01:44:38 2022 -0400

Fix ICEs due to recent jump-to-return optimization

gcc/
* cfgrtl.cc (fixup_reorder_chain): Verify that simple_return
and return are available before trying to use them.

diff --git a/gcc/cfgrtl.cc b/gcc/cfgrtl.cc
index 90cd6ee56a7..281a432f6a6 100644
--- a/gcc/cfgrtl.cc
+++ b/gcc/cfgrtl.cc
@@ -4049,7 +4049,8 @@ fixup_reorder_chain (void)
   rtx_insn *ret, *use;
   basic_block dest;
   if (bb_is_just_return (e_fall->dest, , )
- && (PATTERN (ret) == simple_return_rtx || PATTERN (ret) == ret_rtx))
+ && ((PATTERN (ret) == simple_return_rtx && targetm.have_simple_return 
())
+ || (PATTERN (ret) == ret_rtx && targetm.have_return (
{
  ret_label = PATTERN (ret);
  dest = EXIT_BLOCK_PTR_FOR_FN (cfun);


[PATCH v2] Libvtv-test: Fix bug that scansarif.exp cannot be found in libvtv regression test.

2022-09-27 Thread Lulu Cheng
SARIF support was added in r13-967 but libvtv wasn't updated.

libvtv/ChangeLog:

* testsuite/lib/libvtv-dg.exp: Add load_gcc_lib of scansarif.exp.
---
 libvtv/testsuite/lib/libvtv-dg.exp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libvtv/testsuite/lib/libvtv-dg.exp 
b/libvtv/testsuite/lib/libvtv-dg.exp
index b140c194cdc..454d916e556 100644
--- a/libvtv/testsuite/lib/libvtv-dg.exp
+++ b/libvtv/testsuite/lib/libvtv-dg.exp
@@ -12,6 +12,8 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, 
USA.
 
+load_gcc_lib scansarif.exp
+
 proc libvtv-dg-test { prog do_what extra_tool_flags } {
 return [gcc-dg-test-1 libvtv_target_compile $prog $do_what 
$extra_tool_flags]
 }
-- 
2.31.1