Re: [18/18] hash table: enforce testing is_empty before is_deleted

2023-01-12 Thread Richard Biener via Gcc-patches
On Thu, Jan 12, 2023 at 10:32 PM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> Existing hash_table traits that use the same representation for empty
> and deleted slots reject marking slots as deleted, and to not pass
> is_deleted for slots that pass is_empty.
>
> Nevertheless, nearly everywhere, we only test for is_deleted after
> checking that !is_empty first.  The one exception was the copy
> constructor, that would fail if traits recognized is_empty slots as
> is_deleted, but then refused to mark_deleted.
>
> This asymmetry is neither necessary nor desirable, and there is a
> theoretical risk that traits might not only fail to refuse to
> mark_deleted, but also return is_deleted for is_empty slots.
>
> This patch introduces checks that detect these potentially problematic
> situations, and reorders the tests in the copy constructor so as to
> use the conventional testing order and thus avoid them.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

>
> for  gcc/ChangeLog
>
> * hash-table.h (is_deleted): Precheck !is_empty.
> (mark_deleted): Postcheck !is_empty.
> (copy constructor): Test is_empty before is_deleted.
> ---
>  gcc/hash-table.h |   16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> index 1d3166504c38e..e37625dc315bf 100644
> --- a/gcc/hash-table.h
> +++ b/gcc/hash-table.h
> @@ -534,6 +534,11 @@ private:
>void expand ();
>static bool is_deleted (value_type )
>{
> +/* Traits are supposed to avoid recognizing elements as both empty
> +   and deleted, but to fail safe in case custom traits fail to do
> +   that, make sure we never test for is_deleted without having
> +   first ruled out is_empty.  */
> +gcc_checking_assert (!Descriptor::is_empty (v));
>  return Descriptor::is_deleted (v);
>}
>
> @@ -545,6 +550,11 @@ private:
>static void mark_deleted (value_type )
>{
>  Descriptor::mark_deleted (v);
> +/* Traits are supposed to refuse to set elements as deleted if
> +   those would be indistinguishable from empty, but to fail safe
> +   in case custom traits fail to do that, check that the
> +   just-deleted element does not look empty.  */
> +gcc_checking_assert (!Descriptor::is_empty (v));
>}
>
>static void mark_empty (value_type )
> @@ -700,9 +710,11 @@ hash_table::hash_table 
> (const hash_table ,
>for (size_t i = 0; i < size; ++i)
> {
>   value_type  = h.m_entries[i];
> - if (is_deleted (entry))
> + if (is_empty (entry))
> +   continue;
> + else if (is_deleted (entry))
> mark_deleted (nentries[i]);
> - else if (!is_empty (entry))
> + else
> new ((void*) (nentries + i)) value_type (entry);
> }
>m_entries = nentries;
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: [PATCH 1/1] Replace flag_strict_flex_arrays with DECL_NOT_FLEXARRAY in middle-end

2023-01-12 Thread Richard Biener via Gcc-patches
On Thu, 12 Jan 2023, Qing Zhao wrote:

> We should not directly check flag_strict_flex_arrays in the middle end.
> Instead, check DECL_NOT_FLEXARRAY(array_field_decl) which is set by
> C/C++ FEs according to -fstrict-flex-arrays and the corresponding attribute
> attached to the array_field.
> 
> As a result, We will lose the LEVEL information of -fstrict-flex-arrays in
> the middle end. -Wstrict-flex-arrays will not be able to issue such
> information. update the testing cases accordingly.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>   * attribs.cc (strict_flex_array_level_of): Move this function to ...
>   * attribs.h (strict_flex_array_level_of): Remove the declaration.
>   * gimple-array-bounds.cc (array_bounds_checker::check_array_ref):
>   replace the referece to strict_flex_array_level_of with
>   DECL_NOT_FLEXARRAY.
>   * tree.cc (component_ref_size): Likewise.
> 
> gcc/c/ChangeLog:
> 
>   * c-decl.cc (strict_flex_array_level_of): ... here.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/Warray-bounds-flex-arrays-1.c: Delete the level information
>   from the message issued by -Wstrict-flex-arrays.
>   * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
>   * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
>   * gcc.dg/Wstrict-flex-arrays-2.c: Likewise.
>   * gcc.dg/Wstrict-flex-arrays-3.c: Likewise.
>   * gcc.dg/Wstrict-flex-arrays.c: Likewise.
> ---
>  gcc/attribs.cc| 30 
>  gcc/attribs.h |  2 -
>  gcc/c/c-decl.cc   | 28 +++
>  gcc/gimple-array-bounds.cc| 46 ---
>  .../gcc.dg/Warray-bounds-flex-arrays-1.c  |  2 +-
>  .../gcc.dg/Warray-bounds-flex-arrays-2.c  |  4 +-
>  .../gcc.dg/Warray-bounds-flex-arrays-3.c  |  6 +--
>  .../gcc.dg/Warray-bounds-flex-arrays-4.c  |  2 +-
>  .../gcc.dg/Warray-bounds-flex-arrays-5.c  |  4 +-
>  .../gcc.dg/Warray-bounds-flex-arrays-6.c  |  6 +--
>  gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c  |  8 ++--
>  gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c  |  8 ++--
>  gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c|  8 ++--
>  gcc/tree.cc   | 37 +++
>  14 files changed, 68 insertions(+), 123 deletions(-)
> 
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index 49cb299a3c1..7db730cf831 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -2456,36 +2456,6 @@ init_attr_rdwr_indices (rdwr_map *rwm, tree attrs)
>  }
>  }
>  
> -/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
> -   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
> -unsigned int
> -strict_flex_array_level_of (tree array_field)
> -{
> -  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
> -  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
> -
> -  tree attr_strict_flex_array
> -= lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
> -  /* If there is a strict_flex_array attribute attached to the field,
> - override the flag_strict_flex_arrays.  */
> -  if (attr_strict_flex_array)
> -{
> -  /* Get the value of the level first from the attribute.  */
> -  unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
> -  gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
> -  attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
> -  gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
> -  attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
> -  gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
> -  attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
> -
> -  /* The attribute has higher priority than flag_struct_flex_array.  */
> -  strict_flex_array_level = attr_strict_flex_array_level;
> -}
> -  return strict_flex_array_level;
> -}
> -
> -
>  /* Return the access specification for a function parameter PARM
> or null if the current function has no such specification.  */
>  
> diff --git a/gcc/attribs.h b/gcc/attribs.h
> index e7540f71d6a..140e70b64e0 100644
> --- a/gcc/attribs.h
> +++ b/gcc/attribs.h
> @@ -398,6 +398,4 @@ extern void init_attr_rdwr_indices (rdwr_map *, tree);
>  extern attr_access *get_parm_access (rdwr_map &, tree,
>tree = current_function_decl);
>  
> -extern unsigned int strict_flex_array_level_of (tree);
> -
>  #endif // GCC_ATTRIBS_H
> diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
> index d76ffb3380d..38647c100c1 100644
> --- a/gcc/c/c-decl.cc
> +++ b/gcc/c/c-decl.cc
> @@ -9047,6 +9047,34 @@ finish_incomplete_vars (tree incomplete_vars, bool 
> toplevel)
>  }
>  }
>  
> +/* Get the LEVEL of the 

Re: [PATCH] x86: Avoid -Wuninitialized warnings on _mm*_undefined_* in C++ [PR105593]

2023-01-12 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 13, 2023 at 1:36 AM Jakub Jelinek  wrote:
>
> Hi!
>
> In https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609844.html
> I've posted a patch to allow ignoring -Winit-self using GCC diagnostic
> pragmas, such that one can mark self-initialization as intentional
> disabling of -Wuninitialized warnings.
>
> The following incremental patch uses that in the x86 intrinsic
> headers.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
> if the above mentioned patch is approved/committed?
>
> 2023-01-13  Jakub Jelinek  
>
> PR c++/105593
> gcc/
> * config/i386/xmmintrin.h (_mm_undefined_ps): Temporarily
> disable -Winit-self using pragma GCC diagnostic ignored.
> * config/i386/emmintrin.h (_mm_undefined_pd, _mm_undefined_si128):
> Likewise.
> * config/i386/avxintrin.h (_mm256_undefined_pd, _mm256_undefined_ps,
> _mm256_undefined_si256): Likewise.
> * config/i386/avx512fintrin.h (_mm512_undefined_pd,
> _mm512_undefined_ps, _mm512_undefined_epi32): Likewise.
> * config/i386/avx512fp16intrin.h (_mm_undefined_ph,
> _mm256_undefined_ph, _mm512_undefined_ph): Likewise.
> gcc/testsuite/
> * g++.target/i386/pr105593.C: New test.

LGTM.

Thanks,
Uros.

>
> --- gcc/config/i386/xmmintrin.h.jj  2022-11-07 10:30:42.741629972 +0100
> +++ gcc/config/i386/xmmintrin.h 2023-01-12 15:33:21.685483809 +0100
> @@ -112,7 +112,10 @@ typedef float __v4sf __attribute__ ((__v
>  extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_undefined_ps (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m128 __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> --- gcc/config/i386/emmintrin.h.jj  2022-03-14 14:48:04.299055735 +0100
> +++ gcc/config/i386/emmintrin.h 2023-01-12 15:33:04.148735997 +0100
> @@ -99,7 +99,10 @@ _mm_setr_pd (double __W, double __X)
>  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_undefined_pd (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m128d __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> @@ -785,7 +788,10 @@ _mm_move_epi64 (__m128i __A)
>  extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_undefined_si128 (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m128i __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> --- gcc/config/i386/avxintrin.h.jj  2022-01-11 23:11:21.763298965 +0100
> +++ gcc/config/i386/avxintrin.h 2023-01-12 15:34:15.395711408 +0100
> @@ -1207,21 +1207,30 @@ _mm256_movemask_ps (__m256 __A)
>  extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm256_undefined_pd (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m256d __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
>  extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm256_undefined_ps (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m256 __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
>  extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm256_undefined_si256 (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m256i __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> --- gcc/config/i386/avx512fintrin.h.jj  2022-04-25 13:54:38.277309716 +0200
> +++ gcc/config/i386/avx512fintrin.h 2023-01-12 15:35:09.868928035 +0100
> @@ -185,7 +185,10 @@ extern __inline __m512
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm512_undefined_ps (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m512 __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> @@ -195,7 +198,10 @@ extern __inline __m512d
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm512_undefined_pd (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m512d __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> @@ -203,7 +209,10 @@ extern __inline __m512i
>  __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>  _mm512_undefined_epi32 (void)
>  {
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Winit-self"
>__m512i __Y = __Y;
> +#pragma GCC diagnostic pop
>return __Y;
>  }
>
> --- gcc/config/i386/avx512fp16intrin.h.jj   2022-11-21 10:28:24.284544313 
> +0100
> +++ gcc/config/i386/avx512fp16intrin.h  2023-01-12 15:35:41.070479329 +0100
> @@ -204,7 +204,10 @@ extern __inline __m128h
>  

[PATCH v5] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-12 Thread Lulu Cheng
Co-authored-by: Yang Yujie 

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_classify_address):
Add precessint for CONST_INT.
(loongarch_print_operand_reloc): Operand modifier 'c' is supported.
(loongarch_print_operand): Increase the processing of '%c'.
* doc/extend.texi: Adds documents for LoongArch operand modifiers.
And port the public operand modifiers information to this document.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tst-asm-const.c: Moved to...
* gcc.target/loongarch/pr107731.c: ...here.

---
V2 -> v3:
1. Correct a clerical error.
2. Adding document for loongarch operand modifiers.

v3 -> v4:
Copy the description of "%c" "%n" "%a" "%l" from gccint.pdf to gcc.pdf.

v4 -> v5:
Move the operand modifiers description of "%c", "%n", "%a", "%l" to the top of 
the
x86Operandmodifiers section.

---
 gcc/config/loongarch/loongarch.cc |  14 ++
 gcc/doc/extend.texi   | 135 --
 .../loongarch/{tst-asm-const.c => pr107731.c} |   6 +-
 3 files changed, 106 insertions(+), 49 deletions(-)
 rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} (78%)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index c6b03fcf2f9..cdf190b985e 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
loongarch_address_info *info, rtx x,
   return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
  && loongarch_valid_lo_sum_p (info->symbol_type, mode,
   info->offset));
+case CONST_INT:
+  /* Small-integer addresses don't occur very often, but they
+are legitimate if $r0 is a valid base register.  */
+  info->type = ADDRESS_CONST_INT;
+  return IMM12_OPERAND (INTVAL (x));
 
 default:
   return false;
@@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
hi64_part,
 
'A' Print a _DB suffix if the memory model requires a release.
'b' Print the address of a memory operand, without offset.
+   'c'  Print an integer.
'C' Print the integer branch condition for comparison OP.
'd' Print CONST_INT OP in decimal.
'F' Print the FPU branch condition for comparison OP.
@@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
fputs ("_db", file);
   break;
 
+case 'c':
+  if (CONST_INT_P (op))
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+  else
+   output_operand_lossage ("unsupported operand for code '%c'", letter);
+
+  break;
+
 case 'C':
   loongarch_print_int_branch_condition (file, code, letter);
   break;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1103e9936f7..256811cb8f5 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -10402,8 +10402,10 @@ ensures that modifying @var{a} does not affect the 
address referenced by
 is undefined if @var{a} is modified before using @var{b}.
 
 @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
-instead of simply @samp{%2}). Typically these qualifiers are hardware 
-dependent. The list of supported modifiers for x86 is found at 
+instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
+Generic Operand modifiers} lists the modifiers that are available
+on all targets.  Other modifiers are hardware dependent.
+For example, the list of supported modifiers for x86 is found at
 @ref{x86Operandmodifiers,x86 Operand modifiers}.
 
 If the C code that follows the @code{asm} makes no use of any of the output 
@@ -10671,8 +10673,10 @@ optimizers may discard the @code{asm} statement as 
unneeded
 (see @ref{Volatile}).
 
 @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
-instead of simply @samp{%2}). Typically these qualifiers are hardware 
-dependent. The list of supported modifiers for x86 is found at 
+instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
+Generic Operand modifiers} lists the modifiers that are available
+on all targets.  Other modifiers are hardware dependent.
+For example, the list of supported modifiers for x86 is found at
 @ref{x86Operandmodifiers,x86 Operand modifiers}.
 
 In this example using the fictitious @code{combine} instruction, the 
@@ -11024,9 +11028,8 @@ lab:
 @}
 @end example
 
-@anchor{x86Operandmodifiers}
-@subsubsection x86 Operand Modifiers
-
+@anchor{GenericOperandmodifiers}
+@subsubsection Generic Operand Modifiers
 References to input, output, and goto operands in the assembler template
 of extended @code{asm} statements can use 
 modifiers to affect the way the operands are formatted in 
@@ -11045,48 +11048,31 @@ These modifiers generate this assembler code:
 xchg %ah, %al
 @end example
 
-The rest of this discussion uses the following code for illustrative purposes.
-
-@example
-int 

Re: [PATCH] c, c++: Allow ignoring -Winit-self through pragmas [PR105593]

2023-01-12 Thread Jason Merrill via Gcc-patches

On 1/12/23 19:32, Jakub Jelinek wrote:

Hi!

As mentioned in the PR, various x86 intrinsics need to return
an uninitialized vector.  Currently they use self initialization
to avoid -Wuninitialized warnings, which works fine in C, but
doesn't work in C++ where -Winit-self is enabled in -Wall.
We don't have an attribute to mark a variable as knowingly
uninitialized (the uninitialized attribute exists but means
something else, only in the -ftrivial-auto-var-init context),
and trying to suppress either -Wuninitialized or -Winit-self
inside of the _mm_undefined_ps etc. intrinsic definitions
doesn't work, one needs to currently disable through pragmas
-Wuninitialized warning at the point where _mm_undefined_ps etc.
result is actually used, but that goes against the intent of
those intrinsics.

The -Winit-self warning option actually doesn't do any warning,
all we do is record a suppression for -Winit-self if !warn_init_self
on the decl definition and later look that up in uninit pass.

The following patch changes those !warn_init_self tests which
are true only based on the command line option setting, not based
on GCC diagnostic pragma overrides to
!warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self)
such that it takes them into account.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK on Monday if the C maintainers don't comment.


Will post incremental patch for the intrinsic headers.

2023-01-13  Jakub Jelinek  

PR c++/105593
gcc/c/
* c-parser.cc (c_parser_initializer): Check warning_enabled_at
at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead
of warn_init_self.
gcc/cp/
* decl.cc (cp_finish_decl): Check warning_enabled_at
at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead
of warn_init_self.
gcc/testsuite/
* c-c++-common/Winit-self3.c: New test.
* c-c++-common/Winit-self4.c: New test.
* c-c++-common/Winit-self5.c: New test.

--- gcc/c/c-parser.cc.jj2023-01-11 22:18:25.560492345 +0100
+++ gcc/c/c-parser.cc   2023-01-12 15:30:10.460233783 +0100
@@ -5701,7 +5701,7 @@ c_parser_initializer (c_parser *parser,
  && !DECL_EXTERNAL (decl)
  && !TREE_STATIC (decl)
  && ret.value == decl
- && !warn_init_self)
+ && !warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self))
suppress_warning (decl, OPT_Winit_self);
if (TREE_CODE (ret.value) != STRING_CST
  && (TREE_CODE (ret.value) != COMPOUND_LITERAL_EXPR
--- gcc/cp/decl.cc.jj   2023-01-04 18:42:24.597997547 +0100
+++ gcc/cp/decl.cc  2023-01-12 15:26:01.257817526 +0100
@@ -8407,7 +8407,7 @@ cp_finish_decl (tree decl, tree init, bo
if (!DECL_EXTERNAL (decl)
  && !TREE_STATIC (decl)
  && decl == tree_strip_any_location_wrapper (init)
- && !warn_init_self)
+ && !warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self))
suppress_warning (decl, OPT_Winit_self);
  }
  
--- gcc/testsuite/c-c++-common/Winit-self3.c.jj	2023-01-12 15:49:56.759172518 +0100

+++ gcc/testsuite/c-c++-common/Winit-self3.c2023-01-12 15:50:51.512384963 
+0100
@@ -0,0 +1,36 @@
+/* PR c++/105593 */
+/* { dg-do compile } */
+/* { dg-options "-W -Wall" } */
+
+void bar (int);
+
+static inline int
+baz (void)
+{
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
+  int u = u;   /* { dg-bogus "'u' is used uninitialized" } */
+#pragma GCC diagnostic pop
+  return u;
+}
+
+void
+foo (void)
+{
+  int u = baz ();
+  bar (u);
+}
+
+static inline int
+qux (void)
+{
+  int u = u;   /* { dg-warning "'u' is used uninitialized" "" { target 
c++ } } */
+  return u;/* { dg-message "'u' was declared here" "" { target c++ 
} .-1 } */
+}
+
+void
+corge (void)
+{
+  int u = qux ();
+  bar (u);
+}
--- gcc/testsuite/c-c++-common/Winit-self4.c.jj 2023-01-12 15:50:15.233906776 
+0100
+++ gcc/testsuite/c-c++-common/Winit-self4.c2023-01-12 15:50:42.445515372 
+0100
@@ -0,0 +1,36 @@
+/* PR c++/105593 */
+/* { dg-do compile } */
+/* { dg-options "-W -Wall -Winit-self" } */
+
+void bar (int);
+
+static inline int
+baz (void)
+{
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
+  int u = u;   /* { dg-bogus "'u' is used uninitialized" } */
+#pragma GCC diagnostic pop
+  return u;
+}
+
+void
+foo (void)
+{
+  int u = baz ();
+  bar (u);
+}
+
+static inline int
+qux (void)
+{
+  int u = u;   /* { dg-warning "'u' is used uninitialized" } */
+  return u;/* { dg-message "'u' was declared here" "" { target 
*-*-* } .-1 } */
+}
+
+void
+corge (void)
+{
+  int u = qux ();
+  bar (u);
+}
--- gcc/testsuite/c-c++-common/Winit-self5.c.jj 2023-01-12 15:51:01.045247847 
+0100
+++ gcc/testsuite/c-c++-common/Winit-self5.c2023-01-12 15:51:12.929076923 
+0100
@@ -0,0 +1,36 @@
+/* PR c++/105593 */
+/* { dg-do compile } */
+/* { dg-options "-W -Wall 

Re: [PATCH] modula-2: Handle pass '-v' option to the compiler.

2023-01-12 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Tested on x86-64-darwin21.
> OK for trunk?
> Iain

yes LGTM,

thanks,
Gaius

> --- 8< ---
>
> Somehow this setting had been missed, and we really need the verbose
> flag to enable useful debug output.
>
> Signed-off-by: Iain Sandoe 
>
> gcc/m2/ChangeLog:
>
>   * gm2-gcc/m2options.h (M2Options_SetVerbose): Export the
>   function.
>   * gm2-lang.cc: Handle OPT_v, passing it to the compiler.
>   * lang-specs.h: Pass -v to cc1gm2.
> ---
>  gcc/m2/gm2-gcc/m2options.h | 1 +
>  gcc/m2/gm2-lang.cc | 3 +++
>  gcc/m2/lang-specs.h| 2 +-
>  3 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/m2/gm2-gcc/m2options.h b/gcc/m2/gm2-gcc/m2options.h
> index 4b32c911b09..beaa460ffa9 100644
> --- a/gcc/m2/gm2-gcc/m2options.h
> +++ b/gcc/m2/gm2-gcc/m2options.h
> @@ -106,6 +106,7 @@ EXTERN int M2Options_GetCpp (void);
>  EXTERN int M2Options_GetM2g (void);
>  EXTERN void M2Options_SetM2g (int value);
>  EXTERN void M2Options_SetLowerCaseKeywords (int value);
> +EXTERN void M2Options_SetVerbose (int value);
>  EXTERN void M2Options_SetUnusedVariableChecking (int value);
>  EXTERN void M2Options_SetUnusedParameterChecking (int value);
>  EXTERN void M2Options_SetStrictTypeChecking (int value);
> diff --git a/gcc/m2/gm2-lang.cc b/gcc/m2/gm2-lang.cc
> index 49f93901d3c..073d1dd55db 100644
> --- a/gcc/m2/gm2-lang.cc
> +++ b/gcc/m2/gm2-lang.cc
> @@ -451,6 +451,9 @@ gm2_langhook_handle_option (
>  case OPT_save_temps_:
>M2Options_SetSaveTempsDir (arg);
>return 1;
> +case OPT_v:
> +  M2Options_SetVerbose (value);
> +  return 1;
>  default:
>if (insideCppArgs)
>   {
> diff --git a/gcc/m2/lang-specs.h b/gcc/m2/lang-specs.h
> index 0a34b0bc6d7..bf882649b21 100644
> --- a/gcc/m2/lang-specs.h
> +++ b/gcc/m2/lang-specs.h
> @@ -34,6 +34,6 @@ along with GCC; see the file COPYING3.  If not see
>{"@modula-2",
>"cc1gm2 " M2CPP
>"  %(cc1_options) %{B*} %{c*} %{+e*} %{I*} "
> -  "  %{i*} %{save-temps*} "
> +  "  %{i*} %{save-temps*} %{v} "
>"  %i %{!fsyntax-only:%(invoke_as)}",
>0, 0, 0},


Re: [PATCH v2] libstdc++: Fix Unicode codecvt and add tests [PR86419]

2023-01-12 Thread Jonathan Wakely via Gcc-patches
On Tue, 10 Jan 2023 at 12:59, Dimitrij Mijoski via Libstdc++
 wrote:
>
> Fixes the conversion from UTF-8 to UTF-16 to properly return partial
> instead ok.
> Fixes the conversion from UTF-16 to UTF-8 to properly return partial
> instead ok.
> Fixes the conversion from UTF-8 to UCS-2 to properly return partial
> instead error.
> Fixes the conversion from UTF-8 to UCS-2 to treat 4-byte UTF-8 sequences
> as error just by seeing the leading byte.
> Fixes UTF-8 decoding for all codecvts so they detect error at the end of
> the input range when the last code point is also incomplete.
>
> libstdc++-v3/ChangeLog:
> PR libstdc++/86419
> * src/c++11/codecvt.cc: Fix bugs.
> * testsuite/22_locale/codecvt/codecvt_unicode.cc: New tests.
> * testsuite/22_locale/codecvt/codecvt_unicode.h: New tests.
> * testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: New
>   tests.

I'm just finishing testing and will push this. I had to add 
to the new codecvt_unicode.h header, and fixed some formatting in
codecvt (we don't put a space before function parameter lists in the
libstdc++ code, unlike in the code for the GCC compiler itself). I've
also made the changelog entry a bit more descriptive.

Thanks again for the bug report and for fixing it!



Re: [PATCH] libstdc++: Fix unintended layout change to std::basic_filebuf [PR108331]

2023-01-12 Thread Jonathan Yong via Gcc-patches

On 1/12/23 21:10, Jonathan Wakely wrote:

Tested x86_64-linux. Bootstrapped x86_64-w64-mingw32 with both
--enable-threads=posix and --enable-threads=win32.

I need a libgcc or mingw maintainer to approve the gthr-win32.h part.

OK for trunk?



Looks good to me even if its a dummy member. Thanks for the patch.


[PATCH] x86: Avoid -Wuninitialized warnings on _mm*_undefined_* in C++ [PR105593]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
Hi!

In https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609844.html
I've posted a patch to allow ignoring -Winit-self using GCC diagnostic
pragmas, such that one can mark self-initialization as intentional
disabling of -Wuninitialized warnings.

The following incremental patch uses that in the x86 intrinsic
headers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
if the above mentioned patch is approved/committed?

2023-01-13  Jakub Jelinek  

PR c++/105593
gcc/
* config/i386/xmmintrin.h (_mm_undefined_ps): Temporarily
disable -Winit-self using pragma GCC diagnostic ignored.
* config/i386/emmintrin.h (_mm_undefined_pd, _mm_undefined_si128):
Likewise.
* config/i386/avxintrin.h (_mm256_undefined_pd, _mm256_undefined_ps,
_mm256_undefined_si256): Likewise.
* config/i386/avx512fintrin.h (_mm512_undefined_pd,
_mm512_undefined_ps, _mm512_undefined_epi32): Likewise.
* config/i386/avx512fp16intrin.h (_mm_undefined_ph,
_mm256_undefined_ph, _mm512_undefined_ph): Likewise.
gcc/testsuite/
* g++.target/i386/pr105593.C: New test.

--- gcc/config/i386/xmmintrin.h.jj  2022-11-07 10:30:42.741629972 +0100
+++ gcc/config/i386/xmmintrin.h 2023-01-12 15:33:21.685483809 +0100
@@ -112,7 +112,10 @@ typedef float __v4sf __attribute__ ((__v
 extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_undefined_ps (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m128 __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
--- gcc/config/i386/emmintrin.h.jj  2022-03-14 14:48:04.299055735 +0100
+++ gcc/config/i386/emmintrin.h 2023-01-12 15:33:04.148735997 +0100
@@ -99,7 +99,10 @@ _mm_setr_pd (double __W, double __X)
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_undefined_pd (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m128d __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
@@ -785,7 +788,10 @@ _mm_move_epi64 (__m128i __A)
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_undefined_si128 (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m128i __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
--- gcc/config/i386/avxintrin.h.jj  2022-01-11 23:11:21.763298965 +0100
+++ gcc/config/i386/avxintrin.h 2023-01-12 15:34:15.395711408 +0100
@@ -1207,21 +1207,30 @@ _mm256_movemask_ps (__m256 __A)
 extern __inline __m256d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm256_undefined_pd (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m256d __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm256_undefined_ps (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m256 __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
 extern __inline __m256i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm256_undefined_si256 (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m256i __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
--- gcc/config/i386/avx512fintrin.h.jj  2022-04-25 13:54:38.277309716 +0200
+++ gcc/config/i386/avx512fintrin.h 2023-01-12 15:35:09.868928035 +0100
@@ -185,7 +185,10 @@ extern __inline __m512
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_undefined_ps (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m512 __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
@@ -195,7 +198,10 @@ extern __inline __m512d
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_undefined_pd (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m512d __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
@@ -203,7 +209,10 @@ extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_undefined_epi32 (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m512i __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
--- gcc/config/i386/avx512fp16intrin.h.jj   2022-11-21 10:28:24.284544313 
+0100
+++ gcc/config/i386/avx512fp16intrin.h  2023-01-12 15:35:41.070479329 +0100
@@ -204,7 +204,10 @@ extern __inline __m128h
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_undefined_ph (void)
 {
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
   __m128h __Y = __Y;
+#pragma GCC diagnostic pop
   return __Y;
 }
 
@@ -212,7 +215,10 @@ extern __inline __m256h
 __attribute__ ((__gnu_inline__, __always_inline__, 

[PATCH] c, c++: Allow ignoring -Winit-self through pragmas [PR105593]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, various x86 intrinsics need to return
an uninitialized vector.  Currently they use self initialization
to avoid -Wuninitialized warnings, which works fine in C, but
doesn't work in C++ where -Winit-self is enabled in -Wall.
We don't have an attribute to mark a variable as knowingly
uninitialized (the uninitialized attribute exists but means
something else, only in the -ftrivial-auto-var-init context),
and trying to suppress either -Wuninitialized or -Winit-self
inside of the _mm_undefined_ps etc. intrinsic definitions
doesn't work, one needs to currently disable through pragmas
-Wuninitialized warning at the point where _mm_undefined_ps etc.
result is actually used, but that goes against the intent of
those intrinsics.

The -Winit-self warning option actually doesn't do any warning,
all we do is record a suppression for -Winit-self if !warn_init_self
on the decl definition and later look that up in uninit pass.

The following patch changes those !warn_init_self tests which
are true only based on the command line option setting, not based
on GCC diagnostic pragma overrides to
!warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self)
such that it takes them into account.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Will post incremental patch for the intrinsic headers.

2023-01-13  Jakub Jelinek  

PR c++/105593
gcc/c/
* c-parser.cc (c_parser_initializer): Check warning_enabled_at
at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead
of warn_init_self.
gcc/cp/
* decl.cc (cp_finish_decl): Check warning_enabled_at
at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead
of warn_init_self.
gcc/testsuite/
* c-c++-common/Winit-self3.c: New test.
* c-c++-common/Winit-self4.c: New test.
* c-c++-common/Winit-self5.c: New test.

--- gcc/c/c-parser.cc.jj2023-01-11 22:18:25.560492345 +0100
+++ gcc/c/c-parser.cc   2023-01-12 15:30:10.460233783 +0100
@@ -5701,7 +5701,7 @@ c_parser_initializer (c_parser *parser,
  && !DECL_EXTERNAL (decl)
  && !TREE_STATIC (decl)
  && ret.value == decl
- && !warn_init_self)
+ && !warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self))
suppress_warning (decl, OPT_Winit_self);
   if (TREE_CODE (ret.value) != STRING_CST
  && (TREE_CODE (ret.value) != COMPOUND_LITERAL_EXPR
--- gcc/cp/decl.cc.jj   2023-01-04 18:42:24.597997547 +0100
+++ gcc/cp/decl.cc  2023-01-12 15:26:01.257817526 +0100
@@ -8407,7 +8407,7 @@ cp_finish_decl (tree decl, tree init, bo
   if (!DECL_EXTERNAL (decl)
  && !TREE_STATIC (decl)
  && decl == tree_strip_any_location_wrapper (init)
- && !warn_init_self)
+ && !warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self))
suppress_warning (decl, OPT_Winit_self);
 }
 
--- gcc/testsuite/c-c++-common/Winit-self3.c.jj 2023-01-12 15:49:56.759172518 
+0100
+++ gcc/testsuite/c-c++-common/Winit-self3.c2023-01-12 15:50:51.512384963 
+0100
@@ -0,0 +1,36 @@
+/* PR c++/105593 */
+/* { dg-do compile } */
+/* { dg-options "-W -Wall" } */
+
+void bar (int);
+
+static inline int
+baz (void)
+{
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
+  int u = u;   /* { dg-bogus "'u' is used uninitialized" } */
+#pragma GCC diagnostic pop
+  return u;
+}
+
+void
+foo (void)
+{
+  int u = baz ();
+  bar (u);
+}
+
+static inline int
+qux (void)
+{
+  int u = u;   /* { dg-warning "'u' is used uninitialized" "" { target 
c++ } } */
+  return u;/* { dg-message "'u' was declared here" "" { target c++ 
} .-1 } */
+}
+
+void
+corge (void)
+{
+  int u = qux ();
+  bar (u);
+}
--- gcc/testsuite/c-c++-common/Winit-self4.c.jj 2023-01-12 15:50:15.233906776 
+0100
+++ gcc/testsuite/c-c++-common/Winit-self4.c2023-01-12 15:50:42.445515372 
+0100
@@ -0,0 +1,36 @@
+/* PR c++/105593 */
+/* { dg-do compile } */
+/* { dg-options "-W -Wall -Winit-self" } */
+
+void bar (int);
+
+static inline int
+baz (void)
+{
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Winit-self"
+  int u = u;   /* { dg-bogus "'u' is used uninitialized" } */
+#pragma GCC diagnostic pop
+  return u;
+}
+
+void
+foo (void)
+{
+  int u = baz ();
+  bar (u);
+}
+
+static inline int
+qux (void)
+{
+  int u = u;   /* { dg-warning "'u' is used uninitialized" } */
+  return u;/* { dg-message "'u' was declared here" "" { target 
*-*-* } .-1 } */
+}
+
+void
+corge (void)
+{
+  int u = qux ();
+  bar (u);
+}
--- gcc/testsuite/c-c++-common/Winit-self5.c.jj 2023-01-12 15:51:01.045247847 
+0100
+++ gcc/testsuite/c-c++-common/Winit-self5.c2023-01-12 15:51:12.929076923 
+0100
@@ -0,0 +1,36 @@
+/* PR c++/105593 */
+/* { dg-do compile } */
+/* { dg-options "-W -Wall -Wno-init-self" } */
+
+void bar (int);
+
+static inline int
+baz (void)
+{
+#pragma GCC diagnostic push

Re: [PATCH] c, c++, v2: Avoid incorrect shortening of divisions [PR108365]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 09:31:23PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Thu, Jan 12, 2023 at 08:55:32PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > > > So, the following patch for the NOP_EXPR cases checks just in case that
> > > > it is from integral type and more importantly checks it is a widening
> > > > conversion, and then next to it also allows op0 to be just unsigned,
> > > > promoted or not, as that is what the C FE will do for those cases too
> > > > and I believe it must work - either the division/modulo common type
> > > > will be that unsigned type, then we can shorten and don't need to worry
> > > > about UB, or it will be some wider signed type but then it can't be most
> > > > negative value of the wider type.
> > > 
> > > Why not use the same condition in C and C++?
> > 
> > I can test that.  Do you mean change the C FE to match the patched C++
> > or change C++ FE to just test TYPE_UNSIGNED (orig_op0)?
> > I think both should work, though what I wrote perhaps can shorten in more
> > cases.  Can try to construct testcases where it differs...
> 
> E.g.
> int f1 (int x, int y) { return (unsigned) x / y; }
> unsigned short f2 (unsigned short x, unsigned short y) { return (unsigned) x 
> / y; }
> unsigned int f3 (unsigned int x, unsigned int y) { return (long long) x / y; }
> C++ FE before and after the patch shortens the division in f2 and f3,
> C FE shortens only in f2.  So using the C FE condition would be a regression
> for C++.
> 
> Therefore I'm going to test following patch:

Bootstrapped/regtested successfully on x86_64-linux and i686-linux.

Jakub



[PATCH] modula-2: Handle pass '-v' option to the compiler.

2023-01-12 Thread Iain Sandoe via Gcc-patches
Tested on x86-64-darwin21.
OK for trunk?
Iain

--- 8< ---

Somehow this setting had been missed, and we really need the verbose
flag to enable useful debug output.

Signed-off-by: Iain Sandoe 

gcc/m2/ChangeLog:

* gm2-gcc/m2options.h (M2Options_SetVerbose): Export the
function.
* gm2-lang.cc: Handle OPT_v, passing it to the compiler.
* lang-specs.h: Pass -v to cc1gm2.
---
 gcc/m2/gm2-gcc/m2options.h | 1 +
 gcc/m2/gm2-lang.cc | 3 +++
 gcc/m2/lang-specs.h| 2 +-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/m2/gm2-gcc/m2options.h b/gcc/m2/gm2-gcc/m2options.h
index 4b32c911b09..beaa460ffa9 100644
--- a/gcc/m2/gm2-gcc/m2options.h
+++ b/gcc/m2/gm2-gcc/m2options.h
@@ -106,6 +106,7 @@ EXTERN int M2Options_GetCpp (void);
 EXTERN int M2Options_GetM2g (void);
 EXTERN void M2Options_SetM2g (int value);
 EXTERN void M2Options_SetLowerCaseKeywords (int value);
+EXTERN void M2Options_SetVerbose (int value);
 EXTERN void M2Options_SetUnusedVariableChecking (int value);
 EXTERN void M2Options_SetUnusedParameterChecking (int value);
 EXTERN void M2Options_SetStrictTypeChecking (int value);
diff --git a/gcc/m2/gm2-lang.cc b/gcc/m2/gm2-lang.cc
index 49f93901d3c..073d1dd55db 100644
--- a/gcc/m2/gm2-lang.cc
+++ b/gcc/m2/gm2-lang.cc
@@ -451,6 +451,9 @@ gm2_langhook_handle_option (
 case OPT_save_temps_:
   M2Options_SetSaveTempsDir (arg);
   return 1;
+case OPT_v:
+  M2Options_SetVerbose (value);
+  return 1;
 default:
   if (insideCppArgs)
{
diff --git a/gcc/m2/lang-specs.h b/gcc/m2/lang-specs.h
index 0a34b0bc6d7..bf882649b21 100644
--- a/gcc/m2/lang-specs.h
+++ b/gcc/m2/lang-specs.h
@@ -34,6 +34,6 @@ along with GCC; see the file COPYING3.  If not see
   {"@modula-2",
   "cc1gm2 " M2CPP
   "  %(cc1_options) %{B*} %{c*} %{+e*} %{I*} "
-  "  %{i*} %{save-temps*} "
+  "  %{i*} %{save-temps*} %{v} "
   "  %i %{!fsyntax-only:%(invoke_as)}",
   0, 0, 0},
-- 
2.37.1 (Apple Git-137.1)



[committed] libstdc++: Fix exports for IEEE128 versions of __try_use_facet [PR108327]

2023-01-12 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and powerpc64le-linux. Pushed to trunk.

-- >8 --

The new symbols need to be exported, as well as some of the
std::locale::facet::id globals, which are not new but were presumably
not needed by any inline functions before now.

libstdc++-v3/ChangeLog:

PR libstdc++/108327
* config/os/gnu-linux/ldbl-extra.ver (GLIBCXX_LDBL_3.4.31):
Export __try_use_facet specializations for facets in namespace
__gnu_cxx_ldbl128.
* config/os/gnu-linux/ldbl-ieee128-extra.ver
(GLIBCXX_IEEE128_3.4.31): Likewise for facets in namespace
__gnu_cxx_ieee128.
* testsuite/util/testsuite_abi.cc: Add to lists of known and
latest versions.
---
 libstdc++-v3/config/os/gnu-linux/ldbl-extra.ver | 4 
 libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver | 7 +++
 libstdc++-v3/testsuite/util/testsuite_abi.cc| 9 ++---
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/config/os/gnu-linux/ldbl-extra.ver 
b/libstdc++-v3/config/os/gnu-linux/ldbl-extra.ver
index 8c7c783ba58..ba7cc3308f7 100644
--- a/libstdc++-v3/config/os/gnu-linux/ldbl-extra.ver
+++ b/libstdc++-v3/config/os/gnu-linux/ldbl-extra.ver
@@ -45,6 +45,10 @@ GLIBCXX_LDBL_3.4.29 {
   _ZSt8to_charsPcS_g*;
 } GLIBCXX_LDBL_3.4.21;
 
+GLIBCXX_LDBL_3.4.31 {
+  _ZSt15__try_use_facetINSt17__gnu_cxx_ldbl128[789]*;
+} GLIBCXX_LDBL_3.4.29;
+
 CXXABI_LDBL_1.3 {
   _ZT[IS]g;
   _ZT[IS]Pg;
diff --git a/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver 
b/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
index 830cb8c40f4..17d696364c4 100644
--- a/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
+++ b/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
@@ -50,6 +50,13 @@ GLIBCXX_IEEE128_3.4.30 {
   _ZNSt17__gnu_cxx_ieee12816__convert_from_vERKP15__locale_structPciPKcz;
 } GLIBCXX_3.4.30;
 
+GLIBCXX_IEEE128_3.4.31 {
+  _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1287num_get*;
+  _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1287num_put*;
+  
_ZNSt19__gnu_cxx11_ieee1289money_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
+  
_ZNSt19__gnu_cxx11_ieee1289money_putI[cw]St19ostreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
+} GLIBCXX_3.4.31;
+
 CXXABI_IEEE128_1.3.13 {
 
   _ZT[IS]u9__ieee128;
diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc 
b/libstdc++-v3/testsuite/util/testsuite_abi.cc
index 27616f5987e..f838d27cfb5 100644
--- a/libstdc++-v3/testsuite/util/testsuite_abi.cc
+++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc
@@ -210,11 +210,13 @@ check_version(symbol& test, bool added)
   known_versions.push_back("GLIBCXX_3.4.27");
   known_versions.push_back("GLIBCXX_3.4.28");
   known_versions.push_back("GLIBCXX_3.4.29");
+  known_versions.push_back("GLIBCXX_LDBL_3.4.29");
   known_versions.push_back("GLIBCXX_3.4.30");
   known_versions.push_back("GLIBCXX_3.4.31");
-  known_versions.push_back("GLIBCXX_LDBL_3.4.29");
+  known_versions.push_back("GLIBCXX_LDBL_3.4.31");
   known_versions.push_back("GLIBCXX_IEEE128_3.4.29");
   known_versions.push_back("GLIBCXX_IEEE128_3.4.30");
+  known_versions.push_back("GLIBCXX_IEEE128_3.4.31");
   known_versions.push_back("CXXABI_1.3");
   known_versions.push_back("CXXABI_LDBL_1.3");
   known_versions.push_back("CXXABI_1.3.1");
@@ -250,8 +252,9 @@ check_version(symbol& test, bool added)
 
   // Check that added symbols are added in the latest pre-release version.
   bool latestp = (test.version_name == "GLIBCXX_3.4.31"
- // XXX remove next line when baselines have been regenerated.
-|| test.version_name == "GLIBCXX_IEEE128_3.4.30"
+ // XXX remove next 2 lines when baselines have been regenerated.
+|| test.version_name == "GLIBCXX_IEEE128_3.4.31"
+|| test.version_name == "GLIBCXX_LDBL_3.4.31"
 || test.version_name == "CXXABI_1.3.14"
 || test.version_name == "CXXABI_FLOAT128"
 || test.version_name == "CXXABI_TM_1");
-- 
2.39.0



[committed] libstdc++: Do not include in concurrency headers

2023-01-12 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and powerpc64le-linux. Pushed to trunk.

-- >8 --

The , , and  headers use
std::errc constants, but don't use std::system_error itself. They only
use the __throw_system_error(int) function, which is defined in
.

By including the header for the errc constants instead of the whole of
 we avoid depending on the whole std::string definition.

libstdc++-v3/ChangeLog:

* include/bits/std_mutex.h: Remove  include.
* include/std/condition_variable: Add 
include.
* include/std/mutex: Likewise.
* include/std/shared_mutex: Likewise.
---
 libstdc++-v3/include/bits/std_mutex.h   | 1 -
 libstdc++-v3/include/std/condition_variable | 3 ++-
 libstdc++-v3/include/std/mutex  | 2 +-
 libstdc++-v3/include/std/shared_mutex   | 1 +
 4 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/std_mutex.h 
b/libstdc++-v3/include/bits/std_mutex.h
index 68f5fb9ed65..bc515358d23 100644
--- a/libstdc++-v3/include/bits/std_mutex.h
+++ b/libstdc++-v3/include/bits/std_mutex.h
@@ -36,7 +36,6 @@
 # include 
 #else
 
-#include 
 #include 
 #include 
 
diff --git a/libstdc++-v3/include/std/condition_variable 
b/libstdc++-v3/include/std/condition_variable
index b885e1baa1b..f671fe4afe1 100644
--- a/libstdc++-v3/include/std/condition_variable
+++ b/libstdc++-v3/include/std/condition_variable
@@ -38,6 +38,7 @@
 #else
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -372,7 +373,7 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2)
 {
   return __p();
 }
- 
+
   std::stop_callback __cb(__stoken, [this] { notify_all(); });
   shared_ptr __mutex = _M_mutex;
   while (!__p())
diff --git a/libstdc++-v3/include/std/mutex b/libstdc++-v3/include/std/mutex
index aca5f91e03c..4eedbe5038c 100644
--- a/libstdc++-v3/include/std/mutex
+++ b/libstdc++-v3/include/std/mutex
@@ -40,8 +40,8 @@
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 #include 
 #include 
 #if ! _GTHREAD_USE_MUTEX_TIMEDLOCK
diff --git a/libstdc++-v3/include/std/shared_mutex 
b/libstdc++-v3/include/std/shared_mutex
index 7b70697f178..57c3cc54d81 100644
--- a/libstdc++-v3/include/std/shared_mutex
+++ b/libstdc++-v3/include/std/shared_mutex
@@ -36,6 +36,7 @@
 #if __cplusplus >= 201402L
 
 #include 
+#include 
 #include 
 #include // move, __exchange
 #include// defer_lock_t
-- 
2.39.0



ping: [PATCH] libcpp: Improve location for macro names [PR66290]

2023-01-12 Thread Lewis Hyatt via Gcc-patches
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607647.html
May I please ping this one again? It will enable closing out the PR. Thanks!

-Lewis

On Thu, Dec 1, 2022 at 9:22 AM Lewis Hyatt  wrote:
>
> Hello-
>
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
>
> May I please ping this one? Thanks!
> I have also re-attached the rebased patch here.
>
> -Lewis
>
> On Wed, Oct 12, 2022 at 06:37:50PM -0400, Lewis Hyatt wrote:
> > Hello-
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> >
> > Since Jeff was kind enough to ack one of my other preprocessor patches
> > today, I have become emboldened to ping this one again too :). Would
> > anyone have some time to take a look at it please? Thanks!
> >
> > -Lewis
> >
> > On Thu, Sep 15, 2022 at 6:31 PM Lewis Hyatt  wrote:
> > >
> > > Hello-
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599397.html
> > > May I please ping this patch? Thank you.
> > >
> > > -Lewis
> > >
> > > On Fri, Aug 5, 2022 at 12:14 PM Lewis Hyatt  wrote:
> > > >
> > > >
> > > > When libcpp reports diagnostics whose locus is a macro name (such as for
> > > > -Wunused-macros), it uses the location in the cpp_macro object that was
> > > > stored by _cpp_new_macro. This is currently set to 
> > > > pfile->directive_line,
> > > > which contains the line number only and no column information. This 
> > > > patch
> > > > changes the stored location to the src_loc for the token defining the 
> > > > macro
> > > > name, which includes the location and range information.
> > > >
> > > > libcpp/ChangeLog:
> > > >
> > > > PR c++/66290
> > > > * macro.cc (_cpp_create_definition): Add location argument.
> > > > * internal.h (_cpp_create_definition): Adjust prototype.
> > > > * directives.cc (do_define): Pass new location argument to
> > > > _cpp_create_definition.
> > > > (do_undef): Stop passing inferior location to 
> > > > cpp_warning_with_line;
> > > > the default from cpp_warning is better.
> > > > (cpp_pop_definition): Pass new location argument to
> > > > _cpp_create_definition.
> > > > * pch.cc (cpp_read_state): Likewise.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR c++/66290
> > > > * c-c++-common/cpp/macro-ranges.c: New test.
> > > > * c-c++-common/cpp/line-2.c: Adapt to check for column 
> > > > information
> > > > on macro-related libcpp warnings.
> > > > * c-c++-common/cpp/line-3.c: Likewise.
> > > > * c-c++-common/cpp/macro-arg-count-1.c: Likewise.
> > > > * c-c++-common/cpp/pr58844-1.c: Likewise.
> > > > * c-c++-common/cpp/pr58844-2.c: Likewise.
> > > > * c-c++-common/cpp/warning-zero-location.c: Likewise.
> > > > * c-c++-common/pragma-diag-14.c: Likewise.
> > > > * c-c++-common/pragma-diag-15.c: Likewise.
> > > > * g++.dg/modules/macro-2_d.C: Likewise.
> > > > * g++.dg/modules/macro-4_d.C: Likewise.
> > > > * g++.dg/modules/macro-4_e.C: Likewise.
> > > > * g++.dg/spellcheck-macro-ordering.C: Likewise.
> > > > * gcc.dg/builtin-redefine.c: Likewise.
> > > > * gcc.dg/cpp/Wunused.c: Likewise.
> > > > * gcc.dg/cpp/redef2.c: Likewise.
> > > > * gcc.dg/cpp/redef3.c: Likewise.
> > > > * gcc.dg/cpp/redef4.c: Likewise.
> > > > * gcc.dg/cpp/ucnid-11-utf8.c: Likewise.
> > > > * gcc.dg/cpp/ucnid-11.c: Likewise.
> > > > * gcc.dg/cpp/undef2.c: Likewise.
> > > > * gcc.dg/cpp/warn-redefined-2.c: Likewise.
> > > > * gcc.dg/cpp/warn-redefined.c: Likewise.
> > > > * gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
> > > > * gcc.dg/cpp/warn-unused-macros.c: Likewise.
> > > > ---
> > > >
> > > > Notes:
> > > > Hello-
> > > >
> > > > The PR (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66290) was 
> > > > originally
> > > > about the entirely wrong location for -Wunused-macros in C++ mode, 
> > > > which
> > > > behavior was fixed by r13-1903, but before closing it out I wanted 
> > > > to also
> > > > address a second point brought up in the PR comments, namely that 
> > > > we do not
> > > > include column information when emitting diagnostics for macro 
> > > > names, such as
> > > > is done for -Wunused-macros. The attached patch updates the 
> > > > location stored in
> > > > the cpp_macro object so that it includes the column and range 
> > > > information for
> > > > the token comprising the macro name; previously, the location was 
> > > > just the
> > > > generic one pointing to the whole line.
> > > >
> > > > The change to libcpp is very small, the reason for all the 
> > > > testsuite changes is
> > > > that I have updated all tests explicitly looking for the columnless 
> > > > diagnostics
> > > > (with the "-:" syntax to dg-warning et al) so that they expect a 
> > > > column
> 

[PATCH 1/1] Replace flag_strict_flex_arrays with DECL_NOT_FLEXARRAY in middle-end

2023-01-12 Thread Qing Zhao via Gcc-patches
We should not directly check flag_strict_flex_arrays in the middle end.
Instead, check DECL_NOT_FLEXARRAY(array_field_decl) which is set by
C/C++ FEs according to -fstrict-flex-arrays and the corresponding attribute
attached to the array_field.

As a result, We will lose the LEVEL information of -fstrict-flex-arrays in
the middle end. -Wstrict-flex-arrays will not be able to issue such
information. update the testing cases accordingly.

gcc/ChangeLog:

* attribs.cc (strict_flex_array_level_of): Move this function to ...
* attribs.h (strict_flex_array_level_of): Remove the declaration.
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref):
replace the referece to strict_flex_array_level_of with
DECL_NOT_FLEXARRAY.
* tree.cc (component_ref_size): Likewise.

gcc/c/ChangeLog:

* c-decl.cc (strict_flex_array_level_of): ... here.

gcc/testsuite/ChangeLog:

* gcc.dg/Warray-bounds-flex-arrays-1.c: Delete the level information
from the message issued by -Wstrict-flex-arrays.
* gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
* gcc.dg/Wstrict-flex-arrays-2.c: Likewise.
* gcc.dg/Wstrict-flex-arrays-3.c: Likewise.
* gcc.dg/Wstrict-flex-arrays.c: Likewise.
---
 gcc/attribs.cc| 30 
 gcc/attribs.h |  2 -
 gcc/c/c-decl.cc   | 28 +++
 gcc/gimple-array-bounds.cc| 46 ---
 .../gcc.dg/Warray-bounds-flex-arrays-1.c  |  2 +-
 .../gcc.dg/Warray-bounds-flex-arrays-2.c  |  4 +-
 .../gcc.dg/Warray-bounds-flex-arrays-3.c  |  6 +--
 .../gcc.dg/Warray-bounds-flex-arrays-4.c  |  2 +-
 .../gcc.dg/Warray-bounds-flex-arrays-5.c  |  4 +-
 .../gcc.dg/Warray-bounds-flex-arrays-6.c  |  6 +--
 gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c  |  8 ++--
 gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c  |  8 ++--
 gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c|  8 ++--
 gcc/tree.cc   | 37 +++
 14 files changed, 68 insertions(+), 123 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 49cb299a3c1..7db730cf831 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -2456,36 +2456,6 @@ init_attr_rdwr_indices (rdwr_map *rwm, tree attrs)
 }
 }
 
-/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
-   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
-unsigned int
-strict_flex_array_level_of (tree array_field)
-{
-  gcc_assert (TREE_CODE (array_field) == FIELD_DECL);
-  unsigned int strict_flex_array_level = flag_strict_flex_arrays;
-
-  tree attr_strict_flex_array
-= lookup_attribute ("strict_flex_array", DECL_ATTRIBUTES (array_field));
-  /* If there is a strict_flex_array attribute attached to the field,
- override the flag_strict_flex_arrays.  */
-  if (attr_strict_flex_array)
-{
-  /* Get the value of the level first from the attribute.  */
-  unsigned HOST_WIDE_INT attr_strict_flex_array_level = 0;
-  gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
-  attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
-  gcc_assert (TREE_VALUE (attr_strict_flex_array) != NULL_TREE);
-  attr_strict_flex_array = TREE_VALUE (attr_strict_flex_array);
-  gcc_assert (tree_fits_uhwi_p (attr_strict_flex_array));
-  attr_strict_flex_array_level = tree_to_uhwi (attr_strict_flex_array);
-
-  /* The attribute has higher priority than flag_struct_flex_array.  */
-  strict_flex_array_level = attr_strict_flex_array_level;
-}
-  return strict_flex_array_level;
-}
-
-
 /* Return the access specification for a function parameter PARM
or null if the current function has no such specification.  */
 
diff --git a/gcc/attribs.h b/gcc/attribs.h
index e7540f71d6a..140e70b64e0 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -398,6 +398,4 @@ extern void init_attr_rdwr_indices (rdwr_map *, tree);
 extern attr_access *get_parm_access (rdwr_map &, tree,
 tree = current_function_decl);
 
-extern unsigned int strict_flex_array_level_of (tree);
-
 #endif // GCC_ATTRIBS_H
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index d76ffb3380d..38647c100c1 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9047,6 +9047,34 @@ finish_incomplete_vars (tree incomplete_vars, bool 
toplevel)
 }
 }
 
+/* Get the LEVEL of the strict_flex_array for the ARRAY_FIELD based on the
+   values of attribute strict_flex_array and the flag_strict_flex_arrays.  */
+static unsigned int
+strict_flex_array_level_of (tree array_field)
+{
+  gcc_assert (TREE_CODE (array_field) == 

[PATCH 0/1] Replace flag_strict_flex_arrays with DECL_NOT_FLEXARRAY in middle-end

2023-01-12 Thread Qing Zhao via Gcc-patches
this is the patch to replace all references to flag_strict_flex_arrays
with DECL_NOT_FLEXARRAY in middle-end per the discussion.

I have bootstrapped and regression tested on X86, no issues.

Okay for commit?

thanks.

Qing


Re: libstdc++: Fix deadlock in debug iterator increment [PR108288]

2023-01-12 Thread Jonathan Wakely via Gcc-patches
On Thu, 12 Jan 2023 at 18:25, François Dumont  wrote:
>
> On 12/01/23 13:00, Jonathan Wakely wrote:
> > On Thu, 12 Jan 2023 at 05:52, François Dumont wrote:
> >> Small update for an obvious compilation issue and to review new test
> >> case that could have lead to an infinite loop if the increment issue was
> >> not detected.
> >>
> >> I also forgot to ask if there is more chance for the instantiation to be
> >> elided when it is implemented like in the _Safe_local_iterator:
> >> return { __cur, this->_M_sequence };
> > No, that doesn't make any difference.
> >
> >> than in the _Safe_iterator:
> >> return _Safe_iterator(__cur, this->_M_sequence);
> >>
> >> In the case where the user code do not use it ?
> >>
> >> Fully tested now, ok to commit ?
> >>
> >> François
> >>
> >> On 11/01/23 07:03, François Dumont wrote:
> >>> Thanks for fixing this.
> >>>
> >>> Here is the extension of the fix to all post-increment/decrement
> >>> operators we have on _GLIBCXX_DEBUG iterator.
> > Thanks, I completely forgot we have other partial specializations, I
> > just fixed the one that showed a deadlock in the user's example!
> >
> >>> I prefer to restore somehow previous implementation to continue to
> >>> have _GLIBCXX_DEBUG post operators implemented in terms of normal post
> >>> operators.
> > Why?
> >
> > Implementing post-increment as:
> >
> >  auto tmp = *this;
> >  ++*this;
> >  return tmp;
> >
> > is the idiomatic way to write it, and it works fine in this case. I
> > don't think it performs any more work than your version, does it?
> > Why not use the idiomatic form?
> >
> > Is it just so that post-inc of a debug iterator uses post-inc of the
> > underlying iterator? Why does that matter?
> >
> A little yes, but that's a minor reason that is just making me happy.
>
> Main reason is that this form could produce a __msg_init_copy_singular
> before the __msg_bad_inc.

Ah yes, I see. That's a shame. I find the idiomatic form much simpler
to read, and it will generate better code (because it just reuses
existing functions, instead of adding new ones).

We could do this though, right?

_GLIBCXX_DEBUG_VERIFY(this->_M_incrementable(),
  _M_message(__msg_bad_inc)
  ._M_iterator(*this, "this"));
_Safe_iterator __tmp = *this;
++*this;
return __tmp;

That does the VERIFY check twice though.

> And moreover I plan to propose a patch later to skip any check in the
> call to _Safe_iterator(__cur, _M_sequence) as we already know that __cur
> is ok here like anywhere else in the lib.
>
> There will still be one in the constructor normally elided unless
> --no-elide-constructors but there is not much I can do about it.

Don't worry about it. Nobody should ever use -fno-elide-constructors
in any real cases (except maybe debugging some very strange corner
cases, and in that case the extra safe iterator checks are not going
to be their biggest problem).

The patch is OK for trunk then.



[18/18] hash table: enforce testing is_empty before is_deleted

2023-01-12 Thread Alexandre Oliva via Gcc-patches


Existing hash_table traits that use the same representation for empty
and deleted slots reject marking slots as deleted, and to not pass
is_deleted for slots that pass is_empty.

Nevertheless, nearly everywhere, we only test for is_deleted after
checking that !is_empty first.  The one exception was the copy
constructor, that would fail if traits recognized is_empty slots as
is_deleted, but then refused to mark_deleted.

This asymmetry is neither necessary nor desirable, and there is a
theoretical risk that traits might not only fail to refuse to
mark_deleted, but also return is_deleted for is_empty slots.

This patch introduces checks that detect these potentially problematic
situations, and reorders the tests in the copy constructor so as to
use the conventional testing order and thus avoid them.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* hash-table.h (is_deleted): Precheck !is_empty.
(mark_deleted): Postcheck !is_empty.
(copy constructor): Test is_empty before is_deleted.
---
 gcc/hash-table.h |   16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 1d3166504c38e..e37625dc315bf 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -534,6 +534,11 @@ private:
   void expand ();
   static bool is_deleted (value_type )
   {
+/* Traits are supposed to avoid recognizing elements as both empty
+   and deleted, but to fail safe in case custom traits fail to do
+   that, make sure we never test for is_deleted without having
+   first ruled out is_empty.  */
+gcc_checking_assert (!Descriptor::is_empty (v));
 return Descriptor::is_deleted (v);
   }
 
@@ -545,6 +550,11 @@ private:
   static void mark_deleted (value_type )
   {
 Descriptor::mark_deleted (v);
+/* Traits are supposed to refuse to set elements as deleted if
+   those would be indistinguishable from empty, but to fail safe
+   in case custom traits fail to do that, check that the
+   just-deleted element does not look empty.  */
+gcc_checking_assert (!Descriptor::is_empty (v));
   }
 
   static void mark_empty (value_type )
@@ -700,9 +710,11 @@ hash_table::hash_table (const 
hash_table ,
   for (size_t i = 0; i < size; ++i)
{
  value_type  = h.m_entries[i];
- if (is_deleted (entry))
+ if (is_empty (entry))
+   continue;
+ else if (is_deleted (entry))
mark_deleted (nentries[i]);
- else if (!is_empty (entry))
+ else
new ((void*) (nentries + i)) value_type (entry);
}
   m_entries = nentries;

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


[PATCH] libstdc++: Fix unintended layout change to std::basic_filebuf [PR108331]

2023-01-12 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Bootstrapped x86_64-w64-mingw32 with both
--enable-threads=posix and --enable-threads=win32.

I need a libgcc or mingw maintainer to approve the gthr-win32.h part.

OK for trunk?

-- >8 --

GCC 13 has a new implementation of gthr-win32.h which supports C++11
mutexes, threads etc. but this causes an unintended ABI break. The
__gthread_mutex_t type is always used in std::basic_filebuf even in
C++98, so independent of whether C++11 sync primitives work or not.
Because that type changed for the win32 thread model, we have a layout
change in std::basic_filebuf. The member is completely unused, it just
gets passed to the std::__basic_file constructor and ignored. So we
don't need that mutex to actually work, we just need its layout to not
change.

Introduce a new __gthr_win32_legacy_mutex_t struct in gthr-win32.h with
the old layout, and conditionally use that in std::basic_filebuf.

PR libstdc++/108331

libgcc/ChangeLog:

* config/i386/gthr-win32.h (__gthr_win32_legacy_mutex_t): New
struct matching the previous __gthread_mutex_t struct.
(__GTHREAD_LEGACY_MUTEX_T): Define.

libstdc++-v3/ChangeLog:

* config/io/c_io_stdio.h (__c_lock): Define as a typedef for
__GTHREAD_LEGACY_MUTEX_T if defined.
---
 libgcc/config/i386/gthr-win32.h | 8 
 libstdc++-v3/config/io/c_io_stdio.h | 7 +++
 2 files changed, 15 insertions(+)

diff --git a/libgcc/config/i386/gthr-win32.h b/libgcc/config/i386/gthr-win32.h
index 146357fa436..050c7a21fcc 100644
--- a/libgcc/config/i386/gthr-win32.h
+++ b/libgcc/config/i386/gthr-win32.h
@@ -381,6 +381,14 @@ typedef struct timespec __gthread_time_t;
 #define __GTHREAD_COND_INIT_FUNCTION __gthread_cond_init_function
 #define __GTHREAD_TIME_INIT {0, 0}
 
+// Libstdc++ std::basic_filebuf needs the old definition of __gthread_mutex_t
+// for layout purposes, but doesn't actually use it.
+typedef struct {
+  long __unused1;
+  void *__unused2;
+} __gthr_win32_legacy_mutex_t;
+#define __GTHREAD_LEGACY_MUTEX_T __gthr_win32_legacy_mutex_t
+
 #if defined (_WIN32) && !defined(__CYGWIN__)
 #define MINGW32_SUPPORTS_MT_EH 1
 /* Mingw runtime >= v0.3 provides a magic variable that is set to nonzero
diff --git a/libstdc++-v3/config/io/c_io_stdio.h 
b/libstdc++-v3/config/io/c_io_stdio.h
index 1a5e05a844a..e9e6e3ef4d8 100644
--- a/libstdc++-v3/config/io/c_io_stdio.h
+++ b/libstdc++-v3/config/io/c_io_stdio.h
@@ -39,7 +39,14 @@ namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#ifdef __GTHREAD_LEGACY_MUTEX_T
+  // The layout of __gthread_mutex_t changed in GCC 13, but libstdc++ doesn't
+  // actually use the basic_filebuf::_M_lock member, so define it consistently
+  // with the old __gthread_mutex_t to avoid an unnecessary layout change:
+  typedef __GTHREAD_LEGACY_MUTEX_T __c_lock;
+#else
   typedef __gthread_mutex_t __c_lock;
+#endif
 
   // for basic_file.h
   typedef FILE __c_file;
-- 
2.39.0



[PATCH] libstdc++: Make forward to C version if included by C

2023-01-12 Thread Jonathan Wakely via Gcc-patches
I was asked about this privately. I don't really like that it would ever
be needed, but it also seems fairly harmless. We already do similar
things for some of the libstdc++ versions of  etc.

Any objections?

Tested x86_64-linux.

-- >8 --

This shouldn't be needed, but apparently the Bazel build tool has a
misfeature where it uses -nostdinc and explicitly provides the system
include paths, and it uses the same set of paths for compiling C and C++
code. This means that the libstdc++ header paths get used when compiling
C code, and so our new  gets included.

Just forward to the C header using #include_next if that happens.

libstdc++-v3/ChangeLog:

* include/c_compatibility/stdatomic.h: Use #include_next if
included by C code.
---
 libstdc++-v3/include/c_compatibility/stdatomic.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/c_compatibility/stdatomic.h 
b/libstdc++-v3/include/c_compatibility/stdatomic.h
index a51a84c2054..aa7bbb923dc 100644
--- a/libstdc++-v3/include/c_compatibility/stdatomic.h
+++ b/libstdc++-v3/include/c_compatibility/stdatomic.h
@@ -124,7 +124,7 @@ using std::atomic_flag_clear_explicit;
 using std::atomic_thread_fence;
 using std::atomic_signal_fence;
 
-#elif defined __clang__
+#elif !defined __cplusplus || defined __clang__
 # include_next 
 #endif // C++23
 #endif // _GLIBCXX_STDATOMIC_H
-- 
2.39.0



Re: [PATCH] Various fixes for DWARF register size computation

2023-01-12 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Thu, Jan 12, 2023 at 04:50:07PM +, Richard Sandiford wrote:
>> I'm jumping in here without fully understanding the context, so maybe this
>> is exactly your point, but: the SIMD/FP DWARF registers are supposed to be
>> size 8 regardless of which features are enabled.  That's already only half
>> of the hardware register size for base Armv8-A, since Advanced SIMD registers
>> are 16 bytes in size.
>> 
>> So yeah, if we're using the hardware register size then something is wrong.
>
> I'm talking about what the following compiles to
> static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>
> void
> foo (void)
> {
>   __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table);
> }
> (and therefore what libgcc/unwind-dw2.c (init_dwarf_reg_size_table) as well)
> with -O2 -fbuilding-libgcc -march=armv8-a vs. -O2 -fbuilding-libgcc 
> -march=armv8-a+sve
> The former is setting I think [0..31, 46, 48..63, 72..79, 96]=8, [64..71, 
> 80..95]=0
> (and leaving others untouched, which keeps them 0).
> While the latter is setting [0..31, 46, 72..79, 96]=8, [64..71, 80..95]=0
> and [48..63]=cntd

Ah, interesting.  So the SIMD/FP registers are OK, but the predicate
registers are causing a problem.

I think we should set the predicates to size 0 too, like we do for
call-clobbered FP registers.  Predicate registers should never need
to be represented in CFI.

Thanks,
Richard


[committed] libstdc++: Extend max_align_t special case to 64-bit HP-UX [PR77691]

2023-01-12 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

GCC's std::max_align_t doesn't agree with the system malloc on HP-UX, so
generalize the current hack for Solaris to apply to that target too.

libstdc++-v3/ChangeLog:

PR libstdc++/77691
* include/experimental/memory_resource
(_GLIBCXX_MAX_ALIGN_MATCHES_MALLOC): Define.
(do_allocate, do_deallocate): Check it.
* testsuite/experimental/memory_resource/new_delete_resource.cc:
Relax expected behaviour for 64-bit hppa-hp-hpux11.11.
---
 libstdc++-v3/include/experimental/memory_resource | 15 ---
 .../memory_resource/new_delete_resource.cc|  4 
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/experimental/memory_resource 
b/libstdc++-v3/include/experimental/memory_resource
index aa86c042d84..786392e8904 100644
--- a/libstdc++-v3/include/experimental/memory_resource
+++ b/libstdc++-v3/include/experimental/memory_resource
@@ -412,11 +412,20 @@ namespace pmr {
   allocator_type get_allocator() const noexcept { return _M_alloc; }
 
 protected:
+#if (defined __sun__ || defined __VXWORKS__) && defined __i386__
+// Cannot use max_align_t on 32-bit Solaris x86, see PR libstdc++/77691
+# define _GLIBCXX_MAX_ALIGN_MATCHES_MALLOC 0
+#elif defined __hpux__ && defined __hppa__ && defined __LP64__
+// Ignore inconsistent long double and malloc alignment (libstdc++/77691)
+# define _GLIBCXX_MAX_ALIGN_MATCHES_MALLOC 0
+#else
+# define _GLIBCXX_MAX_ALIGN_MATCHES_MALLOC 1
+#endif
+
   virtual void*
   do_allocate(size_t __bytes, size_t __alignment) override
   {
-   // Cannot use max_align_t on 32-bit Solaris x86, see PR libstdc++/77691
-#if ! ((defined __sun__ || defined __VXWORKS__) && defined __i386__)
+#if _GLIBCXX_MAX_ALIGN_MATCHES_MALLOC
if (__alignment == alignof(max_align_t))
  return _M_allocate(__bytes);
 #endif
@@ -442,7 +451,7 @@ namespace pmr {
   do_deallocate(void* __ptr, size_t __bytes, size_t __alignment) noexcept
   override
   {
-#if ! ((defined __sun__ || defined __VXWORKS__) && defined __i386__)
+#if _GLIBCXX_MAX_ALIGN_MATCHES_MALLOC
if (__alignment == alignof(max_align_t))
  return (void) _M_deallocate(__ptr, __bytes);
 #endif
diff --git 
a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc 
b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
index d121d5f2c41..a7ecb54b905 100644
--- a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
+++ b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
@@ -27,6 +27,10 @@
 // See PR libstdc++/77691
 # define BAD_MAX_ALIGN_T 1
 #endif
+#if defined __hpux__ && defined __hppa__ && defined __LP64__
+// Ignore inconsistent long double and malloc alignment (libstdc++/77691)
+# define BAD_MAX_ALIGN_T 1
+#endif
 
 bool new_called = false;
 bool delete_called = false;
-- 
2.39.0



[committed] libstdc++: Update shared library version history in manual

2023-01-12 Thread Jonathan Wakely via Gcc-patches
Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* doc/xml/manual/abi.xml: Add latest library versions.
* doc/html/manual/abi.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/abi.html | 4 ++--
 libstdc++-v3/doc/xml/manual/abi.xml   | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/doc/xml/manual/abi.xml 
b/libstdc++-v3/doc/xml/manual/abi.xml
index 0153395f477..008fb66e5b1 100644
--- a/libstdc++-v3/doc/xml/manual/abi.xml
+++ b/libstdc++-v3/doc/xml/manual/abi.xml
@@ -273,6 +273,8 @@ compatible.
 GCC 9.3.0: libstdc++.so.6.0.28
 GCC 10.1.0: libstdc++.so.6.0.28
 GCC 11.1.0: libstdc++.so.6.0.29
+GCC 12.1.0: libstdc++.so.6.0.30
+GCC 13.1.0: libstdc++.so.6.0.31
 
 
   Note 1: Error should be libstdc++.so.3.0.3.
@@ -349,7 +351,7 @@ compatible.
 GCC 10.1.0: GLIBCXX_3.4.28, CXXABI_1.3.12
 GCC 11.1.0: GLIBCXX_3.4.29, CXXABI_1.3.13
 GCC 12.1.0: GLIBCXX_3.4.30, CXXABI_1.3.13
-GCC 13.1.0: GLIBCXX_3.4.31, CXXABI_1.3.13
+GCC 13.1.0: GLIBCXX_3.4.31, CXXABI_1.3.14
 
 
 
-- 
2.39.0



[PATCH] c, c++, v2: Avoid incorrect shortening of divisions [PR108365]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 08:55:32PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > > So, the following patch for the NOP_EXPR cases checks just in case that
> > > it is from integral type and more importantly checks it is a widening
> > > conversion, and then next to it also allows op0 to be just unsigned,
> > > promoted or not, as that is what the C FE will do for those cases too
> > > and I believe it must work - either the division/modulo common type
> > > will be that unsigned type, then we can shorten and don't need to worry
> > > about UB, or it will be some wider signed type but then it can't be most
> > > negative value of the wider type.
> > 
> > Why not use the same condition in C and C++?
> 
> I can test that.  Do you mean change the C FE to match the patched C++
> or change C++ FE to just test TYPE_UNSIGNED (orig_op0)?
> I think both should work, though what I wrote perhaps can shorten in more
> cases.  Can try to construct testcases where it differs...

E.g.
int f1 (int x, int y) { return (unsigned) x / y; }
unsigned short f2 (unsigned short x, unsigned short y) { return (unsigned) x / 
y; }
unsigned int f3 (unsigned int x, unsigned int y) { return (long long) x / y; }
C++ FE before and after the patch shortens the division in f2 and f3,
C FE shortens only in f2.  So using the C FE condition would be a regression
for C++.

Therefore I'm going to test following patch:

2023-01-12  Jakub Jelinek  

PR c++/108365
* c-typeck.cc (build_binary_op): For integral division or modulo,
shorten if type0 is unsigned, or op0 is cast from narrower unsigned
integral type or op1 is INTEGER_CST other than -1.

* typeck.cc (cp_build_binary_op): For integral division or modulo,
shorten if type0 is unsigned, or op0 is cast from narrower unsigned
integral type or stripped_op1 is INTEGER_CST other than -1.

* c-c++-common/pr108365.c: New test.
* g++.dg/opt/pr108365.C: New test.
* g++.dg/warn/pr108365.C: New test.

--- gcc/c/c-typeck.cc.jj2022-11-13 12:29:08.197504249 +0100
+++ gcc/c/c-typeck.cc   2023-01-12 21:06:53.310875131 +0100
@@ -12431,7 +12431,14 @@ build_binary_op (location_t location, en
   undefined if the quotient can't be represented in the
   computation mode.  We shorten only if unsigned or if
   dividing by something we know != -1.  */
-   shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
+   shorten = (TYPE_UNSIGNED (type0)
+  || (TREE_CODE (op0) == NOP_EXPR
+  && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
+   0)))
+  && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
+  && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0,
+   0)))
+  < TYPE_PRECISION (type0)))
   || (TREE_CODE (op1) == INTEGER_CST
   && !integer_all_onesp (op1)));
  common = 1;
@@ -12467,7 +12474,12 @@ build_binary_op (location_t location, en
 on some targets, since the modulo instruction is undefined if the
 quotient can't be represented in the computation mode.  We shorten
 only if unsigned or if dividing by something we know != -1.  */
- shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
+ shorten = (TYPE_UNSIGNED (type0)
+|| (TREE_CODE (op0) == NOP_EXPR
+&& INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
+&& TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
+&& (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0)))
+< TYPE_PRECISION (type0)))
 || (TREE_CODE (op1) == INTEGER_CST
 && !integer_all_onesp (op1)));
  common = 1;
--- gcc/cp/typeck.cc.jj 2023-01-11 12:47:56.099672340 +0100
+++ gcc/cp/typeck.cc2023-01-12 21:04:23.738022528 +0100
@@ -5455,8 +5455,15 @@ cp_build_binary_op (const op_location_t
 point, so we have to dig out the original type to find out if
 it was unsigned.  */
  tree stripped_op1 = tree_strip_any_location_wrapper (op1);
- shorten = ((TREE_CODE (op0) == NOP_EXPR
- && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0
+ shorten = (TYPE_UNSIGNED (type0)
+|| (TREE_CODE (op0) == NOP_EXPR
+&& INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
+ 0)))
+&& TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0,
+   0)))
+&& 

Re: [PATCH] c++: Avoid incorrect shortening of divisions [PR108365]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 02:37:13PM -0500, Jason Merrill wrote:
> > But the C++ FE was checking if op0 is a NOP_EXPR from TYPE_UNSIGNED.
> > First of all, not sure if the operand of NOP_EXPR couldn't be non-integral
> > type where TYPE_UNSIGNED wouldn't be meaningful, but more importantly,
> > even if it is a cast from unsigned integral type, we only know it can't be
> > minimum signed value if it is a widening cast, if it is same precision or
> > narrowing cast, we know nothing.
> 
> Curious, this divergence goes back to 1994, when the C++ front-end was
> merged and tege changed the condition in the C front-end.

And it was changed to match the modulo condition adjusted by rms in 1993.

> > So, the following patch for the NOP_EXPR cases checks just in case that
> > it is from integral type and more importantly checks it is a widening
> > conversion, and then next to it also allows op0 to be just unsigned,
> > promoted or not, as that is what the C FE will do for those cases too
> > and I believe it must work - either the division/modulo common type
> > will be that unsigned type, then we can shorten and don't need to worry
> > about UB, or it will be some wider signed type but then it can't be most
> > negative value of the wider type.
> 
> Why not use the same condition in C and C++?

I can test that.  Do you mean change the C FE to match the patched C++
or change C++ FE to just test TYPE_UNSIGNED (orig_op0)?
I think both should work, though what I wrote perhaps can shorten in more
cases.  Can try to construct testcases where it differs...

Jakub



Re: [PATCH] c++: Avoid incorrect shortening of divisions [PR108365]

2023-01-12 Thread Jason Merrill via Gcc-patches

On 1/11/23 12:58, Jakub Jelinek wrote:

Hi!

The following testcase is miscompiled, because we shorten the division
in a case where it should not be shortened.
Divisions (and modulos) can be shortened if it is unsigned division/modulo,
or if it is signed division/modulo where we can prove the dividend will
not be the minimum signed value or divisor will not be -1, because e.g.
on sizeof(long long)==sizeof(int)*2 && __INT_MAX__ == 0x7fff targets
(-2147483647 - 1) / -1 is UB
but
(int) (-2147483648LL / -1LL) is not, it is -2147483648.
The primary aim of both the C and C++ FE division/modulo shortening I assume
was for the implicit integral promotions of {,signed,unsigned} {char,short}
and because at this point we have no VRP information etc., the shortening
is done if the integral promotion is from unsigned type for the divisor
or if the dividend is an integer constant other than -1.
This works fine for char/short -> int promotions when char/short have
smaller precision than int - unsigned char -> int or unsigned short -> int
will always be a positive int, so never the most negative.

Now, the C FE checks whether orig_op0 is TYPE_UNSIGNED where op0 is either
the same as orig_op0 or that promoted to int, I think that works fine,
if it isn't promoted, either the division/modulo common type will have the
same precision as op0 but then the division/modulo is unsigned and so
without UB, or it will be done in wider precision (e.g. because op1 has
wider precision), but then op0 can't be minimum signed value.  Or it has
been promoted to int, but in that case it was again from narrower type and
so never minimum signed int.

But the C++ FE was checking if op0 is a NOP_EXPR from TYPE_UNSIGNED.
First of all, not sure if the operand of NOP_EXPR couldn't be non-integral
type where TYPE_UNSIGNED wouldn't be meaningful, but more importantly,
even if it is a cast from unsigned integral type, we only know it can't be
minimum signed value if it is a widening cast, if it is same precision or
narrowing cast, we know nothing.


Curious, this divergence goes back to 1994, when the C++ front-end was 
merged and tege changed the condition in the C front-end.



So, the following patch for the NOP_EXPR cases checks just in case that
it is from integral type and more importantly checks it is a widening
conversion, and then next to it also allows op0 to be just unsigned,
promoted or not, as that is what the C FE will do for those cases too
and I believe it must work - either the division/modulo common type
will be that unsigned type, then we can shorten and don't need to worry
about UB, or it will be some wider signed type but then it can't be most
negative value of the wider type.


Why not use the same condition in C and C++?


Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-11  Jakub Jelinek  

PR c++/108365
* typeck.cc (cp_build_binary_op): For integral division or modulo,
shorten if type0 is unsigned, or op0 is cast from narrower unsigned
integral type or stripped_op1 is INTEGER_CST other than -1.

* g++.dg/opt/pr108365.C: New test.
* g++.dg/warn/pr108365.C: New test.

--- gcc/cp/typeck.cc.jj 2022-12-15 19:17:37.828072458 +0100
+++ gcc/cp/typeck.cc2023-01-11 12:15:25.195284107 +0100
@@ -5455,8 +5455,15 @@ cp_build_binary_op (const op_location_t
 point, so we have to dig out the original type to find out if
 it was unsigned.  */
  tree stripped_op1 = tree_strip_any_location_wrapper (op1);
- shorten = ((TREE_CODE (op0) == NOP_EXPR
- && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0
+ shorten = (TYPE_UNSIGNED (type0)
+|| (TREE_CODE (op0) == NOP_EXPR
+&& INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
+ 0)))
+&& TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0,
+   0)))
+&& (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0,
+ 0)))
+< TYPE_PRECISION (type0)))
 || (TREE_CODE (stripped_op1) == INTEGER_CST
 && ! integer_all_onesp (stripped_op1)));
}
@@ -5491,8 +5498,12 @@ cp_build_binary_op (const op_location_t
 quotient can't be represented in the computation mode.  We shorten
 only if unsigned or if dividing by something we know != -1.  */
  tree stripped_op1 = tree_strip_any_location_wrapper (op1);
- shorten = ((TREE_CODE (op0) == NOP_EXPR
- && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0
+ shorten = (TYPE_UNSIGNED (type0)
+|| (TREE_CODE (op0) == NOP_EXPR
+  

Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 10, 2023 at 04:33:59PM +, Wilco Dijkstra via Gcc-patches wrote:
> Hi Szabolcs,
> 
> > i would keep the assert: how[reg] must be either UNSAVED or UNDEFINED
> > here, other how[reg] means the toggle cfi instruction is mixed with
> > incompatible instructions for the pseudo reg.
> >
> > and i would add a comment about this e.g. saying that UNSAVED/UNDEFINED
> > how[reg] is used for tracking the return address signing status and
> > other how[reg] is not allowed here.
> 
> I've added the assert back and updated the comment.

BTW, the patch doesn't apply to trunk cleanly (since the January 2nd
r13-4955-gcb775ecd6e437 commit).

> v3: Improve comments, add assert.
> 
> A recent change only initializes the regs.how[] during Dwarf unwinding
> which resulted in an uninitialized offset used in return address signing
> and random failures during unwinding.  The fix is to encode the return
> address signing state in REG_UNSAVED and REG_UNDEFINED.
> 
> Passes bootstrap & regress, OK for commit?
> 
> libgcc/
> PR target/107678
> * unwind-dw2.c (execute_cfa_program): Use REG_UNSAVED/UNDEFINED
> to encode return address signing state.
> * config/aarch64/aarch64-unwind.h (aarch64_demangle_return_addr)
> Check current return address signing state.
> (aarch64_frob_update_contex): Remove.

Jakub



Re: libstdc++: Fix deadlock in debug iterator increment [PR108288]

2023-01-12 Thread François Dumont via Gcc-patches

On 12/01/23 13:00, Jonathan Wakely wrote:

On Thu, 12 Jan 2023 at 05:52, François Dumont wrote:

Small update for an obvious compilation issue and to review new test
case that could have lead to an infinite loop if the increment issue was
not detected.

I also forgot to ask if there is more chance for the instantiation to be
elided when it is implemented like in the _Safe_local_iterator:
return { __cur, this->_M_sequence };

No, that doesn't make any difference.


than in the _Safe_iterator:
return _Safe_iterator(__cur, this->_M_sequence);

In the case where the user code do not use it ?

Fully tested now, ok to commit ?

François

On 11/01/23 07:03, François Dumont wrote:

Thanks for fixing this.

Here is the extension of the fix to all post-increment/decrement
operators we have on _GLIBCXX_DEBUG iterator.

Thanks, I completely forgot we have other partial specializations, I
just fixed the one that showed a deadlock in the user's example!


I prefer to restore somehow previous implementation to continue to
have _GLIBCXX_DEBUG post operators implemented in terms of normal post
operators.

Why?

Implementing post-increment as:

 auto tmp = *this;
 ++*this;
 return tmp;

is the idiomatic way to write it, and it works fine in this case. I
don't think it performs any more work than your version, does it?
Why not use the idiomatic form?

Is it just so that post-inc of a debug iterator uses post-inc of the
underlying iterator? Why does that matter?


A little yes, but that's a minor reason that is just making me happy.

Main reason is that this form could produce a __msg_init_copy_singular 
before the __msg_bad_inc.


And moreover I plan to propose a patch later to skip any check in the 
call to _Safe_iterator(__cur, _M_sequence) as we already know that __cur 
is ok here like anywhere else in the lib.


There will still be one in the constructor normally elided unless 
--no-elide-constructors but there is not much I can do about it.





Re: [PATCH] Various fixes for DWARF register size computation

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 04:50:07PM +, Richard Sandiford wrote:
> I'm jumping in here without fully understanding the context, so maybe this
> is exactly your point, but: the SIMD/FP DWARF registers are supposed to be
> size 8 regardless of which features are enabled.  That's already only half
> of the hardware register size for base Armv8-A, since Advanced SIMD registers
> are 16 bytes in size.
> 
> So yeah, if we're using the hardware register size then something is wrong.

I'm talking about what the following compiles to
static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1];

void
foo (void)
{
  __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table);
}
(and therefore what libgcc/unwind-dw2.c (init_dwarf_reg_size_table) as well)
with -O2 -fbuilding-libgcc -march=armv8-a vs. -O2 -fbuilding-libgcc 
-march=armv8-a+sve
The former is setting I think [0..31, 46, 48..63, 72..79, 96]=8, [64..71, 
80..95]=0
(and leaving others untouched, which keeps them 0).
While the latter is setting [0..31, 46, 72..79, 96]=8, [64..71, 80..95]=0
and [48..63]=cntd

Jakub
.arch armv8-a
.file   "8.c"
.text
.align  2
.align  4
.global foo
.type   foo, %function
foo:
.LFB0:
adrpx2, .LANCHOR0
add x0, x2, :lo12:.LANCHOR0
mov w1, 8
strbw1, [x2, #:lo12:.LANCHOR0]
strbw1, [x0, 1]
strbw1, [x0, 2]
strbw1, [x0, 3]
strbw1, [x0, 4]
strbw1, [x0, 5]
strbw1, [x0, 6]
strbw1, [x0, 7]
strbw1, [x0, 8]
strbw1, [x0, 9]
strbw1, [x0, 10]
strbw1, [x0, 11]
strbw1, [x0, 12]
strbw1, [x0, 13]
strbw1, [x0, 14]
strbw1, [x0, 15]
strbw1, [x0, 16]
strbw1, [x0, 17]
strbw1, [x0, 18]
strbw1, [x0, 19]
strbw1, [x0, 20]
strbw1, [x0, 21]
strbw1, [x0, 22]
strbw1, [x0, 23]
strbw1, [x0, 24]
strbw1, [x0, 25]
strbw1, [x0, 26]
strbw1, [x0, 27]
strbw1, [x0, 28]
strbw1, [x0, 29]
strbw1, [x0, 30]
strbw1, [x0, 31]
strbwzr, [x0, 64]
strbw1, [x0, 46]
strbwzr, [x0, 65]
strbwzr, [x0, 66]
strbwzr, [x0, 67]
strbwzr, [x0, 68]
strbwzr, [x0, 69]
strbwzr, [x0, 70]
strbwzr, [x0, 71]
strbw1, [x0, 72]
strbw1, [x0, 73]
strbw1, [x0, 74]
strbw1, [x0, 75]
strbw1, [x0, 76]
strbw1, [x0, 77]
strbw1, [x0, 78]
strbw1, [x0, 79]
strbwzr, [x0, 80]
strbwzr, [x0, 81]
strbwzr, [x0, 82]
strbwzr, [x0, 83]
strbwzr, [x0, 84]
strbwzr, [x0, 85]
strbwzr, [x0, 86]
strbwzr, [x0, 87]
strbwzr, [x0, 88]
strbwzr, [x0, 89]
strbwzr, [x0, 90]
strbwzr, [x0, 91]
strbwzr, [x0, 92]
strbwzr, [x0, 93]
strbwzr, [x0, 94]
strbwzr, [x0, 95]
strbw1, [x0, 48]
strbw1, [x0, 49]
strbw1, [x0, 50]
strbw1, [x0, 51]
strbw1, [x0, 52]
strbw1, [x0, 53]
strbw1, [x0, 54]
strbw1, [x0, 55]
strbw1, [x0, 56]
strbw1, [x0, 57]
strbw1, [x0, 58]
strbw1, [x0, 59]
strbw1, [x0, 60]
strbw1, [x0, 61]
strbw1, [x0, 62]
strbw1, [x0, 63]
strbw1, [x0, 96]
ret
.LFE0:
.size   foo, .-foo
.bss
.align  4
.set.LANCHOR0,. + 0
.type   dwarf_reg_size_table, %object
.size   dwarf_reg_size_table, 98
dwarf_reg_size_table:
.zero   98
.section.eh_frame,"aw",@progbits
.Lframe1:
.4byte  .LECIE1-.LSCIE1
.LSCIE1:
.4byte  0
.byte   0x3
.string "zR"
.byte   0x1
.byte   0x78
.byte   0x1e
.byte   0x1
.byte   0x1b
.byte   0xc
.byte   0x1f
.byte   0
.align  3
.LECIE1:
.LSFDE1:
.4byte  .LEFDE1-.LASFDE1
.LASFDE1:
.4byte  .LASFDE1-.Lframe1
.4byte  .LFB0-.
.4byte  .LFE0-.LFB0
.byte   0
.align  3
.LEFDE1:
.ident  "GCC: (GNU) 13.0.0 20230111 (experimental)"
.section.note.GNU-stack,"",@progbits
.arch armv8-a+sve
.file   "8.c"
.text
.align  2
.align  4
.global foo
.type   foo, %function
foo:
.LFB0:
adrpx3, .LANCHOR0
add x0, x3, :lo12:.LANCHOR0
mov w1, 8
cntdx2
strbw1, [x3, #:lo12:.LANCHOR0]
strbw1, [x0, 1]
strb   

[PING] [PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-01-12 Thread Surya Kumari Jangala via Gcc-patches
Ping

On 04/01/23 1:58 pm, Surya Kumari Jangala via Gcc-patches wrote:
> swap: Fix incorrect lane extraction by vec_extract() [PR106770]
> 
> In the routine rs6000_analyze_swaps(), special handling of swappable
> instructions is done even if the webs that contain the swappable
> instructions are not optimized, i.e., the webs do not contain any
> permuting load/store instructions along with the associated register
> swap instructions. Doing special handling in such webs will result in
> the extracted lane being adjusted unnecessarily for vec_extract.
> 
> Modifying swappable instructions is also incorrect in webs where
> loads/stores on quad word aligned addresses are changed to lvx/stvx.
> Similarly, in webs where swap(load(vector constant)) instructions are
> replaced with load(swapped vector constant), the swappable
> instructions should not be modified.
> 
> 2023-01-04  Surya Kumari Jangala  
> 
> gcc/
>   PR rtl-optimization/106770
>   * rs6000-p8swap.cc (rs6000_analyze_swaps): .
> 
> gcc/testsuite/
>   PR rtl-optimization/106770
>   * gcc.target/powerpc/pr106770.c: New test.
> ---
> 
> diff --git a/gcc/config/rs6000/rs6000-p8swap.cc 
> b/gcc/config/rs6000/rs6000-p8swap.cc
> index 19fbbfb67dc..7ed39251df9 100644
> --- a/gcc/config/rs6000/rs6000-p8swap.cc
> +++ b/gcc/config/rs6000/rs6000-p8swap.cc
> @@ -179,6 +179,9 @@ class swap_web_entry : public web_entry_base
>unsigned int special_handling : 4;
>/* Set if the web represented by this entry cannot be optimized.  */
>unsigned int web_not_optimizable : 1;
> +  /* Set if the web represented by this entry has been optimized, ie,
> + register swaps of permuting loads/stores have been removed.  */
> +  unsigned int web_is_optimized : 1;
>/* Set if this insn should be deleted.  */
>unsigned int will_delete : 1;
>  };
> @@ -2627,22 +2630,43 @@ rs6000_analyze_swaps (function *fun)
>/* For each load and store in an optimizable web (which implies
>   the loads and stores are permuting), find the associated
>   register swaps and mark them for removal.  Due to various
> - optimizations we may mark the same swap more than once.  Also
> - perform special handling for swappable insns that require it.  */
> + optimizations we may mark the same swap more than once. Fix up
> + the non-permuting loads and stores by converting them into
> + permuting ones.  */
>for (i = 0; i < e; ++i)
>  if ((insn_entry[i].is_load || insn_entry[i].is_store)
>   && insn_entry[i].is_swap)
>{
>   swap_web_entry* root_entry
> = (swap_web_entry*)((_entry[i])->unionfind_root ());
> - if (!root_entry->web_not_optimizable)
> + if (!root_entry->web_not_optimizable) {
> mark_swaps_for_removal (insn_entry, i);
> +  root_entry->web_is_optimized = true;
> +}
>}
> -else if (insn_entry[i].is_swappable && insn_entry[i].special_handling)
> +else if (insn_entry[i].is_swappable
> + && (insn_entry[i].special_handling == SH_NOSWAP_LD ||
> + insn_entry[i].special_handling == SH_NOSWAP_ST))
> +  {
> +swap_web_entry* root_entry
> +  = (swap_web_entry*)((_entry[i])->unionfind_root ());
> +if (!root_entry->web_not_optimizable) {
> +  handle_special_swappables (insn_entry, i);
> +  root_entry->web_is_optimized = true;
> +}
> +  }
> +
> +  /* Perform special handling for swappable insns that require it. 
> + Note that special handling should be done only for those 
> + swappable insns that are present in webs optimized above.  */
> +  for (i = 0; i < e; ++i)
> +if (insn_entry[i].is_swappable && insn_entry[i].special_handling &&
> +!(insn_entry[i].special_handling == SH_NOSWAP_LD || 
> +  insn_entry[i].special_handling == SH_NOSWAP_ST))
>{
>   swap_web_entry* root_entry
> = (swap_web_entry*)((_entry[i])->unionfind_root ());
> - if (!root_entry->web_not_optimizable)
> + if (root_entry->web_is_optimized)
> handle_special_swappables (insn_entry, i);
>}
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106770.c
> new file mode 100644
> index 000..84e9aead975
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O3 " } */
> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
> +
> +/* Test case to resolve PR106770  */
> +
> +#include 
> +
> +int cmp2(double a, double b)
> +{
> +vector double va = vec_promote(a, 1);
> +vector double vb = vec_promote(b, 1);
> +vector long long vlt = (vector long long)vec_cmplt(va, vb);
> +vector long long vgt = (vector long long)vec_cmplt(vb, va);
> +vector signed long long vr = vec_sub(vlt, vgt);
> +
> +return vec_extract(vr, 1);
> +}
> 

Re: [PATCH] Various fixes for DWARF register size computation

2023-01-12 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek via Gcc-patches  writes:
> On Tue, Jan 03, 2023 at 02:25:21PM +0100, Florian Weimer wrote:
>> > Though, I still wonder, because all of this is a hack for a single target
>> > - x86_64-linux -m64 - I think no other target has similar constant
>> > sizes,
>> 
>> Really?  That's odd.
>
> I've tried about 30 cross compilers I had around, I admit it isn't
> exhaustive.
>
>> Is it because other architectures track callee-saved vector registers
>> through unwinding?
>
> I think it is far more than just vector registers, it can be floating point
> registers, or just one integral or special register of a different size etc.
>
>> > Or, if you want to do it on the compiler side, instead of predefining
>> > __LIBGCC_DWARF_REG_SIZES_CONSTANT__ and __LIBGCC_DWARF_REG_MAXIMUM__
>> > register conditionally a new builtin, __builtin_dwarf_reg_size,
>> > which would be defined only if -fbuilding-libgcc and the compiler 
>> > determines
>> > dwarf_reg_size is desirable to be computed inline without a table and
>> > would fold the builtin to say that
>> > index <= 16U ? 8 : 0 on x86_64 -m64,
>> > index <= 9U ? 4 : index - 11U <= 5U ? 12 : 0 on x86_64 -m32 etc.
>> > and if the expression is too large/complex, wouldn't predefine the builtin.
>> 
>> I think the pre-computation of the size array is useful even for targets
>> where the expression is not so simple, but the array elements are still
>> constants.  A builtin like __builtin_dwarf_reg_size could use a
>> reference to a constant static array, so that we can get rid of the
>> array initialization code in libgcc.  Before we can do that, we need to
>
> I think constant static array might be sometimes an option too, but not
> always, not just because of AArch64, or other potential poly-int arches
> (RISCV?), but also if something could depend on some runtime check.
> It is true that most of the time it is constant and depends just on the
> target or more often on target + options combo (mostly ABI related options).
>
>> figure out if the fully dynamic register sizes on AArch64 with SVE are
>> actually correct—and if we need to fix the non-SVE unwinder to work
>> properly for SVE programs.
>> 
>> So I don't want to revert the size array computation just yet.
>> 
>> > Or, is it actually the use of table that is bad on the unwinder side,
>> > or lack of a small upper bound for what you get from the table?
>> > In that case you could predefine upper bound on the sizes instead (if
>> > constant) and simply add if (size > __LIBGCC_DWARF_REG_SIZE_MAX__)
>> > __builtin_unreachable ()).
>> 
>> It also matters what kind of register sizes are used in practice.
>
> Yes, but that is hard to find out easily.  E.g. some registers might be
> only saved/restored in signal frames and nowhere else, others only rarely
> touched but still we'd need their sizes if they are ever used.
>
>> Should I repost this patch with the three nits fixed?  Or should I
>> revert two of the three patches I committed instead?
>
> I lean towards reversion and trying to figure out something that works
> on more than one arch.  It doesn't have to improve all arches (say
> AArch64 is clearly a nightmare that isn't handled correctly even without
> any changes - as noted in the PRs, if libgcc is built without SVE, it will
> hardcode 8, while if it is built with SVE, it will be runtime dependent and
> will be wrong in theory when some HW has 2048 bit SVE vectors - when it is
> 256 bytes), but still watching into what we compile -O2 -nostdinc 
> -fbuilding-libgcc

I'm jumping in here without fully understanding the context, so maybe this
is exactly your point, but: the SIMD/FP DWARF registers are supposed to be
size 8 regardless of which features are enabled.  That's already only half
of the hardware register size for base Armv8-A, since Advanced SIMD registers
are 16 bytes in size.

So yeah, if we're using the hardware register size then something is wrong.

Thanks,
Richard

> static unsigned char dwarf_reg_size_table[__LIBGCC_DWARF_FRAME_REGISTERS__+1];
>
> void
> foo (void)
> {
>   __builtin_init_dwarf_reg_size_table (dwarf_reg_size_table);
> }
>
> on at least 10 most common arches would be useful and optimizing what is
> easily possible would be nice.
>
>   Jakub


Re: [PATCH] libgccjit: Fix float vector comparison

2023-01-12 Thread Antoni Boucher via Gcc-patches
Ping David:

Some more notes about the try/catch API:
I finally got unwinding implemented in rustc_codegen_gcc with the
following GCC patch:
https://github.com/antoyo/gcc/commit/fd603a3c715d3708f831cb637fbcc48bf4641859

It still requires clean-up, but you can have a look at it.
I'm still unsure for the CFG:
currently, it requires the finally to be terminated which would prevent
a finally reached through unwinding to work correctly; unless you call
unwind_resume, in which case, that would probably prevent a normal
finally (e.g. reached not by unwinding, but by falling off the try).
I'll try to not require the finally block to be terminated, but I
remember having issues making that work.

What are your thoughts on this? 

On Fri, 2022-12-02 at 09:29 -0500, Antoni Boucher wrote:
> On Thu, 2022-12-01 at 11:57 -0500, David Malcolm wrote:
> > On Thu, 2022-12-01 at 10:33 -0500, Antoni Boucher wrote:
> > > On Thu, 2022-12-01 at 10:25 -0500, David Malcolm wrote:
> > > > On Thu, 2022-12-01 at 10:01 -0500, Antoni Boucher wrote:
> > > > > Thanks, David.
> > > > > Since we're not in phase 1 anymore, do we need an approval
> > > > > before
> > > > > I
> > > > > merge like last year or can I merge immediately?
> > > > 
> > > > I think it counts as a bug fix and thus you can go ahead and
> > > > merge
> > > > (assuming you've done the usual testing).
> > > > 
> > > > > I also have many other patches (all in jit) that I need to
> > > > > prepare
> > > > > and
> > > > > post to this mailing list.
> > > > > What do you think?
> > > > 
> > > > Given that you're one of the main users of libgccjit I think
> > > > there's
> > > > a
> > > > case for stretching the deadlines a bit here.
> > > > 
> > > > Do you have a repo I can look at?
> > > 
> > > Yes! The commits are in my fork:
> > > https://github.com/antoyo/gcc
> > > 
> > > The only big one is the one adding support for target-dependent
> > > builtins:
> > > https://github.com/antoyo/gcc/commit/6d4313d4c02dd878f43917c978f299f5119330f0
> > > 
> > > Regarding this one, there's the issue that since we record the
> > > builtins
> > > on the first context run, we only have access to the builtins
> > > from
> > > the
> > > second run.
> > > Do you have any idea how to fix this?
> > > Or do you consider this is acceptable?
> > 
> > This is implemented behind the new
> > gcc_jit_context_get_target_builtin_function entrypoint, right?
> 
> Yes.
> 
> > 
> > If so, perhaps that recording::context::get_target_builtin_function
> > could detect if it's the first time it's been called on this
> > context,
> > and if so make a playback::context to do the detection?  That way
> > it
> > would be transparent to the user, and work first time.
> 
> Oh, the issue is actually with the type reflection API and also the
> type checking of function calls, so it's in the recording phase.
> While I could think of a workaround for the type checking (e.g.
> delayed
> type checking at the playback phase), I could not think of any better
> solution for the type reflection.
> 
> > 
> > 
> > I see you have patches to add function and variable attributes; I
> > wonder if this would be cleaner internally if there was a
> > recording::attribute class, rather than the std::pair currently in
> > use
> > (some attributes have int arguments rather than string, others have
> > multiple args).
> > 
> > I also wondered if a "gcc_jit_attribute" type could be exposed to
> > the
> > user, e.g.:
> > 
> >   attr1 = gcc_jit_context_new_attribute (ctxt, "noreturn");
> >   attr2 = gcc_jit_context_new_attribute_with_string (ctxt, "alias",
> > "__foo");
> >   gcc_jit_function_add_attribute (ctxt, attr1);
> >   gcc_jit_function_add_attribute (ctxt, attr2);
> > 
> > or somesuch?  But I think the API you currently have is OK.
> 
> Thanks for the suggestion; I'll look into that.
> 
> > 
> > 
> > > 
> > > I also have a WIP branch which adds support for try/catch:
> > > https://github.com/antoyo/gcc/commit/6219339fcacb079431596a0bc6cf8d430a1bd5a1
> > > I'm not sure if this one is going to be ready soon or not.
> > 
> > I see that the new entrypoints have e.g.:
> > 
> > /* Add a try/catch statement.
> >    This is equivalent to this C++ code:
> >  try {
> >     try_block
> >  }
> >  catch {
> >     catch_block
> >  }
> > */
> > 
> > void
> > gcc_jit_block_add_try_catch (gcc_jit_block *block,
> >  gcc_jit_location *loc,
> >  gcc_jit_block *try_block,
> >  gcc_jit_block *catch_block);
> > 
> > but I'm not sure how this is meant to interact with the CFG-like
> > model
> > used by the rest of the gcc_jit_block_* API.  What happens at the
> > end
> > of the blocks?  Does the generated code use the C++ ABI for
> > exception-
> > handling?
> 
> Currently, it requires the try and catch blocks to be terminated, but
> also require the block containing the try/catch to be terminated.
> That doesn't make sense.
> Would it be OK if it 

Re: [PATCH v4] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-12 Thread Richard Sandiford via Gcc-patches
Thanks for doing this and sorry for the slow review.

Lulu Cheng  writes:
> Hi, Richard:
>
>   Could you please help me look at this document? Is there any problem 
> with my modification?
>
> Thanks!
>
>
> 在 2022/12/27 下午2:42, Lulu Cheng 写道:
>> Co-authored-by: Yang Yujie 
>>
>> gcc/ChangeLog:
>>
>>  * config/loongarch/loongarch.cc (loongarch_classify_address):
>>  Add precessint for CONST_INT.
>>  (loongarch_print_operand_reloc): Operand modifier 'c' is supported.
>>  (loongarch_print_operand): Increase the processing of '%c'.
>>  * doc/extend.texi: Adds documents for LoongArch operand modifiers.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/loongarch/tst-asm-const.c: Moved to...
>>  * gcc.target/loongarch/pr107731.c: ...here.
>>
>> ---
>> V2 -> v3:
>> 1. Correct a clerical error.
>> 2. Adding document for loongarch operand modifiers.
>>
>> v3 -> v4:
>> Copy the description of "%c" "%n" "%a" "%l" from gccint.pdf to gcc.pdf.
>>
>>
>> ---
>>   gcc/config/loongarch/loongarch.cc | 14 
>>   gcc/doc/extend.texi   | 33 +++
>>   .../loongarch/{tst-asm-const.c => pr107731.c} |  6 ++--
>>   3 files changed, 50 insertions(+), 3 deletions(-)
>>   rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} 
>> (78%)
>>
>> diff --git a/gcc/config/loongarch/loongarch.cc 
>> b/gcc/config/loongarch/loongarch.cc
>> index c6b03fcf2f9..cdf190b985e 100644
>> --- a/gcc/config/loongarch/loongarch.cc
>> +++ b/gcc/config/loongarch/loongarch.cc
>> @@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
>> loongarch_address_info *info, rtx x,
>> return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
>>&& loongarch_valid_lo_sum_p (info->symbol_type, mode,
>> info->offset));
>> +case CONST_INT:
>> +  /* Small-integer addresses don't occur very often, but they
>> + are legitimate if $r0 is a valid base register.  */
>> +  info->type = ADDRESS_CONST_INT;
>> +  return IMM12_OPERAND (INTVAL (x));
>>   
>>   default:
>> return false;
>> @@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, 
>> bool hi64_part,
>>   
>>  'A' Print a _DB suffix if the memory model requires a release.
>>  'b' Print the address of a memory operand, without offset.
>> +   'c'  Print an integer.
>>  'C' Print the integer branch condition for comparison OP.
>>  'd' Print CONST_INT OP in decimal.
>>  'F' Print the FPU branch condition for comparison OP.
>> @@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int 
>> letter)
>>  fputs ("_db", file);
>> break;
>>   
>> +case 'c':
>> +  if (CONST_INT_P (op))
>> +fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
>> +  else
>> +output_operand_lossage ("unsupported operand for code '%c'", letter);
>> +
>> +  break;
>> +
>>   case 'C':
>> loongarch_print_int_branch_condition (file, code, letter);
>> break;
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index adba057c190..1bb960266ed 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -10234,6 +10234,24 @@ generates multiple assembler instructions.
>>   Outputs @samp{@{}, @samp{|}, and @samp{@}} characters (respectively)
>>   into the assembler code.  When unescaped, these characters have special
>>   meaning to indicate multiple assembler dialects, as described below.
>> +
>> +@item %c@var{digit}
>> +It can be used to substitute an operand that is a constant value without
>> +the syntax that normally indicates an immediate operand.

It would probably more idiomatic not to have the "it" for this kind of list,
so maybe:

Substitutes a constant value without the syntax that normally indicates
an immediate operand.

Similarly for %a and %l.

>> +
>> +@item %n@var{digit}
>> +It's like @samp{%c@var{digit}} except that the value of the constant is
>> +negated before printing.

Similarly here, I think just "Like ...", without the "It's".

>> +@item %a@var{digit}
>> +It Can be used to substitute an operand as if it were a memory reference,
>> +with the actual operand treated as the address.  This may be useful when
>> +outputting a ``load address'' instruction, because often the assembler
>> +syntax for such an instruction requires you to write the operand as if it
>> +were a memory reference.
>> +
>> +@item %l@var{digit}
>> +It's used to substitute a @code{label_ref} into a jump instruction.
>>   @end table

This list is describing uses of % that don't relate to operands.
The idea of using % with operand numbers is only introduced later, in:

@code{asm} supports operand modifiers on operands (for example @samp{%k2} 
instead of simply @samp{%2}). Typically these qualifiers are hardware 
dependent. The list of supported modifiers for x86 is found at 

Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2023-01-12 Thread Longjun Luo via Gcc-patches

On 12/2/2022 5:10 AM, Joseph Myers wrote:

On Fri, 2 Dec 2022, Longjun Luo via Gcc-patches wrote:


They are ./gcc/testsuite/gcc.dg/cpp/warn-redefined.c and
./gcc/testsuite/gcc.dg/cpp/warn-redefined-2.c

These two cases redefine the __TIME__ macro when using the option
'-Wbuiltin-macro-redefined'.

I think I shoud add a test to verify __LINE__ macro in these two cases.


I think it should be a test that doesn't use either
-Wbuiltin-macro-redefined or -Wno-builtin-macro-redefined - a test of how
the compiler behaves by default.

Sorry for the delay. I have added a test case for this default 
situation. And also update another case to fully test the usage of 
builtin macros redefintions.

So, the patch itself has no problem. What I need do is to rich its test cases
and update change log, right?


The patch needs review, but I'm fine with the principle that
-Wno-builtin-macro-redefined should apply to __LINE__ as it does to
various other built-in macros.


Thanks for your patience.


Re: [PATCH] libcpp: suppress builtin macro redefined warnings for __LINE__

2023-01-12 Thread Longjun Luo via Gcc-patches

From 0821df518b264e754d698d399f98be1a62945e32 Mon Sep 17 00:00:00 2001
From: Longjun Luo 
Date: Thu, 12 Jan 2023 23:59:54 +0800
Subject: [PATCH] libcpp: suppress builtin macro redefined warnings for
 __LINE__

As implied in
gcc.gnu.org/legacy-ml/gcc-patches/2008-09/msg00076.html,
gcc provides -Wno-builtin-macro-redefined to suppress warning when
redefining builtin macro. However, at that time, there was no
scenario for __LINE__ macro.

But, when we try to build a live-patch, we compare sections by using
-ffunction-sections. Some same functions are considered changed because
of __LINE__ macro.

At present, to detect such a changed caused by __LINE__ macro, we
have to analyse code and maintain a function list. For example,
in kpatch, check this commit
github.com/dynup/kpatch/commit/0e1b95edeafa36edb7bcf11da6d1c00f76d7e03d.

So, in this scenario, when we try to compared sections, it would
be better to support suppress builtin macro redefined warnings for
__LINE__ macro.

Signed-off-by: Longjun Luo 
---
 gcc/testsuite/gcc.dg/builtin-redefine-1.c | 49 +++
 gcc/testsuite/gcc.dg/builtin-redefine.c   | 24 +--
 libcpp/init.cc|  2 +-
 3 files changed, 70 insertions(+), 5 deletions(-)
 create mode 100755 gcc/testsuite/gcc.dg/builtin-redefine-1.c

diff --git a/gcc/testsuite/gcc.dg/builtin-redefine-1.c 
b/gcc/testsuite/gcc.dg/builtin-redefine-1.c

new file mode 100755
index 000..c1e05b4fc7c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-redefine-1.c
@@ -0,0 +1,49 @@
+/* Test default warnings for redefining builtin macros.  */
+
+/* { dg-do compile } */
+/* { dg-options "-D__TIMESTAMP__=x -D__TIME__=x -D__DATE__=x 
-D__FILE__=x -D__FILE_NAME__=x -D__BASE_FILE__=x -D__LINE__=0" } */

+
+/* Check default behavior for builtin macros redefinition.  */
+
+/* { dg-message "\"__TIMESTAMP__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __TIMESTAMP__
+#error "__TIMESTAMP__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
+/* { dg-message "\"__TIME__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __TIME__
+#error "__TIME__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
+/* { dg-message "\"__DATE__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __DATE__
+#error "__DATE__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
+/* { dg-message "\"__FILE__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __FILE__
+#error "__FILE__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
+/* { dg-message "\"__FILE_NAME__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __FILE_NAME__
+#error "__FILE_NAME__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
+/* { dg-message "\"__BASE_FILE__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __BASE_FILE__
+#error "__BASE_FILE__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
+/* { dg-message "\"__LINE__\" redefined" "" {target "*-*-*"} 0 } */
+#ifndef __LINE__
+#error "__LINE__ builtin is not defined"
+/* { dg-bogus "Expected built-in is not defined" "" { target *-*-* } 
.-1 } */

+#endif
+
diff --git a/gcc/testsuite/gcc.dg/builtin-redefine.c 
b/gcc/testsuite/gcc.dg/builtin-redefine.c

index 882b2210992..fa27ee9aefc 100644
--- a/gcc/testsuite/gcc.dg/builtin-redefine.c
+++ b/gcc/testsuite/gcc.dg/builtin-redefine.c
@@ -1,9 +1,9 @@
 /* Test -Wno-builtin-macro-redefined warnings.  */

 /* { dg-do compile } */
-/* { dg-options "-Wno-builtin-macro-redefined -U__DATE__ -D__TIME__=X" } */
+/* { dg-options "-Wno-builtin-macro-redefined -U__DATE__ -D__TIME__=X 
-D__LINE__=0" } */


-/* Check date, time, and datestamp built-ins warnings may be 
suppressed.  */
+/* Check date, time, datestamp and line built-ins warnings may be 
suppressed.  */


 #if defined(__DATE__)
 #error "__DATE__ is defined, but should not be (-U command line error)"
@@ -15,6 +15,11 @@
 /* { dg-bogus "__TIME__ is not defined" "" { target *-*-* } .-1 } */
 #endif

+#if __LINE__ != 0
+#error "__LINE__ is not defined as expected (-D command line error)"
+/* { dg-bogus "__LINE__ is not defined" "" { target *-*-* } .-1 } */
+#endif
+
 #if !defined(__TIMESTAMP__)
 #error "__TIMESTAMP__ is not defined (built-in macro expectation error)"
 /* { dg-bogus "__TIMESTAMP__ is not defined" "" { target *-*-* } .-1 } */
@@ -53,6 +58,18 @@
 #undef __TIMESTAMP__ /* Undefine while defined.  */


+#undef __LINE__  /* Undefine while defined.  */
+#undef __LINE__  /* Undefine while already undefined.  */
+
+#define __LINE__ "1" /* Define while undefined.  */
+#define __LINE__ "1" /* Re-define while defined.  */ /* { 
dg-line line_prev } */

+
+#define 

Re: [aarch64] Use dup and zip1 for interleaving elements in initializing vector

2023-01-12 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> On Tue, 6 Dec 2022 at 07:01, Prathamesh Kulkarni
>  wrote:
>>
>> On Mon, 5 Dec 2022 at 16:50, Richard Sandiford
>>  wrote:
>> >
>> > Richard Sandiford via Gcc-patches  writes:
>> > > Prathamesh Kulkarni  writes:
>> > >> Hi,
>> > >> For the following test-case:
>> > >>
>> > >> int16x8_t foo(int16_t x, int16_t y)
>> > >> {
>> > >>   return (int16x8_t) { x, y, x, y, x, y, x, y };
>> > >> }
>> > >>
>> > >> Code gen at -O3:
>> > >> foo:
>> > >> dupv0.8h, w0
>> > >> ins v0.h[1], w1
>> > >> ins v0.h[3], w1
>> > >> ins v0.h[5], w1
>> > >> ins v0.h[7], w1
>> > >> ret
>> > >>
>> > >> For 16 elements, it results in 8 ins instructions which might not be
>> > >> optimal perhaps.
>> > >> I guess, the above code-gen would be equivalent to the following ?
>> > >> dup v0.8h, w0
>> > >> dup v1.8h, w1
>> > >> zip1 v0.8h, v0.8h, v1.8h
>> > >>
>> > >> I have attached patch to do the same, if number of elements >= 8,
>> > >> which should be possibly better compared to current code-gen ?
>> > >> Patch passes bootstrap+test on aarch64-linux-gnu.
>> > >> Does the patch look OK ?
>> > >>
>> > >> Thanks,
>> > >> Prathamesh
>> > >>
>> > >> diff --git a/gcc/config/aarch64/aarch64.cc 
>> > >> b/gcc/config/aarch64/aarch64.cc
>> > >> index c91df6f5006..e5dea70e363 100644
>> > >> --- a/gcc/config/aarch64/aarch64.cc
>> > >> +++ b/gcc/config/aarch64/aarch64.cc
>> > >> @@ -22028,6 +22028,39 @@ aarch64_expand_vector_init (rtx target, rtx 
>> > >> vals)
>> > >>return;
>> > >>  }
>> > >>
>> > >> +  /* Check for interleaving case.
>> > >> + For eg if initializer is (int16x8_t) {x, y, x, y, x, y, x, y}.
>> > >> + Generate following code:
>> > >> + dup v0.h, x
>> > >> + dup v1.h, y
>> > >> + zip1 v0.h, v0.h, v1.h
>> > >> + for "large enough" initializer.  */
>> > >> +
>> > >> +  if (n_elts >= 8)
>> > >> +{
>> > >> +  int i;
>> > >> +  for (i = 2; i < n_elts; i++)
>> > >> +if (!rtx_equal_p (XVECEXP (vals, 0, i), XVECEXP (vals, 0, i % 2)))
>> > >> +  break;
>> > >> +
>> > >> +  if (i == n_elts)
>> > >> +{
>> > >> +  machine_mode mode = GET_MODE (target);
>> > >> +  rtx dest[2];
>> > >> +
>> > >> +  for (int i = 0; i < 2; i++)
>> > >> +{
>> > >> +  rtx x = copy_to_mode_reg (GET_MODE_INNER (mode), XVECEXP 
>> > >> (vals, 0, i));
>> > >
>> > > Formatting nit: long line.
>> > >
>> > >> +  dest[i] = gen_reg_rtx (mode);
>> > >> +  aarch64_emit_move (dest[i], gen_vec_duplicate (mode, x));
>> > >> +}
>> > >
>> > > This could probably be written:
>> > >
>> > > for (int i = 0; i < 2; i++)
>> > >   {
>> > > rtx x = expand_vector_broadcast (mode, XVECEXP (vals, 0, i));
>> > > dest[i] = force_reg (GET_MODE_INNER (mode), x);
>> >
>> > Oops, I meant "mode" rather than "GET_MODE_INNER (mode)", sorry.
>> Thanks, I have pushed the change in
>> 769370f3e2e04823c8a621d8ffa756dd83ebf21e after running
>> bootstrap+test on aarch64-linux-gnu.
> Hi Richard,
> I have attached a patch that extends the transform if one half is dup
> and other is set of constants.
> For eg:
> int8x16_t f(int8_t x)
> {
>   return (int8x16_t) { x, 1, x, 2, x, 3, x, 4, x, 5, x, 6, x, 7, x, 8 };
> }
>
> code-gen trunk:
> f:
> adrpx1, .LC0
> ldr q0, [x1, #:lo12:.LC0]
> ins v0.b[0], w0
> ins v0.b[2], w0
> ins v0.b[4], w0
> ins v0.b[6], w0
> ins v0.b[8], w0
> ins v0.b[10], w0
> ins v0.b[12], w0
> ins v0.b[14], w0
> ret
>
> code-gen with patch:
> f:
> dup v0.16b, w0
> adrpx0, .LC0
> ldr q1, [x0, #:lo12:.LC0]
> zip1v0.16b, v0.16b, v1.16b
> ret
>
> Bootstrapped+tested on aarch64-linux-gnu.
> Does it look OK ?

Looks like a nice improvement.  It'll need to wait for GCC 14 now though.

However, rather than handle this case specially, I think we should instead
take a divide-and-conquer approach: split the initialiser into even and
odd elements, find the best way of loading each part, then compare the
cost of these sequences + ZIP with the cost of the fallback code (the code
later in aarch64_expand_vector_init).

For example, doing that would allow:

  { x, y, 0, y, 0, y, 0, y, 0, y }

to be loaded more easily, even though the even elements aren't wholly
constant.

Thanks,
Richard

>
> Thanks,
> Prathamesh
>>
>
>> Thanks,
>> Prathamesh
>> >
>> > >   }
>> > >
>> > > which avoids forcing constant elements into a register before the 
>> > > duplication.
>> > > OK with that change if it works.
>> > >
>> > > Thanks,
>> > > Richard
>> > >
>> > >> +
>> > >> +  rtvec v = gen_rtvec (2, dest[0], dest[1]);
>> > >> +  emit_set_insn (target, gen_rtx_UNSPEC (mode, v, UNSPEC_ZIP1));
>> > >> +  return;
>> > >> +}
>> > >> +}
>> > >> +
>> > >>enum insn_code icode 

Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Thu, Jan 12, 2023 at 03:22:56PM +, Richard Sandiford wrote:
>> If we have a new enum, I think we should handle it explicitly.  The fact
>> that the information isn't propagated between frames is a key part of
>> the semantics.
>> 
>> >> Another option is to just define the arch dependent value for how field
>> >> in the arch code, right now it is unsigned char type, so using say
>> >> (unsigned char) ~0 or (unsigned char) ~0 and (unsigned char) ~1 as arch
>> >> specific values might be ok too.
>> >
>> > Yet another option would be to define 1-2 extra REG_ values in the generic
>> > unwind-dw2.h header, but name them
>> >   REG_ARCH_SPECIFIC_1,
>> >   REG_ARCH_SPECIFIC_2,
>> > or so, and then the machine specific code can
>> > #define REG_AARCH64_TOGGLE_ON REG_ARCH_SPECIFIC_1
>> > Of course, all this depends on whether the arch specific codes can be
>> > handled in uw_update_context_1 by doing break; there and nothing else.
>> 
>> Yeah, personally I'd prefer for target-independent code to provide
>> the toggle representation, even if it isn't widely used.
>
> I can live even with that, I just hope it won't make code generation worse
> on other targets.
> Anyway, I understood aarch64 needs 2 states for the signing, so one would
> be REG_TOGGLE_ON and the other anything else?

The other is the default (no signing), so it needs to be REG_UNSAVED.

> Users can always create (invalid?) unwind info where they save the magic
> register, make it undefined etc.
>
> And
> void bar (void), baz (void), boo (void), qux (void), corge (void);
> enum {
>   REG_UNSAVED,
>   REG_SAVED_OFFSET,
>   REG_SAVED_REG,
>   REG_SAVED_EXP,
>   REG_SAVED_VAL_OFFSET,
>   REG_SAVED_VAL_EXP,
>   REG_UNDEFINED
> #ifdef ANOTHER
>   , REG_TOGGLE_ON
> #endif
> };
>
> void
> foo (unsigned char c)
> {
>   switch (c)
> {
>   case REG_UNSAVED:
>   case REG_UNDEFINED:
> #ifdef ANOTHER
>   case REG_TOGGLE_ON:
> #endif
> break;
>   
>   case REG_SAVED_OFFSET:
> bar ();
> break;
>  
>   case REG_SAVED_REG:
> baz ();
> break;
>   
>   case REG_SAVED_EXP:
> boo ();
> break;
>   
>   case REG_SAVED_VAL_OFFSET:
> qux ();
> break;
>
>   case REG_SAVED_VAL_EXP:
> corge ();
> break;
> }
> }
> suggests that it doesn't, already cfg pass turns the implicit default:
> into something that covers even the REG_UNSAVED, REG_UNDEFINED and maybe
> REG_TOGGLE_ON values into default:

OK, that's good.  Maybe having it behind a macro wouldn't be too bad though,
if it comes to that.

Thanks,
Richard


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 03:22:56PM +, Richard Sandiford wrote:
> If we have a new enum, I think we should handle it explicitly.  The fact
> that the information isn't propagated between frames is a key part of
> the semantics.
> 
> >> Another option is to just define the arch dependent value for how field
> >> in the arch code, right now it is unsigned char type, so using say
> >> (unsigned char) ~0 or (unsigned char) ~0 and (unsigned char) ~1 as arch
> >> specific values might be ok too.
> >
> > Yet another option would be to define 1-2 extra REG_ values in the generic
> > unwind-dw2.h header, but name them
> >   REG_ARCH_SPECIFIC_1,
> >   REG_ARCH_SPECIFIC_2,
> > or so, and then the machine specific code can
> > #define REG_AARCH64_TOGGLE_ON REG_ARCH_SPECIFIC_1
> > Of course, all this depends on whether the arch specific codes can be
> > handled in uw_update_context_1 by doing break; there and nothing else.
> 
> Yeah, personally I'd prefer for target-independent code to provide
> the toggle representation, even if it isn't widely used.

I can live even with that, I just hope it won't make code generation worse
on other targets.
Anyway, I understood aarch64 needs 2 states for the signing, so one would
be REG_TOGGLE_ON and the other anything else?
Users can always create (invalid?) unwind info where they save the magic
register, make it undefined etc.

And
void bar (void), baz (void), boo (void), qux (void), corge (void);
enum {
  REG_UNSAVED,
  REG_SAVED_OFFSET,
  REG_SAVED_REG,
  REG_SAVED_EXP,
  REG_SAVED_VAL_OFFSET,
  REG_SAVED_VAL_EXP,
  REG_UNDEFINED
#ifdef ANOTHER
  , REG_TOGGLE_ON
#endif
};

void
foo (unsigned char c)
{
  switch (c)
{
  case REG_UNSAVED:
  case REG_UNDEFINED:
#ifdef ANOTHER
  case REG_TOGGLE_ON:
#endif
break;
  
  case REG_SAVED_OFFSET:
bar ();
break;
 
  case REG_SAVED_REG:
baz ();
break;
  
  case REG_SAVED_EXP:
boo ();
break;
  
  case REG_SAVED_VAL_OFFSET:
qux ();
break;

  case REG_SAVED_VAL_EXP:
corge ();
break;
}
}
suggests that it doesn't, already cfg pass turns the implicit default:
into something that covers even the REG_UNSAVED, REG_UNDEFINED and maybe
REG_TOGGLE_ON values into default:

Jakub



Re: Missed lowering to ld1rq from svld1rq for memory operand

2023-01-12 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> On Fri, 5 Aug 2022 at 17:49, Richard Sandiford
>  wrote:
>>
>> Prathamesh Kulkarni  writes:
>> > Hi Richard,
>> > Following from off-list discussion, in the attached patch, I wrote pattern
>> > similar to vec_duplicate_reg, which seems to work for the svld1rq 
>> > tests.
>> > Does it look OK ?
>> >
>> > Sorry, I didn't fully understand your suggestion on integrating with
>> > vec_duplicate_reg
>> > pattern. For vec_duplicate_reg, the operand to vec_duplicate expects
>> > mode to be , while the pattern in patch expects operand of
>> > vec_duplicate to have mode .
>> > How do we write a pattern so an operand can accept either of the 2 modes ?
>>
>> I quoted the wrong one, sorry, should have been
>> aarch64_vec_duplicate_vq_le.
>>
>> > Also it seems  cannot be used with SVE_ALL ?
>>
>> Yeah, these would be SVE_FULL only.
> Hi Richard,
> Sorry for the very late reply. I have attached patch, to integrate
> with vec_duplicate_vq_le.
> Bootstrapped+tested on aarch64-linux-gnu.
> OK to commit ?
>
> Thanks,
> Prathamesh
>>
>> Richard
>>
>
> gcc/
>   * config/aarch64/aarch64-sve.md (aarch64_vec_duplicate_vq_le):
>   Change to define_insn_and_split to fold ldr+dup to ld1rq.
>   * config/aarch64/predicates.md (aarch64_sve_dup_ld1rq_operand): New.
>
> testsuite/
>   * gcc.target/aarch64/sve/acle/general/pr96463-2.c: Adjust.
>
> diff --git a/gcc/config/aarch64/aarch64-sve.md 
> b/gcc/config/aarch64/aarch64-sve.md
> index b8cc47ef5fc..4548375b8d6 100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -2533,14 +2533,34 @@
>  )
>  
>  ;; Duplicate an Advanced SIMD vector to fill an SVE vector (LE version).
> -(define_insn "@aarch64_vec_duplicate_vq_le"
> -  [(set (match_operand:SVE_FULL 0 "register_operand" "=w")
> +
> +(define_insn_and_split "@aarch64_vec_duplicate_vq_le"
> +  [(set (match_operand:SVE_FULL 0 "register_operand" "=w, w")
>   (vec_duplicate:SVE_FULL
> -   (match_operand: 1 "register_operand" "w")))]
> +   (match_operand: 1 "aarch64_sve_dup_ld1rq_operand" "w, UtQ")))
> +   (clobber (match_scratch:VNx16BI 2 "=X, Upl"))]
>"TARGET_SVE && !BYTES_BIG_ENDIAN"
>{
> -operands[1] = gen_rtx_REG (mode, REGNO (operands[1]));
> -return "dup\t%0.q, %1.q[0]";
> +switch (which_alternative)
> +  {
> + case 0:
> +   operands[1] = gen_rtx_REG (mode, REGNO (operands[1]));
> +   return "dup\t%0.q, %1.q[0]";
> + case 1:
> +   return "#";
> + default:
> +   gcc_unreachable ();
> +  }
> +  }
> +  "&& MEM_P (operands[1])"
> +  [(const_int 0)]
> +  {
> +if (GET_CODE (operands[2]) == SCRATCH)
> +  operands[2] = gen_reg_rtx (VNx16BImode);
> +emit_move_insn (operands[2], CONSTM1_RTX (VNx16BImode));
> +rtx gp = gen_lowpart (mode, operands[2]);
> +emit_insn (gen_aarch64_sve_ld1rq (operands[0], operands[1], gp));
> +DONE;
>}
>  )
>  
> diff --git a/gcc/config/aarch64/predicates.md 
> b/gcc/config/aarch64/predicates.md
> index ff7f73d3f30..6062f37025e 100644
> --- a/gcc/config/aarch64/predicates.md
> +++ b/gcc/config/aarch64/predicates.md
> @@ -676,6 +676,10 @@
>(ior (match_operand 0 "register_operand")
> (match_operand 0 "aarch64_sve_ld1r_operand")))
>  
> +(define_predicate "aarch64_sve_dup_ld1rq_operand"
> +  (ior (match_operand 0 "register_operand")
> +   (match_operand 0 "aarch64_sve_ld1rq_operand")))
> +
>  (define_predicate "aarch64_sve_ptrue_svpattern_immediate"
>(and (match_code "const")
> (match_test "aarch64_sve_ptrue_svpattern_p (op, NULL)")))
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr96463-2.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr96463-2.c
> index 196de3f5e0a..c38204e6874 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr96463-2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr96463-2.c
> @@ -26,4 +26,4 @@ TEST(svfloat64_t, float64_t, f64)
>  
>  TEST(svbfloat16_t, bfloat16_t, bf16)
>  
> -/* { dg-final { scan-assembler-times {\tdup\tz[0-9]+\.q, z[0-9]+\.q\[0\]} 12 
> { target aarch64_little_endian } } } */
> +/* { dg-final { scan-assembler-not {\tdup\t} } } */

It would be good to add something like:

/* { dg-final { scan-assembler-times {\tld1rq\t} 12 } } */

(I assume it'll pass for both endiannesses, but please check!),
in addition to the scan-assembler-not.

OK with that change, thanks.

Richard


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Thu, Jan 12, 2023 at 01:28:59PM +0100, Jakub Jelinek wrote:
>> > Although we don't AFAIK support using DW_CFA_undefined with RA signing,
>> > the failure mode would be non-obvious: it would effectively toggle the
>> > bit on.
>> 
>> We don't install unwind-dw2.h nor give user code access to the how array
>> (and it just lives on the stack of __frame_state_for/uw_init_context_1
>> functions and address of it is passed to functions called from it),
>> so I hope all this is private to the libgcc unwinder.  After all, otherwise
>> e.g. the change how "how" is represented couldn't be done.
>> That said, if new enum entries are added in the generic code, then
>> I think uw_update_context_1 will warn about unhandled case in a switch,
>> unless we e.g. change
>>   case REG_UNSAVED:
>>   case REG_UNDEFINED:
>> break;
>> to
>>   default:
>>  break;
>> (and provided that the new enums would want such handling).

If we have a new enum, I think we should handle it explicitly.  The fact
that the information isn't propagated between frames is a key part of
the semantics.

>> Another option is to just define the arch dependent value for how field
>> in the arch code, right now it is unsigned char type, so using say
>> (unsigned char) ~0 or (unsigned char) ~0 and (unsigned char) ~1 as arch
>> specific values might be ok too.
>
> Yet another option would be to define 1-2 extra REG_ values in the generic
> unwind-dw2.h header, but name them
>   REG_ARCH_SPECIFIC_1,
>   REG_ARCH_SPECIFIC_2,
> or so, and then the machine specific code can
> #define REG_AARCH64_TOGGLE_ON REG_ARCH_SPECIFIC_1
> Of course, all this depends on whether the arch specific codes can be
> handled in uw_update_context_1 by doing break; there and nothing else.

Yeah, personally I'd prefer for target-independent code to provide
the toggle representation, even if it isn't widely used.

Thanks,
Richard


Re: [PATCH] wwwdocs: Note that old reload is deprecated

2023-01-12 Thread Paul Koning via Gcc-patches



> On Jan 12, 2023, at 9:40 AM, Segher Boessenkool  
> wrote:
> 
> On Thu, Jan 12, 2023 at 09:17:31AM -0500, Paul Koning wrote:
>>> On Jan 12, 2023, at 4:50 AM, Segher Boessenkool 
>>>  wrote:
>>> I mean general_operand accepts that sp thing you saw.  But your
>>> constraints do not?  (I don't know your target well, maybe this isn't
>>> true).  Things like this should be sorted out by reload, but you get
>>> better code (and fewer problems ;-) ) if you make code that fits
>>> earlier.  The L in LRA means "local": it "just" makes things fit, it
>>> does not emphasise optimising your code.
>> 
>> The destination is "nonimmediate_operand" which matches what the machine 
>> actually does.  It's like VAX and M68k, instruction operands in general can 
>> be registers, memory references, register indirect or memory indirect, 
>> memory at register with offset, or autoinc/autodec off any register.
>> 
>> As far as operand type is concerned, SP is certainly a valid operand for an 
>> add operation, that isn't the problem.  The problem is that it's a two 
>> operand machine: the first operand of the add instruction is both source and 
>> destination.  And LRA assigned the source register to be the destination 
>> register as required, but then after doing so it replaced the destination 
>> (an FP reference) by a different register (SP reference), violating the 
>> constraint after having satisfied it earlier.
> 
> Ah okay.  Yes, something does not verify if the instructions are valid
> before doing some replacement.  Something around ELIMINABLE_REGS it
> looks like?  Maybe the dump file says more, or maybe the dump file can
> be improved.

The Reload dump file mentions in a couple of places that it sees r5 (FP) as 
eliminable, replaced by r6 (SP).  The IRA dump file says nothing.

Yes, the ELIMINABLE_REGS macro says FP can be eliminated, which is correct.  In 
fact, if it weren't for that, it would be wrong for the register allocator to 
pick it (R5) as a general register to hold the result of that addhi3.  So the 
trouble is that the replacement is made after that register allocation, and 
given the constraint the allocator actually needs to generate an additional 
instruction, a move from SP to R5 so it can then do the two-operand add the 
constraint requires.

This feels like a bug; should I file a bug report?  Or is there something the 
target code can do to make this work?  I looked through the internals manual a 
bit, it doesn't show anything helpful.  The register elimination hooks are 
rather generic and don't give me any handle to recognize cases like this one.

paul



Re: [PATCH] testsuite: Skip intrinsics test if arm

2023-01-12 Thread Richard Earnshaw via Gcc-patches




On 19/09/2022 17:16, Torbjörn SVENSSON via Gcc-patches wrote:

In the test case, it's clearly written that intrinsics is not
implemented on arm*. A simple xfail does not help since there are
link error and that would cause an UNRESOLVED testcase rather than
XFAIL.
By chaning to dg-skip-if, the entire test case is omitted.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Replace
dg-xfail-if with gd-skip-if.


Sorry for the delay reviewing this, I missed it at the time.

My problem with your suggested solution is that if these intrinsics are 
ever added this test will not automatically pick this up as it will have 
been disabled.  I presume from the comment (and the body of the test 
that contains an #ifdef for aarch64) that this is expected to be a 
temporary issue rather than something permanent.


So IMO I think it is correct to leave this as unresolved because the 
test cannot be built due to an issue with the compiler.


R.



Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
  gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
index 92a139bc523..f933102be47 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
@@ -1,6 +1,6 @@
  /* We haven't implemented these intrinsics for arm yet.  */
-/* { dg-xfail-if "" { arm*-*-* } } */
  /* { dg-do run } */
+/* { dg-skip-if "unsupported" { arm*-*-* } } */
  /* { dg-options "-O3" } */
  
  #include 


Re: [PATCH] [PR42093] [arm] [thumb2] disable tree-dce for test

2023-01-12 Thread Richard Earnshaw via Gcc-patches




On 02/12/2022 09:26, Alexandre Oliva via Gcc-patches wrote:


CD-DCE introduces blocks to share common PHI nodes, which replaces a
backwards branch that used to prevent the thumb2 jump table shortening
that PR42093 tested for.  In order to keep on testing that the
backward branch prevents the jumptable shortening, disable tree-dce.

Regstraped on x86_64-linux-gnu, also tested with crosses to riscv64-elf
and arm-eabi.  Ok to install?

 > for  gcc/testsuite/ChangeLog

PR target/42093
* gcc.target/arm/pr42093.c: Disable tree-dce.


OK.

R.


---
  gcc/testsuite/gcc.target/arm/pr42093.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr42093.c 
b/gcc/testsuite/gcc.target/arm/pr42093.c
index 7ba2f933eef81..69b1470607c7f 100644
--- a/gcc/testsuite/gcc.target/arm/pr42093.c
+++ b/gcc/testsuite/gcc.target/arm/pr42093.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mthumb -O2 -fno-reorder-blocks" }  */
+/* { dg-options "-mthumb -O2 -fno-reorder-blocks -fno-tree-dce" }  */
  /* { dg-require-effective-target arm_thumb2_ok } */
  /* { dg-final { scan-assembler-not "tbb" } } */
  /* { dg-final { scan-assembler-not "tbh" } } */



Re: [PATCH] wwwdocs: Note that old reload is deprecated

2023-01-12 Thread Segher Boessenkool
On Thu, Jan 12, 2023 at 09:17:31AM -0500, Paul Koning wrote:
> > On Jan 12, 2023, at 4:50 AM, Segher Boessenkool 
> >  wrote:
> > I mean general_operand accepts that sp thing you saw.  But your
> > constraints do not?  (I don't know your target well, maybe this isn't
> > true).  Things like this should be sorted out by reload, but you get
> > better code (and fewer problems ;-) ) if you make code that fits
> > earlier.  The L in LRA means "local": it "just" makes things fit, it
> > does not emphasise optimising your code.
> 
> The destination is "nonimmediate_operand" which matches what the machine 
> actually does.  It's like VAX and M68k, instruction operands in general can 
> be registers, memory references, register indirect or memory indirect, memory 
> at register with offset, or autoinc/autodec off any register.
> 
> As far as operand type is concerned, SP is certainly a valid operand for an 
> add operation, that isn't the problem.  The problem is that it's a two 
> operand machine: the first operand of the add instruction is both source and 
> destination.  And LRA assigned the source register to be the destination 
> register as required, but then after doing so it replaced the destination (an 
> FP reference) by a different register (SP reference), violating the 
> constraint after having satisfied it earlier.

Ah okay.  Yes, something does not verify if the instructions are valid
before doing some replacement.  Something around ELIMINABLE_REGS it
looks like?  Maybe the dump file says more, or maybe the dump file can
be improved.

> Interesting to know what LRA stands for, I didn't know that.

First line of lra.cc:
/* LRA (local register allocator) driver and LRA utilities.

:-)


Segher


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 01:28:59PM +0100, Jakub Jelinek wrote:
> > Although we don't AFAIK support using DW_CFA_undefined with RA signing,
> > the failure mode would be non-obvious: it would effectively toggle the
> > bit on.
> 
> We don't install unwind-dw2.h nor give user code access to the how array
> (and it just lives on the stack of __frame_state_for/uw_init_context_1
> functions and address of it is passed to functions called from it),
> so I hope all this is private to the libgcc unwinder.  After all, otherwise
> e.g. the change how "how" is represented couldn't be done.
> That said, if new enum entries are added in the generic code, then
> I think uw_update_context_1 will warn about unhandled case in a switch,
> unless we e.g. change
>   case REG_UNSAVED:
>   case REG_UNDEFINED:
> break;
> to
>   default:
>   break;
> (and provided that the new enums would want such handling).
> Another option is to just define the arch dependent value for how field
> in the arch code, right now it is unsigned char type, so using say
> (unsigned char) ~0 or (unsigned char) ~0 and (unsigned char) ~1 as arch
> specific values might be ok too.

Yet another option would be to define 1-2 extra REG_ values in the generic
unwind-dw2.h header, but name them
  REG_ARCH_SPECIFIC_1,
  REG_ARCH_SPECIFIC_2,
or so, and then the machine specific code can
#define REG_AARCH64_TOGGLE_ON REG_ARCH_SPECIFIC_1
Of course, all this depends on whether the arch specific codes can be
handled in uw_update_context_1 by doing break; there and nothing else.

Jakub



Re: [PATCH] wwwdocs: Note that old reload is deprecated

2023-01-12 Thread Paul Koning via Gcc-patches



> On Jan 12, 2023, at 4:50 AM, Segher Boessenkool  
> wrote:
> 
> On Wed, Jan 11, 2023 at 05:07:47PM -0500, Paul Koning wrote:
>>> On Jan 11, 2023, at 2:28 PM, Segher Boessenkool 
>>>  wrote:
>>> I would say your predicates are way too lenient here (general_operand),
>>> but this needs more info.  A testcase to reproduce the problem, to
>>> start with :-)
>> 
>> I'll try to trim it down.
>> 
>> What do you mean "too lenient"?  The first input operand (which is supposed 
>> to be the same as the output since the instruction set is 2-address) is 
>> "general_operand".  The destination is "nonimmediate_operand" which fits the 
>> constraints applied to it.
> 
> I mean general_operand accepts that sp thing you saw.  But your
> constraints do not?  (I don't know your target well, maybe this isn't
> true).  Things like this should be sorted out by reload, but you get
> better code (and fewer problems ;-) ) if you make code that fits
> earlier.  The L in LRA means "local": it "just" makes things fit, it
> does not emphasise optimising your code.

The destination is "nonimmediate_operand" which matches what the machine 
actually does.  It's like VAX and M68k, instruction operands in general can be 
registers, memory references, register indirect or memory indirect, memory at 
register with offset, or autoinc/autodec off any register.

As far as operand type is concerned, SP is certainly a valid operand for an add 
operation, that isn't the problem.  The problem is that it's a two operand 
machine: the first operand of the add instruction is both source and 
destination.  And LRA assigned the source register to be the destination 
register as required, but then after doing so it replaced the destination (an 
FP reference) by a different register (SP reference), violating the constraint 
after having satisfied it earlier.

Interesting to know what LRA stands for, I didn't know that.

paul



nvptx: Avoid deadlock in 'cuStreamAddCallback' callback, error case (was: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes)

2023-01-12 Thread Thomas Schwinge
Hi Chung-Lin, Tom!

It's been a while:

On 2018-09-25T21:11:58+0800, Chung-Lin Tang  wrote:
> [...] NVPTX/CUDA-specific implementation
> of the new-style goacc_asyncqueues.

In an OpenACC 'async' setting, where the device kernel (expectedly)
crashes because of "an illegal memory access was encountered", I'm
running into a deadlock here:

> --- a/libgomp/plugin/plugin-nvptx.c
> +++ b/libgomp/plugin/plugin-nvptx.c

> +static void
> +cuda_callback_wrapper (CUstream stream, CUresult res, void *ptr)
> +{
> +  if (res != CUDA_SUCCESS)
> +GOMP_PLUGIN_fatal ("%s error: %s", __FUNCTION__, cuda_error (res));
> +  struct nvptx_callback *cb = (struct nvptx_callback *) ptr;
> +  cb->fn (cb->ptr);
> +  free (ptr);
> +}
> +
> +void
> +GOMP_OFFLOAD_openacc_async_queue_callback (struct goacc_asyncqueue *aq,
> +void (*callback_fn)(void *),
> +void *userptr)
> +{
> +  struct nvptx_callback *b = GOMP_PLUGIN_malloc (sizeof (*b));
> +  b->fn = callback_fn;
> +  b->ptr = userptr;
> +  b->aq = aq;
> +  CUDA_CALL_ASSERT (cuStreamAddCallback, aq->cuda_stream,
> + cuda_callback_wrapper, (void *) b, 0);
> +}

In my case, 'cuda_callback_wrapper' (expectedly) gets invoked with
'res != CUDA_SUCCESS' ("an illegal memory access was encountered").
When we invoke 'GOMP_PLUGIN_fatal', this attempts to shut down the device
(..., which deadlocks); that's generally problematic: per

"'cuStreamAddCallback' [...] Callbacks must not make any CUDA API calls".

Given that eventually we must reach a host/device synchronization point
(latest when the device is shut down at program termination), and the
non-'CUDA_SUCCESS' will be upheld until then, it does seem safe to
replace this 'GOMP_PLUGIN_fatal' with 'GOMP_PLUGIN_error' as per the
"nvptx: Avoid deadlock in 'cuStreamAddCallback' callback, error case"
attached.  OK to push?

(Might we even skip 'GOMP_PLUGIN_error' here, understanding that the
error will be caught and reported at the next host/device synchronization
point?  But I've not verified that.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From b7ddcc0807967750e3c884326ed4c53c05cde81f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 12 Jan 2023 14:39:46 +0100
Subject: [PATCH] nvptx: Avoid deadlock in 'cuStreamAddCallback' callback,
 error case

When we invoke 'GOMP_PLUGIN_fatal', this attempts to shut down the device
(..., which may deadlock); that's generally problematic: per

"'cuStreamAddCallback' [...] Callbacks must not make any CUDA API calls".

Given that eventually we must reach a host/device synchronization point
(latest when the device is shut down at program termination), and the
non-'CUDA_SUCCESS' will be upheld until then, it does seem safe to
replace this 'GOMP_PLUGIN_fatal' with 'GOMP_PLUGIN_error'.

	libgomp/
	* plugin/plugin-nvptx.c (cuda_callback_wrapper): Invoke
	'GOMP_PLUGIN_error' instead of 'GOMP_PLUGIN_fatal'.
---
 libgomp/plugin/plugin-nvptx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 395639537e83..cdb3d435bdc8 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1927,7 +1927,7 @@ static void
 cuda_callback_wrapper (CUstream stream, CUresult res, void *ptr)
 {
   if (res != CUDA_SUCCESS)
-GOMP_PLUGIN_fatal ("%s error: %s", __FUNCTION__, cuda_error (res));
+GOMP_PLUGIN_error ("%s error: %s", __FUNCTION__, cuda_error (res));
   struct nvptx_callback *cb = (struct nvptx_callback *) ptr;
   cb->fn (cb->ptr);
   free (ptr);
-- 
2.39.0



Re: [PATCH,WWWDOCS] htdocs: rotate news

2023-01-12 Thread Jose E. Marchesi via Gcc-patches


> On Fri, 23 Dec 2022, Jose E. Marchesi via Gcc-patches wrote:
>>  htdocs/index.html | 24 
>>  htdocs/news.html  | 24 
>>  2 files changed, 24 insertions(+), 24 deletions(-)
>
> Okay, thank you.
>
> And you can consider this kind of change preapproved. Or falling under 
> our "obvious rule". Whichever you prefer. :-)

Understood, thanks.
Pushed.


Re: [PATCH,WWWDOCS] htdocs: news: GCC BPF in Compiler Explorer

2023-01-12 Thread Jose E. Marchesi via Gcc-patches


> On Fri, 23 Dec 2022, Jose E. Marchesi via Gcc-patches wrote:
>> This patch adds an entry to the News section in index.html, announcing
>> the availability of a nightly build of bpf-unknown-none-gcc.
>
> Nice!
>
>> +https://godbolt.org;>GCC BPF in Compiler 
>> Explorer
>> + [2022-12-23]
>> +Support for a nightly build of the bpf-unknown-none-gcc compiler
>> +  has been contributed to Compiler Explorer (aka godbolt.org) by Marc
>> +  Poulhiès
>
> Usually I recommend active voice, something like "Compiler Explorer (aka 
> godbolt.org) now supports nightly builds of the bpf-unknown-none-gcc 
> compiler thanks to Marc Poulhiès", but your proposal is perfectly fine, 
> too.
>
> Which means only change if you like the alternative apprach better 
> yourself; otherwise simply use the existing one.
>
> Either way: Okay, and thank you!

Committed, thanks.


Re: [PATCH v3 1/2] aarch64: fix warning emission for ABI break since GCC 9.1

2023-01-12 Thread Christophe Lyon via Gcc-patches




On 1/12/23 14:03, Richard Sandiford wrote:

Christophe Lyon  writes:

While looking at PR 105549, which is about fixing the ABI break
introduced in GCC 9.1 in parameter alignment with bit-fields, we
noticed that the GCC 9.1 warning is not emitted in all the cases where
it should be.  This patch fixes that and the next patch in the series
fixes the GCC 9.1 break.

We split this into two patches since patch #2 introduces a new ABI
break starting with GCC 13.1.  This way, patch #1 can be back-ported
to release branches if needed to fix the GCC 9.1 warning issue.

The main idea is to add a new global boolean that indicates whether
we're expanding the start of a function, so that aarch64_layout_arg
can emit warnings for callees as well as callers.  This removes the
need for aarch64_function_arg_boundary to warn (with its incomplete
information).  However, in the first patch there are still cases where
we emit warnings were we should not; this is fixed in patch #2 where
we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly.

The fix in aarch64_function_arg_boundary (replacing & with &&) looks
like an oversight of a previous commit in this area which changed
'abi_break' from a boolean to an integer.

We also take the opportunity to fix the comment above
aarch64_function_arg_alignment since the value of the abi_break
parameter was changed in a previous commit, no longer matching the
description.

v2->v3: removed a bogus comment, added C++ tests (copied from the C
ones)

2022-11-28  Christophe Lyon  
Richard Sandiford  

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix
comment.
(aarch64_layout_arg): Factorize warning conditions.
(aarch64_function_arg_boundary): Fix typo.
* function.cc (currently_expanding_function_start): New variable.
(expand_function_start): Handle
currently_expanding_function_start.
* function.h (currently_expanding_function_start): Declare.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New
test.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New
test.
* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning.h: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New
test.
* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New
test.
* g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning.h: New test.


OK for trunk, and for backports after a while.  Thanks for doing this,
and for your patience through it all.


Great, and thanks for your help & patience too :-)

Christophe



Richard


---
  gcc/config/aarch64/aarch64.cc |  28 +++-
  gcc/function.cc   |   5 +
  gcc/function.h|   2 +
  .../bitfield-abi-warning-align16-O2-extra.C   |  86 
  .../aarch64/bitfield-abi-warning-align16-O2.C |  87 
  .../bitfield-abi-warning-align32-O2-extra.C   | 119 +
  .../aarch64/bitfield-abi-warning-align32-O2.C | 119 +
  .../aarch64/bitfield-abi-warning-align8-O2.C  |  16 +++
  .../g++.target/aarch64/bitfield-abi-warning.h | 125 ++
  .../bitfield-abi-warning-align16-O2-extra.c   |  86 
  .../aarch64/bitfield-abi-warning-align16-O2.c |  87 
  .../bitfield-abi-warning-align32-O2-extra.c   | 119 +
  .../aarch64/bitfield-abi-warning-align32-O2.c | 119 +
  .../aarch64/bitfield-abi-warning-align8-O2.c  |  16 +++
  .../gcc.target/aarch64/bitfield-abi-warning.h | 125 ++
  15 files changed, 1132 insertions(+), 7 deletions(-)
  create mode 100644 
gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
  create mode 100644 
gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
  create mode 100644 
gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
  create mode 100644 
gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
  create mode 100644 
gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
  create mode 100644 gcc/testsuite/g++.target/aarch64/bitfield-abi-warning.h
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2.c
  create mode 100644 

Re: [PATCH v3 2/2] aarch64: Fix bit-field alignment in param passing [PR105549]

2023-01-12 Thread Christophe Lyon via Gcc-patches




On 1/12/23 14:19, Richard Sandiford wrote:

Christophe Lyon  writes:

While working on enabling DFP for AArch64, I noticed new failures in
gcc.dg/compat/struct-layout-1.exp (t028) which were not actually
caused by DFP types handling. These tests are generated during 'make
check' and enabling DFP made generation different (not sure if new
non-DFP tests are generated, or if existing ones are generated
differently, the tests in question are huge and difficult to compare).

Anyway, I reduced the problem to what I attach at the end of the new
gcc.target/aarch64/aapcs64/va_arg-17.c test and rewrote it in the same
scheme as other va_arg* AArch64 tests.  Richard Sandiford further
reduced this to a non-vararg function, added as a second testcase.

This is a tough case mixing bit-fields and alignment, where
aarch64_function_arg_alignment did not follow what its descriptive
comment says: we want to use the natural alignment of the bit-field
type only if the user didn't reduce the alignment for the bit-field
itself.

The patch also adds a comment and assert that would help someone who
has to look at this area again.

The fix would be very small, except that this introduces a new ABI
break, and we have to warn about that.  Since this actually fixes a
problem introduced in GCC 9.1, we keep the old computation to detect
when we now behave differently.

This patch adds two new tests (va_arg-17.c and
pr105549.c). va_arg-17.c contains the reduced offending testcase from
struct-layout-1.exp for reference.  We update some tests introduced by
the previous patch, where parameters with bit-fields and packed
attribute now emit a different warning.

v2->v3: testcase update

2022-11-28  Christophe Lyon  
Richard Sandiford  

gcc/
PR target/105549
* config/aarch64/aarch64.cc (aarch64_function_arg_alignment):
Check DECL_PACKED for bitfield.
(aarch64_layout_arg): Warn when parameter passing ABI changes.
(aarch64_function_arg_boundary): Do not warn here.
(aarch64_gimplify_va_arg_expr): Warn when parameter passing ABI
changes.

gcc/testsuite/
PR target/105549
* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: Update.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: Update.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: Update.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: Update.
* gcc.target/aarch64/aapcs64/va_arg-17.c: New test.
* gcc.target/aarch64/pr105549.c: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: Update.
* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: Update.
* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: Update.
* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: Update.
---
  gcc/config/aarch64/aarch64.cc | 148 ++
  .../bitfield-abi-warning-align16-O2-extra.C   |  64 
  .../aarch64/bitfield-abi-warning-align16-O2.C |  48 +++---
  .../bitfield-abi-warning-align32-O2-extra.C   | 131 +++-
  .../aarch64/bitfield-abi-warning-align32-O2.C | 132 
  .../gcc.target/aarch64/aapcs64/va_arg-17.c| 105 +
  .../bitfield-abi-warning-align16-O2-extra.c   |  64 
  .../aarch64/bitfield-abi-warning-align16-O2.c |  48 +++---
  .../bitfield-abi-warning-align32-O2-extra.c   | 131 +++-
  .../aarch64/bitfield-abi-warning-align32-O2.c | 132 
  gcc/testsuite/gcc.target/aarch64/pr105549.c   |  12 ++
  11 files changed, 587 insertions(+), 428 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-17.c
  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr105549.c
[...]
@@ -68,14 +62,14 @@
  /* { dg-note {parameter passing for argument of type 'S4' changed in GCC 9.1} 
"" { target *-*-* } 103 } f4_stdarg */
  /* { dg-note {parameter passing for argument of type 'S8' changed in GCC 9.1} 
"" { target *-*-* } 104 } f8_stdarg */
  
-/* Parameter passing for these should not have changed in GCC 9.1 (PR 105549).

+/* FIXME Parameter passing for these should not have changed in GCC 9.1 (PR 
105549).
 Fortunately we warn. Note the discrepancy with lines 120-124 below: we warn
 in the callee, but not in the caller.  */


Looks like a stray change.  Same for the C test.


Ha yes, thanks for catching this!



OK otherwise, thanks.

Richard


Re: [PATCH 9/8] middle-end: Allow build_popcount_expr to use an IFN

2023-01-12 Thread Richard Biener via Gcc-patches
On Thu, Dec 22, 2022 at 6:44 PM Andrew Carlotti via Gcc-patches
 wrote:
>
> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
> x86_64-pc-linux-gnu - ok to merge?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> * tree-ssa-loop-niter.cc (build_popcount_expr): Add IFN support.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/tree-ssa/pr86544.C: Add .POPCOUNT to tree scan regex.
> * gcc.dg/tree-ssa/popcount.c: Likewise.
> * gcc.dg/tree-ssa/popcount2.c: Likewise.
> * gcc.dg/tree-ssa/popcount3.c: Likewise.
> * gcc.target/aarch64/popcount4.c: Likewise.
> * gcc.target/i386/pr95771.c: Likewise, and...
> * gcc.target/i386/pr95771-2.c: ...split int128 test from above,
> since this would emit just a single IFN if a TI optab is added.
>
> ---
>
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr86544.C 
> b/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
> index 
> ef438916a8019320564f444ace08e2f4b4190684..50befb36bac75de1cfa282e38358278b3288bd1c
>  100644
> --- a/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
> +++ b/gcc/testsuite/g++.dg/tree-ssa/pr86544.C
> @@ -12,5 +12,5 @@ int PopCount (long b) {
>  return c;
>  }
>
> -/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
> "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "if" 0 "phiopt4" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
> index 
> b4694109411a4631697463519acbe7d9df65bf6e..efd906a0f5447f0beb3752eded3756999b02e6e6
>  100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
> @@ -39,4 +39,4 @@ void PopCount3 (long b1) {
>}
>  }
>
> -/* { dg-final { scan-tree-dump-times "__builtin_popcount" 3 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 3 
> "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
> index 
> ef73e345573de721833e98e89c252640a55f7c60..ae38a329bd4d868a762300d3218d68864c0fc4be
>  100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
> @@ -26,4 +26,4 @@ int main()
>return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
> "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
> index 
> ef438916a8019320564f444ace08e2f4b4190684..50befb36bac75de1cfa282e38358278b3288bd1c
>  100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
> @@ -12,5 +12,5 @@ int PopCount (long b) {
>  return c;
>  }
>
> -/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
> "optimized" } } */
>  /* { dg-final { scan-tree-dump-times "if" 0 "phiopt4" } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/popcount4.c 
> b/gcc/testsuite/gcc.target/aarch64/popcount4.c
> index 
> ee55b2e335223053ca024e95b7a13aa4af32550e..8aa15ff018d4b5fc6bb59e52af20d5c33cea2ee0
>  100644
> --- a/gcc/testsuite/gcc.target/aarch64/popcount4.c
> +++ b/gcc/testsuite/gcc.target/aarch64/popcount4.c
> @@ -11,4 +11,4 @@ int PopCount (long b) {
>  return c;
>  }
>
> -/* { dg-final { scan-tree-dump-times "__builtin_popcount" 0 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 0 
> "optimized" } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr95771-2.c 
> b/gcc/testsuite/gcc.target/i386/pr95771-2.c
> new file mode 100644
> index 
> ..1db9dc94d0b66477667624012221d6844c141a26
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr95771-2.c
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/95771 */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2 -mpopcnt -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump " = __builtin_popcount| = \\.POPCOUNT" 
> "optimized" } } */
> +
> +int
> +corge (unsigned __int128 x)
> +{
> +  int i = 0;
> +  while (x)
> +{
> +  x &= x - 1;
> +  ++i;
> +}
> +  return i;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr95771.c 
> b/gcc/testsuite/gcc.target/i386/pr95771.c
> index 
> d7b67017800b705b9854f561916c20901ea76803..d41be445f4a68613a082b8956fea3ceaf33d7e0f
>  100644
> --- a/gcc/testsuite/gcc.target/i386/pr95771.c
> +++ b/gcc/testsuite/gcc.target/i386/pr95771.c
> @@ -1,8 +1,7 @@
>  /* PR tree-optimization/95771 */
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mpopcnt -fdump-tree-optimized" } */
> -/* { dg-final { scan-tree-dump-times " = __builtin_popcount" 6 "optimized" { 
> target int128 } } } */
> 

Re: [PATCH 5/8 v2] middle-end: Add cltz_complement idiom recognition

2023-01-12 Thread Richard Biener via Gcc-patches
On Thu, Dec 22, 2022 at 6:42 PM Andrew Carlotti  wrote:
>
> On Thu, Nov 24, 2022 at 11:41:31AM +0100, Richard Biener wrote:
> > Note we do have CTZ and CLZ
> > optabs and internal functions - in case there's a HImode CLZ this
> > could be elided.  More general you can avoid using the __builtin_
> > functions with their fixed types in favor of using IFN_C[TL]Z which
> > are type agnostic (but require optab support - you should be able
> > to check this via direct_internal_fn_supported_p).
>
> IFN support added. I've also renamed the defined_at_zero parameter to
> define_at_zero, since this is a request for the expression to define it,
> rather than a guarantee that it is already defined.
>
> New patch below, bootstrapped and regression tested on
> aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu - ok to merge?

OK, and sorry for the delay.

Richard.

> ---
>
> This recognises patterns of the form:
> while (n) { n >>= 1 }
>
> This patch results in improved (but still suboptimal) codegen:
>
> foo (unsigned int b) {
> int c = 0;
>
> while (b) {
> b >>= 1;
> c++;
> }
>
> return c;
> }
>
> foo:
> .LFB11:
> .cfi_startproc
> cbz w0, .L3
> clz w1, w0
> tst x0, 1
> mov w0, 32
> sub w0, w0, w1
> cselw0, w0, wzr, ne
> ret
>
> The conditional is unnecessary. phiopt could recognise a redundant csel
> (using cond_removal_in_builtin_zero_pattern) when one of the inputs is a
> clz call, but it cannot recognise the redunancy when the input is (e.g.)
> (32 - clz).
>
> I could perhaps extend this function to recognise this pattern in a later
> patch, if this is a good place to recognise more patterns.
>
> gcc/ChangeLog:
>
> PR tree-optimization/94793
> * tree-scalar-evolution.cc (expression_expensive_p): Add checks
> for c[lt]z optabs.
> * tree-ssa-loop-niter.cc (build_cltz_expr): New.
> (number_of_iterations_cltz_complement): New.
> (number_of_iterations_bitcount): Add call to the above.
>
> gcc/testsuite/ChangeLog:
>
> * lib/target-supports.exp (check_effective_target_clz)
> (check_effective_target_clzl, check_effective_target_clzll)
> (check_effective_target_ctz, check_effective_target_clzl)
> (check_effective_target_ctzll): New.
> * gcc.dg/tree-ssa/cltz-complement-max.c: New test.
> * gcc.dg/tree-ssa/clz-complement-char.c: New test.
> * gcc.dg/tree-ssa/clz-complement-int.c: New test.
> * gcc.dg/tree-ssa/clz-complement-long-long.c: New test.
> * gcc.dg/tree-ssa/clz-complement-long.c: New test.
> * gcc.dg/tree-ssa/ctz-complement-char.c: New test.
> * gcc.dg/tree-ssa/ctz-complement-int.c: New test.
> * gcc.dg/tree-ssa/ctz-complement-long-long.c: New test.
> * gcc.dg/tree-ssa/ctz-complement-long.c: New test.
>
> ---
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cltz-complement-max.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/cltz-complement-max.c
> new file mode 100644
> index 
> ..1a29ca52e42e50822e4e3213b2cb008b766d0318
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/cltz-complement-max.c
> @@ -0,0 +1,60 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-loop-optimize -fdump-tree-optimized" } */
> +
> +#define PREC (__CHAR_BIT__)
> +
> +int clz_complement_count1 (unsigned char b) {
> +int c = 0;
> +
> +while (b) {
> +   b >>= 1;
> +   c++;
> +}
> +if (c <= PREC)
> +  return 0;
> +else
> +  return 34567;
> +}
> +
> +int clz_complement_count2 (unsigned char b) {
> +int c = 0;
> +
> +while (b) {
> +   b >>= 1;
> +   c++;
> +}
> +if (c <= PREC - 1)
> +  return 0;
> +else
> +  return 76543;
> +}
> +
> +int ctz_complement_count1 (unsigned char b) {
> +int c = 0;
> +
> +while (b) {
> +   b <<= 1;
> +   c++;
> +}
> +if (c <= PREC)
> +  return 0;
> +else
> +  return 23456;
> +}
> +
> +int ctz_complement_count2 (unsigned char b) {
> +int c = 0;
> +
> +while (b) {
> +   b <<= 1;
> +   c++;
> +}
> +if (c <= PREC - 1)
> +  return 0;
> +else
> +  return 65432;
> +}
> +/* { dg-final { scan-tree-dump-times "34567" 0 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "76543" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "23456" 0 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "65432" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/clz-complement-char.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/clz-complement-char.c
> new file mode 100644
> index 
> ..2ebe8fabcaf0ce88f3a6a46e9ba4ba79b7d3672e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/clz-complement-char.c
> @@ -0,0 +1,31 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target clz } */
> +/* { dg-options "-O2 

Re: [PATCH v3 2/2] aarch64: Fix bit-field alignment in param passing [PR105549]

2023-01-12 Thread Richard Sandiford via Gcc-patches
Christophe Lyon  writes:
> While working on enabling DFP for AArch64, I noticed new failures in
> gcc.dg/compat/struct-layout-1.exp (t028) which were not actually
> caused by DFP types handling. These tests are generated during 'make
> check' and enabling DFP made generation different (not sure if new
> non-DFP tests are generated, or if existing ones are generated
> differently, the tests in question are huge and difficult to compare).
>
> Anyway, I reduced the problem to what I attach at the end of the new
> gcc.target/aarch64/aapcs64/va_arg-17.c test and rewrote it in the same
> scheme as other va_arg* AArch64 tests.  Richard Sandiford further
> reduced this to a non-vararg function, added as a second testcase.
>
> This is a tough case mixing bit-fields and alignment, where
> aarch64_function_arg_alignment did not follow what its descriptive
> comment says: we want to use the natural alignment of the bit-field
> type only if the user didn't reduce the alignment for the bit-field
> itself.
>
> The patch also adds a comment and assert that would help someone who
> has to look at this area again.
>
> The fix would be very small, except that this introduces a new ABI
> break, and we have to warn about that.  Since this actually fixes a
> problem introduced in GCC 9.1, we keep the old computation to detect
> when we now behave differently.
>
> This patch adds two new tests (va_arg-17.c and
> pr105549.c). va_arg-17.c contains the reduced offending testcase from
> struct-layout-1.exp for reference.  We update some tests introduced by
> the previous patch, where parameters with bit-fields and packed
> attribute now emit a different warning.
>
> v2->v3: testcase update
>
> 2022-11-28  Christophe Lyon  
>   Richard Sandiford  
>
>   gcc/
>   PR target/105549
>   * config/aarch64/aarch64.cc (aarch64_function_arg_alignment):
>   Check DECL_PACKED for bitfield.
>   (aarch64_layout_arg): Warn when parameter passing ABI changes.
>   (aarch64_function_arg_boundary): Do not warn here.
>   (aarch64_gimplify_va_arg_expr): Warn when parameter passing ABI
>   changes.
>
>   gcc/testsuite/
>   PR target/105549
>   * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: Update.
>   * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: Update.
>   * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: Update.
>   * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: Update.
>   * gcc.target/aarch64/aapcs64/va_arg-17.c: New test.
>   * gcc.target/aarch64/pr105549.c: New test.
>   * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: Update.
>   * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: Update.
>   * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: Update.
>   * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: Update.
> ---
>  gcc/config/aarch64/aarch64.cc | 148 ++
>  .../bitfield-abi-warning-align16-O2-extra.C   |  64 
>  .../aarch64/bitfield-abi-warning-align16-O2.C |  48 +++---
>  .../bitfield-abi-warning-align32-O2-extra.C   | 131 +++-
>  .../aarch64/bitfield-abi-warning-align32-O2.C | 132 
>  .../gcc.target/aarch64/aapcs64/va_arg-17.c| 105 +
>  .../bitfield-abi-warning-align16-O2-extra.c   |  64 
>  .../aarch64/bitfield-abi-warning-align16-O2.c |  48 +++---
>  .../bitfield-abi-warning-align32-O2-extra.c   | 131 +++-
>  .../aarch64/bitfield-abi-warning-align32-O2.c | 132 
>  gcc/testsuite/gcc.target/aarch64/pr105549.c   |  12 ++
>  11 files changed, 587 insertions(+), 428 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-17.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr105549.c
> [...]
> @@ -68,14 +62,14 @@
>  /* { dg-note {parameter passing for argument of type 'S4' changed in GCC 
> 9.1} "" { target *-*-* } 103 } f4_stdarg */
>  /* { dg-note {parameter passing for argument of type 'S8' changed in GCC 
> 9.1} "" { target *-*-* } 104 } f8_stdarg */
>  
> -/* Parameter passing for these should not have changed in GCC 9.1 (PR 
> 105549).
> +/* FIXME Parameter passing for these should not have changed in GCC 9.1 (PR 
> 105549).
> Fortunately we warn. Note the discrepancy with lines 120-124 below: we 
> warn
> in the callee, but not in the caller.  */

Looks like a stray change.  Same for the C test.

OK otherwise, thanks.

Richard


Re: [RFC] Introduce -finline-memset-loops

2023-01-12 Thread Richard Biener via Gcc-patches
On Tue, Dec 27, 2022 at 5:12 AM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> try_store_by_multiple_pieces was added not long ago, enabling
> variable-sized memset to be expanded inline when the worst-case
> in-range constant length would, using conditional blocks with powers
> of two to cover all possibilities of length and alignment.
>
> This patch extends the memset expansion to start with a loop, so as to
> still take advantage of known alignment even with long lengths, but
> without necessarily adding store blocks for every power of two.
>
> This makes it possible for any memset call to be expanded, even if
> storing a single byte per iteration.  Surely efficient implementations
> of memset can do better, with a pre-loop to increase alignment, but
> that would likely be excessive for inline expansions of memset.
>
> Still, in some cases, users prefer to inline memset, even if it's not
> as performant, or when it's known to be performant in ways the
> compiler can't tell, to avoid depending on a C runtime library.
>
> With this flag, global or per-function optimizations may enable inline
> expansion of memset to be selectively requested, while the
> infrastructure for that may enable us to introduce per-target tuning
> to enable such looping even advantageous, even if not explicitly
> requested.
>
>
> I realize this is late for new features in this cycle; I`d be happy to
> submit it again later, but I wonder whether there's any interest in this
> feature, or any objections to it.  FWIW, I've regstrapped this on
> x86_64-linux-gnu, and also tested earlier versions of this patch in
> earlier GCC branches with RISC-v crosses.  Is this ok for GCC 14?  Maybe
> even simple enough for GCC 13, considering it's disabled by default?

I wonder if that isn't better handled by targets via the setmem pattern,
like x86 has the stringop inline strathegy.  What is considered acceptable
in terms of size or performance will vary and I don't think there's much
room for improvements on this generic code support?

Richard.

> TIA,
>
>
> for  gcc/ChangeLog
>
> * builtins.cc (try_store_by_multiple_pieces): Support starting
> with a loop.
> * common.opt (finline-memset-loops): New.
> * doc/invoke.texi (-finline-memset-loops): Add.
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/torture/inline-mem-set-1.c: New.
> ---
>  gcc/builtins.cc |   50 
> ++-
>  gcc/common.opt  |4 ++
>  gcc/doc/invoke.texi |   13 ++
>  gcc/testsuite/gcc.dg/torture/inline-mem-set-1.c |   14 ++
>  4 files changed, 77 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/inline-mem-set-1.c
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 02c4fefa86f48..388bae58ce49e 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -4361,9 +4361,37 @@ try_store_by_multiple_pieces (rtx to, rtx len, 
> unsigned int ctz_len,
>if (max_bits >= 0)
>  xlenest += ((HOST_WIDE_INT_1U << max_bits) * 2
> - (HOST_WIDE_INT_1U << ctz_len));
> +  bool max_loop = false;
>if (!can_store_by_pieces (xlenest, builtin_memset_read_str,
> , align, true))
> -return false;
> +{
> +  if (!flag_inline_memset_loops)
> +   return false;
> +  while (--max_bits >= sctz_len)
> +   {
> + xlenest = ((HOST_WIDE_INT_1U << max_bits) * 2
> +- (HOST_WIDE_INT_1U << ctz_len));
> + if (can_store_by_pieces (xlenest + blksize,
> +  builtin_memset_read_str,
> +  , align, true))
> +   {
> + max_loop = true;
> + break;
> +   }
> + if (!blksize)
> +   continue;
> + if (can_store_by_pieces (xlenest,
> +  builtin_memset_read_str,
> +  , align, true))
> +   {
> + blksize = 0;
> + max_loop = true;
> + break;
> +   }
> +   }
> +  if (!max_loop)
> +   return false;
> +}
>
>by_pieces_constfn constfun;
>void *constfundata;
> @@ -4405,6 +4433,7 @@ try_store_by_multiple_pieces (rtx to, rtx len, unsigned 
> int ctz_len,
>   the least significant bit possibly set in the length.  */
>for (int i = max_bits; i >= sctz_len; i--)
>  {
> +  rtx_code_label *loop_label = NULL;
>rtx_code_label *label = NULL;
>blksize = HOST_WIDE_INT_1U << i;
>
> @@ -4423,14 +4452,24 @@ try_store_by_multiple_pieces (rtx to, rtx len, 
> unsigned int ctz_len,
>else if ((max_len & blksize) == 0)
> continue;
>
> +  if (max_loop && i == max_bits)
> +   {
> + loop_label = gen_label_rtx ();
> + emit_label (loop_label);
> + /* Since we may run this multiple times, don't assume we
> +know 

Re: [PATCH] PR tree-optimization/92342: Optimize b & -(a==c) in match.pd

2023-01-12 Thread Richard Biener via Gcc-patches
On Tue, Jan 3, 2023 at 2:17 PM Roger Sayle  wrote:
>
>
> This patch is an update/tweak of Andrew Pinski's two patches for
> PR tree-optimization/92342, that were originally posted back in November:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585111.html
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585112.html
>
> Technically, the first of those was approved by Richard Biener, though
> never committed, and my first thought was to simply push it for Andrew,
> but the review of the second piece expressed concerns over comparisons
> in non-integral modes, where the result may not be zero-one valued.
> Indeed both transformations misbehave in the presence of vector mode
> comparisons (these transformations are already implemented for
> vec_cond elsewhere in match.pd), so my minor contribution is to limit
> these new transformations to scalars, by testing that both the operands
> and results are INTEGRAL_TYPE_P.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?

OK.

Thanks,
Richard.

>
> 2023-01-03  Andrew Pinski  
> Roger Sayle  
>
> gcc/ChangeLog:
> PR tree-optimization/92342
> * match.pd ((m1 CMP m2) * d -> (m1 CMP m2) ? d : 0):
> Use tcc_comparison and :c for the multiply.
> (b & -(a CMP c) -> (a CMP c)?b:0): New pattern.
>
> gcc/testsuite/ChangeLog:
> PR tree-optimization/92342
> * gcc.dg/tree-ssa/andnegcmp-1.c: New test.
> * gcc.dg/tree-ssa/andnegcmp-2.c: New test.
> * gcc.dg/tree-ssa/multcmp-1.c: New test.
> * gcc.dg/tree-ssa/multcmp-1.c: New test.
>
>
> Thanks in advance (and thanks to Andrew).
> Roger
> --
>


Re: [PATCH v3 1/2] aarch64: fix warning emission for ABI break since GCC 9.1

2023-01-12 Thread Richard Sandiford via Gcc-patches
Christophe Lyon  writes:
> While looking at PR 105549, which is about fixing the ABI break
> introduced in GCC 9.1 in parameter alignment with bit-fields, we
> noticed that the GCC 9.1 warning is not emitted in all the cases where
> it should be.  This patch fixes that and the next patch in the series
> fixes the GCC 9.1 break.
>
> We split this into two patches since patch #2 introduces a new ABI
> break starting with GCC 13.1.  This way, patch #1 can be back-ported
> to release branches if needed to fix the GCC 9.1 warning issue.
>
> The main idea is to add a new global boolean that indicates whether
> we're expanding the start of a function, so that aarch64_layout_arg
> can emit warnings for callees as well as callers.  This removes the
> need for aarch64_function_arg_boundary to warn (with its incomplete
> information).  However, in the first patch there are still cases where
> we emit warnings were we should not; this is fixed in patch #2 where
> we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly.
>
> The fix in aarch64_function_arg_boundary (replacing & with &&) looks
> like an oversight of a previous commit in this area which changed
> 'abi_break' from a boolean to an integer.
>
> We also take the opportunity to fix the comment above
> aarch64_function_arg_alignment since the value of the abi_break
> parameter was changed in a previous commit, no longer matching the
> description.
>
> v2->v3: removed a bogus comment, added C++ tests (copied from the C
> ones)
>
> 2022-11-28  Christophe Lyon  
>   Richard Sandiford  
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix
>   comment.
>   (aarch64_layout_arg): Factorize warning conditions.
>   (aarch64_function_arg_boundary): Fix typo.
>   * function.cc (currently_expanding_function_start): New variable.
>   (expand_function_start): Handle
>   currently_expanding_function_start.
>   * function.h (currently_expanding_function_start): Declare.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
>   * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New
>   test.
>   * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test.
>   * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New
>   test.
>   * gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
>   * gcc.target/aarch64/bitfield-abi-warning.h: New test.
>   * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test.
>   * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New
>   test.
>   * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test.
>   * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New
>   test.
>   * g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test.
>   * g++.target/aarch64/bitfield-abi-warning.h: New test.

OK for trunk, and for backports after a while.  Thanks for doing this,
and for your patience through it all.

Richard

> ---
>  gcc/config/aarch64/aarch64.cc |  28 +++-
>  gcc/function.cc   |   5 +
>  gcc/function.h|   2 +
>  .../bitfield-abi-warning-align16-O2-extra.C   |  86 
>  .../aarch64/bitfield-abi-warning-align16-O2.C |  87 
>  .../bitfield-abi-warning-align32-O2-extra.C   | 119 +
>  .../aarch64/bitfield-abi-warning-align32-O2.C | 119 +
>  .../aarch64/bitfield-abi-warning-align8-O2.C  |  16 +++
>  .../g++.target/aarch64/bitfield-abi-warning.h | 125 ++
>  .../bitfield-abi-warning-align16-O2-extra.c   |  86 
>  .../aarch64/bitfield-abi-warning-align16-O2.c |  87 
>  .../bitfield-abi-warning-align32-O2-extra.c   | 119 +
>  .../aarch64/bitfield-abi-warning-align32-O2.c | 119 +
>  .../aarch64/bitfield-abi-warning-align8-O2.c  |  16 +++
>  .../gcc.target/aarch64/bitfield-abi-warning.h | 125 ++
>  15 files changed, 1132 insertions(+), 7 deletions(-)
>  create mode 100644 
> gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
>  create mode 100644 
> gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
>  create mode 100644 
> gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
>  create mode 100644 
> gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
>  create mode 100644 
> gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
>  create mode 100644 gcc/testsuite/g++.target/aarch64/bitfield-abi-warning.h
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c
> 

[PATCH] tree-optimization/99412 - reassoc and reduction chains

2023-01-12 Thread Richard Biener via Gcc-patches
With -ffast-math we end up associating reduction chains and break
them - this is because of old code that tries to rectify reductions
into a shape likened by the vectorizer.  Nowadays the rank compute
produces correct association for reduction chains and the vectorizer
has robust support to fall back to a regular reductions (via
reduction path) when it turns out to be not a proper reduction chain.

So this patch removes the special code in reassoc which makes
the TSVC s352 vectorized with -Ofast (it is already without
-ffast-math).

Bootstrapped and tested on x86_64-unknown-linux-gnu, I'm going to
push this with the option to revert if it turns out my vectorizer
expectation is wrong.

PR tree-optimization/99412
* tree-ssa-reassoc.cc (is_phi_for_stmt): Remove.
(swap_ops_for_binary_stmt): Remove reduction handling.
(rewrite_expr_tree_parallel): Adjust.
(reassociate_bb): Likewise.
* tree-parloops.cc (build_new_reduction): Handle MINUS_EXPR.

* gcc.dg/vect/pr99412.c: New testcase.
* gcc.dg/tree-ssa/reassoc-47.c: Adjust comment.
* gcc.dg/tree-ssa/reassoc-48.c: Remove.
---
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c |  9 
 gcc/testsuite/gcc.dg/vect/pr99412.c| 24 +
 gcc/tree-parloops.cc   |  3 ++
 gcc/tree-ssa-reassoc.cc| 62 +++---
 5 files changed, 35 insertions(+), 67 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr99412.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c 
b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
index 1b0f0fdabe1..cd2cc740f6d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
@@ -4,6 +4,6 @@
 #define MODIFY
 #include "reassoc-46.h"
 
-/* Check that if the loop accumulator is saved into a global variable, it's
-   still added last.  */
+/* Check that if the loop accumulator is modified using a chain of operations
+   other than addition, its new value is still added last.  */
 /* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = 
(?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ 
(?:vect_)?_[\d._]+)} 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c 
b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
deleted file mode 100644
index 13836ebe8e6..000
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
+++ /dev/null
@@ -1,9 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
-
-#define STORE
-#include "reassoc-46.h"
-
-/* Check that if the loop accumulator is modified using a chain of operations
-   other than addition, its new value is still added last.  */
-/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = 
(?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ 
(?:vect_)?_[\d._]+)} 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr99412.c 
b/gcc/testsuite/gcc.dg/vect/pr99412.c
new file mode 100644
index 000..e3e94a052ca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr99412.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Ofast --param vect-epilogues-nomask=0" } */
+/* { dg-require-effective-target vect_float } */
+
+/* From TSVC s352.  */
+
+typedef float real_t;
+
+#define LEN_1D 32000
+#define LEN_2D 256
+
+real_t a[LEN_1D],b[LEN_1D];
+real_t foo ()
+{
+  real_t dot = (real_t)0.;
+  for (int i = 0; i < LEN_1D; i += 5) {
+  dot = dot + a[i] * b[i] + a[i + 1] * b[i + 1] + a[i + 2]
+ * b[i + 2] + a[i + 3] * b[i + 3] + a[i + 4] * b[i + 4];
+  }
+
+  return dot;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc
index 4f92c4be7d3..dfb75c369d6 100644
--- a/gcc/tree-parloops.cc
+++ b/gcc/tree-parloops.cc
@@ -3228,6 +3228,9 @@ build_new_reduction (reduction_info_table_type 
*reduction_list,
   /* Check for OpenMP supported reduction.  */
   switch (reduction_code)
 {
+case MINUS_EXPR:
+  reduction_code = PLUS_EXPR;
+  /* Fallthru.  */
 case PLUS_EXPR:
 case MULT_EXPR:
 case MAX_EXPR:
diff --git a/gcc/tree-ssa-reassoc.cc b/gcc/tree-ssa-reassoc.cc
index 580ec0ee0eb..5522a3ada8e 100644
--- a/gcc/tree-ssa-reassoc.cc
+++ b/gcc/tree-ssa-reassoc.cc
@@ -5117,35 +5117,6 @@ maybe_optimize_range_tests (gimple *stmt)
   return cfg_cleanup_needed;
 }
 
-/* Return true if OPERAND is defined by a PHI node which uses the LHS
-   of STMT in it's operands.  This is also known as a "destructive
-   update" operation.  */
-
-static bool
-is_phi_for_stmt (gimple *stmt, tree operand)
-{
-  gimple *def_stmt;
-  gphi *def_phi;
-  tree lhs;
-  use_operand_p arg_p;
-  ssa_op_iter i;
-
-  if (TREE_CODE (operand) != SSA_NAME)
-return false;
-
-  lhs = gimple_assign_lhs (stmt);
-
-  def_stmt = 

Re: [PATCH 2/2] xtensa: Optimize ctzsi2 and ffssi2 a bit

2023-01-12 Thread Max Filippov via Gcc-patches
On Wed, Jan 11, 2023 at 8:26 PM Takayuki 'January June' Suwa
 wrote:
>
> This patch saves one byte when the Code Density Option is enabled,
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (ctzsi2, ffssi2):
> Rearrange the emitting codes.
> ---
>  gcc/config/xtensa/xtensa.md | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


Re: [PATCH 1/2] xtensa: Tune "*btrue" insn pattern

2023-01-12 Thread Max Filippov via Gcc-patches
On Wed, Jan 11, 2023 at 8:26 PM Takayuki 'January June' Suwa
 wrote:
>
> This branch instruction has short encoding if EQ/NE comparison against
> immediate zero when the Code Density Option is enabled, but its "length"
> attribute was only for normal encoding.  This patch fixes it.
>
> This patch also prevents undesireable replacement the comparison immediate
> zero of the instruction (short encoding, as mentioned above) with a
> register that has value of zero (normal encoding) by the postreload pass.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (*btrue):
> Correct value of the attribute "length" that depends on
> TARGET_DENSITY and operands, and add '?' character to the register
> constraint of the compared operand.
> ---
>  gcc/config/xtensa/xtensa.md | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 12:05:45PM +, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > On Wed, Jan 11, 2023 at 11:59:27AM +, Wilco Dijkstra wrote:
> >> Hi,
> >> 
> >> > On 1/10/23 19:12, Jakub Jelinek via Gcc-patches wrote:
> >> >> Anyway, the sooner this makes it into gcc trunk, the better, it breaks 
> >> >> quite
> >> >> a lot of stuff.
> >> >
> >> > Yep, please, we're also waiting for this patch for pushing to our gcc13 
> >> > package.
> >> 
> >> Well I'm waiting for an OK from a maintainer... I believe Jakub can 
> >> approve it as well.
> >
> > Yes, I can, but am not sure it is appropriate.  While I'm familiar with the
> > unwinder, I'm not familiar with the pointer signing, and AArch64 has active
> > maintainers, so I'd prefer to defer the review to them.
> 
> I think my main question is: how clean vs hacky is it to use
> REG_UNDEFINED as effectively a toggle bit, rather than using
> REG_UNDEFINED for its intended purpose?
> 
> In the review for earlier (May) patch, I'd asked whether it would
> make sense to add a new enum.  Would that be OK from a target-independent
> point of view?  E.g. maybe REG_TOGGLE_ON.
> 
> Although we don't AFAIK support using DW_CFA_undefined with RA signing,
> the failure mode would be non-obvious: it would effectively toggle the
> bit on.

We don't install unwind-dw2.h nor give user code access to the how array
(and it just lives on the stack of __frame_state_for/uw_init_context_1
functions and address of it is passed to functions called from it),
so I hope all this is private to the libgcc unwinder.  After all, otherwise
e.g. the change how "how" is represented couldn't be done.
That said, if new enum entries are added in the generic code, then
I think uw_update_context_1 will warn about unhandled case in a switch,
unless we e.g. change
  case REG_UNSAVED:
  case REG_UNDEFINED:
break;
to
  default:
break;
(and provided that the new enums would want such handling).
Another option is to just define the arch dependent value for how field
in the arch code, right now it is unsigned char type, so using say
(unsigned char) ~0 or (unsigned char) ~0 and (unsigned char) ~1 as arch
specific values might be ok too.

> It would be good to remove the definition of RA_SIGNED_BIT as well,
> so that people don't accidentally use it in future.

Jakub



Re: [Patch] Fortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107706] (was: [PR107424])

2023-01-12 Thread Tobias Burnus

First, I messed up the PR number – it should be PR107706.

On 12.01.23 11:39, Jakub Jelinek wrote:

On Thu, Jan 12, 2023 at 11:22:40AM +0100, Tobias Burnus wrote:

Rather obvious fix for that ICE.

Comments? If there are none, I will commit it later as obvious.

I think the spec should be clarified, unlike clauses like if, novariants,
nocontext, indirect, final clause operands where we specify the argument
to be expression of logical type and glossary term says that OpenMP logical
expression [...] But for the holds clause, all we say is that holds clause
isn't inarguable and [...] that the listed expression evaluates to true in
the assumption scope. [...]
so I think making it clear that holds argument is expression of logical type
would be useful.


Actually, the spec does have (internally) hold-expr = "OpenMP logical
expression" in a JSON file but that does not show up in the generated
PDF. I have now filed an OpenMP spec issue for it (#3453).


That said, the patch is ok, a rank > 1 expression can't be considered to
evaluate to true...


Thanks! Committed as r13-5118-g2ce55247a8bf32985a96ed63a7a92d36746723dc
(with the fixed PR number).

Thanks.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Richard Sandiford via Gcc-patches
Richard Sandiford via Gcc-patches  writes:
> Jakub Jelinek  writes:
>> On Wed, Jan 11, 2023 at 11:59:27AM +, Wilco Dijkstra wrote:
>>> Hi,
>>> 
>>> > On 1/10/23 19:12, Jakub Jelinek via Gcc-patches wrote:
>>> >> Anyway, the sooner this makes it into gcc trunk, the better, it breaks 
>>> >> quite
>>> >> a lot of stuff.
>>> >
>>> > Yep, please, we're also waiting for this patch for pushing to our gcc13 
>>> > package.
>>> 
>>> Well I'm waiting for an OK from a maintainer... I believe Jakub can approve 
>>> it as well.
>>
>> Yes, I can, but am not sure it is appropriate.  While I'm familiar with the
>> unwinder, I'm not familiar with the pointer signing, and AArch64 has active
>> maintainers, so I'd prefer to defer the review to them.
>
> I think my main question is: how clean vs hacky is it to use
> REG_UNDEFINED as effectively a toggle bit, rather than using
> REG_UNDEFINED for its intended purpose?
>
> In the review for earlier (May) patch, I'd asked whether it would
> make sense to add a new enum.  Would that be OK from a target-independent
> point of view?  E.g. maybe REG_TOGGLE_ON.
>
> Although we don't AFAIK support using DW_CFA_undefined with RA signing,
> the failure mode would be non-obvious: it would effectively toggle the
> bit on.
>
> It would be good to remove the definition of RA_SIGNED_BIT as well,
> so that people don't accidentally use it in future.

Sorry, just realised that the patch does do that.  But please remove
the comment too!

Richard


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Wed, Jan 11, 2023 at 11:59:27AM +, Wilco Dijkstra wrote:
>> Hi,
>> 
>> > On 1/10/23 19:12, Jakub Jelinek via Gcc-patches wrote:
>> >> Anyway, the sooner this makes it into gcc trunk, the better, it breaks 
>> >> quite
>> >> a lot of stuff.
>> >
>> > Yep, please, we're also waiting for this patch for pushing to our gcc13 
>> > package.
>> 
>> Well I'm waiting for an OK from a maintainer... I believe Jakub can approve 
>> it as well.
>
> Yes, I can, but am not sure it is appropriate.  While I'm familiar with the
> unwinder, I'm not familiar with the pointer signing, and AArch64 has active
> maintainers, so I'd prefer to defer the review to them.

I think my main question is: how clean vs hacky is it to use
REG_UNDEFINED as effectively a toggle bit, rather than using
REG_UNDEFINED for its intended purpose?

In the review for earlier (May) patch, I'd asked whether it would
make sense to add a new enum.  Would that be OK from a target-independent
point of view?  E.g. maybe REG_TOGGLE_ON.

Although we don't AFAIK support using DW_CFA_undefined with RA signing,
the failure mode would be non-obvious: it would effectively toggle the
bit on.

It would be good to remove the definition of RA_SIGNED_BIT as well,
so that people don't accidentally use it in future.

Thanks,
Richard


Re: [PATCH][_GLIBCXX_INLINE_VERSION] Adapt to_chars/from_chars symbols

2023-01-12 Thread Jonathan Wakely via Gcc-patches
On Mon, 28 Nov 2022 at 20:44, François Dumont wrote:
>
> On 28/11/22 19:35, Jonathan Wakely wrote:
>
>
>
> On Mon, 28 Nov 2022 at 06:07, François Dumont via Libstdc++ 
>  wrote:
>>
>> This patch is fixing those tests:
>>
>> 20_util/to_chars/float128_c++23.cc
>> std/format/formatter/requirements.cc
>> std/format/functions/format.cc
>> std/format/functions/format_to_n.cc
>> std/format/functions/size.cc
>> std/format/functions/vformat_to.cc
>> std/format/string.cc
>>
>> Note that symbols used in  for __ibm128 and __iee128 are untested.
>
>
> We don't need to do this for those symbols, the ALT128 config is incompatible 
> with versioned namespace. If you're using the versioned namespace, you don't 
> need backwards compatibility with the old long double ABI.
>
>
>
>
> Here is the simplified patch then.
>
> libstdc++: [_GLIBCXX_INLINE_VERSION] Add to_chars/from_chars symbols 
> export
>
> libstdc++-v3/ChangeLog
>
> * include/std/format [_GLIBCXX_INLINE_VERSION](to_chars): Adapt 
> __asm symbol
> specifications.
> * config/abi/pre/gnu-versioned-namespace.ver: Add 
> to_chars/from_chars symbols
> export.
>
> Ok to commit ?

I still don't understand why the linker script change is needed, but I
can investigate that myself later.

OK for trunk, thanks.



Re: [PATCH] PR 107189 Remove useless _Alloc_node

2023-01-12 Thread Jonathan Wakely via Gcc-patches
On Thu, 12 Jan 2023 at 12:01, Jonathan Wakely wrote:
>
> OK for trunk, sorry for not getting to it sooner.

P.S. I do think this one would have been OK to commit as obvious,
since it's just removing an unused variable and so has no effect on
anything.


>
> On Wed, 4 Jan 2023 at 21:13, François Dumont via Libstdc++
>  wrote:
> >
> > Still no chance to review ?
> >
> > On 14/11/22 18:56, François Dumont wrote:
> > > Gentle reminder.
> > >
> > > Sorry if I should have committed it as trivial but I cannot do it
> > > anymore now that I asked :-)
> > >
> > >
> > > On 12/10/22 22:18, François Dumont wrote:
> > >> libstdc++: Remove _Alloc_node instance in _Rb_tree [PR107189]
> > >>
> > >> libstdc++-v3/ChangeLog:
> > >>
> > >> PR libstdc++/107189
> > >> * include/bits/stl_tree.h
> > >> (_Rb_tree<>::_M_insert_range_equal): Remove
> > >> unused _Alloc_node instance.
> > >>
> > >> Ok to commit ?
> > >>
> > >> François
> > >
> > >



Re: [PATCH] PR 107189 Remove useless _Alloc_node

2023-01-12 Thread Jonathan Wakely via Gcc-patches
OK for trunk, sorry for not getting to it sooner.

On Wed, 4 Jan 2023 at 21:13, François Dumont via Libstdc++
 wrote:
>
> Still no chance to review ?
>
> On 14/11/22 18:56, François Dumont wrote:
> > Gentle reminder.
> >
> > Sorry if I should have committed it as trivial but I cannot do it
> > anymore now that I asked :-)
> >
> >
> > On 12/10/22 22:18, François Dumont wrote:
> >> libstdc++: Remove _Alloc_node instance in _Rb_tree [PR107189]
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> PR libstdc++/107189
> >> * include/bits/stl_tree.h
> >> (_Rb_tree<>::_M_insert_range_equal): Remove
> >> unused _Alloc_node instance.
> >>
> >> Ok to commit ?
> >>
> >> François
> >
> >



Re: libstdc++: Fix deadlock in debug iterator increment [PR108288]

2023-01-12 Thread Jonathan Wakely via Gcc-patches
On Thu, 12 Jan 2023 at 05:52, François Dumont wrote:
>
> Small update for an obvious compilation issue and to review new test
> case that could have lead to an infinite loop if the increment issue was
> not detected.
>
> I also forgot to ask if there is more chance for the instantiation to be
> elided when it is implemented like in the _Safe_local_iterator:
> return { __cur, this->_M_sequence };

No, that doesn't make any difference.

>
> than in the _Safe_iterator:
> return _Safe_iterator(__cur, this->_M_sequence);
>
> In the case where the user code do not use it ?
>
> Fully tested now, ok to commit ?
>
> François
>
> On 11/01/23 07:03, François Dumont wrote:
> > Thanks for fixing this.
> >
> > Here is the extension of the fix to all post-increment/decrement
> > operators we have on _GLIBCXX_DEBUG iterator.

Thanks, I completely forgot we have other partial specializations, I
just fixed the one that showed a deadlock in the user's example!

> > I prefer to restore somehow previous implementation to continue to
> > have _GLIBCXX_DEBUG post operators implemented in terms of normal post
> > operators.

Why?

Implementing post-increment as:

auto tmp = *this;
++*this;
return tmp;

is the idiomatic way to write it, and it works fine in this case. I
don't think it performs any more work than your version, does it?
Why not use the idiomatic form?

Is it just so that post-inc of a debug iterator uses post-inc of the
underlying iterator? Why does that matter?



Re: [RFA] choosing __platform_wait_t on targets without lock-free 64 atomics

2023-01-12 Thread Jonathan Wakely via Gcc-patches
On Thu, 12 Jan 2023 at 01:27, Thomas Rodgers wrote:
>
> I agree with this change.

Thanks, pushed to trunk.

>
> On Thu, Jan 5, 2023 at 4:22 PM Jonathan Wakely  wrote:
>>
>> How about this?
>>
>> I don't think we should worry about targets without atomic int, so don't
>> bother using types smaller than int.
>>
>>
>> -- >8 --
>>
>> For non-futex targets the __platform_wait_t type is currently uint64_t,
>> but that requires a lock in libatomic for some 32-bit targets. We don't
>> really need a 64-bit type, so use unsigned long if that is lock-free,
>> and int otherwise. This should mean it's lock-free on a wider set of
>> targets.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/bits/atomic_wait.h (__detail::__platform_wait_t):
>> Define as unsigned long if always lock-free, and unsigned int
>> otherwise.
>> ---
>>  libstdc++-v3/include/bits/atomic_wait.h | 6 +-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
>> b/libstdc++-v3/include/bits/atomic_wait.h
>> index bd1ed56d157..46f39f10cbc 100644
>> --- a/libstdc++-v3/include/bits/atomic_wait.h
>> +++ b/libstdc++-v3/include/bits/atomic_wait.h
>> @@ -64,7 +64,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>  // and __platform_notify() if there is a more efficient primitive supported
>>  // by the platform (e.g. __ulock_wait()/__ulock_wake()) which is better than
>>  // a mutex/condvar based wait.
>> -using __platform_wait_t = uint64_t;
>> +# if  ATOMIC_LONG_LOCK_FREE == 2
>> +using __platform_wait_t = unsigned long;
>> +# else
>> +using __platform_wait_t = unsigned int;
>> +# endif
>>  inline constexpr size_t __platform_wait_alignment
>>= __alignof__(__platform_wait_t);
>>  #endif
>> --
>> 2.39.0
>>



Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 11, 2023 at 11:59:27AM +, Wilco Dijkstra wrote:
> Hi,
> 
> > On 1/10/23 19:12, Jakub Jelinek via Gcc-patches wrote:
> >> Anyway, the sooner this makes it into gcc trunk, the better, it breaks 
> >> quite
> >> a lot of stuff.
> >
> > Yep, please, we're also waiting for this patch for pushing to our gcc13 
> > package.
> 
> Well I'm waiting for an OK from a maintainer... I believe Jakub can approve 
> it as well.

Yes, I can, but am not sure it is appropriate.  While I'm familiar with the
unwinder, I'm not familiar with the pointer signing, and AArch64 has active
maintainers, so I'd prefer to defer the review to them.

Jakub



Re: [Patch] Fortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107424]

2023-01-12 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 12, 2023 at 11:22:40AM +0100, Tobias Burnus wrote:
> Rather obvious fix for that ICE.
> 
> Comments? If there are none, I will commit it later as obvious.

I think the spec should be clarified, unlike clauses like if, novariants,
nocontext, indirect, final clause operands where we specify the argument
to be expression of logical type and glossary term says that OpenMP logical
expression is scalar expression for C/C++ and scalar logical expression
for Fortran.  But for the holds clause, all we say is that holds clause
isn't inarguable and
"the program guarantees that the listed expression evaluates to true in
the assumption scope. The effect of the clause does not include an
observable evaluation of the expression."
so I think making it clear that holds argument is expression of logical type
would be useful.

That said, the patch is ok, a rank > 1 expression can't be considered to
evaluate to true...

Jakub



[Patch] Fortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107424]

2023-01-12 Thread Tobias Burnus

Rather obvious fix for that ICE.

Comments? If there are none, I will commit it later as obvious.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107424]

gcc/fortran/ChangeLog:

	PR fortran/107424
	* openmp.cc (gfc_resolve_omp_assumptions): Reject nonscalars.

gcc/testsuite/ChangeLog:

	PR fortran/107424
	* gfortran.dg/gomp/assume-2.f90: Update dg-error.
	* gfortran.dg/gomp/assumes-2.f90: Likewise.
	* gfortran.dg/gomp/assume-5.f90: New test.

 gcc/fortran/openmp.cc|  8 +---
 gcc/testsuite/gfortran.dg/gomp/assume-2.f90  |  2 +-
 gcc/testsuite/gfortran.dg/gomp/assume-5.f90  | 20 
 gcc/testsuite/gfortran.dg/gomp/assumes-2.f90 |  2 +-
 4 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index b71ee467c01..916daeb1aa5 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -6911,9 +6911,11 @@ void
 gfc_resolve_omp_assumptions (gfc_omp_assumptions *assume)
 {
   for (gfc_expr_list *el = assume->holds; el; el = el->next)
-if (!gfc_resolve_expr (el->expr) || el->expr->ts.type != BT_LOGICAL)
-	gfc_error ("HOLDS expression at %L must be a logical expression",
-		   >expr->where);
+if (!gfc_resolve_expr (el->expr)
+	|| el->expr->ts.type != BT_LOGICAL
+	|| el->expr->rank != 0)
+  gfc_error ("HOLDS expression at %L must be a scalar logical expression",
+		 >expr->where);
 }
 
 
diff --git a/gcc/testsuite/gfortran.dg/gomp/assume-2.f90 b/gcc/testsuite/gfortran.dg/gomp/assume-2.f90
index ca3e04dfe95..dc306a9088a 100644
--- a/gcc/testsuite/gfortran.dg/gomp/assume-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/assume-2.f90
@@ -22,6 +22,6 @@ subroutine foo (i, a)
   end if
 !  !$omp end assume  - silence: 'Unexpected !$OMP END ASSUME statement'
 
-  !$omp assume holds (1.0)  ! { dg-error "HOLDS expression at .1. must be a logical expression" }
+  !$omp assume holds (1.0)  ! { dg-error "HOLDS expression at .1. must be a scalar logical expression" }
   !$omp end assume
 end
diff --git a/gcc/testsuite/gfortran.dg/gomp/assume-5.f90 b/gcc/testsuite/gfortran.dg/gomp/assume-5.f90
new file mode 100644
index 000..5c6c00750dd
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/assume-5.f90
@@ -0,0 +1,20 @@
+! PR fortran/107424
+!
+! Contributed by G. Steinmetz
+!
+
+integer function f(i)
+   implicit none
+   !$omp assumes holds(i < g())  ! { dg-error "HOLDS expression at .1. must be a scalar logical expression" }
+   integer, value :: i
+
+   !$omp assume holds(i < g())  ! { dg-error "HOLDS expression at .1. must be a scalar logical expression" }
+   block
+   end block
+   f = 3
+contains
+   function g()
+  integer :: g(2)
+  g = 4
+   end
+end
diff --git a/gcc/testsuite/gfortran.dg/gomp/assumes-2.f90 b/gcc/testsuite/gfortran.dg/gomp/assumes-2.f90
index 729c9737a1c..c8719a86a94 100644
--- a/gcc/testsuite/gfortran.dg/gomp/assumes-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/assumes-2.f90
@@ -4,7 +4,7 @@ module m
   !$omp assumes contains(target) holds(x > 0.0)
   !$omp assumes absent(target)
   !$omp assumes holds(0.0)
-! { dg-error "HOLDS expression at .1. must be a logical expression" "" { target *-*-* } .-1 }
+! { dg-error "HOLDS expression at .1. must be a scalar logical expression" "" { target *-*-* } .-1 }
 end module
 
 module m2


Re: [PATCH] wwwdocs: Note that old reload is deprecated

2023-01-12 Thread Segher Boessenkool
On Wed, Jan 11, 2023 at 05:07:47PM -0500, Paul Koning wrote:
> > On Jan 11, 2023, at 2:28 PM, Segher Boessenkool 
> >  wrote:
> > I would say your predicates are way too lenient here (general_operand),
> > but this needs more info.  A testcase to reproduce the problem, to
> > start with :-)
> 
> I'll try to trim it down.
> 
> What do you mean "too lenient"?  The first input operand (which is supposed 
> to be the same as the output since the instruction set is 2-address) is 
> "general_operand".  The destination is "nonimmediate_operand" which fits the 
> constraints applied to it.

I mean general_operand accepts that sp thing you saw.  But your
constraints do not?  (I don't know your target well, maybe this isn't
true).  Things like this should be sorted out by reload, but you get
better code (and fewer problems ;-) ) if you make code that fits
earlier.  The L in LRA means "local": it "just" makes things fit, it
does not emphasise optimising your code.


Segher


Re: [PATCH, nvptx, 2/2] Reimplement libgomp barriers for nvptx: bar.red instruction support in GCC

2023-01-12 Thread Thomas Schwinge
Hi Chung-Lin!

On 2022-09-21T15:45:54+0800, Chung-Lin Tang via Gcc-patches 
 wrote:
> [...] The attached patch adds bar.red instructions to the nvptx port [...]

I see GCC report:

[...]
build/genrecog [...]/source-gcc/gcc/common.md 
[...]/source-gcc/gcc/config/nvptx/nvptx.md \
  insn-conditions.md > tmp-recog.cc
[...]/source-gcc/gcc/config/nvptx/nvptx.md:2297:1: warning: source missing 
a mode?
[...]/source-gcc/gcc/config/nvptx/nvptx.md:2297:1: warning: source missing 
a mode?
[...]/source-gcc/gcc/config/nvptx/nvptx.md:2297:1: warning: source missing 
a mode?
Statistics for recog:
[...]

2297 (define_insn "nvptx_barred_"
2298   [(set (match_operand: 0 "nvptx_register_operand" "=R")
2299 (unspec_volatile
2300   [(match_operand:SI 1 "nvptx_nonmemory_operand" "Ri")
2301(match_operand:SI 2 "nvptx_nonmemory_operand" "Ri")
2302(match_operand:SI 3 "const_int_operand" "i")
2303(match_operand:BI 4 "nvptx_register_operand" "R")]
2304   BARRED))]
2305   ""
2306   "\\tbar.red.. \\t%0, %1, %2, %p3%4;";"
2307   [(set_attr "predicable" "no")])


Grüße
 Thomas


> gcc/ChangeLog:
>
>   * config/nvptx/nvptx.cc (nvptx_print_operand): Add 'p'
>   case, adjust comments.
>   (enum nvptx_builtins): Add NVPTX_BUILTIN_BAR_RED_AND,
>   NVPTX_BUILTIN_BAR_RED_OR, and NVPTX_BUILTIN_BAR_RED_POPC.
>   (nvptx_expand_bar_red): New function.
>   (nvptx_init_builtins):
>   Add DEFs of __builtin_nvptx_bar_red_[and/or/popc].
>   (nvptx_expand_builtin): Use nvptx_expand_bar_red to expand
>   NVPTX_BUILTIN_BAR_RED_[AND/OR/POPC] cases.
>
>   * config/nvptx/nvptx.md (define_c_enum "unspecv"): Add
>   UNSPECV_BARRED_AND, UNSPECV_BARRED_OR, and UNSPECV_BARRED_POPC.
>   (BARRED): New int iterator.
>   (barred_op,barred_mode,barred_ptxtype): New int attrs.
>   (nvptx_barred_): New define_insn.
> diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
> index 49cc681..afc3a890 100644
> --- a/gcc/config/nvptx/nvptx.cc
> +++ b/gcc/config/nvptx/nvptx.cc
> @@ -2879,6 +2879,7 @@ nvptx_mem_maybe_shared_p (const_rtx x)
> t -- print a type opcode suffix, promoting QImode to 32 bits
> T -- print a type size in bits
> u -- print a type opcode suffix without promotions.
> +   p -- print a '!' for constant 0.
> x -- print a destination operand that may also be a bit bucket.  */
>
>  static void
> @@ -3012,6 +3013,11 @@ nvptx_print_operand (FILE *file, rtx x, int code)
>fprintf (file, "@!");
>goto common;
>
> +case 'p':
> +  if (INTVAL (x) == 0)
> + fprintf (file, "!");
> +  break;
> +
>  case 'c':
>mode = GET_MODE (XEXP (x, 0));
>switch (x_code)
> @@ -6151,9 +6157,90 @@ enum nvptx_builtins
>NVPTX_BUILTIN_CMP_SWAPLL,
>NVPTX_BUILTIN_MEMBAR_GL,
>NVPTX_BUILTIN_MEMBAR_CTA,
> +  NVPTX_BUILTIN_BAR_RED_AND,
> +  NVPTX_BUILTIN_BAR_RED_OR,
> +  NVPTX_BUILTIN_BAR_RED_POPC,
>NVPTX_BUILTIN_MAX
>  };
>
> +/* Expander for 'bar.red' instruction builtins.  */
> +
> +static rtx
> +nvptx_expand_bar_red (tree exp, rtx target,
> +   machine_mode ARG_UNUSED (m), int ARG_UNUSED (ignore))
> +{
> +  int code = DECL_MD_FUNCTION_CODE (TREE_OPERAND (CALL_EXPR_FN (exp), 0));
> +  machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
> +
> +  if (!target)
> +target = gen_reg_rtx (mode);
> +
> +  rtx pred, dst;
> +  rtx bar = expand_expr (CALL_EXPR_ARG (exp, 0),
> +  NULL_RTX, SImode, EXPAND_NORMAL);
> +  rtx nthr = expand_expr (CALL_EXPR_ARG (exp, 1),
> +   NULL_RTX, SImode, EXPAND_NORMAL);
> +  rtx cpl = expand_expr (CALL_EXPR_ARG (exp, 2),
> +  NULL_RTX, SImode, EXPAND_NORMAL);
> +  rtx redop = expand_expr (CALL_EXPR_ARG (exp, 3),
> +NULL_RTX, SImode, EXPAND_NORMAL);
> +  if (CONST_INT_P (bar))
> +{
> +  if (INTVAL (bar) < 0 || INTVAL (bar) > 15)
> + {
> +   error_at (EXPR_LOCATION (exp),
> + "barrier value must be within [0,15]");
> +   return const0_rtx;
> + }
> +}
> +  else if (!REG_P (bar))
> +bar = copy_to_mode_reg (SImode, bar);
> +
> +  if (!CONST_INT_P (nthr) && !REG_P (nthr))
> +nthr = copy_to_mode_reg (SImode, nthr);
> +
> +  if (!CONST_INT_P (cpl))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "complement argument must be constant");
> +  return const0_rtx;
> +}
> +
> +  pred = gen_reg_rtx (BImode);
> +  if (!REG_P (redop))
> +redop = copy_to_mode_reg (SImode, redop);
> +  emit_insn (gen_rtx_SET (pred, gen_rtx_NE (BImode, redop, GEN_INT (0;
> +  redop = pred;
> +
> +  rtx pat;
> +  switch (code)
> +{
> +case NVPTX_BUILTIN_BAR_RED_AND:
> +  dst = gen_reg_rtx (BImode);
> +  pat = gen_nvptx_barred_and (dst, bar, nthr, cpl, redop);
> +  break;
> +case NVPTX_BUILTIN_BAR_RED_OR:
> +