Re: [v4][PATCH 1/2] Handle component_ref to a structre/union field including C99 FAM [PR101832]

2023-03-09 Thread Richard Biener via Gcc-patches
On Thu, 9 Mar 2023, Qing Zhao wrote:

> 
> 
> > On Mar 9, 2023, at 7:20 AM, Richard Biener  wrote:
> > 
> > On Fri, 24 Feb 2023, Qing Zhao wrote:
> > 
> >> GCC extension accepts the case when a struct with a C99 flexible array 
> >> member
> >> is embedded into another struct or union (possibly recursively).
> >> __builtin_object_size should treat such struct as flexible size.
> >> 
> >> gcc/c/ChangeLog:
> >> 
> >>PR tree-optimization/101832
> >>* c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
> >>struct/union type.
> > 
> > I can't really comment on the correctness of this part but since
> > only the C frontend will ever set this and you are using it from
> > addr_object_size which is also used for other C family languages
> > (at least), I wonder if you can really test
> > 
> > +   if (!TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (v)))
> > 
> > there.
> 
> You mean for C++ and also other C family languages (other than C), the above 
> bit cannot be set?
> Yes, that’s true. The bit is only set for C. So is the bit 
> DECL_NOT_FLEXARRAY, which is only set for C too. 
> So, I am wondering whether the bit DECL_NOT_FLEXARRAY should be also set in 
> middle end? Or we can set DECL_NOT_FLEXARRAY in C++ FE too? And then we can 
> set TYPE_INCLUDE_FLEXARRAY also in C++ FE?
> What’s your suggestion?
> 
> (I personally feel that DECL_NOT_FLEXARRAY and TYPE_INCLUDE_FLEXARRAY should 
> be set in the same places).

I was wondering if the above test errors on the conservative side
correctly - it will now, for all but C, cut off some thing where it
didn't before?

> > 
> > Originally I was suggesting to set this flag in stor-layout.cc
> > which eventually all languages funnel their types through and
> > if there's language specific handling use a langhook (with the
> > default implementation preserving the status quo).
> 
> If we decide to set the bits in stor-layout.cc, where is the best place to do 
> it? I checked the star-layout.cc code, looks like “layout_type” might be the 
> place where we can set these bits for RECORD_TYPE, UNION_TYPE? 

Yes, it would be layout_type.

> > 
> > Some more comments below ...
> > 
> >> gcc/cp/ChangeLog:
> >> 
> >>PR tree-optimization/101832
> >>* module.cc (trees_out::core_bools): Stream out new bit
> >>type_include_flexarray.
> >>(trees_in::core_bools): Stream in new bit type_include_flexarray.
> >> 
> >> gcc/ChangeLog:
> >> 
> >>PR tree-optimization/101832
> >>* print-tree.cc (print_node): Print new bit type_include_flexarray.
> >>* tree-core.h (struct tree_type_common): New bit
> >>type_include_flexarray.
> >>* tree-object-size.cc (addr_object_size): Handle structure/union type
> >>when it has flexible size.
> >>* tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream
> >>in new bit type_include_flexarray.
> >>* tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream
> >>out new bit type_include_flexarray.
> >>* tree.h (TYPE_INCLUDE_FLEXARRAY): New macro
> >>TYPE_INCLUDE_FLEXARRAY.
> >> 
> >> gcc/testsuite/ChangeLog:
> >> 
> >>PR tree-optimization/101832
> >>* gcc.dg/builtin-object-size-pr101832.c: New test.
> >> ---
> >> gcc/c/c-decl.cc   |  12 ++
> >> gcc/cp/module.cc  |   2 +
> >> gcc/print-tree.cc |   5 +
> >> .../gcc.dg/builtin-object-size-pr101832.c | 134 ++
> >> gcc/tree-core.h   |   4 +-
> >> gcc/tree-object-size.cc   |  79 +++
> >> gcc/tree-streamer-in.cc   |   1 +
> >> gcc/tree-streamer-out.cc  |   1 +
> >> gcc/tree.h|   6 +
> >> 9 files changed, 215 insertions(+), 29 deletions(-)
> >> create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
> >> 
> >> diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
> >> index 08078eadeb8..f589a2f5192 100644
> >> --- a/gcc/c/c-decl.cc
> >> +++ b/gcc/c/c-decl.cc
> >> @@ -9284,6 +9284,18 @@ finish_struct (location_t loc, tree t, tree 
> >> fieldlist, tree attributes,
> >>   /* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x.  */
> >>   DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, 
> >> x);
> >> 
> >> +  /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t
> >> +   * when x is an array.  */
> >> +  if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
> >> +  TYPE_INCLUDE_FLEXARRAY (t) = flexible_array_member_type_p (TREE_TYPE 
> >> (x)) ;
> >> +  /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t
> >> +   when x is the last field.  */
> >> +  else if ((TREE_CODE (TREE_TYPE (x)) == RECORD_TYPE
> >> +  || TREE_CODE (TREE_TYPE (x)) == UNION_TYPE)
> >> + && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x))
> >> + && is_last_field)
> >> +  TYPE_INCLUDE_FLEXARRAY (t) = true;
> >> +
> >>   if (DECL_NAME (x)

Re: [PATCH]middle-end: don't form FMAs when multiplication is not single use. [PR108583]

2023-03-09 Thread Richard Biener via Gcc-patches
On Fri, 10 Mar 2023, Hongtao Liu wrote:

> On Fri, Mar 10, 2023 at 3:37 AM Tamar Christina via Gcc-patches
>  wrote:
> >
> > Hi All,
> >
> > The testcase
> >
> > typedef unsigned int vec __attribute__((vector_size(32)));
> > vec
> > f3 (vec a, vec b, vec c)
> > {
> >   vec d = a * b;
> >   return d + ((c + d) >> 1);
> > }
> >
> > shows a case where we don't want to form an FMA due to the MUL not being 
> > single
> > use.  In this case to form an FMA we have to redo the MUL as well as we no
> > longer have it to share.
> >
> > As such making an FMA here would be a de-optimization.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR target/108583
> > * tree-ssa-math-opts.cc (convert_mult_to_fma): Inhibit FMA in case 
> > not
> > single use.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/108583
> > * gcc.dg/mla_1.c: New test.
> >
> > Co-Authored-By: Richard Sandiford 
> >
> > --- inline copy of patch --
> > diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c
> > new file mode 100644
> > index 
> > ..a92ecf248116d89b1bc4207a907ea5ed95728a28
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/mla_1.c
> > @@ -0,0 +1,40 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target vect_int } */
> > +/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve 
> > -fdump-tree-optimized" } */
> > +
> > +unsigned int
> > +f1 (unsigned int a, unsigned int b, unsigned int c) {
> > +  unsigned int d = a * b;
> > +  return d + ((c + d) >> 1);
> > +}
> > +
> > +unsigned int
> > +g1 (unsigned int a, unsigned int b, unsigned int c) {
> > +  return a * b + c;
> > +}
> > +
> > +__Uint32x4_t
> > +f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
> > +  __Uint32x4_t d = a * b;
> > +  return d + ((c + d) >> 1);
> > +}
> > +
> > +__Uint32x4_t
> > +g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
> > +  return a * b + c;
> > +}
> > +
> > +typedef unsigned int vec __attribute__((vector_size(32))); vec
> > +f3 (vec a, vec b, vec c)
> > +{
> > +  vec d = a * b;
> > +  return d + ((c + d) >> 1);
> > +}
> > +
> > +vec
> > +g3 (vec a, vec b, vec c)
> > +{
> > +  return a * b + c;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target 
> > aarch64*-*-* } } } */
> > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> > index 
> > 5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172
> >  100644
> > --- a/gcc/tree-ssa-math-opts.cc
> > +++ b/gcc/tree-ssa-math-opts.cc
> > @@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, 
> > tree op2,
> > param_avoid_fma_max_bits));
> >bool defer = check_defer;
> >bool seen_negate_p = false;
> > +
> > +  /* There is no numerical difference between fused and unfused integer 
> > FMAs,
> > + and the assumption below that FMA is as cheap as addition is unlikely
> > + to be true, especially if the multiplication occurs multiple times on
> > + the same chain.  E.g., for something like:
> > +
> > +(((a * b) + c) >> 1) + (a * b)
> > +
> > + we do not want to duplicate the a * b into two additions, not least
> > + because the result is not a natural FMA chain.  */
> > +  if (ANY_INTEGRAL_TYPE_P (type)
> > +  && !has_single_use (mul_result))
> What about floating point?

I think for a case like above, thus

 ((a * b) + c) + (a * b)

it's profitable to handle this as

  fma (a, b, fma (a, b, c))

as this saves one add and has one op less latency?  For the case
where the second use is not part of the dependence chain it's
less obvious but since FMA is usually not (very much more) expensive
than an add erroring on the optimization side didn't look wrong
(IIRC the FMA forming analysis isn't "global", aka counts
untransformed mults left in the end)

Richard.

> > +return false;
> > +
> >/* Make sure that the multiplication statement becomes dead after
> >   the transformation, thus that all uses are transformed to FMAs.
> >   This means we assume that an FMA operation has the same cost
> >
> >
> >
> >
> > --
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH]middle-end: don't form FMAs when multiplication is not single use. [PR108583]

2023-03-09 Thread Richard Biener via Gcc-patches
On Thu, 9 Mar 2023, Tamar Christina wrote:

> Hi All,
> 
> The testcase
> 
> typedef unsigned int vec __attribute__((vector_size(32)));
> vec
> f3 (vec a, vec b, vec c)
> {
>   vec d = a * b;
>   return d + ((c + d) >> 1);
> }
> 
> shows a case where we don't want to form an FMA due to the MUL not being 
> single
> use.  In this case to form an FMA we have to redo the MUL as well as we no
> longer have it to share.
> 
> As such making an FMA here would be a de-optimization.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR target/108583
>   * tree-ssa-math-opts.cc (convert_mult_to_fma): Inhibit FMA in case not
>   single use.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/108583
>   * gcc.dg/mla_1.c: New test.
> 
> Co-Authored-By: Richard Sandiford 
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c
> new file mode 100644
> index 
> ..a92ecf248116d89b1bc4207a907ea5ed95728a28
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/mla_1.c
> @@ -0,0 +1,40 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve 
> -fdump-tree-optimized" } */
> +
> +unsigned int
> +f1 (unsigned int a, unsigned int b, unsigned int c) {
> +  unsigned int d = a * b;
> +  return d + ((c + d) >> 1);
> +}
> +
> +unsigned int
> +g1 (unsigned int a, unsigned int b, unsigned int c) {
> +  return a * b + c;
> +}
> +
> +__Uint32x4_t
> +f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
> +  __Uint32x4_t d = a * b;
> +  return d + ((c + d) >> 1);
> +}
> +
> +__Uint32x4_t
> +g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
> +  return a * b + c;
> +}
> +
> +typedef unsigned int vec __attribute__((vector_size(32))); vec
> +f3 (vec a, vec b, vec c)
> +{
> +  vec d = a * b;
> +  return d + ((c + d) >> 1);
> +}
> +
> +vec
> +g3 (vec a, vec b, vec c)
> +{
> +  return a * b + c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target 
> aarch64*-*-* } } } */
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index 
> 5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172
>  100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
> op2,
>   param_avoid_fma_max_bits));
>bool defer = check_defer;
>bool seen_negate_p = false;
> +
> +  /* There is no numerical difference between fused and unfused integer FMAs,
> + and the assumption below that FMA is as cheap as addition is unlikely
> + to be true, especially if the multiplication occurs multiple times on
> + the same chain.  E.g., for something like:
> +
> +  (((a * b) + c) >> 1) + (a * b)
> +
> + we do not want to duplicate the a * b into two additions, not least
> + because the result is not a natural FMA chain.  */
> +  if (ANY_INTEGRAL_TYPE_P (type)
> +  && !has_single_use (mul_result))
> +return false;
> +
>/* Make sure that the multiplication statement becomes dead after
>   the transformation, thus that all uses are transformed to FMAs.
>   This means we assume that an FMA operation has the same cost
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


[PATCH] Fix PR 108874: aarch64 code regression with shift and ands

2023-03-09 Thread Andrew Pinski via Gcc-patches
After r6-2044-g98e30e515f184b, code like "((x & 0xff00ff00U) >> 8)"
would be optimized like (x >> 8) & 0xff00ffU which is normally better
except on aarch64, the shift right could be combined with another
operation in some cases. So we need to add a few define_splits
to the aarch64 backends that match "((x >> shift) & CST0) OP Y"
and splits it to:
TMP = X & CST1
(TMP >> shift) OP Y

Note this also gets us to matching rev16 back too so I added a
testcase to make sure we don't lose that matching any more.
Note when the generic patch to recognize those as bswap ROT 16,
we might regress again and need to add a few more patterns to
the aarch64 backend but will deal with that once that happens.

OK? Bootstrapped and tested on aarch64 with no regressions.

gcc/ChangeLog:

* config/aarch64/aarch64.md: Add a new define_split
to help combine.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/rev16_2.c: New test.
* gcc.target/aarch64/shift_and_operator-1.c: New test.
---
 gcc/config/aarch64/aarch64.md | 21 ++
 gcc/testsuite/gcc.target/aarch64/rev16_2.c| 39 +++
 .../gcc.target/aarch64/shift_and_operator-1.c | 22 +++
 3 files changed, 82 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rev16_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shift_and_operator-1.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index af9087508ac..41cc563f10c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4656,6 +4656,27 @@ (define_insn "*_3"
   [(set_attr "type" "logic_shift_imm")]
 )
 
+(define_split
+  [(set (match_operand:GPI 0 "register_operand")
+   (LOGICAL_OR_PLUS:GPI
+ (and:GPI
+   (lshiftrt:GPI (match_operand:GPI 1 "register_operand")
+ (match_operand:QI 2 "aarch64_shift_imm_"))
+   (match_operand:GPI 3 "aarch64_logical_immediate"))
+ (match_operand:GPI 4 "register_operand")))]
+  "can_create_pseudo_p ()
+   && aarch64_bitmask_imm (UINTVAL (operands[3]) << UINTVAL (operands[2]), 
mode)"
+  [(set (match_dup 5) (and:GPI (match_dup 1) (match_dup 6)))
+   (set (match_dup 0) (match_dup 7))]
+  {
+operands[5] = gen_reg_rtx (mode);
+operands[6] = gen_int_mode (UINTVAL (operands[3]) << UINTVAL 
(operands[2]), mode);
+rtx shift = gen_rtx_LSHIFTRT (mode, operands[5], operands[2]);
+rtx_code new_code = ;
+operands[7] = gen_rtx_fmt_ee (new_code, mode, shift, operands[4]);
+  }
+)
+
 (define_split
   [(set (match_operand:GPI 0 "register_operand")
(LOGICAL_OR_PLUS:GPI
diff --git a/gcc/testsuite/gcc.target/aarch64/rev16_2.c 
b/gcc/testsuite/gcc.target/aarch64/rev16_2.c
new file mode 100644
index 000..621eb5dfbf0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/rev16_2.c
@@ -0,0 +1,39 @@
+/* { dg-options "-O2" } */
+/* { dg-do compile } */
+
+extern void abort (void);
+
+typedef unsigned int __u32;
+
+__u32
+__rev16_32_alt (__u32 x)
+{
+  return (((__u32)(x) & (__u32)0xff00ff00UL) >> 8)
+ | (((__u32)(x) & (__u32)0x00ff00ffUL) << 8);
+}
+
+__u32
+__rev16_32 (__u32 x)
+{
+  return (((__u32)(x) & (__u32)0x00ff00ffUL) << 8)
+ | (((__u32)(x) & (__u32)0xff00ff00UL) >> 8);
+}
+
+typedef unsigned long long __u64;
+
+__u64
+__rev16_64_alt (__u64 x)
+{
+  return (((__u64)(x) & (__u64)0xff00ff00ff00ff00UL) >> 8)
+ | (((__u64)(x) & (__u64)0x00ff00ff00ff00ffUL) << 8);
+}
+
+__u64
+__rev16_64 (__u64 x)
+{
+  return (((__u64)(x) & (__u64)0x00ff00ff00ff00ffUL) << 8)
+ | (((__u64)(x) & (__u64)0xff00ff00ff00ff00UL) >> 8);
+}
+
+/* { dg-final { scan-assembler-times "rev16\\tx\[0-9\]+" 2 } } */
+/* { dg-final { scan-assembler-times "rev16\\tw\[0-9\]+" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/shift_and_operator-1.c 
b/gcc/testsuite/gcc.target/aarch64/shift_and_operator-1.c
new file mode 100644
index 000..49152c5495a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/shift_and_operator-1.c
@@ -0,0 +1,22 @@
+/* { dg-options "-O2" } */
+/* { dg-do compile } */
+
+unsigned f(unsigned x, unsigned b)
+{
+  return ((x & 0xff00ff00U) >> 8) | b;
+}
+
+unsigned f0(unsigned x, unsigned b)
+{
+  return ((x & 0xff00ff00U) >> 8) ^ b;
+}
+unsigned f1(unsigned x, unsigned b)
+{
+  return ((x & 0xff00ff00U) >> 8) + b;
+}
+
+/* { dg-final { scan-assembler-times "lsr\\tw\[0-9\]+" 0 } } */
+/* { dg-final { scan-assembler-times "lsr 8" 3 } } */
+/* { dg-final { scan-assembler-times "eor\\tw\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "add\\tw\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "orr\\tw\[0-9\]+" 1 } } */
-- 
2.31.1



Re: [PATCH] Fortran: fix ICE with bind(c) in block data [PR104332]

2023-03-09 Thread Jerry D via Gcc-patches

On 3/9/23 10:08 AM, Harald Anlauf via Fortran wrote:

Dear all,

the attached almost obvious patch fixes a NULL pointer dereference
in a check of a symbol with the bind(c) attribute.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This PR is marked as 10/11/12/13 regression, thus it should
qualify for a backport.  It's simple enough anyway.

Thanks,
Harald



OK, please proceed. Thanks for the patch.

Jerry


[pushed] c++: signed __int128_t [PR108099]

2023-03-09 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

The code for handling signed + typedef was breaking on __int128_t, because
it isn't a proper typedef: it doesn't have DECL_ORIGINAL_TYPE.

PR c++/108099

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Handle non-typedef typedef_decl.

gcc/testsuite/ChangeLog:

* g++.dg/ext/int128-7.C: New test.
---
 gcc/cp/decl.cc  | 11 ---
 gcc/testsuite/g++.dg/ext/int128-7.C |  4 
 2 files changed, 12 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/int128-7.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 30c7470974d..b1603859644 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -12440,10 +12440,15 @@ grokdeclarator (const cp_declarator *declarator,
{
  if (typedef_decl)
{
- pedwarn (loc, OPT_Wpedantic, "%qs specified with %qT",
-  key, type);
+ pedwarn (loc, OPT_Wpedantic, "%qs specified with %qD",
+  key, typedef_decl);
  ok = !flag_pedantic_errors;
- type = DECL_ORIGINAL_TYPE (typedef_decl);
+ if (is_typedef_decl (typedef_decl))
+   type = DECL_ORIGINAL_TYPE (typedef_decl);
+ else
+   /* PR108099: __int128_t comes from c_common_nodes_and_builtins,
+  and is not built as a typedef.  */
+   type = TREE_TYPE (typedef_decl);
  typedef_decl = NULL_TREE;
}
  else if (declspecs->decltype_p)
diff --git a/gcc/testsuite/g++.dg/ext/int128-7.C 
b/gcc/testsuite/g++.dg/ext/int128-7.C
new file mode 100644
index 000..bf5e8c40a4b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/int128-7.C
@@ -0,0 +1,4 @@
+// PR c++/108099
+// { dg-do compile { target { c++11 && int128 } } }
+
+using i128 = signed __int128_t;// { dg-error "specified with" }

base-commit: 68c5d92a1390ecccb61d3600a95eeff6caf7ccdf
-- 
2.31.1



[pushed] c++: overloaded fn in contract [PR108542]

2023-03-09 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

PR c++/108542

gcc/cp/ChangeLog:

* class.cc (instantiate_type): Strip location wrapper.

gcc/testsuite/ChangeLog:

* g++.dg/contracts/contracts-err1.C: New test.
---
 gcc/cp/class.cc | 2 ++
 gcc/testsuite/g++.dg/contracts/contracts-err1.C | 7 +++
 2 files changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/contracts/contracts-err1.C

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index 27a79829737..d37e9d4d576 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -8728,6 +8728,8 @@ instantiate_type (tree lhstype, tree rhs, tsubst_flags_t 
complain)
 
   complain &= ~tf_ptrmem_ok;
 
+  STRIP_ANY_LOCATION_WRAPPER (rhs);
+
   if (lhstype == unknown_type_node)
 {
   if (complain & tf_error)
diff --git a/gcc/testsuite/g++.dg/contracts/contracts-err1.C 
b/gcc/testsuite/g++.dg/contracts/contracts-err1.C
new file mode 100644
index 000..8437d94e2ad
--- /dev/null
+++ b/gcc/testsuite/g++.dg/contracts/contracts-err1.C
@@ -0,0 +1,7 @@
+// PR c++/108542
+// { dg-additional-options -fcontracts }
+// { dg-do compile { target c++11 } }
+
+template
+void f (T n) {}
+void g() [[pre: f]];   // { dg-error "overloaded" }

base-commit: e0324e2629e25a90c13c68b4eef1e47b091970c3
-- 
2.31.1



Re: [PATCH] libstdc++: Add missing free functions for atomic_flag [PR103934]

2023-03-09 Thread Thomas Rodgers via Gcc-patches
The second patch has now been backported and pushed to releases/gcc-12 and
releases/gcc-11.

On Mon, Feb 13, 2023 at 6:06 PM Thomas Rodgers  wrote:

> Tested x86_64-pc-linux-gnu. Pushed to trunk.
>
> The first patch has also been backported and pushed to releases/gcc-12 and
> releases/gcc-11
>
> The second patch fails to cleanly cherry-pick. Will resolve and push
> shortly.
>
> On Fri, Feb 10, 2023 at 4:41 PM Jonathan Wakely 
> wrote:
>
>> On Fri, 10 Feb 2023 at 18:25, Thomas Rodgers  wrote:
>> >
>> > This patch did not get committed in a timely manner after it was OK'd.
>> In revisiting the patch some issues were found that have lead me to
>> resubmit for review -
>> >
>> > Specifically -
>> >
>> > The original commit to add C++20 atomic_flag::test did not include the
>> free functions for atomic_flag_test[_explicit]
>> > The original commit to add C++20 atomic_flag::wait/notify did not
>> include the free functions for atomic_flag_wait/notify[_explicit]
>> >
>> > These two commits landed in GCC10 and GCC11 respectively. My original
>> patch included both sets of free functions, but
>> > that complicates the backporting of these changes to GCC10, GCC11, and
>> GCC12.
>>
>> I don't think we need them in GCC 10.
>>
>> > Additionally commit 7c2155 removed const qualification from
>> atomic_flag::notify_one/notify_all but the original version of this
>> > patch accepts the atomic flag as const.
>> >
>> > The original version of this patch did not include test cases for the
>> atomic_flag_test[_explicit] free functions.
>> >
>> > I have split the original patch into two patches, on for the
>> atomic_flag_test free functions, and one for the atomic_flag_wait/notify
>> > free functions.
>>
>> Thanks.
>>
>> For [PATCH 1/2] please name the added functions in the changelog entry:
>>
>> * include/std/atomic (atomic_flag_test): Add.
>> (atomic_flag_test_explicit): Add.
>>
>> Similarly for the changelog in [PATCH 2/2], naming the four new
>> functions added to include/std/atomic.
>>
>> The indentation is off in [PATCH 2/2] for atomic_flag:
>>
>> +#if __cpp_lib_atomic_wait
>> +  inline void
>> +  atomic_flag_wait(atomic_flag* __a, bool __old) noexcept
>> +  { __a->wait(__old); }
>> +
>>
>> And similarly for the other three added functions.
>> The function names should start in the same column as the 'inline' and
>> opening brace of the function body.
>>
>>
>> Both patches are OK for trunk, gcc-12 and gcc-11 with those changes.
>>
>>
>>
>>
>> >
>> >
>> > On Wed, Feb 2, 2022 at 1:35 PM Jonathan Wakely 
>> wrote:
>> >>
>> >> >+  inline void
>> >> >+  atomic_flag_wait_explicit(const atomic_flag* __a, bool __old,
>> >> >+   std::memory_order __m) noexcept
>> >>
>> >> No need for the std:: qualification, and check the indentation.
>> >>
>> >>
>> >> > libstdc++-v3/ChangeLog:
>> >> >
>> >> >PR103934
>> >>
>> >> This needs to include the component: PR libstdc++/103934
>> >>
>> >> >* include/std/atomic: Add missing free functions.
>> >>
>> >> Please name the new functions in the changelog, in the usual format.
>> >> Just the names is fine, no need for the full signatures with
>> >> parameters.
>> >>
>> >> OK for trunk with those changes.
>> >>
>>
>>


Re: [PATCH]middle-end: don't form FMAs when multiplication is not single use. [PR108583]

2023-03-09 Thread Hongtao Liu via Gcc-patches
On Fri, Mar 10, 2023 at 3:37 AM Tamar Christina via Gcc-patches
 wrote:
>
> Hi All,
>
> The testcase
>
> typedef unsigned int vec __attribute__((vector_size(32)));
> vec
> f3 (vec a, vec b, vec c)
> {
>   vec d = a * b;
>   return d + ((c + d) >> 1);
> }
>
> shows a case where we don't want to form an FMA due to the MUL not being 
> single
> use.  In this case to form an FMA we have to redo the MUL as well as we no
> longer have it to share.
>
> As such making an FMA here would be a de-optimization.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR target/108583
> * tree-ssa-math-opts.cc (convert_mult_to_fma): Inhibit FMA in case not
> single use.
>
> gcc/testsuite/ChangeLog:
>
> PR target/108583
> * gcc.dg/mla_1.c: New test.
>
> Co-Authored-By: Richard Sandiford 
>
> --- inline copy of patch --
> diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c
> new file mode 100644
> index 
> ..a92ecf248116d89b1bc4207a907ea5ed95728a28
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/mla_1.c
> @@ -0,0 +1,40 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve 
> -fdump-tree-optimized" } */
> +
> +unsigned int
> +f1 (unsigned int a, unsigned int b, unsigned int c) {
> +  unsigned int d = a * b;
> +  return d + ((c + d) >> 1);
> +}
> +
> +unsigned int
> +g1 (unsigned int a, unsigned int b, unsigned int c) {
> +  return a * b + c;
> +}
> +
> +__Uint32x4_t
> +f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
> +  __Uint32x4_t d = a * b;
> +  return d + ((c + d) >> 1);
> +}
> +
> +__Uint32x4_t
> +g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
> +  return a * b + c;
> +}
> +
> +typedef unsigned int vec __attribute__((vector_size(32))); vec
> +f3 (vec a, vec b, vec c)
> +{
> +  vec d = a * b;
> +  return d + ((c + d) >> 1);
> +}
> +
> +vec
> +g3 (vec a, vec b, vec c)
> +{
> +  return a * b + c;
> +}
> +
> +/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target 
> aarch64*-*-* } } } */
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index 
> 5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172
>  100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
> op2,
> param_avoid_fma_max_bits));
>bool defer = check_defer;
>bool seen_negate_p = false;
> +
> +  /* There is no numerical difference between fused and unfused integer FMAs,
> + and the assumption below that FMA is as cheap as addition is unlikely
> + to be true, especially if the multiplication occurs multiple times on
> + the same chain.  E.g., for something like:
> +
> +(((a * b) + c) >> 1) + (a * b)
> +
> + we do not want to duplicate the a * b into two additions, not least
> + because the result is not a natural FMA chain.  */
> +  if (ANY_INTEGRAL_TYPE_P (type)
> +  && !has_single_use (mul_result))
What about floating point?
> +return false;
> +
>/* Make sure that the multiplication statement becomes dead after
>   the transformation, thus that all uses are transformed to FMAs.
>   This means we assume that an FMA operation has the same cost
>
>
>
>
> --



-- 
BR,
Hongtao


[PATCH V4] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Michael Meissner via Gcc-patches
This patch reworks how the complex multiply and divide built-in functions are
done.  Previously GCC created built-in declarations for doing long double 
complex
multiply and divide when long double is IEEE 128-bit.  However, it did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

Originally, the patch was part of a larger patch set and the comments reflected
this.  I have removed the comments referring to the other patches.  While this
patch was originally developed as part of those other patches, it is a stand
alone patch.

I have tried to take the comments in the last patch review in this patch.
Note, I will be away from the computer from March 10 through the 13th.  So I
would not be checking in the patches until I get back.  But I thought I would
share the results of the changes that were asked for.

I fixed the complex_multiply_builtin_code and complex_divide_builtin_code
functions to have an assert tht the mode is within the proper modes.  I have
tried to make the code a little bit clearer.

I have cleaned up the tests to eliminate the target powerpc in the tests.  I
have elimited the -mpower8-vector option.  I have changed the scan assembler
lines jut to look for __divtc3 or __multc3, and not depend on the format of the
'bl' call to those functions.  I have kept the -Wno-psabi option, because this
is needed to prevent spurious errors on systems with older libraries (like big
endian) that don't have IEEE 128-bit support.

2023-03-09   Michael Meissner  

gcc/

PR target/109067
* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
(init_float128_ieee): Delete code to switch complex multiply and divide
for long double.
(complex_multiply_builtin_code): New helper function.
(complex_divide_builtin_code): Likewise.
(rs6000_mangle_decl_assembler_name): Add support for mangling the name
of complex 128-bit multiply and divide built-in functions.

gcc/testsuite/

PR target/109067
* gcc.target/powerpc/divic3-1.c: New test.
* gcc.target/powerpc/divic3-2.c: Likewise.
* gcc.target/powerpc/mulic3-1.c: Likewise.
* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000.cc | 111 +++-
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  21 
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  25 +
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  21 
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  25 +
 5 files changed, 156 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 8e0b0d022db..fa5f93a874f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11154,26 +11154,6 @@ init_float128_ibm (machine_mode mode)
 }
 }
 
-/* Create a decl for either complex long double multiply or complex long double
-   divide when long double is IEEE 128-bit floating point.  We can't use
-   __multc3 and __divtc3 because the original long double using IBM extended
-   double used those names.  The complex multiply/divide functions are encoded
-   as builtin functions with a complex result and 4 scalar inputs.  */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
-  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
- name, NULL_TREE);
-
-  set_builtin_decl (fncode, fndecl, true);
-
-  if (TARGET_DEBUG_BUILTIN)
-fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
-  return;
-}
-
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
arguments can be passed in a vector register.  The historical PowerPC
implementation of IEEE 128-bit floating point used _q_ for the names, so
@@ -11185,32 +11165,6 @@ init_float128_ieee (machine_mode mode)
 {
   if (FLOAT128_VECTOR_P (mode))
 {
-  static bool complex_muldiv_init_p = false;
-
-  /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  If
-we have clone or target attributes, this will be called a second
-time.  We want to create the built-in function only once.  */
- if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
-   {
-complex_muldiv_init_p = true;
-built_in_function fncode_mul =
-  (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
-

Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-09 Thread Peter Bergner via Gcc-patches
On 3/9/23 8:55 AM, Segher Boessenkool wrote:
> On Thu, Mar 09, 2023 at 05:30:53PM +0800, Kewen.Lin wrote:
>> on 2023/3/9 07:01, Peter Bergner via Gcc-patches wrote:
>>> This patch was tested in both GCC 11 and GCC 10 on powerpc64le-linux and
>>> showed no regressions.  Ok for backports?
> 
> It isn't truly a backport. You can put it on 11 and 10 at the same time,
> there is no benefit doing it on 11 only first.

Correct.  I just meant that they're targeted at the two release branches
and not trunk.



>>>   op[nopnds++] = build_pointer_type (void_type_node);
>>>   if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC)
>>> -   op[nopnds++] = build_pointer_type (vector_quad_type_node);
>>> +   op[nopnds++] = build_pointer_type (build_qualified_type
>>> +(vector_quad_type_node,
>>> + TYPE_QUAL_CONST));
>>
>> Nit: Maybe we can build them out of the loop once and then just use the
>> built one in the loop.
> 
> Or as globals even.  Currently we have X and pointer to X, but no
> pointer to const X (and no const X either, but that isn't so useful).
> 
> The generic code doesn't have this either, hrm.

I can have a look at that, but was trying to keep the change as small
as possible.  Especially since we're not trying to create code that
will be "easier" to maintain in the future, because this is all changed
in GCC12 with Bill's builtin re-write.



>> Simply testing __builtin_mma_xxmtacc and __builtin_mma_xxmfacc as below:
>>
>> $ cat test.C
>> void foo0(const __vector_quad *acc) {
>>   __builtin_mma_xxmtacc(acc);
>>   __builtin_mma_xxmfacc(acc);
>> }
>>
>> test.C:2:25: error: invalid conversion from ‘const __vector_quad*’ to 
>> ‘__vector_quad*’ [-fpermissive]
>> 2 |   __builtin_mma_xxmtacc(acc);
>>
>> test.C:3:25: error: invalid conversion from ‘const __vector_quad*’ to 
>> ‘__vector_quad*’ [-fpermissive]
>> 3 |   __builtin_mma_xxmfacc(acc);
>>
>> They also suffered the same error on gcc11 branch but not on trunk.
> 
> Yeah, there is more to be done here.

Well I'm sure there are non-MMA builtins that have the same issue.
I was just fixing the ones Chip ran into and similar builtins.
I don't think we want to go and make everything work like it does
on trunk, especially when no one has complained about hitting
them.

As for the __builtin_mma_xxm[ft]acc() errors, I'm not sure any actual
code will ever hit this.  All realistic examples declare a __vector_quad
var and the pointer passed to the builtin comes from doing  as the
operand.  Clearly we cannot have a "const __vector_quad var;" and
use that in the builtins.


>> Besides, I'm not sure if the existing bif declarations using 
>> ptr_vector_pair_type_node
>> and ptr_vector_quad_type_node are all intentional, at least it looks weird 
>> to me that
>> we declare const __vector_pair* for this __builtin_vsx_stxvp, which is meant 
>> to store 32
>> bytes into the memory provided by the pointer biasing the sizetype offset, 
>> but the "const"
>> qualifier seems to tell that this bif doesn't modify the memory pointed by 
>> the given pointer.

I'm not a language lawyer and I don't play one on TV.  What we're accepting
here, is a pointer with a "const" value that points to non-const memory.
I'll double check the trunk code, but I don't think it allows (and we don't
want it to) using a pointer (const or non-const) that points to a const memory
...at least for the stxvp builtin.


> Since the patch is a strict improvement already, it is okay for 11 and
> 10.  But you (Peter) may want to flesh it out a bit first?  Or first
> commit only this if that works better for you.

I'll see about making some of the changes above and then I'll report back.

Peter



[PATCH v2] ubsan: missed -fsanitize=bounds for compound ops [PR108060]

2023-03-09 Thread Marek Polacek via Gcc-patches
On Thu, Mar 09, 2023 at 09:44:49AM +0100, Jakub Jelinek wrote:
> On Thu, Mar 09, 2023 at 08:12:47AM +, Richard Biener wrote:
> > I think this is a reasonable way to address the regression, so OK.
> 
> It is true that both C and C++ (including c++14_down and c++17 and later
> where the latter have different ordering rules) evaluate the lhs of
> MODIFY_EXPR after rhs, so conceptually this patch makes sense.

Thank you both for taking a look.

> But I wonder why we do in ubsan_maybe_instrument_array_ref:
>   if (e != NULL_TREE)
> {
>   tree t = copy_node (*expr_p);
>   TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
> e, op1);
>   *expr_p = t;
> }
> rather than modification of the ARRAY_REF's operand in place.  If we
> did that, we wouldn't really care about the order, shared tree would
> be instrumented once, with SAVE_EXPR in there making sure we don't
> compute that multiple times.  Is that because the 2 copies could
> have side-effects and we do want to evaluate those multiple times?

I'd assumed that that was the point of the copy_node.  But now that
I'm actually experimenting with it, I can't trigger any problems
without the copy_node.  So maybe we can use the following patch, which
also adds a new test, bounds-21.c, to check that side-effects are
evaluated correctly.  I didn't bother writing a description for this
patch yet because I sort of think we should apply both patches at the
same time.  


Regtested on x86_64-pc-linux-gnu.

-- >8 --
PR sanitizer/108060
PR sanitizer/109050

gcc/c-family/ChangeLog:

* c-ubsan.cc (ubsan_maybe_instrument_array_ref): Don't copy_node.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/bounds-17.c: New test.
* c-c++-common/ubsan/bounds-18.c: New test.
* c-c++-common/ubsan/bounds-19.c: New test.
* c-c++-common/ubsan/bounds-20.c: New test.
* c-c++-common/ubsan/bounds-21.c: New test.
---
 gcc/c-family/c-ubsan.cc  |  8 ++--
 gcc/testsuite/c-c++-common/ubsan/bounds-17.c | 17 +
 gcc/testsuite/c-c++-common/ubsan/bounds-18.c | 17 +
 gcc/testsuite/c-c++-common/ubsan/bounds-19.c | 20 
 gcc/testsuite/c-c++-common/ubsan/bounds-20.c | 16 
 gcc/testsuite/c-c++-common/ubsan/bounds-21.c | 18 ++
 6 files changed, 90 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-17.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-18.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-19.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-20.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-21.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 3e24198d7bb..8ce6421b61a 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -505,12 +505,8 @@ ubsan_maybe_instrument_array_ref (tree *expr_p, bool 
ignore_off_by_one)
   tree e = ubsan_instrument_bounds (EXPR_LOCATION (*expr_p), op0, ,
ignore_off_by_one);
   if (e != NULL_TREE)
-   {
- tree t = copy_node (*expr_p);
- TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
-   e, op1);
- *expr_p = t;
-   }
+   TREE_OPERAND (*expr_p, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
+   e, op1);
 }
 }
 
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-17.c 
b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
new file mode 100644
index 000..b727e3235b8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
@@ -0,0 +1,17 @@
+/* PR sanitizer/108060 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-skip-if "" { *-*-* } "-flto" } */
+/* { dg-shouldfail "ubsan" } */
+
+int a[8];
+int c;
+
+int
+main ()
+{
+  int b = -32768;
+  a[b] |= c;
+}
+
+/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-18.c 
b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
new file mode 100644
index 000..556abc0e1c0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
@@ -0,0 +1,17 @@
+/* PR sanitizer/108060 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-skip-if "" { *-*-* } "-flto" } */
+/* { dg-shouldfail "ubsan" } */
+
+int a[8];
+int c;
+
+int
+main ()
+{
+  int b = -32768;
+  a[b] = a[b] | c;
+}
+
+/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-19.c 
b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c
new file mode 100644
index 000..54217ae399f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c
@@ -0,0 +1,20 @@
+/* PR sanitizer/108060 */
+/* { dg-do run } */
+/* { dg-options 

Re: [PATCH v2 4/5] Update texinfo.tex, remove the @gol macro/alias

2023-03-09 Thread Gerald Pfeifer
On Thu, 9 Mar 2023, Sandra Loosemore wrote:
> This is OK, but I'd like to see this patch split into two separate 
> commits as well -- one for the texinfo.tex import, and one for the @gol 
> changes.

I believe Arsen does not have git write access.

Arsen, if that is indeed the case, I offer to push these two commits for
you if you send them by e-mail (as two attachments).

Gerald


Re: [PATCH v2 5/5] update_web_docs_git: Update CSS reference to new manual CSS

2023-03-09 Thread Sandra Loosemore

On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:

maintainer-scripts/ChangeLog:

* update_web_docs_git (CSS): Update CSS reference to point to
/texinfo-manuals.css.


I'm going to defer to Gerald on this one, since I am ignorant of how 
documents are produced for the GCC web site.  IIUC the online docs are 
built on a system with Texinfo 6.5; I don't know if it's reasonable to 
update that, otherwise I think somebody ought to give it a dry run to 
make sure that the style sheet does reasonable things with Texinfo 6.5 
output.


-Sandra


Re: [PATCH v2 4/5] Update texinfo.tex, remove the @gol macro/alias

2023-03-09 Thread Sandra Loosemore

On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:

The @gol macro appears to have existed as a workaround for a bug in old
versions of makeinfo and/or texinfo.tex, where they would, in some types
of output, fail to emit line breaks in @gccoptlists.  After updating
texinfo.tex, I noticed that this behavior appears to no longer be
exhibited, instead, both acted correctly and inserted newlines.  The
(groff) manual output also appears unaffected.

gcc/ChangeLog:

* doc/include/texinfo.tex: Update to 2023-01-17.19.
* doc/implement-c.texi: Remove usage of @gol.
* doc/invoke.texi: Ditto.
* doc/sourcebuild.texi: Ditto.
* doc/include/gcc-common.texi: Remove @gol.  In new Makeinfo and
texinfo.tex versions, the bug it was working around appears to
be gone.

gcc/fortran/ChangeLog:

* invoke.texi: Remove usages of @gol.
* intrinsic.texi: Ditto.


This is OK, but I'd like to see this patch split into two separate 
commits as well -- one for the texinfo.tex import, and one for the @gol 
changes.


-Sandra


Re: [PATCH v2 3/5] doc: Add @defbuiltin family of helpers, set documentlanguage

2023-03-09 Thread Sandra Loosemore

On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:

The @defbuiltin{,x} macros are convenience macros for the often-repeated
task of defining a built-in function in extend.texi.  Usage of this
macro should lead to a higher degree of consistency across pieces of
text written by different people, and provide a better reading
experience, as they prevent easy-to-make errors, like forgetting index
entries for these functions.

The documentlanguage omission was spotted by one of the people I asked
to "test drive" the updated manual, and so, it was added accordingly.

gcc/ChangeLog:

* doc/gcc.texi: Set document language to en_US.
(@copying): Wrap cover tests @quotation, move description of
manual in.
* doc/include/gcc-common.texi: Add @defbuiltin(x), @enddefbuiltin
for defining built-in functions.
* doc/extend.texi: Fix copyright notice comment, switch to using
@defbuiltin for built-in function definitions.
(Object Size Checking): Add subsubsection for formatted output
function (printf et al.) checking.


This is OK, but I would like to see it split into two separate commits, 
one for the @defbuiltin parts and one for the other miscellaneous tweaks.


-Sandra


Re: [PATCH v2 0/5] A small Texinfo refinement

2023-03-09 Thread Sandra Loosemore via Gcc-patches

On 3/9/23 01:26, Richard Biener wrote:


SLES 12 has texinfo 4.13a, SLES 15 has texinfo 6.5.  We still provide
up-to-date GCC for SLES 12 but we can probably manage in some ways
when the texinfo requirement gets bumped.


OK, this seems to be the oldest version anyone admits to actually using. 
 I built the manual with Arsen's patches using 4.13a; the build was 
successful, and I didn't see any obvious issues with the @gol removal in 
either the PDF or HTML output, so I think we are OK for backward 
compatibility.


I will work up a patch to remove the references to version 4.7 and 
replace it with some generic language as I suggested earlier, that won't 
be so prone to bit rot.


-Sandra


Re: [PATCH] libcpp: Update to Unicode 15

2023-03-09 Thread Lewis Hyatt via Gcc-patches
On Fri, Nov 04, 2022 at 10:03:13AM +0100, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> The following pseudo-patch (for uname2c.h part
> just a pseudo patch with a lot of changes replaced with ...
> because it is too large but the important changes like
> -static const char uname2c_dict[59418] =
> +static const char uname2c_dict[59891] =
> -static const unsigned char uname2c_tree[208765] = {
> +static const unsigned char uname2c_tree[210697] = {
> are shown, full patch xz compressed will be posted separately
> due to mail limit) regenerates the libcpp tables with Unicode 15.0.0
> which added 4489 new characters.
> 
> As mentioned previously, this isn't just a matter of running the
> two libcpp/make*.cc programs on the new Unicode files, but one needs
> to manually update a table inside of makeuname2c.cc according to
> a table in Unicode text (which is partially reflected in the text
> files, but e.g. in Unicode 14.0.0 not 100% accurately, in 15.0.0
> actually accurately).
> I've also added some randomly chosen subset of those 4489 new
> characters to a testcase.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Hi Jakub-

In addition to these files you updated last year for Unicode 15, we also need
to update generated_cpp_wcwidth.h, which implements cpp_wcwidth() for
diagnostics so we can output correct column numbers. There is a procedure
outlined in the file contrib/unicode/README that accomplishes this. Is it OK
to push the attached patch (gzipped since it is large and uninformative),
which is the result of following the procedure? It went straightforwardly as
expected, and bootstrap+regtest on x86-64 Linux is clean. Thanks!

-Lewis
[PATCH] libcpp: Update cpp_wcwidth() to Unicode 15

Updates cpp_wcwidth() to Unicode 15, following the procedure in
contrib/unicode/README mechanically without incident.

contrib/ChangeLog:

* unicode/DerivedCoreProperties.txt: Update to Unicode 15.
* unicode/DerivedNormalizationProps.txt: Likewise.
* unicode/EastAsianWidth.txt: Likwise.
* unicode/PropList.txt: Likewise.
* unicode/README: Likewise.
* unicode/UnicodeData.txt: Likewise.

libcpp/ChangeLog:

* generated_cpp_wcwidth.h: Regenerated for Unicode 15.


unicode_15_wcwidth-1.txt.gz
Description: application/gunzip


Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Michael Meissner via Gcc-patches
On Thu, Mar 09, 2023 at 04:16:21PM -0600, Segher Boessenkool wrote:
> On Thu, Mar 09, 2023 at 11:11:34AM -0500, Michael Meissner wrote:
> > On Fri, Mar 03, 2023 at 03:35:44PM -0600, Segher Boessenkool wrote:
> > > > +/* { dg-final { scan-assembler "bl __divtc3" } } */
> > > 
> > > This name depends on what object format and ABI is in use (some have an
> > > extra leading underscore, or a dot, or whatever).
> > 
> > Yes it is needed if GCC is configured against an older GLIBC before the full
> > IEEE 128-bit support was added.  For example, on my big endian test system, 
> > you
> > get warnings if you switch the floating point format.  I would imagine it 
> > would
> > also fail on little endian system with older libraries.
> 
> The regexp is not good enough, that is all.  Maybe
>   {bl .?__divtc3}
> or similar?  We have many examples in the tests already.

I forgot the mention the regexp.  I think just doing:

/* { dg-final { scan-assembler "__multc3" } } */

is sufficient.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] c++: noexcept and copy elision [PR109030]

2023-03-09 Thread Jason Merrill via Gcc-patches

On 3/9/23 14:32, Patrick Palka wrote:

On Mon, 6 Mar 2023, Marek Polacek via Gcc-patches wrote:


When processing a noexcept, constructors aren't elided: build_over_call
has
 /* It's unsafe to elide the constructor when handling
a noexcept-expression, it may evaluate to the wrong
value (c++/53025).  */
 && (force_elide || cp_noexcept_operand == 0))
so the assert I added recently needs to be relaxed a little bit.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/109030

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Relax assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept77.C: New test.
---
  gcc/cp/constexpr.cc | 6 +-
  gcc/testsuite/g++.dg/cpp0x/noexcept77.C | 9 +
  2 files changed, 14 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept77.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 364695b762c..5384d0e8e46 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2869,7 +2869,11 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  
/* We used to shortcut trivial constructor/op= here, but nowadays

   we can only get a trivial function here with -fno-elide-constructors.  */
-  gcc_checking_assert (!trivial_fn_p (fun) || !flag_elide_constructors);
+  gcc_checking_assert (!trivial_fn_p (fun)
+  || !flag_elide_constructors
+  /* We don't elide constructors when processing
+ a noexcept-expression.  */
+  || cp_noexcept_operand);


It seems weird that we're performing constant evaluation within an
unevaluated operand.  Would it make sense to also fix this a second way
by avoiding constant evaluation from maybe_constant_init when
cp_unevaluated_operand && !manifestly_const_eval, like in maybe_constant_value?


Sounds good.


IIUC since we could still have an evaluated subexpression withis
noexcept, the two fixes would be complementary.

  
bool non_constant_args = false;

new_call.bindings
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept77.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept77.C
new file mode 100644
index 000..16db8eb79ee
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept77.C
@@ -0,0 +1,9 @@
+// PR c++/109030
+// { dg-do compile { target c++11 } }
+
+struct foo { };
+
+struct __as_receiver {
+  foo empty_env;
+};
+void sched(foo __fun) noexcept(noexcept(__as_receiver{__fun})) { }

base-commit: dfb14cdd796ad9df6b5f2def047ef36b29385902
--
2.39.2








[PATCH 2/2] libstdc++: Add a test for FTM redefinitions

2023-03-09 Thread Arsen Arsenović via Gcc-patches
This test detects redefinitions by compiling stdc++.h with
-Wsystem-headers.  Thanks Patrick Palka for the suggestion.

libstdc++-v3/ChangeLog:

* testsuite/17_intro/versionconflict.cc: New test.
---
 libstdc++-v3/testsuite/17_intro/versionconflict.cc | 6 ++
 1 file changed, 6 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/17_intro/versionconflict.cc

diff --git a/libstdc++-v3/testsuite/17_intro/versionconflict.cc 
b/libstdc++-v3/testsuite/17_intro/versionconflict.cc
new file mode 100644
index 000..4191c7a2b08
--- /dev/null
+++ b/libstdc++-v3/testsuite/17_intro/versionconflict.cc
@@ -0,0 +1,6 @@
+// { dg-do preprocess }
+// { dg-additional-options "-Wsystem-headers -Werror" }
+
+// Test for redefinitions of FTMs using bits/stdc++.h.
+#include 
+#include 
-- 
2.39.2



[PATCH 1/2] libstdc++: Harmonize and other headers

2023-03-09 Thread Arsen Arsenović via Gcc-patches
Due to recent, large changes in libstdc++, the feature test macros
declared in  got out of sync with the other headers that
possibly declare them.  This patch resolves that.

libstdc++-v3/ChangeLog:

* include/bits/unique_ptr.h (__cpp_lib_constexpr_memory):
Synchronize the definition block with...
* include/bits/ptr_traits.h (__cpp_lib_constexpr_memory):
... this one here.  Also define the 202202L value, rather than
leaving it up to purely unique_ptr.h, so that the value is
synchronized across all headers.
(__gnu_debug::_Safe_iterator_base): Move into new conditional
block.
* include/std/memory (__cpp_lib_atomic_value_initialization):
Define on freestanding under the same conditions as in
atomic_base.h.
* include/std/version (__cpp_lib_robust_nonmodifying_seq_ops):
Also define on freestanding.
(__cpp_lib_to_chars): Ditto.
(__cpp_lib_gcd): Ditto.
(__cpp_lib_gcd_lcm): Ditto.
(__cpp_lib_raw_memory_algorithms): Ditto.
(__cpp_lib_array_constexpr): Ditto.
(__cpp_lib_nonmember_container_access): Ditto.
(__cpp_lib_clamp): Ditto.
(__cpp_lib_constexpr_char_traits): Ditto.
(__cpp_lib_constexpr_string): Ditto.
(__cpp_lib_sample): Ditto.
(__cpp_lib_lcm): Ditto.
(__cpp_lib_constexpr_iterator): Ditto.
(__cpp_lib_constexpr_char_traits): Ditto.
(__cpp_lib_interpolate): Ditto.
(__cpp_lib_constexpr_utility): Ditto.
(__cpp_lib_shift): Ditto.
(__cpp_lib_ranges): Ditto.
(__cpp_lib_constexpr_numeric): Ditto.
(__cpp_lib_constexpr_functional): Ditto.
(__cpp_lib_constexpr_algorithms): Ditto.
(__cpp_lib_constexpr_tuple): Ditto.
(__cpp_lib_constexpr_memory): Ditto.
(__cpp_lib_format): Define to 202106L, matching std/format.
---
Hi,

This patchset harmonizes the FTM inconsistencies that were discovered a
while back, and adds a test that crudely tries to detect them.

In the future, we should replace this with a common definition
mechanism, and selective exposure, as we discussed.

Tested on x86_64-pc-linux-gnu.

 libstdc++-v3/include/bits/ptr_traits.h | 13 ++--
 libstdc++-v3/include/bits/unique_ptr.h | 11 ++--
 libstdc++-v3/include/std/memory|  6 ++
 libstdc++-v3/include/std/version   | 84 ++
 4 files changed, 66 insertions(+), 48 deletions(-)

diff --git a/libstdc++-v3/include/bits/ptr_traits.h 
b/libstdc++-v3/include/bits/ptr_traits.h
index dc42a743c96..f6cc6b65f93 100644
--- a/libstdc++-v3/include/bits/ptr_traits.h
+++ b/libstdc++-v3/include/bits/ptr_traits.h
@@ -34,12 +34,15 @@
 
 #include 
 
+/* Duplicate definition with unique_ptr.h.  */
+#if __cplusplus > 202002L && defined(__cpp_constexpr_dynamic_alloc)
+# define __cpp_lib_constexpr_memory 202202L
+#elif __cplusplus > 201703L
+# include 
+# define __cpp_lib_constexpr_memory 201811L
+#endif
+
 #if __cplusplus > 201703L
-#include 
-# ifndef __cpp_lib_constexpr_memory
-// Defined to a newer value in bits/unique_ptr.h for C++23
-#  define __cpp_lib_constexpr_memory 201811L
-# endif
 namespace __gnu_debug { struct _Safe_iterator_base; }
 #endif
 
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index c8daff41865..f0c6d2383b4 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -43,12 +43,11 @@
 # endif
 #endif
 
-#if __cplusplus > 202002L && __cpp_constexpr_dynamic_alloc
-# if __cpp_lib_constexpr_memory < 202202L
-// Defined with older value in bits/ptr_traits.h for C++20
-#  undef __cpp_lib_constexpr_memory
-#  define __cpp_lib_constexpr_memory 202202L
-# endif
+/* Duplicate definition with ptr_traits.h.  */
+#if __cplusplus > 202002L && defined(__cpp_constexpr_dynamic_alloc)
+# define __cpp_lib_constexpr_memory 202202L
+#elif __cplusplus > 201703L
+# define __cpp_lib_constexpr_memory 201811L
 #endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
diff --git a/libstdc++-v3/include/std/memory b/libstdc++-v3/include/std/memory
index 341f9857730..85c36d67ee1 100644
--- a/libstdc++-v3/include/std/memory
+++ b/libstdc++-v3/include/std/memory
@@ -91,6 +91,12 @@
 #  include 
 #endif
 
+/* As a hack, we declare __cpp_lib_atomic_value_initialization here even though
+   we don't include the bit that actually declares it, for consistency.  */
+#if !defined(__cpp_lib_atomic_value_initialization) && __cplusplus >= 202002L
+# define __cpp_lib_atomic_value_initialization 201911L
+#endif
+
 #if __cplusplus >= 201103L && __cplusplus <= 202002L && _GLIBCXX_HOSTED
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 871d30db5b3..abc49d12e54 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -85,6 +85,12 @@
 #define __cpp_lib_transparent_operators 201510L
 #define 

Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Segher Boessenkool
On Thu, Mar 09, 2023 at 11:11:34AM -0500, Michael Meissner wrote:
> On Fri, Mar 03, 2023 at 03:35:44PM -0600, Segher Boessenkool wrote:
> > > +/* { dg-final { scan-assembler "bl __divtc3" } } */
> > 
> > This name depends on what object format and ABI is in use (some have an
> > extra leading underscore, or a dot, or whatever).
> 
> Yes it is needed if GCC is configured against an older GLIBC before the full
> IEEE 128-bit support was added.  For example, on my big endian test system, 
> you
> get warnings if you switch the floating point format.  I would imagine it 
> would
> also fail on little endian system with older libraries.

The regexp is not good enough, that is all.  Maybe
  {bl .?__divtc3}
or similar?  We have many examples in the tests already.


Segher


Re: Fwd: Bugzilla Bug 81649 [PATCH]: Clarify LeakSanitizer in documentation

2023-03-09 Thread Jonny Grant



On 07/03/2023 23:42, Sandra Loosemore wrote:
> On 3/1/23 05:53, Jonny Grant wrote:
>> Hello
>> I don't have write access, could someone review and apply this please?
>> Kind regards
>> Jonny
> 
> Looks good; I've gone ahead and pushed it for you.
> 
> -Sandra

Awesome - thank you. My first gcc patch.
Jonny


Ping^6: [PATCH] jit: Install jit headers in $(libsubincludedir) [PR 101491]

2023-03-09 Thread Lorenzo Salvadore via Gcc-patches
Hello,

Ping https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606450.html

Thanks,

Lorenzo Salvadore

> From f8e2c2ee89a7d8741bb65163d1f1c20edcd546ac Mon Sep 17 00:00:00 2001
> From: Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> Date: Wed, 16 Nov 2022 11:27:38 +0100
> Subject: [PATCH] jit: Install jit headers in $(libsubincludedir) [PR 101491]
> 
> Installing jit/libgccjit.h and jit/libgccjit++.h headers in
> $(includedir) can be a problem for machines where multiple versions of
> GCC are required simultaneously, see for example this bug report on
> FreeBSD:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257060
> 
> Hence,
> 
> - define $(libsubincludedir) the same way it is defined in libgomp;
> - install jit/libgccjit.h and jit/libgccjit++.h in $(libsubincludedir).
> 
> The patch has already been applied successfully in the official FreeBSD
> ports tree for the ports lang/gcc11 and lang/gcc12. Please see the
> following commits:
> 
> https://cgit.freebsd.org/ports/commit/?id=0338e04504ee269b7a95e6707f1314bc1c4239fe
> https://cgit.freebsd.org/ports/commit/?id=f1957296ed2dce8a09bb9582e9a5a715bf8b3d4d
> 
> gcc/ChangeLog:
> 
> 2022-11-16 Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> PR jit/101491
> * Makefile.in: Define and create $(libsubincludedir)
> 
> gcc/jit/ChangeLog:
> 
> 2022-11-16 Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> PR jit/101491
> * Make-lang.in: Install headers in $(libsubincludedir)
> ---
> gcc/Makefile.in | 3 +++
> gcc/jit/Make-lang.in | 4 ++--
> 2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f672e6ea549..3bcf1c491ab 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -635,6 +635,8 @@ libexecdir = @libexecdir@
> 
> # Directory in which the compiler finds libraries etc.
> libsubdir = 
> $(libdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
> +# Directory in which the compiler finds headers.
> +libsubincludedir = $(libdir)/gcc/$(target_alias)/$(version)/include
> # Directory in which the compiler finds executables
> libexecsubdir = 
> $(libexecdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
> # Directory in which all plugin resources are installed
> @@ -3642,6 +3644,7 @@ install-cpp: installdirs cpp$(exeext)
> # $(libdir)/gcc/include isn't currently searched by cpp.
> installdirs:
> $(mkinstalldirs) $(DESTDIR)$(libsubdir)
> + $(mkinstalldirs) $(DESTDIR)$(libsubincludedir)
> $(mkinstalldirs) $(DESTDIR)$(libexecsubdir)
> $(mkinstalldirs) $(DESTDIR)$(bindir)
> $(mkinstalldirs) $(DESTDIR)$(includedir)
> diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
> index 248ec45b729..ba1b3e95da5 100644
> --- a/gcc/jit/Make-lang.in
> +++ b/gcc/jit/Make-lang.in
> @@ -360,9 +360,9 @@ selftest-jit:
> # Install hooks:
> jit.install-headers: installdirs
> $(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \
> - $(DESTDIR)$(includedir)/libgccjit.h
> + $(DESTDIR)$(libsubincludedir)/libgccjit.h
> $(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \
> - $(DESTDIR)$(includedir)/libgccjit++.h
> + $(DESTDIR)$(libsubincludedir)/libgccjit++.h
> 
> ifneq (,$(findstring mingw,$(target)))
> jit.install-common: installdirs jit.install-headers
> --
> 2.38.0


[pushed] c++: allocator temps in list of arrays [PR108773]

2023-03-09 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

The optimization to reuse the same allocator temporary for all string
constructor calls was breaking on this testcase, because the temps were
already in the argument to build_vec_init, and replacing them with
references to one slot got confused with calls at multiple levels (for the
initializer_list backing array, and then again for the array member of the
std::array).  Fixed by reusing the whole TARGET_EXPR instead of pulling out
the slot; gimplification ensures that it's only initialized once.

I also moved the check for initializing a std:: class down into the tree
walk, and handle multiple temps within a single array element
initialization.

PR c++/108773

gcc/cp/ChangeLog:

* init.cc (find_allocator_temps_r): New.
(combine_allocator_temps): Replace find_allocator_temp.
(build_vec_init): Adjust.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array18.C: New test.
* g++.dg/cpp0x/initlist-array19.C: New test.
---
 gcc/cp/init.cc| 78 ++-
 gcc/testsuite/g++.dg/cpp0x/initlist-array18.C | 30 +++
 gcc/testsuite/g++.dg/cpp0x/initlist-array19.C | 23 ++
 3 files changed, 110 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-array18.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-array19.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 52e96fbe590..1b7d3d8fe3e 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -4330,8 +4330,54 @@ find_temps_r (tree *tp, int *walk_subtrees, void *data)
   return NULL_TREE;
 }
 
+/* walk_tree callback to collect temporaries in an expression that
+   are allocator arguments to standard library classes.  */
+
+static tree
+find_allocator_temps_r (tree *tp, int *walk_subtrees, void *data)
+{
+  vec  = *static_cast *>(data);
+  tree t = *tp;
+  if (TYPE_P (t))
+{
+  *walk_subtrees = 0;
+  return NULL_TREE;
+}
+
+  /* If this is a call to a constructor for a std:: class, look for
+ a reference-to-allocator argument.  */
+  tree fn = cp_get_callee_fndecl_nofold (t);
+  if (fn && DECL_CONSTRUCTOR_P (fn)
+  && decl_in_std_namespace_p (TYPE_NAME (DECL_CONTEXT (fn
+{
+  int nargs = call_expr_nargs (t);
+  for (int i = 1; i < nargs; ++i)
+   {
+ tree arg = get_nth_callarg (t, i);
+ tree atype = TREE_TYPE (arg);
+ if (TREE_CODE (atype) == REFERENCE_TYPE
+ && is_std_allocator (TREE_TYPE (atype)))
+   {
+ STRIP_NOPS (arg);
+ if (TREE_CODE (arg) == ADDR_EXPR)
+   {
+ tree *ap = _OPERAND (arg, 0);
+ if (TREE_CODE (*ap) == TARGET_EXPR)
+   temps.safe_push (ap);
+   }
+   }
+   }
+}
+
+  return NULL_TREE;
+}
+
 /* If INIT initializes a standard library class, and involves a temporary
-   std::allocator, return a pointer to the temp.
+   std::allocator, use ALLOC_OBJ for all such temporaries.
+
+   Note that this can clobber the input to build_vec_init; no unsharing is
+   done.  To make this safe we use the TARGET_EXPR in all places rather than
+   pulling out the TARGET_EXPR_SLOT.
 
Used by build_vec_init when initializing an array of e.g. strings to reuse
the same temporary allocator for all of the strings.  We can do this because
@@ -4341,22 +4387,18 @@ find_temps_r (tree *tp, int *walk_subtrees, void *data)
??? Add an attribute to allow users to assert the same property for other
classes, i.e. one object of the type is interchangeable with any other?  */
 
-static tree*
-find_allocator_temp (tree init)
+static void
+combine_allocator_temps (tree , tree _obj)
 {
-  if (TREE_CODE (init) == EXPR_STMT)
-init = EXPR_STMT_EXPR (init);
-  if (TREE_CODE (init) == CONVERT_EXPR)
-init = TREE_OPERAND (init, 0);
-  tree type = TREE_TYPE (init);
-  if (!CLASS_TYPE_P (type) || !decl_in_std_namespace_p (TYPE_NAME (type)))
-return NULL;
   auto_vec temps;
-  cp_walk_tree_without_duplicates (, find_temps_r, );
+  cp_walk_tree_without_duplicates (, find_allocator_temps_r, );
   for (tree *p : temps)
-if (is_std_allocator (TREE_TYPE (*p)))
-  return p;
-  return NULL;
+{
+  if (!alloc_obj)
+   alloc_obj = *p;
+  else
+   *p = alloc_obj;
+}
 }
 
 /* `build_vec_init' returns tree structure that performs
@@ -4694,13 +4736,7 @@ build_vec_init (tree base, tree maxindex, tree init,
  if (one_init)
{
  /* Only create one std::allocator temporary.  */
- if (tree *this_alloc = find_allocator_temp (one_init))
-   {
- if (alloc_obj)
-   *this_alloc = alloc_obj;
- else
-   alloc_obj = TARGET_EXPR_SLOT (*this_alloc);
-   }
+ combine_allocator_temps (one_init, alloc_obj);
  finish_expr_stmt (one_init);
   

[pushed] testsuite: add various -Wanalyzer-null-dereference false +ve test cases

2023-03-09 Thread David Malcolm via Gcc-patches
There are various -Wanalyzer-null-dereference false +ves in bugzilla
that I've been attempting to fix.  Unfortunately I haven't made much
progress, but it seems worth at least capturing the reduced
reproducers as test cases, to make it easier to spot changes in
behavior.

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-6565-g4214bdb1d77ebe.

gcc/testsuite/ChangeLog:
PR analyzer/102671
PR analyzer/105755
PR analyzer/108251
PR analyzer/108400
* gcc.dg/analyzer/null-deref-pr102671-1.c: New test, reduced
from Emacs.
* gcc.dg/analyzer/null-deref-pr102671-2.c: Likewise.
* gcc.dg/analyzer/null-deref-pr105755.c: Likewise.
* gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
New test, reduced from haproxy's src/ssl_sample.c.
* gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:
Likewise.
* gcc.dg/analyzer/null-deref-pr108400-SoftEtherVPN-WebUi.c: New
test, reduced from SoftEtherVPN's src/Cedar/WebUI.c.

Signed-off-by: David Malcolm 
---
 .../gcc.dg/analyzer/null-deref-pr102671-1.c   | 167 +++
 .../gcc.dg/analyzer/null-deref-pr102671-2.c   |  78 +++
 .../gcc.dg/analyzer/null-deref-pr105755.c | 193 ++
 ...f-pr108251-smp_fetch_ssl_fc_has_early-O2.c |  98 +
 ...eref-pr108251-smp_fetch_ssl_fc_has_early.c |  96 +
 .../null-deref-pr108400-SoftEtherVPN-WebUi.c  |  77 +++
 6 files changed, 709 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/null-deref-pr102671-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/null-deref-pr102671-2.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/null-deref-pr105755.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/null-deref-pr108400-SoftEtherVPN-WebUi.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/null-deref-pr102671-1.c 
b/gcc/testsuite/gcc.dg/analyzer/null-deref-pr102671-1.c
new file mode 100644
index 000..12a0a48d658
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/null-deref-pr102671-1.c
@@ -0,0 +1,167 @@
+/* { dg-additional-options "-O2 -Wno-shift-count-overflow" } */
+
+struct lisp;
+union vectorlike_header { long size; };
+struct Lisp_Symbol { void *unused; };
+extern struct Lisp_Symbol lispsym[];
+
+static _Bool
+TAGGEDP (struct lisp *a, unsigned tag)
+{
+  return ! (((unsigned) (long) a - tag) & 7);
+}
+
+static _Bool
+VECTORLIKEP (struct lisp *x)
+{
+  return TAGGEDP (x, 5);
+}
+
+static _Bool
+PSEUDOVECTOR_TYPEP (union vectorlike_header const *a, int code)
+{
+  long PSEUDOVECTOR_FLAG = 1L << 62;
+  long PVEC_TYPE_MASK = 0x3fL << 24;
+  return ((a->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK))
+ == (PSEUDOVECTOR_FLAG | (code << 24)));
+}
+
+static _Bool
+PSEUDOVECTORP (struct lisp *a, int code)
+{
+  if (! VECTORLIKEP (a))
+return 0;
+  else
+return PSEUDOVECTOR_TYPEP ((union vectorlike_header *) ((char *) a - 5),
+  code);
+}
+
+static struct lisp *
+builtin_lisp_symbol (int index)
+{
+  return (struct lisp *) (index * sizeof *lispsym);
+}
+
+static _Bool
+NILP (struct lisp *x)
+{
+  return x == builtin_lisp_symbol (0);
+}
+
+
+void wrong_type_argument (struct lisp *, struct lisp *);
+
+static void
+CHECK_TYPE (int ok, struct lisp *predicate, struct lisp *x)
+{
+  if (!ok)
+wrong_type_argument (predicate, x);
+}
+
+
+struct buffer
+{
+  union vectorlike_header header;
+  struct buffer *base_buffer;
+  int window_count;
+};
+
+static _Bool
+BUFFERP (struct lisp *a)
+{
+  return PSEUDOVECTORP (a, 12);
+}
+
+static struct buffer *
+XBUFFER (struct lisp *a)
+{
+  return (struct buffer *) ((char *) a - 5);
+}
+
+
+struct window
+{
+  union vectorlike_header header;
+  struct lisp *next;
+  struct lisp *contents;
+};
+
+static _Bool
+WINDOWP (struct lisp *a)
+{
+  return PSEUDOVECTORP (a, 12);
+}
+
+static void
+CHECK_WINDOW (struct lisp *x)
+{
+  CHECK_TYPE (WINDOWP (x), builtin_lisp_symbol (1360), x);
+}
+
+static struct window *
+XWINDOW (struct lisp *a)
+{
+  return (struct window *) ((char *) a - 5);
+}
+
+static void
+wset_combination (struct window *w, _Bool horflag, struct lisp *val)
+{
+  w->contents = val;
+}
+
+extern struct lisp *selected_window;
+
+struct window *
+decode_live_window (register struct lisp *window)
+{
+  if (NILP (window))
+return XWINDOW (selected_window);
+  CHECK_TYPE (WINDOWP (window) && BUFFERP (XWINDOW (window)->contents),
+ builtin_lisp_symbol (1351), window);
+  return XWINDOW (window);
+}
+
+struct window *
+decode_any_window (register struct lisp *window)
+{
+  struct window *w;
+  if (NILP (window))
+return XWINDOW (selected_window);
+  CHECK_WINDOW (window);
+  w = XWINDOW (window);
+  return w;
+}
+
+static void

Re: [PATCH v2 1/5] docs: Create Indices appendix

2023-03-09 Thread Arsen Arsenović via Gcc-patches

Sandra Loosemore  writes:

> On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:
>> The GCC manual has multiple indices.  By creating an appendix which
>> lists them, we help makeinfo present a more accessible way for the
>> reader to see all the indices.
>> gcc/ChangeLog:
>>  * doc/gcc.texi: Add the Indices appendix, to make texinfo
>>  generate nice indices overview page.
>>  (@copying): Move "This file documents the use of the GNU
>>  compilers" into @copying.  Add quotations around cover texts.
>
>
> I guess this patch is OK and is necessary to smooth over some misfeatures
> in newer versions of Texinfo.  In particular, comparing your sample output
> https://www.aarsen.me/~arsen/final/gcc.html/index.html
>
> to my own fresh Texinfo 6.7-generated version with your patches applied, and
> the existing online documention like
>
> https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/index.html
>
> the order of the "Short Table of Contents" and longer "Table of Contents" have
> been switched, so that in the new version you have to scroll all the way down
> to the bottom of the page (ugh) to click on "Option Index".  (Frankly, this
> seems like a misfeature; the point of having a "Short Table of Contents" is
> *not* to have to page through the long one to find a particular chapter.)
>
> I guess that is a Texinfo change?  gcc.texi still has:
>
> @summarycontents
> @contents
>
> in that order.

Found the change.  HTML got support for CONTENTS_OUTPUT_LOCATION, which
defaults to after_top, which ignores the inline location of these
elements.  Here's a patch:

From 0a0c9469301fb25c4b420a1ed4e381f5ea921d07 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Arsen=20Arsenovi=C4=87?= 
Date: Thu, 9 Mar 2023 21:44:29 +0100
Subject: [PATCH] update_web_docs_git: Set CONTENTS_OUTPUT_LOCATION=inline

maintainer-scripts/ChangeLog:

	* update_web_docs_git: Set CONTENTS_OUTPUT_LOCATION=inline in
	order to put @shortcontents above contents. See
	9dd976a4-4e09-d901-b949-6d5037567...@codesourcery.com on
	gcc-patches.
---
 maintainer-scripts/update_web_docs_git | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/maintainer-scripts/update_web_docs_git b/maintainer-scripts/update_web_docs_git
index 9ded1744df4..c9f14d1a4d1 100755
--- a/maintainer-scripts/update_web_docs_git
+++ b/maintainer-scripts/update_web_docs_git
@@ -169,7 +169,7 @@ for file in $MANUALS; do
 if [ "$file" = "gnat_ugn" ]; then
   includes="$includes -I gcc/gcc/ada -I gcc/gcc/ada/doc/gnat_ugn"
 fi
-makeinfo --html --css-ref $CSS $includes -o ${file} ${filename}
+makeinfo --html -c CONTENTS_OUTPUT_LOCATION=inline --css-ref $CSS $includes -o ${file} ${filename}
 tar cf ${file}-html.tar ${file}/*.html
 texi2dvi $includes -o ${file}.dvi ${filename} /dev/null && dvips -o ${file}.ps ${file}.dvi
 texi2pdf $includes -o ${file}.pdf ${filename} 
I pushed the updated docs, and added that commit to my branch (as well
as rebasing it, as always).

> OTOH, I see that in your new version there is now a line with links
> [Contents][Index] before the Introduction.  If adding this new appendix makes
> the [Index] link point at the indices, I think it is OK, although I'm still
> worried that the overall effect (even without the new version of Texinfo) is
> making the indices harder to find.
>
> I wonder, could we add something to the Introduction text like
>
> Tip: This manual is very long.  If you're looking for something in particular,
> try searching the @ref{Option Index} or @ref{Concept and Symbol Index}.
>
> ???

Even with the above fixed, I think it'd be nice to add this text.

> -Sandra


-- 
Arsen Arsenović


signature.asc
Description: PGP signature


[PATCH] c++, abi: Fix up class layout with bitfields [PR109039]

2023-03-09 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase FAILs, because starting with r12-6028
the S class has only 2 bytes, not enough to hold one 7-bit bitfield, one 8-bit
bitfield and one 8-bit char field.

The reason is that when end_of_class attempts to compute dsize, it simply
adds byte_position of the field and DECL_SIZE_UNIT (and uses maximum from
those offsets).
The problematic bit-field in question has bit_position 7, byte_position 0,
DECL_SIZE 8 and DECL_SIZE_UNIT 1.  So, byte_position + DECL_SIZE_UNIT is
1, even when the bitfield only has a single bit in the first byte and 7
further bits in the second byte, so per the Itanium ABI it should be 2:
"In either case, update dsize(C) to include the last byte
containing (part of) the bit-field, and update sizeof(C) to
max(sizeof(C),dsize(C))."

The following patch fixes it by computing bitsize of the end and using
CEIL_DIV_EXPR division to round it to next byte boundary and convert
from bits to bytes.

While this is an ABI change, classes with such incorrect layout couldn't
have worked properly, so I doubt anybody is actually running it often
in the wild.  Thus I think adding some ABI warning for it is unnecessary.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
(and after a while for GCC 12)?

2023-03-09  Jakub Jelinek  

PR c++/109039
* class.cc (end_of_class): For bit-fields, instead of computing
offset as sum of byte_position (field) and DECL_SIZE_UNIT (field),
compute it as sum of bit_position (field) and DECL_SIZE (field)
divided by BITS_PER_UNIT rounded up.

* g++.dg/abi/no_unique_address7.C: New test.

--- gcc/cp/class.cc.jj  2023-02-04 06:22:17.053407477 +0100
+++ gcc/cp/class.cc 2023-03-09 18:02:43.967815721 +0100
@@ -6476,7 +6476,15 @@ end_of_class (tree t, eoc_mode mode)
 size of the type (usually 1) for computing nvsize.  */
  size = TYPE_SIZE_UNIT (TREE_TYPE (field));
 
-   offset = size_binop (PLUS_EXPR, byte_position (field), size);
+   if (DECL_BIT_FIELD_TYPE (field))
+ {
+   offset = size_binop (PLUS_EXPR, bit_position (field),
+DECL_SIZE (field));
+   offset = size_binop (CEIL_DIV_EXPR, offset, bitsize_unit_node);
+   offset = fold_convert (sizetype, offset);
+ }
+   else
+ offset = size_binop (PLUS_EXPR, byte_position (field), size);
if (tree_int_cst_lt (result, offset))
  result = offset;
   }
--- gcc/testsuite/g++.dg/abi/no_unique_address7.C.jj2023-03-09 
18:09:08.397205087 +0100
+++ gcc/testsuite/g++.dg/abi/no_unique_address7.C   2023-03-09 
18:08:56.439379395 +0100
@@ -0,0 +1,33 @@
+// PR c++/109039
+// { dg-do run { target c++11 } }
+
+struct X {
+  signed short x0 : 7;
+  signed short x1 : 8;
+  X () : x0 (1), x1 (2) {}
+  int get () { return x0 + x1; }
+};
+
+struct S {
+  [[no_unique_address]] X x;
+  signed char c;
+  S () : c (0) {}
+};
+
+S s;
+
+int
+main ()
+{
+  if (s.x.x0 != 1 || s.x.x1 != 2 || s.c != 0)
+__builtin_abort ();
+  s.x.x0 = -1;
+  s.x.x1 = -1;
+  if (s.x.x0 != -1 || s.x.x1 != -1 || s.c != 0)
+__builtin_abort ();
+  s.c = -1;
+  s.x.x0 = 0;
+  s.x.x1 = 0;
+  if (s.x.x0 != 0 || s.x.x1 != 0 || s.c != -1)
+__builtin_abort ();
+}

Jakub



RE: [PATCH 3/4]middle-end: Implement preferred_div_as_shifts_over_mult [PR108583]

2023-03-09 Thread Tamar Christina via Gcc-patches
Hi,

Here's the respun patch.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/108583
* target.def (preferred_div_as_shifts_over_mult): New.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate.
* targhooks.cc (default_preferred_div_as_shifts_over_mult): New.
* targhooks.h (default_preferred_div_as_shifts_over_mult): New.
* tree-vect-patterns.cc (vect_recog_divmod_pattern): Use it.

gcc/testsuite/ChangeLog:

PR target/108583
* gcc.dg/vect/vect-div-bitmask-4.c: New test.
* gcc.dg/vect/vect-div-bitmask-5.c: New test.

--- inline copy of patch ---

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 
50a8872a6695b18b9bed0d393bacf733833633db..bf7269e323de1a065d4d04376e5a2703cbb0f9fa
 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6137,6 +6137,12 @@ instruction pattern.  There is no need for the hook to 
handle these two
 implementation approaches itself.
 @end deftypefn
 
+@deftypefn {Target Hook} bool 
TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT (const_tree @var{type})
+Sometimes it is possible to implement a vector division using a sequence
+of two addition-shift pairs, giving four instructions in total.
+Return true if taking this approach for @var{vectype} is likely
+to be better than using a sequence involving highpart multiplication.
+Default is false if @code{can_mult_highpart_p}, otherwise true.
 @end deftypefn
 
 @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION 
(unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in})
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 
3e07978a02f4e6077adae6cadc93ea4273295f1f..0051017a7fd67691a343470f36ad4fc32c8e7e15
 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4173,6 +4173,7 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_VEC_PERM_CONST
 
+@hook TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT
 
 @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION
 
diff --git a/gcc/target.def b/gcc/target.def
index 
e0a5c7adbd962f5d08ed08d1d81afa2c2baa64a5..e4474a3ed6bd2f5f5c010bf0d40c2a371370490c
 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1868,6 +1868,18 @@ correct for most targets.",
  poly_uint64, (const_tree type),
  default_preferred_vector_alignment)
 
+/* Returns whether the target has a preference for decomposing divisions using
+   shifts rather than multiplies.  */
+DEFHOOK
+(preferred_div_as_shifts_over_mult,
+ "Sometimes it is possible to implement a vector division using a sequence\n\
+of two addition-shift pairs, giving four instructions in total.\n\
+Return true if taking this approach for @var{vectype} is likely\n\
+to be better than using a sequence involving highpart multiplication.\n\
+Default is false if @code{can_mult_highpart_p}, otherwise true.",
+ bool, (const_tree type),
+ default_preferred_div_as_shifts_over_mult)
+
 /* Return true if vector alignment is reachable (by peeling N
iterations) for the given scalar type.  */
 DEFHOOK
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 
a6a4809ca91baa5d7fad2244549317a31390f0c2..a207963b9e6eb9300df0043e1b79aa6c941d0f7f
 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -53,6 +53,8 @@ extern scalar_int_mode default_unwind_word_mode (void);
 extern unsigned HOST_WIDE_INT default_shift_truncation_mask
   (machine_mode);
 extern unsigned int default_min_divisions_for_recip_mul (machine_mode);
+extern bool default_preferred_div_as_shifts_over_mult
+  (const_tree);
 extern int default_mode_rep_extended (scalar_int_mode, scalar_int_mode);
 
 extern tree default_stack_protect_guard (void);
diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index 
211525720a620d6f533e2da91e03877337a931e7..7f39ff9b7ec2bf66625d48a47bb76e96c05a3233
 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -1483,6 +1483,15 @@ default_preferred_vector_alignment (const_tree type)
   return TYPE_ALIGN (type);
 }
 
+/* The default implementation of
+   TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT.  */
+
+bool
+default_preferred_div_as_shifts_over_mult (const_tree type)
+{
+  return can_mult_highpart_p (TYPE_MODE (type), TYPE_UNSIGNED (type));
+}
+
 /* By default assume vectors of element TYPE require a multiple of the natural
alignment of TYPE.  TYPE is naturally aligned if IS_PACKED is false.  */
 bool
diff --git a/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c 
b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c
new file mode 100644
index 
..c81f8946922250234bf759e0a0a04ea8c1f73e3c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-div-bitmask-4.c
@@ -0,0 +1,25 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+typedef unsigned __attribute__((__vector_size__ (16))) V;
+
+static __attribute__((__noinline__)) __attribute__((__noclone__)) V
+foo (V v, 

RE: [PATCH 2/4][ranger]: Add range-ops for widen addition and widen multiplication [PR108583]

2023-03-09 Thread Tamar Christina via Gcc-patches
Cheers,

Thanks! I'll way for him to come back then 

Thanks,
Tamar

> -Original Message-
> From: Aldy Hernandez 
> Sent: Wednesday, March 8, 2023 8:57 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; amacl...@redhat.com
> Subject: Re: [PATCH 2/4][ranger]: Add range-ops for widen addition and
> widen multiplication [PR108583]
> 
> As Andrew has been advising on this one, I'd prefer for him to review it.
> However, he's on vacation this week.  FYI...
> 
> Aldy
> 
> On Mon, Mar 6, 2023 at 12:22 PM Tamar Christina
>  wrote:
> >
> > Ping.
> >
> > And updated the patch to reject cases that we don't expect or can handle
> cleanly for now.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR target/108583
> > * gimple-range-op.h (gimple_range_op_handler): Add
> maybe_non_standard.
> > * gimple-range-op.cc
> (gimple_range_op_handler::gimple_range_op_handler):
> > Use it.
> > (gimple_range_op_handler::maybe_non_standard): New.
> > * range-op.cc (class operator_widen_plus_signed,
> > operator_widen_plus_signed::wi_fold, class
> operator_widen_plus_unsigned,
> > operator_widen_plus_unsigned::wi_fold, class
> operator_widen_mult_signed,
> > operator_widen_mult_signed::wi_fold, class
> operator_widen_mult_unsigned,
> > operator_widen_mult_unsigned::wi_fold,
> > ptr_op_widen_mult_signed, ptr_op_widen_mult_unsigned,
> > ptr_op_widen_plus_signed, ptr_op_widen_plus_unsigned): New.
> > * range-op.h (ptr_op_widen_mult_signed,
> ptr_op_widen_mult_unsigned,
> > ptr_op_widen_plus_signed, ptr_op_widen_plus_unsigned): New
> >
> > Co-Authored-By: Andrew MacLeod 
> >
> > --- Inline copy of patch ---
> >
> > diff --git a/gcc/gimple-range-op.h b/gcc/gimple-range-op.h index
> >
> 743b858126e333ea9590c0f175aacb476260c048..1bf63c5ce6f5db924a1f5
> 907ab45
> > 39e376281bd0 100644
> > --- a/gcc/gimple-range-op.h
> > +++ b/gcc/gimple-range-op.h
> > @@ -41,6 +41,7 @@ public:
> >  relation_trio = TRIO_VARYING);
> >  private:
> >void maybe_builtin_call ();
> > +  void maybe_non_standard ();
> >gimple *m_stmt;
> >tree m_op1, m_op2;
> >  };
> > diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc index
> >
> d9dfdc56939bb62ade72726b15c3d5e87e4ddcd1..a5d625387e712c170e1e
> 68f6a7d4
> > 94027f6ef0d0 100644
> > --- a/gcc/gimple-range-op.cc
> > +++ b/gcc/gimple-range-op.cc
> > @@ -179,6 +179,8 @@
> gimple_range_op_handler::gimple_range_op_handler (gimple *s)
> >// statements.
> >if (is_a  (m_stmt))
> >  maybe_builtin_call ();
> > +  else
> > +maybe_non_standard ();
> >  }
> >
> >  // Calculate what we can determine of the range of this unary @@
> > -764,6 +766,57 @@ public:
> >}
> >  } op_cfn_parity;
> >
> > +// Set up a gimple_range_op_handler for any nonstandard function
> > +which can be // supported via range-ops.
> > +
> > +void
> > +gimple_range_op_handler::maybe_non_standard () {
> > +  range_operator *signed_op = ptr_op_widen_mult_signed;
> > +  range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
> > +  if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
> > +switch (gimple_assign_rhs_code (m_stmt))
> > +  {
> > +   case WIDEN_PLUS_EXPR:
> > +   {
> > + signed_op = ptr_op_widen_plus_signed;
> > + unsigned_op = ptr_op_widen_plus_unsigned;
> > +   }
> > +   gcc_fallthrough ();
> > +   case WIDEN_MULT_EXPR:
> > +   {
> > + m_valid = false;
> > + m_op1 = gimple_assign_rhs1 (m_stmt);
> > + m_op2 = gimple_assign_rhs2 (m_stmt);
> > + tree ret = gimple_assign_lhs (m_stmt);
> > + bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
> > + bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
> > + bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
> > +
> > + /* Normally these operands should all have the same sign, but
> > +some passes and violate this by taking mismatched sign args.  
> > At
> > +the moment the only one that's possible is mismatch inputs and
> > +unsigned output.  Once ranger supports signs for the operands 
> > we
> > +can properly fix it,  for now only accept the case we can do
> > +correctly.  */
> > + if ((signed1 ^ signed2) && signed_ret)
> > +   return;
> > +
> > + m_valid = true;
> > + if (signed2 && !signed1)
> > +   std::swap (m_op1, m_op2);
> > +
> > + if (signed1 || signed2)
> > +   m_int = signed_op;
> > + else
> > +   m_int = unsigned_op;
> > + break;
> > +   }
> > +   default:
> > + break;
> > +  }
> > +}
> > +
> >  // Set up a gimple_range_op_handler for any built in function which
> > can be  // supported via range-ops.
> >
> > diff --git a/gcc/range-op.h 

[PATCH]middle-end: don't form FMAs when multiplication is not single use. [PR108583]

2023-03-09 Thread Tamar Christina via Gcc-patches
Hi All,

The testcase

typedef unsigned int vec __attribute__((vector_size(32)));
vec
f3 (vec a, vec b, vec c)
{
  vec d = a * b;
  return d + ((c + d) >> 1);
}

shows a case where we don't want to form an FMA due to the MUL not being single
use.  In this case to form an FMA we have to redo the MUL as well as we no
longer have it to share.

As such making an FMA here would be a de-optimization.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/108583
* tree-ssa-math-opts.cc (convert_mult_to_fma): Inhibit FMA in case not
single use.

gcc/testsuite/ChangeLog:

PR target/108583
* gcc.dg/mla_1.c: New test.

Co-Authored-By: Richard Sandiford 

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c
new file mode 100644
index 
..a92ecf248116d89b1bc4207a907ea5ed95728a28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/mla_1.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve 
-fdump-tree-optimized" } */
+
+unsigned int
+f1 (unsigned int a, unsigned int b, unsigned int c) {
+  unsigned int d = a * b;
+  return d + ((c + d) >> 1);
+}
+
+unsigned int
+g1 (unsigned int a, unsigned int b, unsigned int c) {
+  return a * b + c;
+}
+
+__Uint32x4_t
+f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
+  __Uint32x4_t d = a * b;
+  return d + ((c + d) >> 1);
+}
+
+__Uint32x4_t
+g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
+  return a * b + c;
+}
+
+typedef unsigned int vec __attribute__((vector_size(32))); vec
+f3 (vec a, vec b, vec c)
+{
+  vec d = a * b;
+  return d + ((c + d) >> 1);
+}
+
+vec
+g3 (vec a, vec b, vec c)
+{
+  return a * b + c;
+}
+
+/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target 
aarch64*-*-* } } } */
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index 
5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172
 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
op2,
param_avoid_fma_max_bits));
   bool defer = check_defer;
   bool seen_negate_p = false;
+
+  /* There is no numerical difference between fused and unfused integer FMAs,
+ and the assumption below that FMA is as cheap as addition is unlikely
+ to be true, especially if the multiplication occurs multiple times on
+ the same chain.  E.g., for something like:
+
+(((a * b) + c) >> 1) + (a * b)
+
+ we do not want to duplicate the a * b into two additions, not least
+ because the result is not a natural FMA chain.  */
+  if (ANY_INTEGRAL_TYPE_P (type)
+  && !has_single_use (mul_result))
+return false;
+
   /* Make sure that the multiplication statement becomes dead after
  the transformation, thus that all uses are transformed to FMAs.
  This means we assume that an FMA operation has the same cost




-- 
diff --git a/gcc/testsuite/gcc.dg/mla_1.c b/gcc/testsuite/gcc.dg/mla_1.c
new file mode 100644
index 
..a92ecf248116d89b1bc4207a907ea5ed95728a28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/mla_1.c
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-options "-O2 -msve-vector-bits=256 -march=armv8.2-a+sve 
-fdump-tree-optimized" } */
+
+unsigned int
+f1 (unsigned int a, unsigned int b, unsigned int c) {
+  unsigned int d = a * b;
+  return d + ((c + d) >> 1);
+}
+
+unsigned int
+g1 (unsigned int a, unsigned int b, unsigned int c) {
+  return a * b + c;
+}
+
+__Uint32x4_t
+f2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
+  __Uint32x4_t d = a * b;
+  return d + ((c + d) >> 1);
+}
+
+__Uint32x4_t
+g2 (__Uint32x4_t a, __Uint32x4_t b, __Uint32x4_t c) {
+  return a * b + c;
+}
+
+typedef unsigned int vec __attribute__((vector_size(32))); vec
+f3 (vec a, vec b, vec c)
+{
+  vec d = a * b;
+  return d + ((c + d) >> 1);
+}
+
+vec
+g3 (vec a, vec b, vec c)
+{
+  return a * b + c;
+}
+
+/* { dg-final { scan-tree-dump-times {\.FMA } 1 "optimized" { target 
aarch64*-*-* } } } */
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index 
5ab5b944a573ad24ce8427aff24fc5215bf05dac..26ed91d58fa4709a67c903ad446d267a3113c172
 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -3346,6 +3346,20 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
op2,
param_avoid_fma_max_bits));
   bool defer = check_defer;
   bool seen_negate_p = false;
+
+  /* There is no numerical difference between fused and unfused integer FMAs,
+ and the assumption below that FMA is as cheap as addition is unlikely
+ to be true, especially if the multiplication occurs multiple times on
+ the same chain.  E.g., 

Re: [PATCH] c++: noexcept and copy elision [PR109030]

2023-03-09 Thread Patrick Palka via Gcc-patches
On Mon, 6 Mar 2023, Marek Polacek via Gcc-patches wrote:

> When processing a noexcept, constructors aren't elided: build_over_call
> has
>/* It's unsafe to elide the constructor when handling
>   a noexcept-expression, it may evaluate to the wrong
>   value (c++/53025).  */
>&& (force_elide || cp_noexcept_operand == 0))
> so the assert I added recently needs to be relaxed a little bit.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
>   PR c++/109030
> 
> gcc/cp/ChangeLog:
> 
>   * constexpr.cc (cxx_eval_call_expression): Relax assert.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp0x/noexcept77.C: New test.
> ---
>  gcc/cp/constexpr.cc | 6 +-
>  gcc/testsuite/g++.dg/cpp0x/noexcept77.C | 9 +
>  2 files changed, 14 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept77.C
> 
> diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> index 364695b762c..5384d0e8e46 100644
> --- a/gcc/cp/constexpr.cc
> +++ b/gcc/cp/constexpr.cc
> @@ -2869,7 +2869,11 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
> tree t,
>  
>/* We used to shortcut trivial constructor/op= here, but nowadays
>   we can only get a trivial function here with -fno-elide-constructors.  
> */
> -  gcc_checking_assert (!trivial_fn_p (fun) || !flag_elide_constructors);
> +  gcc_checking_assert (!trivial_fn_p (fun)
> +|| !flag_elide_constructors
> +/* We don't elide constructors when processing
> +   a noexcept-expression.  */
> +|| cp_noexcept_operand);

It seems weird that we're performing constant evaluation within an
unevaluated operand.  Would it make sense to also fix this a second way
by avoiding constant evaluation from maybe_constant_init when
cp_unevaluated_operand && !manifestly_const_eval, like in maybe_constant_value?
IIUC since we could still have an evaluated subexpression withis
noexcept, the two fixes would be complementary.

>  
>bool non_constant_args = false;
>new_call.bindings
> diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept77.C 
> b/gcc/testsuite/g++.dg/cpp0x/noexcept77.C
> new file mode 100644
> index 000..16db8eb79ee
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/noexcept77.C
> @@ -0,0 +1,9 @@
> +// PR c++/109030
> +// { dg-do compile { target c++11 } }
> +
> +struct foo { };
> +
> +struct __as_receiver {
> +  foo empty_env;
> +};
> +void sched(foo __fun) noexcept(noexcept(__as_receiver{__fun})) { }
> 
> base-commit: dfb14cdd796ad9df6b5f2def047ef36b29385902
> -- 
> 2.39.2
> 
> 



[PATCH] testsuite: Handle default_packed targets in gcc.dg/plugin

2023-03-09 Thread Hans-Peter Nilsson via Gcc-patches
It's not obvious to me whether considered best to include or
exclude these tests that depend on structure layout details.
If excluding, the obvious alternative to this patch is then
to add a top one-liner (to dg-skip-if the test for
default_packed targets or a similar excluding expression).
I'm fine either way, just suggesting the following, which
handles the cris-elf test-case failures I see for these
tests, and causes no change in results for native
x86_64-pc-linux-gnu.

Beware that some of the tests have lines with trailing
whitespace.  Where lines are changed in this patch, the
trailing whitespace is removed.

Ok to commit?

-- >8 --
It's a judgement call whether to just skip some of these
tests rather than trying to match messages depending on the
layout of structures, but better include than exclude.

* gcc.dg/plugin/infoleak-2.c,
gcc.dg/plugin/infoleak-CVE-2011-1078-1.c,
gcc.dg/plugin/infoleak-CVE-2011-1078-2.c,
gcc.dg/plugin/infoleak-CVE-2017-18549-1.c,
gcc.dg/plugin/infoleak-CVE-2017-18550-1.c,
gcc.dg/plugin/infoleak-antipatterns-1.c,
gcc.dg/plugin/infoleak-fixit-1.c: Handle default_packed targets.
---
 gcc/testsuite/gcc.dg/plugin/infoleak-2.c| 13 -
 .../gcc.dg/plugin/infoleak-CVE-2011-1078-1.c| 10 +-
 .../gcc.dg/plugin/infoleak-CVE-2011-1078-2.c| 10 +-
 .../gcc.dg/plugin/infoleak-CVE-2017-18549-1.c   | 10 +-
 .../gcc.dg/plugin/infoleak-CVE-2017-18550-1.c   |  7 ---
 .../gcc.dg/plugin/infoleak-antipatterns-1.c | 10 +-
 gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c  | 10 ++
 7 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
index 252f8f25918a..4ba484b3c6be 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
@@ -18,16 +18,19 @@ struct st
   int b:1; /* { dg-message "field 'b' is uninitialized \\(1 bit\\)" "field" } 
*/
/* { dg-message "padding after field 'b' is uninitialized \\(7 
bits\\)" "padding" { target *-*-* } .-1 } */
   u8 d;/* { dg-message "field 'd' is uninitialized \\(1 byte\\)" } */
-  int c:7; /* { dg-message "padding after field 'c' is uninitialized \\(9 
bits\\)" } */
-  u16 e;   /* { dg-message "padding after field 'e' is uninitialized \\(2 
bytes\\)" } */  
+  int c:7; /* { dg-message "padding after field 'c' is uninitialized \\(9 
bits\\)" "padding" { target { ! default_packed } } } */
+   /* { dg-message "padding after field 'c' is uninitialized \\(1 
bit\\)" "padding" { target default_packed } .-1 } */
+  u16 e;   /* { dg-message "padding after field 'e' is uninitialized \\(2 
bytes\\)" "padding" { target { ! default_packed } } } */
 };
 
 void test (void __user *dst, u16 v)
 {
   struct st s; /* { dg-message "region created on stack here" "where" } */
-  /* { dg-message "capacity: 12 bytes" "capacity" { target *-*-* } .-1 } */
-  /* { dg-message "suggest forcing zero-initialization by providing a 
'\\{0\\}' initializer" "fix-it" { target *-*-* } .-2 } */  
+  /* { dg-message "capacity: 12 bytes" "capacity" { target { ! default_packed 
} } .-1 } */
+  /* { dg-message "capacity: 9 bytes" "capacity" { target default_packed } .-2 
} */
+  /* { dg-message "suggest forcing zero-initialization by providing a 
'\\{0\\}' initializer" "fix-it" { target *-*-* } .-3 } */
   s.e = v;
   copy_to_user(dst, , sizeof (struct st)); /* { dg-warning "potential 
exposure of sensitive information by copying uninitialized data from stack" 
"warning" } */
-  /* { dg-message "10 bytes are uninitialized" "note how much" { target *-*-* 
} .-1 } */
+  /* { dg-message "10 bytes are uninitialized" "note how much" { target { ! 
default_packed } } .-1 } */
+  /* { dg-message "7 bytes are uninitialized" "note how much" { target 
default_packed } .-2 } */
 }
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
index 3616fbe176b3..9269b911b22f 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
@@ -51,7 +51,7 @@ struct socket {
 
 struct sco_conninfo {
__u16 hci_handle;
-   __u8  dev_class[3]; /* { dg-message "padding after field 'dev_class' is 
uninitialized \\(1 byte\\)" } */
+   __u8  dev_class[3]; /* { dg-message "padding after field 'dev_class' is 
uninitialized \\(1 byte\\)" "padding" { target { ! default_packed } } } */
 };
 
 struct sco_conn {
@@ -83,8 +83,8 @@ static int sco_sock_getsockopt_old_broken(struct socket 
*sock, int optname, char
 {
struct sock *sk = sock->sk;
/* [...snip...] */
-   struct sco_conninfo cinfo; /* { dg-message "region created on stack 
here" "where" } */
-  /* { dg-message "capacity: 6 bytes" 
"capacity" { target *-*-* } .-1 } */
+ 

Re: [PATCHv2] Fix PR 108980: note without warning due to array bounds check

2023-03-09 Thread Jakub Jelinek via Gcc-patches
On Thu, Mar 09, 2023 at 10:03:20AM -0800, Andrew Pinski via Gcc-patches wrote:
> The problem here is after r13-4748-g2a27ae32fabf85, in some
> cases we were calling inform without a corresponding warning.
> This changes the logic such that we only cause that to happen
> if there was a warning happened before hand.
> 
> Changes since
> * v1: Fix formating and dump message as suggested by Jakub.
> 
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/108980
>   * gimple-array-bounds.cc (array_bounds_checker::check_array_ref):
>   Reorgnize the call to warning for not strict flexible arrays
>   to be before the check of warned.
> ---
>  gcc/gimple-array-bounds.cc | 41 --
>  1 file changed, 26 insertions(+), 15 deletions(-)

It would be nice to have a testcase with dg-bogus for the messages,
but seems we don't have one in the PR, so ok for trunk.

Jakub



[PATCH] Fortran: fix ICE with bind(c) in block data [PR104332]

2023-03-09 Thread Harald Anlauf via Gcc-patches
Dear all,

the attached almost obvious patch fixes a NULL pointer dereference
in a check of a symbol with the bind(c) attribute.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This PR is marked as 10/11/12/13 regression, thus it should
qualify for a backport.  It's simple enough anyway.

Thanks,
Harald

From ef96d7d360c088d68e3b405401bdb8b589d562f2 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 9 Mar 2023 18:59:08 +0100
Subject: [PATCH] Fortran: fix ICE with bind(c) in block data [PR104332]

gcc/fortran/ChangeLog:

	PR fortran/104332
	* resolve.cc (resolve_symbol): Avoid NULL pointer dereference while
	checking a symbol with the BIND(C) attribute.

gcc/testsuite/ChangeLog:

	PR fortran/104332
	* gfortran.dg/bind_c_usage_34.f90: New test.
---
 gcc/fortran/resolve.cc|  4 ++--
 gcc/testsuite/gfortran.dg/bind_c_usage_34.f90 | 21 +++
 2 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/bind_c_usage_34.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 2780c82c798..46585879ddc 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -15933,8 +15933,8 @@ resolve_symbol (gfc_symbol *sym)

   /* First, make sure the variable is declared at the
 	 module-level scope (J3/04-007, Section 15.3).	*/
-  if (sym->ns->proc_name->attr.flavor != FL_MODULE &&
-  sym->attr.in_common == 0)
+  if (!(sym->ns->proc_name && sym->ns->proc_name->attr.flavor == FL_MODULE)
+	  && !sym->attr.in_common)
 	{
 	  gfc_error ("Variable %qs at %L cannot be BIND(C) because it "
 		 "is neither a COMMON block nor declared at the "
diff --git a/gcc/testsuite/gfortran.dg/bind_c_usage_34.f90 b/gcc/testsuite/gfortran.dg/bind_c_usage_34.f90
new file mode 100644
index 000..40c8e9363cf
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/bind_c_usage_34.f90
@@ -0,0 +1,21 @@
+! { dg-do compile }
+! PR fortran/104332 - ICE with bind(c) in block data
+! Contributed by G. Steinmetz
+
+block data
+  bind(c) :: a ! { dg-error "cannot be BIND\\(C\\)" }
+end
+
+block data aa
+   real, bind(c) :: a ! { dg-error "cannot be BIND\\(C\\)" }
+end
+
+block data bb
+   real:: a ! { dg-error "cannot be BIND\\(C\\)" }
+   bind(c) :: a
+end
+
+block data cc
+   common /a/ x
+   bind(c) :: /a/
+end
--
2.35.3



[PATCHv2] Fix PR 108980: note without warning due to array bounds check

2023-03-09 Thread Andrew Pinski via Gcc-patches
The problem here is after r13-4748-g2a27ae32fabf85, in some
cases we were calling inform without a corresponding warning.
This changes the logic such that we only cause that to happen
if there was a warning happened before hand.

Changes since
* v1: Fix formating and dump message as suggested by Jakub.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/108980
* gimple-array-bounds.cc (array_bounds_checker::check_array_ref):
Reorgnize the call to warning for not strict flexible arrays
to be before the check of warned.
---
 gcc/gimple-array-bounds.cc | 41 --
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 66fd46e9b6c..34e039adca7 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -397,27 +397,38 @@ array_bounds_checker::check_array_ref (location_t 
location, tree ref,
   "of an interior zero-length array %qT")),
 low_sub, artype);
 
-  if (warned || out_of_bound)
+  if (warned && dump_file && (dump_flags & TDF_DETAILS))
 {
-  if (warned && dump_file && (dump_flags & TDF_DETAILS))
+  fprintf (dump_file, "Array bound warning for ");
+  dump_generic_expr (MSG_NOTE, TDF_SLIM, ref);
+  fprintf (dump_file, "\n");
+}
+
+   /* Issue warnings for -Wstrict-flex-arrays according to the level of
+  flag_strict_flex_arrays.  */
+  if (out_of_bound && warn_strict_flex_arrays
+  && (sam == special_array_member::trail_0
+ || sam == special_array_member::trail_1
+ || sam == special_array_member::trail_n)
+  && DECL_NOT_FLEXARRAY (afield_decl))
+{
+  bool warned1
+   = warning_at (location, OPT_Wstrict_flex_arrays,
+ "trailing array %qT should not be used as "
+ "a flexible array member",
+ artype);
+
+  if (warned1 && dump_file && (dump_flags & TDF_DETAILS))
{
- fprintf (dump_file, "Array bound warning for ");
+ fprintf (dump_file, "Trailing non flexible-like array bound warning 
for ");
  dump_generic_expr (MSG_NOTE, TDF_SLIM, ref);
  fprintf (dump_file, "\n");
}
+  warned |= warned1;
+}
 
-  /* issue warnings for -Wstrict-flex-arrays according to the level of
-flag_strict_flex_arrays.  */
-  if ((out_of_bound && warn_strict_flex_arrays)
- && (((sam == special_array_member::trail_0)
-   || (sam == special_array_member::trail_1)
-   || (sam == special_array_member::trail_n))
- && DECL_NOT_FLEXARRAY (afield_decl)))
- warned = warning_at (location, OPT_Wstrict_flex_arrays,
-  "trailing array %qT should not be used as "
-  "a flexible array member",
-  artype);
-
+  if (warned)
+{
   /* Avoid more warnings when checking more significant subscripts
 of the same expression.  */
   ref = TREE_OPERAND (ref, 0);
-- 
2.31.1



Re: [PATCH] libstdc++: Implement LWG 3820/3849 changes to cartesian_product_view

2023-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 8 Mar 2023 at 15:13, Patrick Palka via Libstdc++
 wrote:
>
> The LWG 3820 testcase revealed a bug in _M_advance, which this patch
> also fixes.
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK

>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges
> (cartesian_product_view::_Iterator::_Iterator): Remove
> constraint on default constructor as per LWG 3849.
> (cartesian_product_view::_Iterator::_M_prev): Adjust position
> of _Nm > 0 test as per LWG 3820.
> (cartesian_product_view::_Iterator::_M_advance): Perform bound
> checking only on sized cartesian products.
> * testsuite/std/ranges/cartesian_product/1.cc (test08): New test.
> ---
>  libstdc++-v3/include/std/ranges   | 23 +++
>  .../std/ranges/cartesian_product/1.cc |  9 
>  2 files changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index 67566c6ebcf..14f38727198 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -8225,7 +8225,7 @@ namespace views::__adaptor
> range_reference_t<__maybe_const_t<_Const, 
> _Vs>>...>;
>  using difference_type = 
> decltype(cartesian_product_view::_S_difference_type());
>
> -_Iterator() requires forward_range<__maybe_const_t<_Const, _First>> = 
> default;
> +_Iterator() = default;
>
>  constexpr
>  _Iterator(_Iterator __i)
> @@ -8390,12 +8390,12 @@ namespace views::__adaptor
>  _M_prev()
>  {
>auto& __it = std::get<_Nm>(_M_current);
> -  if (__it == ranges::begin(std::get<_Nm>(_M_parent->_M_bases)))
> -   {
> - __it = 
> __detail::__cartesian_common_arg_end(std::get<_Nm>(_M_parent->_M_bases));
> - if constexpr (_Nm > 0)
> +  if constexpr (_Nm > 0)
> +   if (__it == ranges::begin(std::get<_Nm>(_M_parent->_M_bases)))
> + {
> +   __it = 
> __detail::__cartesian_common_arg_end(std::get<_Nm>(_M_parent->_M_bases));
> _M_prev<_Nm - 1>();
> -   }
> + }
>--__it;
>  }
>
> @@ -8416,10 +8416,13 @@ namespace views::__adaptor
>   if constexpr (_Nm == 0)
> {
>  #ifdef _GLIBCXX_ASSERTIONS
> - auto __size = ranges::ssize(__r);
> - auto __begin = ranges::begin(__r);
> - auto __offset = __it - __begin;
> - __glibcxx_assert(__offset + __x >= 0 && __offset + __x <= 
> __size);
> + if constexpr (sized_range<__maybe_const_t<_Const, _First>>)
> +   {
> + auto __size = ranges::ssize(__r);
> + auto __begin = ranges::begin(__r);
> + auto __offset = __it - __begin;
> + __glibcxx_assert(__offset + __x >= 0 && __offset + __x <= 
> __size);
> +   }
>  #endif
>   __it += __x;
> }
> diff --git a/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc 
> b/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
> index f52c2b96d58..56ff3d152c6 100644
> --- a/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
> @@ -201,6 +201,14 @@ test07()
>VERIFY( i == 5 );
>  }
>
> +void
> +test08()
> +{
> +  // LWG 3820
> +  auto r = std::views::cartesian_product(std::views::iota(0));
> +  r.begin() += 3; // hard error
> +}
> +
>  int
>  main()
>  {
> @@ -211,4 +219,5 @@ main()
>test05();
>static_assert(test06());
>test07();
> +  test08();
>  }
> --
> 2.40.0.rc0.57.g454dfcbddf
>



Re: [PATCH] libstdc++: extraneous begin in cartesian_product_view::end [PR107572]

2023-03-09 Thread Jonathan Wakely via Gcc-patches
On Tue, 7 Mar 2023 at 20:49, Patrick Palka via Libstdc++
 wrote:
>
> On Tue, 7 Mar 2023, Patrick Palka wrote:
>
> > ranges::begin() isn't guaranteed to be equality-preserving for
> > non-forward ranges, so in cartesian_product_view::end we need to be
> > careful about calling begin() on the first range (which could be
> > non-forward) in the (non-degenerate) case where __empty_tail is false.
> >
> > Since we're already using a variadic lambda to compute __empty_tail, we
> > might as well use that same lambda to build up the tuple of iterators
> > instead of doing it via __tuple_transform.
> >
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
> >
> >   PR libstdc++/107572
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/std/ranges (cartesian_product_view::end): When
> >   building the tuple of iterators, avoid calling ranges::begin on
> >   the first range if __empty_tail is false.
> >   * testsuite/std/ranges/cartesian_product/1.cc (test07): New test.
> > ---
> >  libstdc++-v3/include/std/ranges   | 36 +--
> >  .../std/ranges/cartesian_product/1.cc | 22 
> >  2 files changed, 48 insertions(+), 10 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/std/ranges 
> > b/libstdc++-v3/include/std/ranges
> > index e0cac15a64f..0de7bdef504 100644
> > --- a/libstdc++-v3/include/std/ranges
> > +++ b/libstdc++-v3/include/std/ranges
> > @@ -8078,26 +8078,42 @@ namespace views::__adaptor
> >  end() requires ((!__detail::__simple_view<_First> || ... || 
> > !__detail::__simple_view<_Vs>)
> >   && __detail::__cartesian_product_is_common<_First, 
> > _Vs...>)
> >  {
> > -  bool __empty_tail = [this](index_sequence<_Is...>) {
> > - return (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
> > +  auto __it = [this](index_sequence<_Is...>) {
> > + bool __empty_tail = (ranges::empty(std::get<1 + _Is>(_M_bases)) || 
> > ...);
> > + auto& __first = std::get<0>(_M_bases);
> > + auto __first_it = __empty_tail
> > +   ? ranges::begin(__first)
> > +   : __detail::__cartesian_common_arg_end(__first);
> > + // N.B. When implementing P2165R4 this should be changed to always 
> > return tuple.
> > + if constexpr (sizeof...(_Is) == 1)
> > +   return std::make_pair(std::move(__first_it),
> > +  ranges::begin(std::get<1 + 
> > _Is>(_M_bases))...);
> > + else
> > +   return std::make_tuple(std::move(__first_it),
> > +  ranges::begin(std::get<1 + 
> > _Is>(_M_bases))...);
>
> On second thought, it might be better to use __tuple_or_pair_t here
> instead of manually determining whether to use a pair or tuple, so that
> we don't forget to adjust this site when implementing P2165R4 (which
> removes __tuple_or_pair_t):

Good idea.

OK for trunk.

>
> -- >8 --
>
> Subject: [PATCH] libstdc++: extraneous begin in cartesian_product_view::end
>  [PR107572]
>
> ranges::begin() isn't guaranteed to be equality-preserving for
> non-forward ranges, so in cartesian_product_view::end we need to avoid
> calling begin() on the first range (which could be non-forward) in the
> (non-degenerate) case where __empty_tail is false.
>
> Since we're already using a variadic lambda to compute __empty_tail, we
> might as well use that same lambda to build up the tuple of iterators
> instead of doing it separately via __tuple_transform.
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
>
> PR libstdc++/107572
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (cartesian_product_view::end): When
> building the tuple of iterators, avoid calling ranges::begin on
> the first range if __empty_tail is false.
> * testsuite/std/ranges/cartesian_product/1.cc (test07): New test.
> ---
>  libstdc++-v3/include/std/ranges   | 28 ---
>  .../std/ranges/cartesian_product/1.cc | 25 +
>  2 files changed, 43 insertions(+), 10 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index e0cac15a64f..d2ab79179ca 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -8078,26 +8078,34 @@ namespace views::__adaptor
>  end() requires ((!__detail::__simple_view<_First> || ... || 
> !__detail::__simple_view<_Vs>)
> && __detail::__cartesian_product_is_common<_First, 
> _Vs...>)
>  {
> -  bool __empty_tail = [this](index_sequence<_Is...>) {
> -   return (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
> +  auto __it = [this](index_sequence<_Is...>) {
> +   using _Ret = __detail::__tuple_or_pair_t,
> +iterator_t<_Vs>...>;
> +   bool __empty_tail = (ranges::empty(std::get<1 + _Is>(_M_bases)) || 
> ...);
> +   auto& __first = std::get<0>(_M_bases);
> +   return _Ret{(__empty_tail
> +  

Re: [PATCH] libstdc++: Implement P2520R0 changes to move_iterator's iterator_concept

2023-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 8 Mar 2023 at 16:47, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps
> backports?

Yes for all.


>
> libstdc++-v3/ChangeLog:
>
> * include/bits/stl_iterator.h (move_iterator::_S_iter_concept):
> Define.
> (__cpp_lib_move_iterator_concept): Define for C++20.
> (move_iterator::iterator_concept): Strengthen as per P2520R0.
> * include/std/version (__cpp_lib_move_iterator_concept): Define
> for C++20.
> * testsuite/24_iterators/move_iterator/p2520r0.cc: New test.
> ---
>  libstdc++-v3/include/bits/stl_iterator.h  | 20 +-
>  libstdc++-v3/include/std/version  |  1 +
>  .../24_iterators/move_iterator/p2520r0.cc | 37 +++
>  3 files changed, 57 insertions(+), 1 deletion(-)
>  create mode 100644 
> libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc
>
> diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> b/libstdc++-v3/include/bits/stl_iterator.h
> index c20dc9ecac5..a6a09dbac16 100644
> --- a/libstdc++-v3/include/bits/stl_iterator.h
> +++ b/libstdc++-v3/include/bits/stl_iterator.h
> @@ -1465,11 +1465,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> && convertible_to;
>  #endif
>
> +#if __cplusplus > 201703L && __cpp_lib_concepts
> +  static auto
> +  _S_iter_concept()
> +  {
> +   if constexpr (random_access_iterator<_Iterator>)
> + return random_access_iterator_tag{};
> +   else if constexpr (bidirectional_iterator<_Iterator>)
> + return bidirectional_iterator_tag{};
> +   else if constexpr (forward_iterator<_Iterator>)
> + return forward_iterator_tag{};
> +   else
> + return input_iterator_tag{};
> +  }
> +#endif
> +
>  public:
>using iterator_type = _Iterator;
>
>  #if __cplusplus > 201703L && __cpp_lib_concepts
> -  using iterator_concept = input_iterator_tag;
> +  // This is P2520R0, a C++23 change, but we treat it as a DR against 
> C++20.
> +# define __cpp_lib_move_iterator_concept 202207L
> +  using iterator_concept = decltype(_S_iter_concept());
> +
>// iterator_category defined in __move_iter_cat
>using value_type = iter_value_t<_Iterator>;
>using difference_type = iter_difference_t<_Iterator>;
> diff --git a/libstdc++-v3/include/std/version 
> b/libstdc++-v3/include/std/version
> index 871d30db5b3..25ebfc3e512 100644
> --- a/libstdc++-v3/include/std/version
> +++ b/libstdc++-v3/include/std/version
> @@ -289,6 +289,7 @@
>  #define __cpp_lib_polymorphic_allocator 201902L
>  #if __cpp_lib_concepts
>  # define __cpp_lib_ranges 202110L
> +# define __cpp_lib_move_iterator_concept 202207L
>  #endif
>  #if __cpp_lib_atomic_wait || _GLIBCXX_HAVE_POSIX_SEMAPHORE
>  # define __cpp_lib_semaphore 201907L
> diff --git a/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc 
> b/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc
> new file mode 100644
> index 000..883d6cc09e0
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/24_iterators/move_iterator/p2520r0.cc
> @@ -0,0 +1,37 @@
> +// { dg-options "-std=gnu++20" }
> +// { dg-do compile { target c++20 } }
> +
> +// Verify P2520R0 changes to move_iterator's iterator_concept, which we treat
> +// as a DR against C++20.
> +
> +#include 
> +#if __cpp_lib_move_iterator_concept != 202207L
> +# error "Feature-test macro __cpp_lib_move_iterator_concept has wrong value 
> in "
> +#endif
> +
> +#undef __cpp_lib_move_iterator_concept
> +#include 
> +#if __cpp_lib_move_iterator_concept != 202207L
> +# error "Feature-test macro __cpp_lib_move_iterator_concept has wrong value 
> in "
> +#endif
> +
> +#include 
> +
> +using __gnu_test::test_input_range;
> +using __gnu_test::test_forward_range;
> +using __gnu_test::test_bidirectional_range;
> +using __gnu_test::test_random_access_range;
> +
> +using ty1 = 
> std::move_iterator&>().begin())>;
> +static_assert(std::same_as);
> +
> +using ty2 = 
> std::move_iterator&>().begin())>;
> +static_assert(std::same_as std::forward_iterator_tag>);
> +
> +using ty3 = 
> std::move_iterator&>().begin())>;
> +static_assert(std::same_as std::bidirectional_iterator_tag>);
> +
> +using ty4 = 
> std::move_iterator&>().begin())>;
> +static_assert(std::same_as std::random_access_iterator_tag>);
> +
> +static_assert(std::random_access_iterator>);
> --
> 2.40.0.rc0.57.g454dfcbddf
>



Re: [PATCH] libstdc++: Implement LWG 3715 changes to view_interface::empty

2023-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 8 Mar 2023 at 15:53, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK. I think this would make sense for 12 too.

>
> libstdc++-v3/ChangeLog:
>
> * include/bits/ranges_util.h (view_interface::empty): Add
> preferred overloads that use ranges::size when the range is
> sized as per LWG 3715.
> * testsuite/std/ranges/adaptors/lwg3715.cc: New test.
> ---
>  libstdc++-v3/include/bits/ranges_util.h   | 16 +++--
>  .../testsuite/std/ranges/adaptors/lwg3715.cc  | 33 +++
>  2 files changed, 47 insertions(+), 2 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/adaptors/lwg3715.cc
>
> diff --git a/libstdc++-v3/include/bits/ranges_util.h 
> b/libstdc++-v3/include/bits/ranges_util.h
> index e4643e31a20..880a0ce0143 100644
> --- a/libstdc++-v3/include/bits/ranges_util.h
> +++ b/libstdc++-v3/include/bits/ranges_util.h
> @@ -97,15 +97,27 @@ namespace ranges
>constexpr bool
>empty()
>noexcept(noexcept(_S_empty(_M_derived(
> -  requires forward_range<_Derived>
> +  requires forward_range<_Derived> && (!sized_range<_Derived>)
>{ return _S_empty(_M_derived()); }
>
> +  constexpr bool
> +  empty()
> +  noexcept(noexcept(ranges::size(_M_derived()) == 0))
> +  requires sized_range<_Derived>
> +  { return ranges::size(_M_derived()) == 0; }
> +
>constexpr bool
>empty() const
>noexcept(noexcept(_S_empty(_M_derived(
> -  requires forward_range
> +  requires forward_range && (!sized_range _Derived>)
>{ return _S_empty(_M_derived()); }
>
> +  constexpr bool
> +  empty() const
> +  noexcept(noexcept(ranges::size(_M_derived()) == 0))
> +  requires sized_range
> +  { return ranges::size(_M_derived()) == 0; }
> +
>constexpr explicit
>operator bool() noexcept(noexcept(ranges::empty(_M_derived(
>requires requires { ranges::empty(_M_derived()); }
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/lwg3715.cc 
> b/libstdc++-v3/testsuite/std/ranges/adaptors/lwg3715.cc
> new file mode 100644
> index 000..96ee7087be0
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/ranges/adaptors/lwg3715.cc
> @@ -0,0 +1,33 @@
> +// { dg-options "-std=gnu++23" }
> +// { dg-do run { target c++23 } }
> +
> +// Verify LWG 3715 changes.
> +
> +#include 
> +#include 
> +#include 
> +
> +void
> +test01()
> +{
> +  std::istringstream ints("0 1 2 3 4");
> +  auto i = std::views::istream(ints);
> +  auto r4 = std::views::counted(i.begin(), 4) | std::views::chunk(2);
> +  VERIFY( !r4.empty() );
> +}
> +
> +void
> +test02()
> +{
> +  std::istringstream ints("0 1 2 3 4");
> +  auto i = std::views::istream(ints);
> +  auto r0 = std::views::counted(i.begin(), 0) | std::views::chunk(2);
> +  VERIFY( r0.empty() );
> +}
> +
> +int
> +main()
> +{
> +  test01();
> +  test02();
> +}
> --
> 2.40.0.rc0.57.g454dfcbddf
>



Re: [PATCH] libstdc++: Make views::single/iota/istream SFINAE-friendly [PR108362]

2023-03-09 Thread Jonathan Wakely via Gcc-patches
On Wed, 8 Mar 2023 at 14:36, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 12?

Yes, OK for trunk and 12.

>
> PR libstdc++/108362
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (__detail::__can_single_view): New concept.
> (_Single::operator()): Constrain it.  Move [[nodiscard]] to the
> end of the function declarator.
> (__detail::__can_iota_view): New concept.
> (_Iota::operator()): Constrain it.  Move [[nodiscard]] to the
> end of the function declarator.
> (__detail::__can_istream_view): New concept.
> (_Istream::operator()): Constrain it.  Move [[nodiscard]] to the
> end of the function declarator.
> * testsuite/std/ranges/iota/iota_view.cc (test07): New test.
> * testsuite/std/ranges/istream_view.cc (test08): New test.
> * testsuite/std/ranges/single_view.cc (test07): New test.
> ---
>  libstdc++-v3/include/std/ranges   | 40 ++-
>  .../testsuite/std/ranges/iota/iota_view.cc| 10 +
>  .../testsuite/std/ranges/istream_view.cc  | 12 ++
>  .../testsuite/std/ranges/single_view.cc   | 13 ++
>  4 files changed, 65 insertions(+), 10 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index 0a65d74bb5b..67566c6ebcf 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -675,30 +675,41 @@ namespace views
>template
>  inline constexpr empty_view<_Tp> empty{};
>
> +  namespace __detail
> +  {
> +template
> +  concept __can_single_view
> +   = requires { single_view>(std::declval<_Tp>()); };
> +  } // namespace __detail
> +
>struct _Single
>{
> -template
> -  [[nodiscard]]
> +template<__detail::__can_single_view _Tp>
>constexpr auto
> -  operator()(_Tp&& __e) const
> +  operator() [[nodiscard]] (_Tp&& __e) const
>noexcept(noexcept(single_view>(std::forward<_Tp>(__e
>{ return single_view>(std::forward<_Tp>(__e)); }
>};
>
>inline constexpr _Single single{};
>
> +  namespace __detail
> +  {
> +template
> +  concept __can_iota_view = requires { 
> iota_view(std::declval<_Args>()...); };
> +  } // namespace __detail
> +
>struct _Iota
>{
> -template
> -  [[nodiscard]]
> +template<__detail::__can_iota_view _Tp>
>constexpr auto
> -  operator()(_Tp&& __e) const
> +  operator() [[nodiscard]] (_Tp&& __e) const
>{ return iota_view(std::forward<_Tp>(__e)); }
>
>  template
> -  [[nodiscard]]
> +  requires __detail::__can_iota_view<_Tp, _Up>
>constexpr auto
> -  operator()(_Tp&& __e, _Up&& __f) const
> +  operator() [[nodiscard]] (_Tp&& __e, _Up&& __f) const
>{ return iota_view(std::forward<_Tp>(__e), std::forward<_Up>(__f)); }
>};
>
> @@ -796,13 +807,22 @@ namespace views
>
>  namespace views
>  {
> +  namespace __detail
> +  {
> +template
> +concept __can_istream_view
> +  = requires (_Up __e) {
> +   basic_istream_view<_Tp, typename _Up::char_type, typename 
> _Up::traits_type>(__e);
> +  };
> +  };
> +
>template
>  struct _Istream
>  {
>template
> -   [[nodiscard]]
> constexpr auto
> -   operator()(basic_istream<_CharT, _Traits>& __e) const
> +   operator() [[nodiscard]] (basic_istream<_CharT, _Traits>& __e) const
> +   requires __detail::__can_istream_view<_Tp, 
> std::remove_reference_t>
> { return basic_istream_view<_Tp, _CharT, _Traits>(__e); }
>  };
>
> diff --git a/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc 
> b/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> index 2dd17113536..0d2eaf1d0c2 100644
> --- a/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/iota/iota_view.cc
> @@ -110,6 +110,15 @@ test06()
>VERIFY( std::ranges::equal(v3, w3) );
>  }
>
> +template
> +void
> +test07()
> +{
> +  // Verify SFINAE behavior.
> +  static_assert(!requires { iota(nullptr); });
> +  static_assert(!requires { iota(nullptr, nullptr); });
> +}
> +
>  int
>  main()
>  {
> @@ -119,4 +128,5 @@ main()
>test04();
>test05();
>test06();
> +  test07();
>  }
> diff --git a/libstdc++-v3/testsuite/std/ranges/istream_view.cc 
> b/libstdc++-v3/testsuite/std/ranges/istream_view.cc
> index 26f109fdeaa..cc1c3e006b9 100644
> --- a/libstdc++-v3/testsuite/std/ranges/istream_view.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/istream_view.cc
> @@ -115,6 +115,17 @@ test07()
>VERIFY( sum == 10 );
>  }
>
> +template
> +concept can_istream_view = requires (U u) { views::istream(u); };
> +
> +void
> +test08()
> +{
> +  // Verify SFINAE behavior.
> +  struct S { };
> +  static_assert(!can_istream_view);
> +}
> +
>  int
>  main()
>  {
> @@ -125,4 +136,5 @@ main()
>test05();
>test06();
>test07();
> +  test08();
>  }
> diff 

Re: AArch64 bfloat16 mangling

2023-03-09 Thread Richard Sandiford via Gcc-patches
Sorry for the slow response.

Jakub Jelinek  writes:
> Hi!
>
> On Mon, Jan 30, 2023 at 11:07:23PM +, Richard Sandiford wrote:
>> Jakub Jelinek  writes:
>> > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605965.html
>> >   - ABI - aarch64: Add bfloat16_t support for aarch64 (enabling it in GCC 
>> > 14
>> > will be harder)
>> 
>> Sorry for the delay on this.  There's still an ongoing debate about
>> whether to keep the current AArch64 mangling or switch to the new one.
>
> If it helps, I'll try to repeat the options I see:
> 1) don't do anything right now; problem is if it is done later (GCC 14+),
>libstdc++ would need to conditionalize the std::bfloat16_t RTTI symbols,
>have them in one symbol version for x86 and in another for aarch64
> 2) similarly to x86 __bf16 would be the underlying type for std::bfloat16_t
>where the latter needs to act as usable extended floating point type with
>all arithmetics, mangling is DF16b which is how std::bfloat16_t should
>mangle according to the Itanitum ABI pull request; decltype (0.0bf16) is
>__bf16; disadvantage is that existing code using __bf16 in argument
>passing and templates changes mangling
> 3) keep __bf16 as is with its u6__bf16 mangling and use for std::bfloat16_t
>a distinct type (the latter would be the bfloat16_type_node);
>decltype (0.0bf16) would be that new type which would mangle DF16b and
>would allow arithmetics/casts etc.  How exactly would the new type be
>named is up to you (__bfloat16_t, __bfloat16, __std_bfloat16_t,
>whatever else); in theory it could be created without a user accessible
>name as well; libstdc++ only uses decltype (0.0bf16) to get at it
> 4) like 3), including keeping the mangling of __bf16 as u6__bf16, but
>make also __bf16 a usable arithmetic type, not just a storage only type;
>for C++ FE it would be simply another non-standard type like say
>__float128 is on x86
> 5) like 2), but make the mangling of __bf16 depend on flag_abi_version;
>flag_abi_version >= 18 (aka GCC 13+ ABI) mangles it as DF16b,
>flag_abi_version < 18 mangles it as u6__bf16; the default for
>-fabi-compat-version= is I think GCC 8 ABI compatibility, so GCC normally
>emits mangling aliases, so say void foo (std::bfloat16_t) {} would
>mangle as _Z3fooDF16b and for a few years there would be
>an alias _Z3foou6__bf16 to it
>
> Of course, it is possible I've missed some options.
>
>   Jakub

We decided to keep the current mangling of __bf16 and use it for
std::bfloat16_t too.  __bf16 will become a non-standard arithmetic type.
This will be an explicit diversion from the Itanium ABI.

I think that's equivalent to your (2) without the part about following
the Itanium ABI.

Thanks,
Richard


Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Michael Meissner via Gcc-patches
On Fri, Mar 03, 2023 at 03:35:44PM -0600, Segher Boessenkool wrote:
> > +complex_multiply_builtin_code (machine_mode mode)
> > +{
> > +  return (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + mode
> > + - MIN_MODE_COMPLEX_FLOAT);
> > +}
> 
> There should be an assert that the mode is as expected
>   gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, 
> MAX_MODE_COMPLEX_FLOAT));
> or such.
> 
> Using more temporaries should make this simpler as well, obviate the
> need for explicit casts, and make everything fit on short lines.

While I can use a temporary to shorten the line, I can't eliminate the case, or
I'll get a warning about implicit conversion from int to the enum
built_in_function.  Here is what I will use:

static inline built_in_function
complex_multiply_builtin_code (machine_mode mode)
{
  gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, MAX_MODE_COMPLEX_FLOAT));
  int func = BUILT_IN_COMPLEX_MUL_MIN + mode - MIN_MODE_COMPLEX_FLOAT;
  return (built_in_function) func;
}

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[committed] libstdc++: Really fix symver for __gnu_cxx11_ieee128::__try_use_facet [PR108882]

2023-03-09 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

PR libstdc++/108882
* config/os/gnu-linux/ldbl-ieee128-extra.ver: Fix incorrect
patterns.
---
 libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver 
b/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
index 20d87a5e373..5f1d9a39264 100644
--- a/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
+++ b/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
@@ -53,8 +53,8 @@ GLIBCXX_IEEE128_3.4.30 {
 GLIBCXX_IEEE128_3.4.31 {
   _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1287num_get*;
   _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1287num_put*;
-  _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1289money_get*;
-  _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1289money_put*;
+  _ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_get*;
+  _ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_put*;
   
_ZNSt19__gnu_cxx11_ieee1289money_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
   
_ZNSt19__gnu_cxx11_ieee1289money_putI[cw]St19ostreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
 } GLIBCXX_3.4.31;
-- 
2.39.2



Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Michael Meissner via Gcc-patches
On Fri, Mar 03, 2023 at 03:35:44PM -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Feb 03, 2023 at 12:53:05AM -0500, Michael Meissner wrote:
> > This patch reworks how the complex multiply and divide built-in functions 
> > are
> > done.
> 
> > I tested all 3 patchs for PR target/107299 on:
> 
> Is this part of the proposed commit message?  As Ke Wen pointed out, it
> is wrong.  Most of your mail does not belong in a commit message at all,
> but some probably does?  Please do this clearer with future patches.
> 
> > * config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
> > (init_float128_ieee): Delete code to switch complex multiply and divide
> > for long double.
> 
> I like this kind of patch :-)
> 
> > +/* Internal function to return the built-in function id for the complex
> > +   multiply operation for a given mode.  */
> > +
> > +static inline built_in_function
> > +complex_multiply_builtin_code (machine_mode mode)
> > +{
> > +  return (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + mode
> > + - MIN_MODE_COMPLEX_FLOAT);
> > +}
> 
> There should be an assert that the mode is as expected
>   gcc_assert (IN_RANGE (mode, MIN_MODE_COMPLEX_FLOAT, 
> MAX_MODE_COMPLEX_FLOAT));
> or such.

Ok.

> Using more temporaries should make this simpler as well, obviate the
> need for explicit casts, and make everything fit on short lines.
> 
> > +static inline built_in_function
> > +complex_divide_builtin_code (machine_mode mode)
> > +{
> > +  return (built_in_function) (BUILT_IN_COMPLEX_DIV_MIN + mode
> > + - MIN_MODE_COMPLEX_FLOAT);
> > +}
> 
> Ditto ofc.
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/divic3-1.c
> > @@ -0,0 +1,18 @@
> > +/* { dg-do compile { target { powerpc*-*-* } } } */
> 
> Leave the target clause out.

Ok.

> > +/* { dg-require-effective-target powerpc_p8vector_ok } */
> > +/* { dg-require-effective-target longdouble128 } */
> > +/* { dg-require-effective-target ppc_float128_sw } */
> > +/* { dg-options "-O2 -mpower8-vector -mabi=ieeelongdouble -Wno-psabi" } */
> 
> It would be nice if you did not try to add -mpower8-vector in more
> testcases :-(

Yep.

> Is -Wno-psabi needed here?  What is the error you get without it / on
> which configurations?  Cargo-culting hiding the warnings makes you see
> fewer warnings, but that is the opposite of a good idea.
> 
> > +/* { dg-final { scan-assembler "bl __divtc3" } } */
> 
> This name depends on what object format and ABI is in use (some have an
> extra leading underscore, or a dot, or whatever).

Yes it is needed if GCC is configured against an older GLIBC before the full
IEEE 128-bit support was added.  For example, on my big endian test system, you
get warnings if you switch the floating point format.  I would imagine it would
also fail on little endian system with older libraries.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [v4][PATCH 1/2] Handle component_ref to a structre/union field including C99 FAM [PR101832]

2023-03-09 Thread Qing Zhao via Gcc-patches


> On Mar 9, 2023, at 7:20 AM, Richard Biener  wrote:
> 
> On Fri, 24 Feb 2023, Qing Zhao wrote:
> 
>> GCC extension accepts the case when a struct with a C99 flexible array member
>> is embedded into another struct or union (possibly recursively).
>> __builtin_object_size should treat such struct as flexible size.
>> 
>> gcc/c/ChangeLog:
>> 
>>  PR tree-optimization/101832
>>  * c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
>>  struct/union type.
> 
> I can't really comment on the correctness of this part but since
> only the C frontend will ever set this and you are using it from
> addr_object_size which is also used for other C family languages
> (at least), I wonder if you can really test
> 
> +   if (!TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (v)))
> 
> there.

You mean for C++ and also other C family languages (other than C), the above 
bit cannot be set?
Yes, that’s true. The bit is only set for C. So is the bit DECL_NOT_FLEXARRAY, 
which is only set for C too. 
So, I am wondering whether the bit DECL_NOT_FLEXARRAY should be also set in 
middle end? Or we can set DECL_NOT_FLEXARRAY in C++ FE too? And then we can set 
TYPE_INCLUDE_FLEXARRAY also in C++ FE?
What’s your suggestion?

(I personally feel that DECL_NOT_FLEXARRAY and TYPE_INCLUDE_FLEXARRAY should be 
set in the same places).

> 
> Originally I was suggesting to set this flag in stor-layout.cc
> which eventually all languages funnel their types through and
> if there's language specific handling use a langhook (with the
> default implementation preserving the status quo).

If we decide to set the bits in stor-layout.cc, where is the best place to do 
it? I checked the star-layout.cc code, looks like “layout_type” might be the 
place where we can set these bits for RECORD_TYPE, UNION_TYPE? 
> 
> Some more comments below ...
> 
>> gcc/cp/ChangeLog:
>> 
>>  PR tree-optimization/101832
>>  * module.cc (trees_out::core_bools): Stream out new bit
>>  type_include_flexarray.
>>  (trees_in::core_bools): Stream in new bit type_include_flexarray.
>> 
>> gcc/ChangeLog:
>> 
>>  PR tree-optimization/101832
>>  * print-tree.cc (print_node): Print new bit type_include_flexarray.
>>  * tree-core.h (struct tree_type_common): New bit
>>  type_include_flexarray.
>>  * tree-object-size.cc (addr_object_size): Handle structure/union type
>>  when it has flexible size.
>>  * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream
>>  in new bit type_include_flexarray.
>>  * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream
>>  out new bit type_include_flexarray.
>>  * tree.h (TYPE_INCLUDE_FLEXARRAY): New macro
>>  TYPE_INCLUDE_FLEXARRAY.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  PR tree-optimization/101832
>>  * gcc.dg/builtin-object-size-pr101832.c: New test.
>> ---
>> gcc/c/c-decl.cc   |  12 ++
>> gcc/cp/module.cc  |   2 +
>> gcc/print-tree.cc |   5 +
>> .../gcc.dg/builtin-object-size-pr101832.c | 134 ++
>> gcc/tree-core.h   |   4 +-
>> gcc/tree-object-size.cc   |  79 +++
>> gcc/tree-streamer-in.cc   |   1 +
>> gcc/tree-streamer-out.cc  |   1 +
>> gcc/tree.h|   6 +
>> 9 files changed, 215 insertions(+), 29 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
>> 
>> diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
>> index 08078eadeb8..f589a2f5192 100644
>> --- a/gcc/c/c-decl.cc
>> +++ b/gcc/c/c-decl.cc
>> @@ -9284,6 +9284,18 @@ finish_struct (location_t loc, tree t, tree 
>> fieldlist, tree attributes,
>>   /* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x.  */
>>   DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, 
>> x);
>> 
>> +  /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t
>> +   * when x is an array.  */
>> +  if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
>> +TYPE_INCLUDE_FLEXARRAY (t) = flexible_array_member_type_p (TREE_TYPE 
>> (x)) ;
>> +  /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t
>> + when x is the last field.  */
>> +  else if ((TREE_CODE (TREE_TYPE (x)) == RECORD_TYPE
>> +|| TREE_CODE (TREE_TYPE (x)) == UNION_TYPE)
>> +   && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x))
>> +   && is_last_field)
>> +TYPE_INCLUDE_FLEXARRAY (t) = true;
>> +
>>   if (DECL_NAME (x)
>>|| RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
>>  saw_named_field = true;
>> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
>> index ac2fe66b080..c750361b704 100644
>> --- a/gcc/cp/module.cc
>> +++ b/gcc/cp/module.cc
>> @@ -5371,6 +5371,7 @@ trees_out::core_bools (tree t)
>>   WB (t->type_common.lang_flag_5);
>>   WB (t->type_common.lang_flag_6);
>>   WB 

Re: [PATCH RFC 1/3] c++: add __is_deducible trait [PR105841]

2023-03-09 Thread Jason Merrill via Gcc-patches

On 2/20/23 11:58, Patrick Palka wrote:

On Sat, 18 Feb 2023, Jason Merrill via Gcc-patches wrote:


Tested x86_64-pc-linux-gnu.  Since this is fixing experimental (C++20)
functionality, I think it's reasonable to apply now; I'm interested in other
opinions, and thoughts about the user-facing functionality.  I'm thinking to
make it internal-only for GCC 13 at least by adding a space in the name, but
does this look useful to the library?


IIUC this looks like a generalization of an __is_specialization_of trait
that returns whether a type is a specialization of a given class template,
which seems potentially useful for the library to me.  We already define
some ad-hoc predicates for testing this, e.g. __is_reverse_view,
__is_span etc in  as well as a more general __is_specialization_of
in  for templates that take only type arguments.  Using a built-in
trait should be more efficient.

[...]

Since the first argument of a TRAIT_EXPR can now be a TEMPLATE_DECL, I
suppose cp_tree_equal needs to be changed too.


>[...]


For sake of the __is_specialization_of use case, I wonder if it'd
be possible to have a "fast path" that avoids deduction/coercion when
the given template is a class template?


Thanks, done.  I've also fixed array bounds type deduction and added 
more comments about the relationship of the implementation and the 
specification in terms of partial specialization.


The second patch makes it internal-only for GCC 13; you can revert that 
if you want to experiment with using it in the library.


Tested x86_64-pc-linux-gnu, applying to trunk.From 81f820cff3316cea454ba81dc38ddf55b1afa852 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Thu, 9 Feb 2023 12:51:51 -0800
Subject: [PATCH] c++: add __is_deducible trait [PR105841]
To: gcc-patches@gcc.gnu.org

C++20 class template argument deduction for an alias template involves
adding a constraint that the template arguments for the alias template can
be deduced from the return type of the deduction guide for the underlying
class template.  In the standard, this is modeled as defining a class
template with a partial specialization, but it's much more efficient to
implement with a trait that directly tries to perform the deduction.

The first argument to the trait is a template rather than a type, so various
places needed to be adjusted to accommodate that.

	PR c++/105841

gcc/ChangeLog:

	* doc/extend.texi (Type Traits):: Document __is_deducible.

gcc/cp/ChangeLog:

	* cp-trait.def (IS_DEDUCIBLE): New.
	* cxx-pretty-print.cc (pp_cxx_trait): Handle non-type.
	* parser.cc (cp_parser_trait): Likewise.
	* tree.cc (cp_tree_equal): Likewise.
	* pt.cc (tsubst_copy_and_build): Likewise.
	(type_targs_deducible_from): New.
	(alias_ctad_tweaks): Use it.
	* semantics.cc (trait_expr_value): Handle CPTK_IS_DEDUCIBLE.
	(finish_trait_expr): Likewise.
	* constraint.cc (diagnose_trait_expr): Likewise.
	* cp-tree.h (type_targs_deducible_from): Declare.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/is_deducible1.C: New test.
---
 gcc/doc/extend.texi  |  4 ++
 gcc/cp/cp-tree.h |  1 +
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cxx-pretty-print.cc   |  5 +-
 gcc/cp/parser.cc | 20 +--
 gcc/cp/pt.cc | 69 
 gcc/cp/semantics.cc  | 11 
 gcc/cp/tree.cc   |  2 +-
 gcc/testsuite/g++.dg/ext/is_deducible1.C | 31 +++
 gcc/cp/cp-trait.def  |  1 +
 10 files changed, 131 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_deducible1.C

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index c1122916255..b64a85722db 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -25213,6 +25213,10 @@ type.  A diagnostic is produced if this requirement is not met.
 If @code{type} is a cv-qualified class type, and not a union type
 ([basic.compound]) the trait is @code{true}, else it is @code{false}.
 
+@item __is_deducible (template, type)
+If template arguments for @code{template} can be deduced from
+@code{type} or obtained from default template arguments.
+
 @item __is_empty (type)
 If @code{__is_class (type)} is @code{false} then the trait is @code{false}.
 Otherwise @code{type} is considered empty if and only if: @code{type}
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index fb21c064141..dfc1c845768 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7361,6 +7361,7 @@ extern tree fn_type_unification			(tree, tree, tree,
 		 bool, bool);
 extern void mark_decl_instantiated		(tree, int);
 extern int more_specialized_fn			(tree, tree, int);
+extern bool type_targs_deducible_from		(tree, tree);
 extern void do_decl_instantiation		(tree, tree);
 extern void do_type_instantiation		(tree, tree, tsubst_flags_t);
 extern bool always_instantiate_p		(tree);
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 

Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]

2023-03-09 Thread Jonathan Yong via Gcc-patches

On 3/9/23 13:33, Costas Argyris wrote:

Pinging the list and mingw maintainer.

Analysis and pre-approval here:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108865



Thanks, pushed to master branch.




Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-09 Thread Segher Boessenkool
Hi!

On Thu, Mar 09, 2023 at 05:30:53PM +0800, Kewen.Lin wrote:
> on 2023/3/9 07:01, Peter Bergner via Gcc-patches wrote:
> > PR109073 shows a problem where GCC 11 and GCC 10 do not accept a const
> > __vector_pair pointer operand to some MMA builtins, which GCC 12 and later
> > correctly accept.  Fixed here by initializing the builtins to accept const
> > pointers.

"Pointers to const" is the more correct.  A "const pointer" is e.g.
  int *const p;
not the same thing at all, and sometimes this is useful to have ;-)

> > This patch was tested in both GCC 11 and GCC 10 on powerpc64le-linux and
> > showed no regressions.  Ok for backports?

It isn't truly a backport. You can put it on 11 and 10 at the same time,
there is no benefit doing it on 11 only first.

> > {
> >   op[nopnds++] = build_pointer_type (void_type_node);
> >   if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC)
> > -   op[nopnds++] = build_pointer_type (vector_quad_type_node);
> > +   op[nopnds++] = build_pointer_type (build_qualified_type
> > +(vector_quad_type_node,
> > + TYPE_QUAL_CONST));
> 
> Nit: Maybe we can build them out of the loop once and then just use the
> built one in the loop.

Or as globals even.  Currently we have X and pointer to X, but no
pointer to const X (and no const X either, but that isn't so useful).

The generic code doesn't have this either, hrm.

(snip)

> Simply testing __builtin_mma_xxmtacc and __builtin_mma_xxmfacc as below:
> 
> $ cat test.C
> void foo0(const __vector_quad *acc) {
>   __builtin_mma_xxmtacc(acc);
>   __builtin_mma_xxmfacc(acc);
> }
> 
> test.C:2:25: error: invalid conversion from ‘const __vector_quad*’ to 
> ‘__vector_quad*’ [-fpermissive]
> 2 |   __builtin_mma_xxmtacc(acc);
> 
> test.C:3:25: error: invalid conversion from ‘const __vector_quad*’ to 
> ‘__vector_quad*’ [-fpermissive]
> 3 |   __builtin_mma_xxmfacc(acc);
> 
> They also suffered the same error on gcc11 branch but not on trunk.

Yeah, there is more to be done here.

> Besides, I'm not sure if the existing bif declarations using 
> ptr_vector_pair_type_node
> and ptr_vector_quad_type_node are all intentional, at least it looks weird to 
> me that
> we declare const __vector_pair* for this __builtin_vsx_stxvp, which is meant 
> to store 32
> bytes into the memory provided by the pointer biasing the sizetype offset, 
> but the "const"
> qualifier seems to tell that this bif doesn't modify the memory pointed by 
> the given pointer.

That looks like a bug.  Well it is one even.  Is it fixed on trunk?

Since the patch is a strict improvement already, it is okay for 11 and
10.  But you (Peter) may want to flesh it out a bit first?  Or first
commit only this if that works better for you.


Segher


[pushed] [PR108999] LRA: For clobbered regs use operand mode instead of the biggest mode

2023-03-09 Thread Vladimir Makarov via Gcc-patches

The following patch solves

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108999

The patch was successfully bootstrapped and tested on i686, x86-64, 
aarch64, and ppc64 be/le.
commit 3c75631fc09a22f2513fab80ef502c2a8b0f9121
Author: Vladimir N. Makarov 
Date:   Thu Mar 9 08:41:09 2023 -0500

LRA: For clobbered regs use operand mode instead of the biggest mode

LRA is too conservative in calculation of conflicts with clobbered regs by
using the biggest access mode.  This results in failure of possible reg
coalescing and worse code.  This patch solves the problem.

PR rtl-optimization/108999

gcc/ChangeLog:

* lra-constraints.cc (process_alt_operands): Use operand modes for
clobbered regs instead of the biggest access mode.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr108999.c: New.

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index dbfaf0485a5..c38566a7451 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -3108,7 +3108,8 @@ process_alt_operands (int only_alternative)
 	  lra_assert (operand_reg[i] != NULL_RTX);
 	  clobbered_hard_regno = hard_regno[i];
 	  CLEAR_HARD_REG_SET (temp_set);
-	  add_to_hard_reg_set (_set, biggest_mode[i], clobbered_hard_regno);
+	  add_to_hard_reg_set (_set, GET_MODE (*curr_id->operand_loc[i]),
+			   clobbered_hard_regno);
 	  first_conflict_j = last_conflict_j = -1;
 	  for (j = 0; j < n_operands; j++)
 	if (j == i
diff --git a/gcc/testsuite/gcc.target/aarch64/pr108999.c b/gcc/testsuite/gcc.target/aarch64/pr108999.c
new file mode 100644
index 000..a34db85be83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr108999.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=armv8.2-a+sve" } */
+#include 
+
+void subreg_coalesce5 (
+svbool_t pg, int64_t* base, int n,
+int64_t *in1, int64_t *in2, int64_t*out
+)
+{
+svint64x2_t result = svld2_s64 (pg, base);
+
+for (int i = 0; i < n; i += 1) {
+svint64_t v18 = svld1_s64(pg, in1 + i);
+svint64_t v19 = svld1_s64(pg, in2 + i);
+result.__val[0] = svmad_s64_z(pg, v18, v19, result.__val[0]);
+result.__val[1] = svmad_s64_z(pg, v18, v19, result.__val[1]);
+}
+svst2_s64(pg, out, result);
+}
+
+/* { dg-final { scan-assembler-not {[ \t]*mov[ \t]*z[0-9]+\.d} } } */


Re: [PATCH] driver: Treat include path args the same way between cpp_unique_options and asm_options. [PR71850]

2023-03-09 Thread Costas Argyris via Gcc-patches
Pinging list and driver reviewer.

Details here:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71850

On Thu, 2 Mar 2023 at 19:25, Costas Argyris 
wrote:

> This is a proposal to fix PR71850 by applying the existing logic for
> passing include paths to cc1 to as.
>
> Thanks,
> Costas
>
From 393aff0d006ee9372cc8b9321c612c2dfb4b0a31 Mon Sep 17 00:00:00 2001
From: Costas Argyris 
Date: Thu, 2 Mar 2023 18:27:22 +
Subject: [PATCH] driver: Treat include path args the same way between
 cpp_unique_options and asm_options. [PR71850]

On Windows, when a @file with many include paths is passed to gcc, it forwards those include paths to cc1 through a temporary @file as well, so they don't end up in the command line.This is because cpp_unique_options has %@{I* which passes -I args in a temporary file, if a temporary file was passed to the driver in the first place.

The same logic is not applied in asm_options, and this leads to the include paths being passed as command line arguments to the assembler, which causes the failure on Windows seen in PR71850.

Treating the -I args to the assembler the same way as to the compiler (that is, through a @tempfile if @file was passed to gcc) solves the issue, allowing a large number of include paths to be passed to gcc on Windows through a @file.
---
 gcc/gcc.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index becc56051a8..b1fa80cde4f 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -1278,7 +1278,7 @@ static const char *asm_options =
 #if HAVE_GNU_AS
 /* If GNU AS is used, then convert -w (no warnings), -I, and -v
to the assembler equivalents.  */
-"%{v} %{w:-W} %{I*} "
+"%{v} %{w:-W} %@{I*} "
 #endif
 "%(asm_debug_option)"
 ASM_COMPRESS_DEBUG_SPEC
-- 
2.30.2



Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]

2023-03-09 Thread Costas Argyris via Gcc-patches
Pinging the list and mingw maintainer.

Analysis and pre-approval here:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108865

On Wed, 8 Mar 2023 at 10:52, Costas Argyris 
wrote:

> Added .manifest file to the make rule for utf8rc-mingw32.o, latest patch
> attached.
>
> On Tue, 7 Mar 2023 at 15:27, Costas Argyris 
> wrote:
>
>> Hi Jacek,
>>
>> "but I think it should work just fine if you didn't explicitly limit the
>> patch to x86_64."
>>
>> I would think so too.
>>
>> Actually, even cygwin might benefit from this, assuming it has the same
>> problem, which I don't know if it's the case.
>>
>> But I'm not experienced with that so I would like to explore these hosts
>> separately and just focus on the most common 64-bit Windows host with this
>> change, if possible.
>>
>> "The point that when winnt-utf8.manifest is modified, utf8-mingw32.o
>> should be rebuilt."
>>
>> Right, makes sense.
>>
>> Just noting that winnt-utf8.manifest is really not meant to be modified,
>> because it is copied straight from:
>>
>>
>> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
>>
>> and will probably remain like that, but I do get your point and I am
>> happy to make the change.
>>
>> Thanks,
>> Costas
>>
>> On Tue, 7 Mar 2023 at 14:18, Jacek Caban  wrote:
>>
>>> Hi Costas,
>>>
>>> On 3/7/23 15:00, Costas Argyris wrote:
>>> > Hi Jacek,
>>> >
>>> > "Is there a reason to make it specific to x86_64? It seems to me that
>>> > all mingw hosts could use it."
>>> >
>>> > Are you referring to the 32-bit host?My concern here is that this
>>> > functionality (embedding the UTF-8
>>> > manifest file into the executable) is only truly supported in recent
>>> > versions of Windows.From:
>>> >
>>> >
>>> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
>>> >
>>> > It says that Windows Version 1903 (May 2019 Update) enables this, so
>>> > we are looking at the 64-bit
>>> > version of Windows.
>>> >
>>> > I suppose you are referring to the scenario where one has a 32-bit
>>> > gcc + mingw running in a 64-bit
>>> > Windows that is recent enough to support this?It is not clear to
>>> > me based on the above doc what
>>> > would happen encoding-wise in that situation, and I haven't tried it
>>> > either because I assumed that
>>> > most people would want the 64-bit version of gcc since they are
>>> > probably running a 64-bit OS.
>>> >
>>> > If you think it is useful, I could look into that as a separate task
>>> > to try and keep this one simple, if
>>> > that makes sense.
>>>
>>>
>>> Yes, realistically it's mostly about 32-bit gcc on 64-bit Windows
>>> (perhaps aarch64 as well at some point in the future). It's probably
>>> indeed not very popular configuration those days, but I think it should
>>> work just fine if you didn't explicitly limit the patch to x86_64.
>>>
>>>
>>> > "I think that .manifest file should also be a dependency here."
>>> >
>>> > Why is that?Windres takes only the .rc file as its input, as per
>>> > its own doc, and it successfully
>>> > compiles it into an object file.The .manifest file is only
>>> > referenced by the .rc file, and it doesn't
>>> > get passed to windres, so I don't see why it has to be listed as a
>>> > prerequisite in the make rule.
>>>
>>>
>>> The point that when winnt-utf8.manifest is modified, utf8-mingw32.o
>>> should be rebuilt. Anyway, it's probably not a big deal (I should
>>> disclaim that I'm not very familiar with gcc build system; I'm mostly on
>>> this ML due to mingw-w64 contributions).
>>>
>>>
>>> Thanks,
>>>
>>> Jacek
>>>
>>>
From 694d6f4860a08f690070df411f3f72d66a48a981 Mon Sep 17 00:00:00 2001
From: Costas Argyris 
Date: Tue, 28 Feb 2023 17:10:18 +
Subject: [PATCH] Enable UTF-8 code page on Windows 64-bit host [PR108865]

Compile a resource object that contains the utf8 manifest.

Then link that object into the driver and compiler proper.

For compiler proper the link has to be forced because the
resource object file gets into a static library (libbackend.a)
and gets eventually dropped because it has no symbols of
its own and nothing is referencing it inside the library.

Therefore, an artificial symbol is planted to force the link.
---
 gcc/config.host |  5 ++-
 gcc/config/i386/sym-mingw32.cc  |  1 +
 gcc/config/i386/utf8-mingw32.rc |  3 ++
 gcc/config/i386/winnt-utf8.manifest |  8 
 gcc/config/i386/x-mingw32   |  3 +-
 gcc/config/i386/x-mingw32-utf8  | 57 +
 6 files changed, 73 insertions(+), 4 deletions(-)
 create mode 100644 gcc/config/i386/sym-mingw32.cc
 create mode 100644 gcc/config/i386/utf8-mingw32.rc
 create mode 100644 gcc/config/i386/winnt-utf8.manifest
 create mode 100644 gcc/config/i386/x-mingw32-utf8

diff --git a/gcc/config.host b/gcc/config.host
index a522c39658e..4abb32ad73d 100644
--- a/gcc/config.host
+++ b/gcc/config.host
@@ -241,10 +241,11 @@ case ${host} in
   x86_64-*-mingw*)
 

Re: [PATCH] middle-end/108995 - avoid folding when sanitizing overflow

2023-03-09 Thread Richard Biener via Gcc-patches
On Thu, 9 Mar 2023, Jakub Jelinek wrote:

> On Wed, Mar 08, 2023 at 09:38:43AM +, Richard Biener via Gcc-patches 
> wrote:
> > The following plugs one place in extract_muldiv where it should avoid
> > folding when sanitizing overflow.
> > 
> > I'm unsure about the testcase, I didn't find any that tests for
> > a runtime sanitizer error ...
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > OK?
> > 
> > PR middle-end/108995
> > * fold-const.cc (extract_muldiv_1): Avoid folding
> > (CST * b) / CST2 when sanitizing overflow and we rely on
> > overflow being undefined.
> 
> This is ok.
> 
> > 
> > * gcc.dg/ubsan/pr108995.c: New testcase.
> 
> As for testcase, there are many testcases that test for runtime sanitizer
> errors.  For ubsan, it is more common to test -fsanitize-recover= and
> just dg-output scan the output for expected diagnostics (many examples
> in that directory).
> Another possibility is to test for the no recovery, see e.g.
> gcc.dg/ubsan/bounds-3.c.  In that case there should be
> /* { dg-do run } */
> and
> /* { dg-shouldfail "ubsan" } */
> but dg-output checking for the exact wording is still highly desirable.
> 
> The test also relies on 32-bit ints, so it should be dg-do run { target int32 
> }
> I think.

OK, the following is what I have applied.

Richard.

>From ace65db9215882b95e2ead1bb0dc8c54c2ea69be Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Wed, 8 Mar 2023 09:06:44 +0100
Subject: [PATCH] middle-end/108995 - avoid folding when sanitizing overflow
To: gcc-patches@gcc.gnu.org

The following plugs one place in extract_muldiv where it should avoid
folding when sanitizing overflow.

PR middle-end/108995
* fold-const.cc (extract_muldiv_1): Avoid folding
(CST * b) / CST2 when sanitizing overflow and we rely on
overflow being undefined.

* gcc.dg/ubsan/pr108995.c: New testcase.
---
 gcc/fold-const.cc |  7 +++
 gcc/testsuite/gcc.dg/ubsan/pr108995.c | 18 ++
 2 files changed, 21 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/pr108995.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 99882ef820a..02a24c5fe65 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -7093,6 +7093,7 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
 If we have an unsigned type, we cannot do this since it will change
 the result if the original computation overflowed.  */
   if (TYPE_OVERFLOW_UNDEFINED (ctype)
+ && !TYPE_OVERFLOW_SANITIZED (ctype)
  && ((code == MULT_EXPR && tcode == EXACT_DIV_EXPR)
  || (tcode == MULT_EXPR
  && code != TRUNC_MOD_EXPR && code != CEIL_MOD_EXPR
@@ -7102,8 +7103,7 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
  if (wi::multiple_of_p (wi::to_wide (op1), wi::to_wide (c),
 TYPE_SIGN (type)))
{
- if (TYPE_OVERFLOW_UNDEFINED (ctype))
-   *strict_overflow_p = true;
+ *strict_overflow_p = true;
  return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
  fold_convert (ctype,
const_binop (TRUNC_DIV_EXPR,
@@ -7112,8 +7112,7 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
  else if (wi::multiple_of_p (wi::to_wide (c), wi::to_wide (op1),
  TYPE_SIGN (type)))
{
- if (TYPE_OVERFLOW_UNDEFINED (ctype))
-   *strict_overflow_p = true;
+ *strict_overflow_p = true;
  return fold_build2 (code, ctype, fold_convert (ctype, op0),
  fold_convert (ctype,
const_binop (TRUNC_DIV_EXPR,
diff --git a/gcc/testsuite/gcc.dg/ubsan/pr108995.c 
b/gcc/testsuite/gcc.dg/ubsan/pr108995.c
new file mode 100644
index 000..166825b2ef8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/pr108995.c
@@ -0,0 +1,18 @@
+/* { dg-do run { target int32 } } */
+/* { dg-shouldfail "ubsan" } */
+/* With optimization we constant fold and diagnose the overflow and do
+   not sanitize anything.  */
+/* { dg-skip-if "" { *-*-* } { "*" } { ! "-O0" } } */
+/* { dg-options "-fsanitize=undefined -fno-sanitize-recover=undefined" } */
+
+int a;
+const int b = 44514;
+int *c = 
+
+int main ()
+{
+  *c = 65526 * b / 6;
+  return 0;
+}
+
+/* { dg-output "signed integer overflow: 44514 \\* 65526 cannot be represented 
in type 'int'" } */
-- 
2.35.3



Re: [PATCH] middle-end/108995 - avoid folding when sanitizing overflow

2023-03-09 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 08, 2023 at 09:38:43AM +, Richard Biener via Gcc-patches wrote:
> The following plugs one place in extract_muldiv where it should avoid
> folding when sanitizing overflow.
> 
> I'm unsure about the testcase, I didn't find any that tests for
> a runtime sanitizer error ...
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> 
> OK?
> 
>   PR middle-end/108995
>   * fold-const.cc (extract_muldiv_1): Avoid folding
>   (CST * b) / CST2 when sanitizing overflow and we rely on
>   overflow being undefined.

This is ok.

> 
>   * gcc.dg/ubsan/pr108995.c: New testcase.

As for testcase, there are many testcases that test for runtime sanitizer
errors.  For ubsan, it is more common to test -fsanitize-recover= and
just dg-output scan the output for expected diagnostics (many examples
in that directory).
Another possibility is to test for the no recovery, see e.g.
gcc.dg/ubsan/bounds-3.c.  In that case there should be
/* { dg-do run } */
and
/* { dg-shouldfail "ubsan" } */
but dg-output checking for the exact wording is still highly desirable.

The test also relies on 32-bit ints, so it should be dg-do run { target int32 }
I think.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ubsan/pr108995.c
> @@ -0,0 +1,15 @@
> +/* { dg-do run { xfail *-*-* } } */
> +/* With optimization we constant fold and diagnose the overflow and do
> +   not sanitize anything.  */
> +/* { dg-skip-if "" { *-*-* } { "*" } { ! "-O0" } } */
> +/* { dg-options "-fsanitize=undefined -fno-sanitize-recover=undefined" } */
> +
> +int a;
> +const int b = 44514;
> +int *c = 
> +
> +int main ()
> +{
> +  *c = 65526 * b / 6;
> +  return 0;
> +}
> -- 
> 2.35.3

Jakub



Re: [v4][PATCH 1/2] Handle component_ref to a structre/union field including C99 FAM [PR101832]

2023-03-09 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Qing Zhao wrote:

> GCC extension accepts the case when a struct with a C99 flexible array member
> is embedded into another struct or union (possibly recursively).
> __builtin_object_size should treat such struct as flexible size.
> 
> gcc/c/ChangeLog:
> 
>   PR tree-optimization/101832
>   * c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
>   struct/union type.

I can't really comment on the correctness of this part but since
only the C frontend will ever set this and you are using it from
addr_object_size which is also used for other C family languages
(at least), I wonder if you can really test

+   if (!TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (v)))

there.

Originally I was suggesting to set this flag in stor-layout.cc
which eventually all languages funnel their types through and
if there's language specific handling use a langhook (with the
default implementation preserving the status quo).

Some more comments below ...

> gcc/cp/ChangeLog:
> 
>   PR tree-optimization/101832
>   * module.cc (trees_out::core_bools): Stream out new bit
>   type_include_flexarray.
>   (trees_in::core_bools): Stream in new bit type_include_flexarray.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/101832
>   * print-tree.cc (print_node): Print new bit type_include_flexarray.
>   * tree-core.h (struct tree_type_common): New bit
>   type_include_flexarray.
>   * tree-object-size.cc (addr_object_size): Handle structure/union type
>   when it has flexible size.
>   * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream
>   in new bit type_include_flexarray.
>   * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream
>   out new bit type_include_flexarray.
>   * tree.h (TYPE_INCLUDE_FLEXARRAY): New macro
>   TYPE_INCLUDE_FLEXARRAY.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/101832
>   * gcc.dg/builtin-object-size-pr101832.c: New test.
> ---
>  gcc/c/c-decl.cc   |  12 ++
>  gcc/cp/module.cc  |   2 +
>  gcc/print-tree.cc |   5 +
>  .../gcc.dg/builtin-object-size-pr101832.c | 134 ++
>  gcc/tree-core.h   |   4 +-
>  gcc/tree-object-size.cc   |  79 +++
>  gcc/tree-streamer-in.cc   |   1 +
>  gcc/tree-streamer-out.cc  |   1 +
>  gcc/tree.h|   6 +
>  9 files changed, 215 insertions(+), 29 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
> 
> diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
> index 08078eadeb8..f589a2f5192 100644
> --- a/gcc/c/c-decl.cc
> +++ b/gcc/c/c-decl.cc
> @@ -9284,6 +9284,18 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
> tree attributes,
>/* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x.  */
>DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, 
> x);
>  
> +  /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t
> +   * when x is an array.  */
> +  if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
> + TYPE_INCLUDE_FLEXARRAY (t) = flexible_array_member_type_p (TREE_TYPE 
> (x)) ;
> +  /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t
> +  when x is the last field.  */
> +  else if ((TREE_CODE (TREE_TYPE (x)) == RECORD_TYPE
> + || TREE_CODE (TREE_TYPE (x)) == UNION_TYPE)
> +&& TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x))
> +&& is_last_field)
> + TYPE_INCLUDE_FLEXARRAY (t) = true;
> +
>if (DECL_NAME (x)
> || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
>   saw_named_field = true;
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index ac2fe66b080..c750361b704 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -5371,6 +5371,7 @@ trees_out::core_bools (tree t)
>WB (t->type_common.lang_flag_5);
>WB (t->type_common.lang_flag_6);
>WB (t->type_common.typeless_storage);
> +  WB (t->type_common.type_include_flexarray);
>  }
>  
>if (CODE_CONTAINS_STRUCT (code, TS_DECL_COMMON))
> @@ -5551,6 +5552,7 @@ trees_in::core_bools (tree t)
>RB (t->type_common.lang_flag_5);
>RB (t->type_common.lang_flag_6);
>RB (t->type_common.typeless_storage);
> +  RB (t->type_common.type_include_flexarray);
>  }
>  
>if (CODE_CONTAINS_STRUCT (code, TS_DECL_COMMON))
> diff --git a/gcc/print-tree.cc b/gcc/print-tree.cc
> index 1f3afcbbc86..efacdb7686f 100644
> --- a/gcc/print-tree.cc
> +++ b/gcc/print-tree.cc
> @@ -631,6 +631,11 @@ print_node (FILE *file, const char *prefix, tree node, 
> int indent,
> && TYPE_CXX_ODR_P (node))
>   fputs (" cxx-odr-p", file);
>  
> +  if ((code == RECORD_TYPE
> +|| code == UNION_TYPE)
> +   && TYPE_INCLUDE_FLEXARRAY (node))
> + fputs (" 

[PATCH v2 2/2] combine: Try harder to form zero_extends [PR106594]

2023-03-09 Thread Richard Sandiford via Gcc-patches
g:c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f uses nonzero_bits
information to convert sign_extends into zero_extends.
That change is semantically correct in itself, but for the
testcase in the PR, it leads to a series of unfortunate events,
as described below.

We try to combine:

Trying 24 -> 25:
   24: r116:DI=sign_extend(r115:SI)
  REG_DEAD r115:SI
   25: r117:SI=[r116:DI*0x4+r118:DI]
  REG_DEAD r116:DI
  REG_EQUAL [r116:DI*0x4+`constellation_64qam']

which previously succeeded, giving:

(set (reg:SI 117 [ constellation_64qam[_5] ])
(mem/u:SI (plus:DI (mult:DI (sign_extend:DI (reg:SI 115))
(const_int 4 [0x4]))
(reg/f:DI 118)) [1 constellation_64qam[_5]+0 S4 A32]))

However, nonzero_bits can tell that only the low 6 bits of r115
can be nonzero.  The commit above therefore converts the sign_extend
to a zero_extend.  Using the same nonzero_bits information, we then
"expand" the zero_extend to:

  (and:DI (subreg:DI (reg:SI r115) 0)
  (const_int 63))

Substituting into the mult gives the unsimplified expression:

  (mult:DI (and:DI (subreg:DI (reg:SI r115) 0)
   (const_int 63))
   (const_int 4))

The simplification rules for mult convert this to an ashift by 2.
Then, this rule in simplify_shift_const_1:

  /* If we have (shift (logical)), move the logical to the outside
 to allow it to possibly combine with another logical and the
 shift to combine with another shift.  This also canonicalizes to
 what a ZERO_EXTRACT looks like.  Also, some machines have
 (and (shift)) insns.  */

moves the shift inside the "and", so that the expression becomes:

  (and:DI (ashift:DI (subreg:DI (reg:SI r115) 0)
 (const_int 2))
  (const_int 252))

We later recanonicalise to a mult (since this is an address):

  (and:DI (mult:DI (subreg:DI (reg:SI r115) 0)
   (const_int 4))
  (const_int 252))

But we fail to transform this back to the natural substitution:

  (mult:DI (zero_extend:DI (reg:SI r115))
   (const_int 4))

There are several other cases in which make_compound_operation
needs to look more than one level down in order to complete a
compound operation.  For example:

(a) the ashiftrt handling uses extract_left_shift to look through
things like logic ops in order to find a partnering ashift
operation

(b) the "and" handling looks through subregs, xors and iors
to find a partnerning lshiftrt

This patch takes the same approach for mult.

gcc/
PR rtl-optimization/106594
* combine.cc (make_compound_operation_and): Look through
multiplications by a power of two.

gcc/testsuite/
* gcc.target/aarch64/pr106594.c: New test.
---
 gcc/combine.cc  | 17 +
 gcc/testsuite/gcc.target/aarch64/pr106594.c | 20 
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr106594.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 7d446d02cb4..36d04ad6703 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -7996,6 +7996,23 @@ make_compound_operation_and (scalar_int_mode mode, rtx x,
break;
   }
 
+case MULT:
+  /* Recurse through a power of 2 multiplication (as can be found
+in an address), using the relationship:
+
+(and (mult X 2**N1) N2) == (mult (and X (lshifrt N2 N1)) 2**N1).  */
+  if (CONST_INT_P (XEXP (x, 1))
+ && pow2p_hwi (INTVAL (XEXP (x, 1
+   {
+ int shift = exact_log2 (INTVAL (XEXP (x, 1)));
+ rtx sub = make_compound_operation_and (mode, XEXP (x, 0),
+mask >> shift, in_code,
+next_code);
+ if (sub)
+   return gen_rtx_MULT (mode, sub, XEXP (x, 1));
+   }
+  break;
+
 default:
   break;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/pr106594.c 
b/gcc/testsuite/gcc.target/aarch64/pr106594.c
new file mode 100644
index 000..beda8e050a5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr106594.c
@@ -0,0 +1,20 @@
+/* { dg-options "-O2" } */
+
+extern const int constellation_64qam[64];
+
+void foo(int nbits,
+ const char *p_src,
+ int *p_dst) {
+
+  while (nbits > 0U) {
+char first = *p_src++;
+
+char index1 = ((first & 0x3) << 4) | (first >> 4);
+
+*p_dst++ = constellation_64qam[index1];
+
+nbits--;
+  }
+}
+
+/* { dg-final { scan-assembler {ldr\tw[0-9]+, \[x[0-9]+, w[0-9]+, [su]xtw 
#?2\]} } } */
-- 
2.25.1



[PATCH v2 1/2] combine: Split code out of make_compound_operation_int

2023-03-09 Thread Richard Sandiford via Gcc-patches
This patch just splits some code out of make_compound_operation_int
into a new function called make_compound_operation_and.  It is a
prerequisite for the fix for PR106594.

It might (or might not) make sense to put more of the existing
"and" handling into the new function, so that the subreg+lshiftrt
case can be handled through recursion rather than duplication.
But that's certainly not necessary to fix the bug, so is at
best stage 1 material.

No behavioural change intended.

gcc/
* combine.cc (make_compound_operation_and): New function,
split out from...
(make_compound_operation_int): ...here.
---
 gcc/combine.cc | 84 --
 1 file changed, 54 insertions(+), 30 deletions(-)

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 053879500b7..7d446d02cb4 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -7952,6 +7952,56 @@ extract_left_shift (scalar_int_mode mode, rtx x, int 
count)
   return 0;
 }
 
+/* A subroutine of make_compound_operation_int.  Try to combine an outer
+   AND of X and MASK with a partnering inner operation to form a compound
+   operation.  Return the new X on success, otherwise return null.
+
+   MODE is the mode of X.  IN_CODE is as for make_compound_operation.
+   NEXT_CODE is the value of IN_CODE that should be used for (recursive)
+   calls to make_compound_operation.  */
+
+static rtx
+make_compound_operation_and (scalar_int_mode mode, rtx x,
+unsigned HOST_WIDE_INT mask,
+rtx_code in_code, rtx_code next_code)
+{
+  switch (GET_CODE (x))
+{
+case SUBREG:
+  /* If the operand is a paradoxical subreg of a register or memory
+and MASK (limited to the smaller mode) has only zero bits where
+the sub expression has known zero bits, this can be expressed as
+a zero_extend.  */
+  {
+   rtx sub = XEXP (x, 0);
+   machine_mode sub_mode = GET_MODE (sub);
+   int sub_width;
+   if ((REG_P (sub) || MEM_P (sub))
+   && GET_MODE_PRECISION (sub_mode).is_constant (_width)
+   && sub_width < GET_MODE_PRECISION (mode))
+ {
+   unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode);
+   unsigned HOST_WIDE_INT submask;
+
+   /* The shifted AND constant with all the known zero
+  bits set.  */
+   submask = mask | ~nonzero_bits (sub, sub_mode);
+   if ((submask & mode_mask) == mode_mask)
+ {
+   rtx new_rtx = make_compound_operation (sub, next_code);
+   return make_extraction (mode, new_rtx, 0, 0, sub_width,
+   1, 0, in_code == COMPARE);
+ }
+ }
+   break;
+  }
+
+default:
+  break;
+}
+  return NULL_RTX;
+}
+
 /* Subroutine of make_compound_operation.  *X_PTR is the rtx at the current
level of the expression and MODE is its mode.  IN_CODE is as for
make_compound_operation.  *NEXT_CODE_PTR is the value of IN_CODE
@@ -8184,36 +8234,10 @@ make_compound_operation_int (scalar_int_mode mode, rtx 
*x_ptr,
   make_compound_operation (XEXP (x, 0),
next_code),
   i, NULL_RTX, 1, 1, 0, 1);
-
-  /* If the one operand is a paradoxical subreg of a register or memory and
-the constant (limited to the smaller mode) has only zero bits where
-the sub expression has known zero bits, this can be expressed as
-a zero_extend.  */
-  else if (GET_CODE (XEXP (x, 0)) == SUBREG)
-   {
- rtx sub;
-
- sub = XEXP (XEXP (x, 0), 0);
- machine_mode sub_mode = GET_MODE (sub);
- int sub_width;
- if ((REG_P (sub) || MEM_P (sub))
- && GET_MODE_PRECISION (sub_mode).is_constant (_width)
- && sub_width < mode_width)
-   {
- unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode);
- unsigned HOST_WIDE_INT mask;
-
- /* original AND constant with all the known zero bits set */
- mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode));
- if ((mask & mode_mask) == mode_mask)
-   {
- new_rtx = make_compound_operation (sub, next_code);
- new_rtx = make_extraction (mode, new_rtx, 0, 0, sub_width,
-1, 0, in_code == COMPARE);
-   }
-   }
-   }
-
+  else
+   new_rtx = make_compound_operation_and (mode, XEXP (x, 0),
+  UINTVAL (XEXP (x, 1)),
+  in_code, next_code);
   break;
 
 case LSHIFTRT:
-- 
2.25.1



[PATCH v2 0/2] Series of patch to fix PR106594

2023-03-09 Thread Richard Sandiford via Gcc-patches
This series of patches fixes PR106594, an aarch64 regression in which
we fail to combine an extension into an address.  The first patch just
refactors code.  The second patch contains the actual fix.

The cover note for the second patch describes the problem and the fix.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


Re: [PATCH v3] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-09 Thread Richard Biener via Gcc-patches
On Wed, 8 Mar 2023, Xionghu Luo wrote:

> 
> 
> On 2023/3/7 19:25, Richard Biener wrote:
> >>> It would be nice to avoid creating blocks / preserving labels we'll
> >>> immediately remove again.  For that we do need some analysis
> >>> before creating basic-blocks that determines whether a label is
> >>> possibly reached by a non-falltru edge.
> >>>
> >>
> >>  :
> >> p = 0;
> >> switch (s) , case 0: , case 1: >
> >>
> >>  :
> >> :   <= prev_stmt
> >> :   <= stmt
> >> p = p + 1;
> >> n = n + -1;
> >> if (n != 0) goto ; else goto ;
> >>
> >> Check if  is a case label and  is a goto target then return
> >> true
> >> in stmt_starts_bb_p to start a new basic block?  This would avoid creating
> >> and
> >> removing blocks, but cleanup_dead_labels has all bbs setup while
> >> stmt_starts_bb_p
> >> does't yet to iterate bbs/labels to establish label_for_bb[] map?
> 
> > Yes.  I think we'd need something more pragmatic before make_blocks (),
> > like re-computing TREE_USED of the label decls or computing a bitmap
> > of targeted labels (targeted by goto, switch or any other means).
> > 
> > I'll note that doing a cleanup_dead_labels () like optimization before
> > we create blocks will help keeping LABEL_DECL_UID and thus
> > label_to_block_map dense.  But it does look like a bit of
> > an chicken-and-egg problem and the question is how effective the
> > dead label removal is in practice.
> 
> Tried to add function compute_target_labels(not sure whether the function
> name is suitable) in the front of make_blocks_1, now the fortran case doesn't
> create/removing blocks now, but I still have several questions:
> 
>  1. I used hash_set to save the target labels instead of bitmap, as
> labels
> are tree type value instead of block index so bitmap is not good for it since
> we don't have LABEL_DECL_UID now?

We don't have LABEL_DECL_UID, we have DECL_UID though, but the choice of
hash_set vs. bitmap is somewhat arbitrary here.  The real cost is
the extra walk over all stmts.

>  2. Is the compute_target_labels still only for !optimize?  And if we compute
> the target labels before create bbs, it is unnessary to guard the first
> cleanup_dead_labels under !optimize now, because the switch-case-do-while
> case already create new block for CASE_LABEL already.

OK.

>  3. I only added GIMPLE_SWITCH/GIMPLE_COND in compute_target_labels
> so far, is it needed to also handle GIMPLE_ASM/GIMPLE_TRANSACTION and even
> labels_eh?

I'd add GIMPLE_ASM handling, the rest should be OK wrt debugging and
coverage already?

> PS1: The v3 patch will cause one test case fail:
> 
> Number of regressions in total: 1
> > FAIL: gcc.c-torture/compile/limits-caselabels.c   -O0  (test for excess
> > errors)
> 
> due to this exausting case has labels from L0 to L11, they won't be
> optimized
> to a simple if-else expression like before...

Hmm, that's somewhat unexpected.

> 
> PS2: The GIMPLE_GOTO piece of code would cause some fortran cases run fail due
> to __builtin_unreachable trap generated in .fixup_cfg1, I didn't dig into it
> so
> just skip these label...

Please investigate, we might be missing a corner case here.

> 
> +   case GIMPLE_GOTO:
> +#if 0
> + if (!computed_goto_p (stmt))
> +   {
> + tree dest = gimple_goto_dest (stmt);
> + target_labels->add (dest);
> +   }
> +#endif
> + break;
> 
> Change the #if 0 to #if 1 result in:
> 
> Number of regressions in total: 8
> > FAIL: gcc.c-torture/compile/limits-caselabels.c   -O0  (test for excess
> > FAIL: errors)
> > FAIL: gcc.dg/analyzer/explode-2a.c (test for excess errors)
> > FAIL: gcc.dg/analyzer/pragma-2.c (test for excess errors)
> > FAIL: gfortran.dg/bound_2.f90   -O0  execution test
> > FAIL: gfortran.dg/bound_7.f90   -O0  execution test
> > FAIL: gfortran.dg/char_result_14.f90   -O0  execution test
> > FAIL: gfortran.dg/pointer_array_1.f90   -O0  execution test
> > FAIL: gfortran.dg/select_type_15.f03   -O0  execution test
> 
> 
> 
> Paste the updated patch v3:

The gcov testcase adjustments look good, does the analyzer testcase
(missing in the changelog) get different CFG input?

Thanks,
Richard.

> 
> v3: Add compute_target_labels and call it in the front of make_blocks_1.
> 
> Start a new basic block if two labels have different location when
> test-coverage.
> 
> Regression tested pass on x86_64-linux-gnu and aarch64-linux-gnu, OK for
> master?
> 
> gcc/ChangeLog:
> 
>   PR gcov/93680
>   * tree-cfg.cc (stmt_starts_bb_p): Check whether the label is in
>   target_labels.
>   (compute_target_labels): New function.
>   (make_blocks_1): Call compute_target_labels.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR gcov/93680
>   * g++.dg/gcov/gcov-1.C: Correct counts.
>   * gcc.misc-tests/gcov-4.c: Likewise.
>   * gcc.misc-tests/gcov-pr85332.c: Likewise.
>   * lib/gcov.exp: Also clean gcda if fail.
>   * gcc.misc-tests/gcov-pr93680.c: New test.
> 
> Signed-off-by: 

Re: [PATCH] Avoid unnecessary epilogues from tree_unroll_loop

2023-03-09 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> The following fixes the condition determining whether we need an
> epilogue.
>
> When r12-2429-g62acc72a957b56 introduced this check I didn't notice
> the odd condition on review.  Richard - do you remember if this
> was on purpose?

Oops, no, looks like a mistake.  Thanks for the fix.

Richard

> I've noticed the mismatch with gcc.dg/tree-ssa/predcom-1.c for example.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.
>
> Richard.
>
>   * tree-ssa-loop-manip.cc (determine_exit_conditions): Fix
>   no epilogue condition.
> ---
>  gcc/tree-ssa-loop-manip.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc
> index c804a7353d5..a52277abdbf 100644
> --- a/gcc/tree-ssa-loop-manip.cc
> +++ b/gcc/tree-ssa-loop-manip.cc
> @@ -1010,7 +1010,7 @@ determine_exit_conditions (class loop *loop, class 
> tree_niter_desc *desc,
>/* Convert the latch count to an iteration count.  */
>tree niter = fold_build2 (PLUS_EXPR, type, desc->niter,
>   build_one_cst (type));
> -  if (multiple_of_p (type, niter, bigstep))
> +  if (multiple_of_p (type, niter, build_int_cst (type, factor)))
>   return;
>  }


[PATCH] tree-optimization/44794 - avoid excessive RTL unrolling on epilogues

2023-03-09 Thread Richard Biener via Gcc-patches
The following adjusts tree_[transform_and_]unroll_loop to set an
upper bound on the number of iterations on the epilogue loop it
creates.  For the testcase at hand which involves array prefetching
this avoids applying RTL unrolling to them when -funroll-loops is
specified.

Other users of this API includes predictive commoning and
unroll-and-jam.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.

PR tree-optimization/44794
* tree-ssa-loop-manip.cc (tree_transform_and_unroll_loop):
If an epilogue loop is required set its iteration upper bound.
---
 gcc/tree-ssa-loop-manip.cc | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc
index 09acc1c94cc..c804a7353d5 100644
--- a/gcc/tree-ssa-loop-manip.cc
+++ b/gcc/tree-ssa-loop-manip.cc
@@ -1297,6 +1297,12 @@ tree_transform_and_unroll_loop (class loop *loop, 
unsigned factor,
}
 
   remove_path (exit);
+
+  /* The epilog loop latch executes at most factor - 1 times.
+Since the epilog is entered unconditionally it will need to handle
+up to factor executions of its body.  */
+  new_loop->any_upper_bound = 1;
+  new_loop->nb_iterations_upper_bound = factor - 1;
 }
   else
 new_exit = single_dom_exit (loop);
-- 
2.35.3


[PATCH] Avoid unnecessary epilogues from tree_unroll_loop

2023-03-09 Thread Richard Biener via Gcc-patches
The following fixes the condition determining whether we need an
epilogue.

When r12-2429-g62acc72a957b56 introduced this check I didn't notice
the odd condition on review.  Richard - do you remember if this
was on purpose?

I've noticed the mismatch with gcc.dg/tree-ssa/predcom-1.c for example.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.

Richard.

* tree-ssa-loop-manip.cc (determine_exit_conditions): Fix
no epilogue condition.
---
 gcc/tree-ssa-loop-manip.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc
index c804a7353d5..a52277abdbf 100644
--- a/gcc/tree-ssa-loop-manip.cc
+++ b/gcc/tree-ssa-loop-manip.cc
@@ -1010,7 +1010,7 @@ determine_exit_conditions (class loop *loop, class 
tree_niter_desc *desc,
   /* Convert the latch count to an iteration count.  */
   tree niter = fold_build2 (PLUS_EXPR, type, desc->niter,
build_one_cst (type));
-  if (multiple_of_p (type, niter, bigstep))
+  if (multiple_of_p (type, niter, build_int_cst (type, factor)))
return;
 }
 
-- 
2.35.3


Re: [PATCH v2 0/5] A small Texinfo refinement

2023-03-09 Thread Arsen Arsenović via Gcc-patches

Sandra Loosemore  writes:

>> As an example, let's take this link:
>> https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Warning-Options.html#index-Wpedantic
>> This should place you below the item line this index entry refers to,
>> and there aren't any copiable anchors (see equivalent in my render for
>> an example of those), both of which were often named as annoyances with
>> the onlinedocs while the Sphinx experiment was taking place.
>> A similar thing happens in the standalone and Emacs info viewers (but
>> that's less noticeable there since the cursor is placed in the middle of
>> the screen when jumping to an index entry there).  Try, for instance,
>> 'info gcc Wpedantic' (your cursor will be placed just below the item
>> line).
>> The fix for the first of these issues should already be applied by
>> Gerald (in the reordering commits, IIRC at least, save for one that I
>> created later because someone snuck in new "misplaced" indices), and
>> that fix should also fix up previous versions of Texinfo.
>> Even with this change, the copiable anchors will remain missing since
>> released Texinfo versions lack some AST transformations that enable
>> those.
>
> OK, I can see the difference there between the current online docs, the set 
> you
> produced with the unreleased Texinfo support, and what I got building with
> Texinfo 6.7.
>
>> Otherwise, manuals should work fine with older releases, unless I missed
>> something when refactoring @defbuiltin and removing @gols (which I do
>> believe are superfluous with current versions of texinfo.tex, which is
>> why I bumped that too).
>
> I did a few spot-checks here and there of those changes.  I saw a couple of
> line break problems but they turn out to be due to existing errors in the 
> .texi
> files that were not introduced by your (mostly mechanical) changes.

Thanks.  I tried to check all usage sites of @gol in PDF output too, to
make sure its removal didn't have a negative impact, but I only tested
new makeinfo and the texinfo.tex I pushed to my branch (2023-01-17.19).

I expect the version of makeinfo to have no impact for that output,
since it should just offload to texi2dvi.

>> FWIW, I (briefly) tested with Texinfo 6.0, and output seems okay.  On
>> 5.0, I got a few warnings, but I think even 6.0 is apt considering its
>> age.  I haven't given it a proper scrutiny, though (workdays are busy
>> this time of year..).
>
> Texinfo 6.0 was released in 2015, 5.0 in 2013.  FWIW, Trusty Tahr (the current
> oldest Ubuntu LTS release) has 5.2.  4.7 was released in 2004, I don't know 
> why
> anyone would still be trying to use that unless it's needed for building 
> legacy
> code from the same era.

Heh, I hadn't realized how far back LTS releases go..  I don't think
there's any new language constructs that the GCC manual could make use
of currently, so it shouldn't be too difficult to retain at least a
"builds with diagnostics" level of support for those versions.

> I think we could do away with the requirement for a specific minimum version,
> and make install.texi say something similar to what it says for e.g. awk --
> just use a "recent" version, and note that new versions produce better output
> and very old ones may produce diagnostics.  I'll add that do my own todo list.

That seems reasonable, thanks.

> -Sandra

-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-09 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool  writes:
> On Wed, Mar 08, 2023 at 11:58:51AM +, Richard Sandiford wrote:
>> Segher Boessenkool  writes:
>> > An #ifdef is a way of making a change that is not finished yet not hurt
>> > the other targets.  It still hurts generic development, which indirectly
>> > hurts all targets.
>> 
>> Seems like this might be moot anyway given that your results
>> suggest no impact on other targets.
>
> Which means the patch does not do what it says it does.  It is a net
> negative on the only target it did change code on, too.
>
> If the patch did do what it promises it would be a (large!) net benefit,
> and also on various other targets.

I'm not sure which promise you're referring to here.  The patch wasn't
supposed to be a big sweeping improvement to combine. :-)  It was just
supposed to fix the regression.

> As it is, either the regression wasn't P1 at all, or the patch doesn't
> fix the problem, or the problem only happens in unusual code (or vector
> or float code).  Please explain what the regression is you want to
> solve?  With a compilable testcase etc., the usual.

The testcase is the one from the patch and the PR, compiled at -O2
on aarch64-linux-gnu:

---
extern const int constellation_64qam[64];

void foo(int nbits,
 const char *p_src,
 int *p_dst) {

  while (nbits > 0U) {
char first = *p_src++;

char index1 = ((first & 0x3) << 4) | (first >> 4);

*p_dst++ = constellation_64qam[index1];

nbits--;
  }
}
---

The regression occurred in c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f.
Before that patch, the loop body was:

.L3:
ldrbw3, [x1, x0]
ubfiz   w4, w3, 4, 2
orr w3, w4, w3, lsr 4
ldr w3, [x6, w3, sxtw 2] // <
str w3, [x2, x0, lsl 2]
add x0, x0, 1
cmp x5, x0
bne .L3

After the patch it is:

.L3:
ldrbw0, [x1, x3]
ubfiz   w4, w0, 4, 2 
orr w0, w4, w0, lsr 4
sxtwx0, w0   // <
ldr w0, [x6, x0, lsl 2]  // <
str w0, [x2, x3, lsl 2]
add x3, x3, 1
cmp x5, x3
bne .L3

Before the patch:

Trying 24 -> 25:
   24: r116:DI=sign_extend(r115:SI)
  REG_DEAD r115:SI
   25: r117:SI=[r116:DI*0x4+r118:DI]
  REG_DEAD r116:DI
  REG_EQUAL [r116:DI*0x4+`constellation_64qam']
Successfully matched this instruction:
(set (reg:SI 117 [ constellation_64qam[_5] ])
(mem/u:SI (plus:DI (mult:DI (sign_extend:DI (reg:SI 115))
(const_int 4 [0x4]))
(reg/f:DI 118)) [1 constellation_64qam[_5]+0 S4 A32]))
allowing combination of insns 24 and 25
original costs 4 + 16 = 20
replacement cost 16
deferring deletion of insn with uid = 24.
modifying insn i325: r117:SI=[sign_extend(r115:SI)*0x4+r118:DI]
  REG_DEAD r115:SI
deferring rescan insn with uid = 25.

After the patch:

Trying 24 -> 25:
   24: r116:DI=sign_extend(r115:SI)
  REG_DEAD r115:SI
   25: r117:SI=[r116:DI*0x4+r118:DI]
  REG_DEAD r116:DI
  REG_EQUAL [r116:DI*0x4+`constellation_64qam']
Failed to match this instruction:
(set (reg:SI 117 [ constellation_64qam[_5] ])
(mem/u:SI (plus:DI (and:DI (mult:DI (subreg:DI (reg:SI 115) 0)
(const_int 4 [0x4]))
(const_int 252 [0xfc]))
(reg/f:DI 118)) [1 constellation_64qam[_5]+0 S4 A32]))

expand_compound_operation has the curious code (that Richard pointed
out in the PR):

  /* Convert sign extension to zero extension, if we know that the high
 bit is not set, as this is easier to optimize.  It will be converted
 back to cheaper alternative in make_extraction.  */
  if (GET_CODE (x) == SIGN_EXTEND
  && HWI_COMPUTABLE_MODE_P (mode)
  && ((nonzero_bits (XEXP (x, 0), inner_mode)
   & ~(((unsigned HOST_WIDE_INT) GET_MODE_MASK (inner_mode)) >> 1))
  == 0))
{
  rtx temp = gen_rtx_ZERO_EXTEND (mode, XEXP (x, 0));
  rtx temp2 = expand_compound_operation (temp);

  /* Make sure this is a profitable operation.  */
  if (set_src_cost (x, mode, optimize_this_for_speed_p)
  > set_src_cost (temp2, mode, optimize_this_for_speed_p))
   return temp2;
  else if (set_src_cost (x, mode, optimize_this_for_speed_p)
   > set_src_cost (temp, mode, optimize_this_for_speed_p))
   return temp;
  else
   return x;
}

So we only use the expanded version of zero_extend if it is strictly
cheaper than sign_extend.  Otherwise we make a choice between the
original *unexpanded* sign_extend and the original *unexpanded*
zero_extend, preferring to keep things as they are in the event of a tie.

That is, this code bypasses the normal expansion of sign_extends if
we can prove that the top bit of the input is clear.  If all costs are
equal, it will 

Re: [PATCH v2 1/5] docs: Create Indices appendix

2023-03-09 Thread Arsen Arsenović via Gcc-patches

Sandra Loosemore  writes:

> On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:
>> The GCC manual has multiple indices.  By creating an appendix which
>> lists them, we help makeinfo present a more accessible way for the
>> reader to see all the indices.
>> gcc/ChangeLog:
>>  * doc/gcc.texi: Add the Indices appendix, to make texinfo
>>  generate nice indices overview page.
>>  (@copying): Move "This file documents the use of the GNU
>>  compilers" into @copying.  Add quotations around cover texts.
>
>
> I guess this patch is OK and is necessary to smooth over some misfeatures
> in newer versions of Texinfo.  In particular, comparing your sample output
> https://www.aarsen.me/~arsen/final/gcc.html/index.html
>
> to my own fresh Texinfo 6.7-generated version with your patches applied, and
> the existing online documention like
>
> https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/index.html
>
> the order of the "Short Table of Contents" and longer "Table of Contents" have
> been switched, so that in the new version you have to scroll all the way down
> to the bottom of the page (ugh) to click on "Option Index".  (Frankly, this
> seems like a misfeature; the point of having a "Short Table of Contents" is
> *not* to have to page through the long one to find a particular chapter.)
>
> I guess that is a Texinfo change?  gcc.texi still has:
>
> @summarycontents
> @contents
>
> in that order.

Hm, I hadn't noticed that..  That is odd.  I'll figure out why this
happens later today.  Thanks for raising this to my attention.

> OTOH, I see that in your new version there is now a line with links
> [Contents][Index] before the Introduction.  If adding this new appendix makes
> the [Index] link point at the indices, I think it is OK, although I'm still
> worried that the overall effect (even without the new version of Texinfo) is
> making the indices harder to find.
>
> I wonder, could we add something to the Introduction text like
>
> Tip: This manual is very long.  If you're looking for something in particular,
> try searching the @ref{Option Index} or @ref{Concept and Symbol Index}.
>
> ???

I think this is a good idea in either case.  It should certainly help
users unaccustomed to Texinfo in locating things quickly.

> -Sandra


-- 
Arsen Arsenović


signature.asc
Description: PGP signature


Re: [PING, PING] Re: [PATCH 2/2] Corrected pr25521.c target matching.

2023-03-09 Thread Cupertino Miranda via Gcc-patches


[PING]

Cupertino Miranda writes:

> Hi Jeff,
>
> Please, please, give me some feedback on this one.
> I just don't want to have to keep asking you for time on this small
> pending patches that I also have to keep track on.
>
> I realized your committed the other one. Thank you !
>
> Best regards,
> Cupertino
>
>
> Cupertino Miranda writes:
>
>> PING !
>>
>> Cupertino Miranda via Gcc-patches writes:
>>
>>> Hi Jeff,
>>>
>>> Can you please confirm if the patch is Ok?
>>>
>>> Thanks,
>>> Cupertino
>>>
 Cupertino Miranda via Gcc-patches writes:

> Thank you for the comments and suggestions.
> I have changed the patch.
>
> Unfortunately in case of rx target I could not make
> scan-assembler-symbol-section to match. I believe it is because the
> .section and .global entries order is reversed in this target.
>
> Patch in inlined below. looking forward to your comments.
>
> Cupertino
>
> diff --git a/gcc/testsuite/gcc.dg/pr25521.c 
> b/gcc/testsuite/gcc.dg/pr25521.c
> index 63363a03b9f..82b4cd88ec0 100644
> --- a/gcc/testsuite/gcc.dg/pr25521.c
> +++ b/gcc/testsuite/gcc.dg/pr25521.c
> @@ -2,9 +2,10 @@
> sections.
>
> { dg-require-effective-target elf }
> -   { dg-do compile } */
> +   { dg-do compile }
> +   { dg-skip-if "" { ! const_volatile_readonly_section } } */
>
>  const volatile int foo = 30;
>
> -
> -/* { dg-final { scan-assembler "\\.s\?rodata" } } */
> +/* { dg-final { scan-assembler {.section C,} { target { rx-*-* } } } } */
> +/* { dg-final { scan-assembler-symbol-section {^_?foo$} 
> {^\.(const|s?rodata)} { target { ! "rx-*-*" } } } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index c0694af2338..91aafd89909 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -12295,3 +12295,13 @@ proc check_is_prog_name_available { prog } {
>
>  return 1
>  }
> +
> +# returns 1 if target does selects a readonly section for const volatile 
> variables.
> +proc check_effective_target_const_volatile_readonly_section { } {
> +
> +if { [istarget powerpc-*-*]
> +   || [check-flags { "" { powerpc64-*-* } { -m32 } }] } {
> + return 0
> +}
> +  return 1
> +}
>
>
> Jeff Law writes:
>
>> On 12/7/22 08:45, Cupertino Miranda wrote:
>>>
 On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
> This commit is a follow up of bugzilla #107181.
> The commit /a0aafbc/ changed the default implementation of the
> SELECT_SECTION hook in order to match clang/llvm behaviour w.r.t the
> placement of `const volatile' objects.
> However, the following targets use target-specific selection functions
> and they choke on the testcase pr25521.c:
>*rx - target sets its const variables as '.section 
> C,"a",@progbits'.
 That's presumably a constant section.  We should instead twiddle the 
 test to
 recognize that section.
>>> Although @progbits is indeed a constant section, I believe it is
>>> more interesting to detect if the `rx' starts selecting more
>>> standard sections instead of the current @progbits.
>>> That was the reason why I opted to XFAIL instead of PASSing it.
>>> Can I keep it as such ?
>> I'm not aware of any ongoing development for that port, so I would not 
>> let
>> concerns about the rx port changing behavior dominate how we approach 
>> this
>> problem.
>>
>> The rx port is using a different name for the section.  That's  valid 
>> thing to
>> do and to the extent we can, we should support that in the test rather 
>> than
>> (incorrectly IMHO) xfailing the test just becuase the name isn't what we
>> expected.
>>
>> To avoid over-eagerly matching, I would probably search for "C,"  I 
>> wouldn't do
>> that for the const or rodata sections as they often have a suffix like 
>> 1, 2, 4,
>> 8 for different sized rodata sections.
>>
>> PPC32 is explicitly doing something different and placing those objects 
>> into an
>> RW section.  So for PPC32 it makes more sense to skip the test rather 
>> than xfail
>> it.
>>
>> Jeff


Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-09 Thread Kewen.Lin via Gcc-patches
Hi Peter,

on 2023/3/9 07:01, Peter Bergner via Gcc-patches wrote:
> PR109073 shows a problem where GCC 11 and GCC 10 do not accept a const
> __vector_pair pointer operand to some MMA builtins, which GCC 12 and later
> correctly accept.  Fixed here by initializing the builtins to accept const
> pointers.
> 
> This patch was tested in both GCC 11 and GCC 10 on powerpc64le-linux and
> showed no regressions.  Ok for backports?
> 
> Peter
> 
> 
> gcc/
> 
>   PR target/109073
>   * config/rs6000/rs6000-call.c (mma_init_builtins): Accept const pointer
>   operands for lxvp, stxvp and disassemble builtins.
> 
> gcc/testsuite/
> 
>   PR target/109073
>   * gcc.target/powerpc/mma-builtin-4.c): New const * test. Update
   ~~ typo.

>   expected instruction counts.
>   * gcc.target/powerpc/mma-builtin-5.c: Likewise.
>   * gcc.target/powerpc/mma-builtin-7.c: Likewise.
> 
> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 1be4797e834..3b6d40f0aef 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -14343,22 +14343,30 @@ mma_init_builtins (void)
>   {
> op[nopnds++] = build_pointer_type (void_type_node);
> if (d->code == MMA_BUILTIN_DISASSEMBLE_ACC)
> - op[nopnds++] = build_pointer_type (vector_quad_type_node);
> + op[nopnds++] = build_pointer_type (build_qualified_type
> +  (vector_quad_type_node,
> +   TYPE_QUAL_CONST));

Nit: Maybe we can build them out of the loop once and then just use the
built one in the loop.

> else
> - op[nopnds++] = build_pointer_type (vector_pair_type_node);
> + op[nopnds++] = build_pointer_type (build_qualified_type
> +  (vector_pair_type_node,
> +   TYPE_QUAL_CONST));
>   }
>else if (d->code == VSX_BUILTIN_LXVP)
>   {
> op[nopnds++] = vector_pair_type_node;
> op[nopnds++] = sizetype;
> -   op[nopnds++] = build_pointer_type (vector_pair_type_node);
> +   op[nopnds++] = build_pointer_type (build_qualified_type
> +(vector_pair_type_node,
> + TYPE_QUAL_CONST));
>   }
>else if (d->code == VSX_BUILTIN_STXVP)
>   {
> op[nopnds++] = void_type_node;
> op[nopnds++] = vector_pair_type_node;
> op[nopnds++] = sizetype;
> -   op[nopnds++] = build_pointer_type (vector_pair_type_node);
> +   op[nopnds++] = build_pointer_type (build_qualified_type
> +(vector_pair_type_node,
> + TYPE_QUAL_CONST));

I wonder if the bifs which need to be updated are enough here.  The reason why
I asked is that on trunk *ptr_vector_pair_type_node* is used for function types
v1poi_ftype_ulg_pv1poi, v_ftype_pv1poi_uv16qi_uv16qi, v_ftype_pv_pv1poi and
v_ftype_v1poi_ulg_pv1poi, and *ptr_vector_quad_type_node* is used for function
types v_ftype_pv1pxi, v_ftype_pv1pxi_uv16qi_uv16qi, 
v_ftype_pv1pxi_uv16qi_uv16qi_ci_ci,
v_ftype_pv1pxi_uv16qi_uv16qi_ci_ci_ci, 
v_ftype_pv1pxi_uv16qi_uv16qi_uv16qi_uv16qi,
v_ftype_pv1pxi_v1poi_uv16qi, v_ftype_pv1pxi_v1poi_uv16qi_ci_ci and 
v_ftype_pv_pv1pxi.

These function types are further used for bifs as follow:

__builtin_vsx_lxvp
__builtin_mma_assemble_pair
__builtin_vsx_assemble_pair
__builtin_vsx_build_pair
__builtin_mma_disassemble_pair
__builtin_vsx_disassemble_pair
__builtin_vsx_stxvp
__builtin_mma_xxmfacc
__builtin_mma_xxmtacc
__builtin_mma_xxsetaccz
...
... and more ...

Simply testing __builtin_mma_xxmtacc and __builtin_mma_xxmfacc as below:

$ cat test.C
void foo0(const __vector_quad *acc) {
  __builtin_mma_xxmtacc(acc);
  __builtin_mma_xxmfacc(acc);
}

test.C:2:25: error: invalid conversion from ‘const __vector_quad*’ to 
‘__vector_quad*’ [-fpermissive]
2 |   __builtin_mma_xxmtacc(acc);

test.C:3:25: error: invalid conversion from ‘const __vector_quad*’ to 
‘__vector_quad*’ [-fpermissive]
3 |   __builtin_mma_xxmfacc(acc);

They also suffered the same error on gcc11 branch but not on trunk.

Besides, I'm not sure if the existing bif declarations using 
ptr_vector_pair_type_node
and ptr_vector_quad_type_node are all intentional, at least it looks weird to 
me that
we declare const __vector_pair* for this __builtin_vsx_stxvp, which is meant to 
store 32
bytes into the memory provided by the pointer biasing the sizetype offset, but 
the "const"
qualifier seems to tell that this bif doesn't modify the memory pointed by the 
given pointer.

As a contrast, for bif vec_xl (a, b), b is the address argument and of type 
const TYPE *, while for
bif vec_xst (a, b, c), c is the address and of type TYPE *, here we don't add 
const qualifier for
the 

Re: [PATCH v2] vect: Check that vector factor is a compile-time constant

2023-03-09 Thread Richard Biener via Gcc-patches
On Thu, Mar 9, 2023 at 8:57 AM Michael Collison  wrote:

OK.

Thanks,
Richard.

> 2023-03-05  Michael Collison  
>
> * tree-vect-loop-manip.cc (vect_do_peeling): Use
> result of constant_lower_bound instead of vf in case
> vf is not a compile time constant.
> ---
>  gcc/tree-vect-loop-manip.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index d88edafa018..f60fa50e8f4 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -2921,7 +2921,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, 
> tree nitersm1,
>if (new_var_p)
> {
>   value_range vr (type,
> - wi::to_wide (build_int_cst (type, vf)),
> + wi::to_wide (build_int_cst (type, lowest_vf)),
>   wi::to_wide (TYPE_MAX_VALUE (type)));
>   set_range_info (niters, vr);
> }
> --
> 2.34.1
>


Re: [PATCH] ubsan: missed -fsanitize=bounds for compound ops [PR108060]

2023-03-09 Thread Jakub Jelinek via Gcc-patches
On Thu, Mar 09, 2023 at 08:12:47AM +, Richard Biener wrote:
> I think this is a reasonable way to address the regression, so OK.

It is true that both C and C++ (including c++14_down and c++17 and later
where the latter have different ordering rules) evaluate the lhs of
MODIFY_EXPR after rhs, so conceptually this patch makes sense.
But I wonder why we do in ubsan_maybe_instrument_array_ref:
  if (e != NULL_TREE)
{
  tree t = copy_node (*expr_p);
  TREE_OPERAND (t, 1) = build2 (COMPOUND_EXPR, TREE_TYPE (op1),
e, op1);
  *expr_p = t;
}
rather than modification of the ARRAY_REF's operand in place.  If we
did that, we wouldn't really care about the order, shared tree would
be instrumented once, with SAVE_EXPR in there making sure we don't
compute that multiple times.  Is that because the 2 copies could
have side-effects and we do want to evaluate those multiple times?

Jakub



Re: [PATCH] range-op-float: Fix up reverse binary operations [PR109008]

2023-03-09 Thread Richard Biener via Gcc-patches
On Thu, 9 Mar 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is reduced from miscompilation of scipy package.
> If we have say lhs = [1., 1.] - [1., 1.] and want to compute the range
> of lhs from it, we correctly determine it is [0., 0.] (if computations
> are exact, we generally don't try to round them further in
> frange_arithmetic).  In the testcase it is about a reverse operation,
> [1., 1.] = op1 + [1., 1.] and we want to compute range of op1 from that.
> Right now we just perform the inverse operation (there are some corner
> cases about NaN and infinities handling) and so arrive to range
> [0., 0.] as well, and because it is a singleton, optimize return eps;
> to return 0.  That is incorrect though, for the reverse ops we need to
> take into account also rounding, the right exact range is
> [-0x1.0p-54, 0x1.0p-53] in this case when rounding to nearest, i.e.
> all numbers which added to 1. with round to nearest still produce 1.
> 
> The problem isn't solely on singleton ranges, and isn't solely on
> results around zero.  We basically need to consider also values
> where the result is up to 0.5ulp away from the lhs range boundaries
> in each direction.
> 
> The following patch fixes it by extending the lhs range for the
> reverse operations by 1ulp in each direction.  The PR contains
> a pseudo-random test generator I've used to generate 30 tests
> of + and - and then used the same test with * and / instead of + and -
> together with a hack to print the discovered ranges by the patch in
> a form that another test could then verify the range is conservatively
> correct and how far it is from a minimal range.
> 
> I believe the results are good enough for now, though plan to look
> incrementally into trying to do something better on the -XXX_MAX or
> XXX_MAX boundaries (where I think frange_nextafter will use -inf or +inf)
> and also try to increase the range just by 0.5ulp rather than 1ulp
> if !flag_rounding_math.  But dunno if either of those will be doable
> and will pass the testing, so I think it is worth committing this fix
> first.

Sounds good.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2023-03-09  Jakub Jelinek  
>   Richard Biener  
> 
>   PR tree-optimization/109008
>   * range-op-float.cc (float_widen_lhs_range): New function.
>   (foperator_plus::op1_range, foperator_minus::op1_range,
>   foperator_minus::op2_range, foperator_mult::op1_range,
>   foperator_div::op1_range, foperator_div::op2_range): Use it.
> 
>   * gcc.c-torture/execute/ieee/pr109008.c: New test.
> 
> --- gcc/range-op-float.cc.jj  2023-03-08 12:33:44.641043477 +0100
> +++ gcc/range-op-float.cc 2023-03-08 13:13:09.015341002 +0100
> @@ -2199,6 +2199,33 @@ zero_to_inf_range (REAL_VALUE_TYPE ,
>  }
>  }
>  
> +/* Extend the LHS range by 1ulp in each direction.  For op1_range
> +   or op2_range of binary operations just computing the inverse
> +   operation on ranges isn't sufficient.  Consider e.g.
> +   [1., 1.] = op1 + [1., 1.].  op1's range is not [0., 0.], but
> +   [-0x1.0p-54, 0x1.0p-53] (when not -frounding-math), any value for
> +   which adding 1. to it results in 1. after rounding to nearest.
> +   So, for op1_range/op2_range extend the lhs range by 1ulp in each
> +   direction.  See PR109008 for more details.  */
> +
> +static frange
> +float_widen_lhs_range (tree type, const frange )
> +{
> +  frange ret = lhs;
> +  if (lhs.known_isnan ())
> +return ret;
> +  REAL_VALUE_TYPE lb = lhs.lower_bound ();
> +  REAL_VALUE_TYPE ub = lhs.upper_bound ();
> +  if (real_isfinite ())
> +frange_nextafter (TYPE_MODE (type), lb, dconstninf);
> +  if (real_isfinite ())
> +frange_nextafter (TYPE_MODE (type), ub, dconstinf);
> +  ret.set (type, lb, ub);
> +  ret.clear_nan ();
> +  ret.union_ (lhs);
> +  return ret;
> +}
> +
>  class foperator_plus : public range_operator_float
>  {
>using range_operator_float::op1_range;
> @@ -2214,8 +2241,9 @@ public:
>  range_op_handler minus (MINUS_EXPR, type);
>  if (!minus)
>return false;
> -return float_binary_op_range_finish (minus.fold_range (r, type, lhs, 
> op2),
> -  r, type, lhs);
> +frange wlhs = float_widen_lhs_range (type, lhs);
> +return float_binary_op_range_finish (minus.fold_range (r, type, wlhs, 
> op2),
> +  r, type, wlhs);
>}
>virtual bool op2_range (frange , tree type,
> const frange ,
> @@ -2260,9 +2288,10 @@ public:
>{
>  if (lhs.undefined_p ())
>return false;
> -return float_binary_op_range_finish (fop_plus.fold_range (r, type, lhs,
> +frange wlhs = float_widen_lhs_range (type, lhs);
> +return float_binary_op_range_finish (fop_plus.fold_range (r, type, wlhs,
> op2),
> -  r, type, lhs);
> +  

[PATCH] range-op-float: Fix up reverse binary operations [PR109008]

2023-03-09 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase is reduced from miscompilation of scipy package.
If we have say lhs = [1., 1.] - [1., 1.] and want to compute the range
of lhs from it, we correctly determine it is [0., 0.] (if computations
are exact, we generally don't try to round them further in
frange_arithmetic).  In the testcase it is about a reverse operation,
[1., 1.] = op1 + [1., 1.] and we want to compute range of op1 from that.
Right now we just perform the inverse operation (there are some corner
cases about NaN and infinities handling) and so arrive to range
[0., 0.] as well, and because it is a singleton, optimize return eps;
to return 0.  That is incorrect though, for the reverse ops we need to
take into account also rounding, the right exact range is
[-0x1.0p-54, 0x1.0p-53] in this case when rounding to nearest, i.e.
all numbers which added to 1. with round to nearest still produce 1.

The problem isn't solely on singleton ranges, and isn't solely on
results around zero.  We basically need to consider also values
where the result is up to 0.5ulp away from the lhs range boundaries
in each direction.

The following patch fixes it by extending the lhs range for the
reverse operations by 1ulp in each direction.  The PR contains
a pseudo-random test generator I've used to generate 30 tests
of + and - and then used the same test with * and / instead of + and -
together with a hack to print the discovered ranges by the patch in
a form that another test could then verify the range is conservatively
correct and how far it is from a minimal range.

I believe the results are good enough for now, though plan to look
incrementally into trying to do something better on the -XXX_MAX or
XXX_MAX boundaries (where I think frange_nextafter will use -inf or +inf)
and also try to increase the range just by 0.5ulp rather than 1ulp
if !flag_rounding_math.  But dunno if either of those will be doable
and will pass the testing, so I think it is worth committing this fix
first.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-03-09  Jakub Jelinek  
Richard Biener  

PR tree-optimization/109008
* range-op-float.cc (float_widen_lhs_range): New function.
(foperator_plus::op1_range, foperator_minus::op1_range,
foperator_minus::op2_range, foperator_mult::op1_range,
foperator_div::op1_range, foperator_div::op2_range): Use it.

* gcc.c-torture/execute/ieee/pr109008.c: New test.

--- gcc/range-op-float.cc.jj2023-03-08 12:33:44.641043477 +0100
+++ gcc/range-op-float.cc   2023-03-08 13:13:09.015341002 +0100
@@ -2199,6 +2199,33 @@ zero_to_inf_range (REAL_VALUE_TYPE ,
 }
 }
 
+/* Extend the LHS range by 1ulp in each direction.  For op1_range
+   or op2_range of binary operations just computing the inverse
+   operation on ranges isn't sufficient.  Consider e.g.
+   [1., 1.] = op1 + [1., 1.].  op1's range is not [0., 0.], but
+   [-0x1.0p-54, 0x1.0p-53] (when not -frounding-math), any value for
+   which adding 1. to it results in 1. after rounding to nearest.
+   So, for op1_range/op2_range extend the lhs range by 1ulp in each
+   direction.  See PR109008 for more details.  */
+
+static frange
+float_widen_lhs_range (tree type, const frange )
+{
+  frange ret = lhs;
+  if (lhs.known_isnan ())
+return ret;
+  REAL_VALUE_TYPE lb = lhs.lower_bound ();
+  REAL_VALUE_TYPE ub = lhs.upper_bound ();
+  if (real_isfinite ())
+frange_nextafter (TYPE_MODE (type), lb, dconstninf);
+  if (real_isfinite ())
+frange_nextafter (TYPE_MODE (type), ub, dconstinf);
+  ret.set (type, lb, ub);
+  ret.clear_nan ();
+  ret.union_ (lhs);
+  return ret;
+}
+
 class foperator_plus : public range_operator_float
 {
   using range_operator_float::op1_range;
@@ -2214,8 +2241,9 @@ public:
 range_op_handler minus (MINUS_EXPR, type);
 if (!minus)
   return false;
-return float_binary_op_range_finish (minus.fold_range (r, type, lhs, op2),
-r, type, lhs);
+frange wlhs = float_widen_lhs_range (type, lhs);
+return float_binary_op_range_finish (minus.fold_range (r, type, wlhs, op2),
+r, type, wlhs);
   }
   virtual bool op2_range (frange , tree type,
  const frange ,
@@ -2260,9 +2288,10 @@ public:
   {
 if (lhs.undefined_p ())
   return false;
-return float_binary_op_range_finish (fop_plus.fold_range (r, type, lhs,
+frange wlhs = float_widen_lhs_range (type, lhs);
+return float_binary_op_range_finish (fop_plus.fold_range (r, type, wlhs,
  op2),
-r, type, lhs);
+r, type, wlhs);
   }
   virtual bool op2_range (frange , tree type,
  const frange ,
@@ -2271,8 +2300,9 @@ public:
   {
 if (lhs.undefined_p ())
   return false;
-return float_binary_op_range_finish 

Re: [PATCH v2 0/5] A small Texinfo refinement

2023-03-09 Thread Gerald Pfeifer
On Wed, 8 Mar 2023, Sandra Loosemore wrote:
> I personally do not know how the manuals for the GCC web site are built

gcc.gnu.org has texinfo 6.5. (It's a RHEL 8.7 system.)

> If we do update the version, there's a version check in configure.ac and 
> some hack for "makeinfo 4.7 brokenness" in doc/install.texi2html that 
> need to be changed, as well as install.texi.

The note in doc/install.texi2html isn't specific to version 4.7. Rather 
it's about a design decision made then to encode dashes as _002d which
essentially solved a non-issue and isn't practial at all.

Good point - I'll see to update that comment.

Gerald


Re: [PATCH v2 0/5] A small Texinfo refinement

2023-03-09 Thread Richard Biener via Gcc-patches
On Thu, Mar 9, 2023 at 2:20 AM Andrew Pinski via Gcc-patches
 wrote:
>
> On Wed, Mar 8, 2023 at 5:09 PM Sandra Loosemore  
> wrote:
> >
> > On 3/8/23 14:22, Arsen Arsenović wrote:
> > >
> > > Sandra Loosemore  writes:
> > >
> > >> On 3/8/23 02:11, Arsen Arsenović wrote:
> > >>> Sandra Loosemore  writes:
> > >>>
> >  On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:
> > > I've rerendered the updated documentation with latest development
> > > Texinfo (as some of the changes I made for the purposes of the GCC
> > > manual still aren't in releases) at:
> > >  https://www.aarsen.me/~arsen/final/
> > 
> >  Ummm.  I don't think GCC's documentation should depend on an 
> >  unreleased version
> >  of Texinfo.  Currently install.texi documents that version 4.7 or 
> >  later is
> >  required, 4.8 for "make pdf"; did I miss something in your patch set 
> >  that bumps
> >  this requirement?  Exactly what features do you depend on that are not 
> >  yet
> >  supported by an official Texinfo release?
> > >>> This patch should still build with older Texinfo versions (albeit, I
> > >>> hadn't tested 4.7, I missed that requirement).  The unreleased version
> > >>> should be installed on the server building HTML documentation as it
> > >>> produces better results w.r.t clickable anchors and index-in-table
> > >>> handling.  It should not be a hard dependency, and should only degrade
> > >>> to its current state should in-dev Texinfo be missing.
> > >>
> > >> Hmmm, OK.  We presently have Texinfo version 6.7 installed here, so I'll 
> > >> give
> > >> that a try.  I'm not sure I'd be able to detect problems with incorrect 
> > >> HTML
> > >> anchors or whatever, though.
> > >
> > > As an example, let's take this link:
> > > https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Warning-Options.html#index-Wpedantic
> > >
> > > This should place you below the item line this index entry refers to,
> > > and there aren't any copiable anchors (see equivalent in my render for
> > > an example of those), both of which were often named as annoyances with
> > > the onlinedocs while the Sphinx experiment was taking place.
> > >
> > > A similar thing happens in the standalone and Emacs info viewers (but
> > > that's less noticeable there since the cursor is placed in the middle of
> > > the screen when jumping to an index entry there).  Try, for instance,
> > > 'info gcc Wpedantic' (your cursor will be placed just below the item
> > > line).
> > >
> > > The fix for the first of these issues should already be applied by
> > > Gerald (in the reordering commits, IIRC at least, save for one that I
> > > created later because someone snuck in new "misplaced" indices), and
> > > that fix should also fix up previous versions of Texinfo.
> > >
> > > Even with this change, the copiable anchors will remain missing since
> > > released Texinfo versions lack some AST transformations that enable
> > > those.
> >
> > OK, I can see the difference there between the current online docs, the
> > set you produced with the unreleased Texinfo support, and what I got
> > building with Texinfo 6.7.
> >
> > > Otherwise, manuals should work fine with older releases, unless I missed
> > > something when refactoring @defbuiltin and removing @gols (which I do
> > > believe are superfluous with current versions of texinfo.tex, which is
> > > why I bumped that too).
> >
> > I did a few spot-checks here and there of those changes.  I saw a couple
> > of line break problems but they turn out to be due to existing errors in
> > the .texi files that were not introduced by your (mostly mechanical)
> > changes.
> >
> > >> Most people building GCC from source probably use whatever versions of 
> > >> build
> > >> dependencies are provided by their OS distribution.  In our group we need
> > >> reproducible builds for long-term support so we maintain our own list of
> > >> dependencies and normally update to the latest stable versions only once 
> > >> every
> > >> few years unless there is a hard requirement to upgrade some particular 
> > >> tool
> > >> meanwhile.  I personally do not know how the manuals for the GCC web 
> > >> site are
> > >> built, but it seems kind of important to make sure that works as 
> > >> intended since
> > >> it's the main online resource for ordinary GCC users.
> > >
> > > Yes, I can get behind this sentiment too.  I don't mean to impose a hard
> > > dependency on the bleeding edge of Texinfo.  My target was indeed the
> > > GCC website and ordinary users.
> > >
> > >>> It might be worth bumping the minimum, 4.7 is a version from 2004; in
> > >>> the meanwhile, I'll try a few older versions too.
> > >>
> > >> I agree that it's unlikely anyone is building current GCC with a Texinfo
> > >> version as old as 4.7 any more, and it may be that the manual doesn't 
> > >> even
> > >> build properly with such an old release due to existing unintentional
> > >> dependencies on newer 

Re: [PATCH] -Wdangling-pointer: don't mark SSA lhs sets as stores

2023-03-09 Thread Alexandre Oliva via Gcc-patches
On Mar  8, 2023, Richard Biener  wrote:

> On Wed, Mar 8, 2023 at 2:04 PM Martin Liška  wrote:

>> Is the emitted warning correct?

> For the reduced testcase yes, if !aio_bh_poll_s (or !aio_bh_poll_bh)
> the stored pointer remains local.

*nod*, before the recent patch, it would have failed to issue the
warning in somewhat unpredictable circumstances.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH] ubsan: missed -fsanitize=bounds for compound ops [PR108060]

2023-03-09 Thread Richard Biener via Gcc-patches
On Wed, 8 Mar 2023, Marek Polacek wrote:

> In this PR we are dealing with a missing .UBSAN_BOUNDS, so the
> out-of-bounds access in the test makes the program crash before
> a UBSan diagnostic was emitted.  In C and C++, c_genericize gets
> 
>   a[b] = a[b] | c;
> 
> but in C, both a[b] are one identical shared tree (not in C++ because
> cp_fold/ARRAY_REF created two same but not identical trees).  Since
> ubsan_walk_array_refs_r keeps a pset, in C we produce
> 
>   a[.UBSAN_BOUNDS (0B, SAVE_EXPR , 8);, SAVE_EXPR ;] = a[b] | c;
> 
> because the LHS is walked before the RHS.
> 
> Since r7-1900, we gimplify the RHS before the LHS.  So the statement above
> gets gimplified into
> 
> _1 = a[b];
> c.0_2 = c;
> b.1 = b;
> .UBSAN_BOUNDS (0B, b.1, 8);
> 
> With this patch we produce:
> 
>   a[b] = a[.UBSAN_BOUNDS (0B, SAVE_EXPR , 8);, SAVE_EXPR ;] | c;
> 
> which gets gimplified into:
> 
> b.0 = b;
> .UBSAN_BOUNDS (0B, b.0, 8);
> _1 = a[b.0];
> 
> therefore we emit a runtime error before making the bad array access.
> 
> I think it's OK that only the RHS gets a .UBSAN_BOUNDS, as in few lines
> above: the instrumented array access dominates the array access on the
> LHS, and I've verified that
> 
>   b = 0;
>   a[b] = (a[b], b = -32768, a[0] | c);
> 
> works as expected: the inner a[b] is OK but we do emit an error for the
> a[b] on the LHS.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12?

I think this is a reasonable way to address the regression, so OK.

Thanks,
Richard.

>   PR sanitizer/108060
>   PR sanitizer/109050
> 
> gcc/c-family/ChangeLog:
> 
>   * c-gimplify.cc (ubsan_walk_array_refs_r): For a MODIFY_EXPR, instrument
>   the RHS before the LHS.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/ubsan/bounds-17.c: New test.
>   * c-c++-common/ubsan/bounds-18.c: New test.
>   * c-c++-common/ubsan/bounds-19.c: New test.
>   * c-c++-common/ubsan/bounds-20.c: New test.
> ---
>  gcc/c-family/c-gimplify.cc   | 12 
>  gcc/testsuite/c-c++-common/ubsan/bounds-17.c | 17 +
>  gcc/testsuite/c-c++-common/ubsan/bounds-18.c | 17 +
>  gcc/testsuite/c-c++-common/ubsan/bounds-19.c | 20 
>  gcc/testsuite/c-c++-common/ubsan/bounds-20.c | 16 
>  5 files changed, 82 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-17.c
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-18.c
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-19.c
>  create mode 100644 gcc/testsuite/c-c++-common/ubsan/bounds-20.c
> 
> diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc
> index 74b276b2b26..ef5c7d919fc 100644
> --- a/gcc/c-family/c-gimplify.cc
> +++ b/gcc/c-family/c-gimplify.cc
> @@ -106,6 +106,18 @@ ubsan_walk_array_refs_r (tree *tp, int *walk_subtrees, 
> void *data)
>  }
>else if (TREE_CODE (*tp) == ARRAY_REF)
>  ubsan_maybe_instrument_array_ref (tp, false);
> +  else if (TREE_CODE (*tp) == MODIFY_EXPR)
> +{
> +  /* Since r7-1900, we gimplify RHS before LHS.  Consider
> +a[b] |= c;
> +  wherein we can have a single shared tree a[b] in both LHS and RHS.
> +  If we only instrument the LHS and the access is invalid, the program
> +  could crash before emitting a UBSan error.  So instrument the RHS
> +  first.  */
> +  *walk_subtrees = 0;
> +  walk_tree (_OPERAND (*tp, 1), ubsan_walk_array_refs_r, pset, 
> pset);
> +  walk_tree (_OPERAND (*tp, 0), ubsan_walk_array_refs_r, pset, 
> pset);
> +}
>return NULL_TREE;
>  }
>  
> diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-17.c 
> b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
> new file mode 100644
> index 000..b727e3235b8
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-17.c
> @@ -0,0 +1,17 @@
> +/* PR sanitizer/108060 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds" } */
> +/* { dg-skip-if "" { *-*-* } "-flto" } */
> +/* { dg-shouldfail "ubsan" } */
> +
> +int a[8];
> +int c;
> +
> +int
> +main ()
> +{
> +  int b = -32768;
> +  a[b] |= c;
> +}
> +
> +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
> diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-18.c 
> b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
> new file mode 100644
> index 000..556abc0e1c0
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/ubsan/bounds-18.c
> @@ -0,0 +1,17 @@
> +/* PR sanitizer/108060 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds" } */
> +/* { dg-skip-if "" { *-*-* } "-flto" } */
> +/* { dg-shouldfail "ubsan" } */
> +
> +int a[8];
> +int c;
> +
> +int
> +main ()
> +{
> +  int b = -32768;
> +  a[b] = a[b] | c;
> +}
> +
> +/* { dg-output "index -32768 out of bounds for type 'int \\\[8\\\]'" } */
> diff --git a/gcc/testsuite/c-c++-common/ubsan/bounds-19.c 
> b/gcc/testsuite/c-c++-common/ubsan/bounds-19.c
> new file mode 

Re: [wwwdocs] gcc-13/porting_to.html: Document C++ -fexcess-precision=standard

2023-03-09 Thread Jakub Jelinek via Gcc-patches
On Thu, Mar 09, 2023 at 08:09:02AM +0100, Gerald Pfeifer wrote:
> I struggled a bit understanding this and so have come up with what I 
> hope is simpler (without changing the meaning).
> 
> What do you think of the change below?

LGTM, thanks.
> 
> diff --git a/htdocs/gcc-13/porting_to.html b/htdocs/gcc-13/porting_to.html
> index 170da096..8a2822ff 100644
> --- a/htdocs/gcc-13/porting_to.html
> +++ b/htdocs/gcc-13/porting_to.html
> @@ -122,12 +122,14 @@ the operand as an lvalue.
>  
>  Excess precision changes
>  
> -GCC 13 implements in C++ excess precision 
> support
> -which has been before implemented just in the C front end.  The new behavior 
> is
> -enabled by default in -std=c++NN modes and e.g. when
> -FLT_EVAL_METHOD is 1 or 2 affects behavior of floating point
> -constants and expressions.  E.g. for FLT_EVAL_METHOD equal
> -to 2 on ia32:
> +GCC 13 implements excess precision
> +support, which was implemented just in the C front end
> +before, in C++. The new behavior is enabled by default in
> +-std=c++NN modes and when
> +FLT_EVAL_METHOD is 1 or 2 and affects the behavior of
> +floating point constants and expressions.
> +
> +E.g. for FLT_EVAL_METHOD equal to 2 on ia32
>  
>  
>  #include stdlib.h
> @@ -139,11 +141,11 @@ will not abort with standard excess precision, because 
> constants and expressions
>  in float or double are evaluated in precision of
>  long double and demoted only on casts or assignments, but will
>  abort with fast excess precision, where whether something is evaluated in
> -precision of long double or not depends on what evaluations are
> -done in the i387 floating point stack or are spilled from it.
> +long double precision depends on what evaluations are
> +done in the i387 floating point stack or are spilled from it.
>  
> -The -fexcess-precision=fast option can be used to request the
> -previous behavior.
> +The -fexcess-precision=fast option can be used to
> +request the previous behavior.
>  
>   id="alloc-rebind">allocator_traitsA::rebind_allocA::value_type
>  must be A
>  

Jakub