Re: [PATCH V2] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 10:40:01PM +0800, liuhongt wrote:
> gcc/ChangeLog:
> 
>   PR target/115384
>   * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
>   Only do the simplification of (AND (ASHIFTRT A imm) mask)
>   to (LSHIFTRT A imm) when the component of const_vector is
>   CONST_INT_P.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/pr115384.c: New test.

LGTM, except

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr115384.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile { target { ! ia32 } } } */

Maybe just int128 target instead of { ! ia32 } would be more appropriate.

Ok either way.

> +/* { dg-options "-O" } */
> +
> +typedef __attribute__((__vector_size__(sizeof(__int128 __int128 W;
> +
> +W w;
> +
> +void
> +foo()
> +{
> +  w = w >> 4 & 18446744073709551600llu;
> +}

Jakub



Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 10:06:49AM +0200, Richard Biener wrote:
> > approrpiate #define _POSIX_C_SOURCE or #define _XOPE_SOURCE befor the
> > include in case somebody builds with -std=c99?
> 
> Oh, and the manpage says that  also defines ssize_t which
> is a bit odd since we already include that ...

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/stdio.h.html
shows that indeed POSIX 2018 stdio.h should provide ssize_t, but
e.g. POSIX 2004 stdio.h doesn't have to:
https://pubs.opengroup.org/onlinepubs/007904875/basedefs/stdio.h.html

Jakub



Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 09:27:37AM +0200, Richard Biener wrote:
> On Tue, 11 Jun 2024, FX Coudert wrote:
> 
> > Hi
> > 
> > I can’t seem to get a review of this one-line patch. Could a global 
> > reviewer help?
> 
> While stdio.h can be relied on to exist I do not think you can assume
> the same for sys/types.h without "configury", but libgccjit.h is an
> installed API.  I would assume including stdlib.h gets you ssize_t as 
> well?

If stdlib.h includes sys/types.h like often on Linux, yes, but not
necessarily.  ssize_t is a POSIX type and it might be solely in sys/types.h.

Perhaps libgccjit.h could use
#ifdef __has_include
#if __has_include ()
#include 
#endif
#endif
instead of just #include .
When compiled by gcc, one can use hacks like
#define unsigned signed
typedef __SIZE_TYPE__ gcc_jit_ssize_t;
#undef unsigned
but that might not work with other compilers and is perhaps
just too ugly.

>  In fact the C11 standard doesn't even mention ssize_t so the
> API should probably avoid using it and instead use size_t for
> 
> /* Given type "T", get its size.
>This API entrypoint was added in LIBGCCJIT_ABI_20; you can test for its
>presence using
>  #ifdef LIBGCCJIT_HAVE_SIZED_INTEGERS  */
> extern ssize_t
> gcc_jit_type_get_size (gcc_jit_type *type);

Jakub



Re: [PATCH] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 01:36:35PM +0800, liuhongt wrote:
> In theory, const_wide_int can also be handle with extra check for each 
> components of the HOST_WIDE_INT array, and the check is need for both
> shift and bit_and operands.
> I assume the optimization opportnunity is rare, so the patch just add
> extra check to make sure GET_MODE_INNER (mode) can fix into a
> HOST_WIDE_INT.

I think if you only handle CONST_INT_P, you should check just for that, and
in both places where you check for CONST_VECTOR_DUPLICATE_P (there is one
spot 2 lines above this).
So add
&& CONST_INT_P (XVECEXP (XEXP (op0, 1), 0, 0))
and
&& CONST_INT_P (XVECEXP (op1, 0, 0))
tests right below those && CONST_VECTOR_DUPLICATE_P (something) tests.
> 
> gcc/ChangeLog:
> 
>   PR target/115384
>   * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
>   Only do the simplification of (AND (ASHIFTRT A imm) mask)
>   to (LSHIFTRT A imm) when inner mode fits HOST_WIDE_INT.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/pr115384.c: New test.

Jakub



Re: [PATCH] OpenMP: warn about iteration var modifications in loop body

2024-06-07 Thread Jakub Jelinek
On Wed, Mar 06, 2024 at 06:08:47PM +0100, Frederik Harwath wrote:
> Subject: [PATCH] OpenMP: warn about iteration var modifications in loop body

Note, the partially rewritten OpenMP loop transformations changes are now
in.
See below.

> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -235,6 +235,8 @@ struct gimplify_omp_ctx
>bool order_concurrent;
>bool has_depend;
>bool in_for_exprs;
> +  bool in_omp_for_body;
> +  bool is_doacross;
>int defaultmap[5];
>  };
>  
> @@ -456,6 +458,10 @@ new_omp_context (enum omp_region_type region_type)
>c->privatized_types = new hash_set;
>c->location = input_location;
>c->region_type = region_type;
> +  c->loop_iter_var.create (0);
> +  c->in_omp_for_body = false;
> +  c->is_doacross = false;

I'm not sure it is a good idea to reuse loop_iter_var for this.

>if ((region_type & ORT_TASK) == 0)
>  c->default_kind = OMP_CLAUSE_DEFAULT_SHARED;
>else
> @@ -6312,6 +6318,18 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
> gimple_seq *post_p,
>gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
> || TREE_CODE (*expr_p) == INIT_EXPR);
>  
> +  if (gimplify_omp_ctxp && gimplify_omp_ctxp->in_omp_for_body)
> +{
> +  size_t num_vars = gimplify_omp_ctxp->loop_iter_var.length () / 2;
> +  for (size_t i = 0; i < num_vars; i++)
> + {
> +   if (*to_p == gimplify_omp_ctxp->loop_iter_var[2 * i + 1])
> + warning_at (input_location, OPT_Wopenmp,
> + "forbidden modification of iteration variable %qE in "
> + "OpenMP loop", *to_p);

I think the forbidden word doesn't belong there, just modification of ...

Note, your patch seems to handle just one gimplify_omp_ctxp, not all.
If I do:
#pragma omp for
for (int i = 0; i < 32; ++i)
{
  ++i; // This is warned about
  #pragma omp parallel shared (i)
  #pragma omp master
  ++i; // This is not
  #pragma omp parallel private (i)
  ++i; // This should not
  #pragma omp target map(tofrom:i)
  ++i; // This is not
  #pragma omp target firstprivate (i)\
  ++i; // This should not
  #pragma omp simd
  for (i = 0; i < 32; ++i) // This is not
;
}
The question is if it isn't just too hard to figure out the data sharing
in nested constructs.  But to be useful, perhaps at least loop
transformation constructs which don't have any privatization on the
iterators (pending the resolution of the data sharing loop transformation
issue) should be handled.

> @@ -15380,23 +15398,22 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
>gcc_assert (DECL_P (decl));
>gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl))
> || POINTER_TYPE_P (TREE_TYPE (decl)));
> -  if (is_doacross)
> +
> +  if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))

There is nothing specific about OMP_FOR for the orig decls, the reason
why the check is (probably) there is that simd construct has extra
restriction:
"The only random access iterator types that are allowed for the associated 
loops are pointer
types."
and so there is no point at looking at the orig decls for say for simd 
ordered(2)
doacross loops.
I was worried your patch wouldn't handle
void bar (int &);

void
foo ()
{
  int i;
  #pragma omp for
  for (i = 0; i < 32; ++i)
bar (i);
}
where because the IV is addressable we actually choose to use an artificial
IV and assign i = i.0; at the start of the loop body, but apparently that
works right (though maybe it should go into the testsuite), supposedly we
emit it in gimplify_omp_for in GIMPLE before actually gimplifying the actual
OMP_FOR_BODY (but it is an assignment in there).

Anyway, what the patch certainly doesn't handle is the loop transformations.
The tile/unroll partial as done right now have the inter-tile emitted into
the OMP_FOR body, so both the initial assignment and the increment in there
would trigger the warning.  I guess similarly for reverse construct when
implemented.  Furthermore, the generated loops together with associated
ORIG_DECLs move to whatever outer construct loop needs them.

So, I think instead of doing it during gimplification of actual statements,
we should do it through a walk_tree on the bodies, done perhaps from inside
of omp_maybe_apply_loop_xforms or better right before that and mark through some
new flag loops whose bodies were walked for the diagnostics so that we don't
do that again.  Just have one hash map based on say DECL_UID into which we
mark all the loop iterators which should be warned about,
*walk_subtrees = 0; for OpenMP constructs which could privatize stuff
because it would be too difficult to handle but walk using a separate
walk_tree the loop transformation constructs and normally walk say
OMP_CRITICAL, OMP_MASKED and other constructs which never privatize stuff.
So, handle say
#pragma omp for
#pragma omp tile sizes (2, 2)
for (int i = 0; i < 32; ++i)
for (int j = 0; j < 32; ++j)
{
  ++i; // warn here; this is in the end generated loop of for, 

[PATCH] bitint: Fix up lower_addsub_overflow [PR115352]

2024-06-07 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled because of a flawed optimization.
If one changes the 65 in the testcase to e.g. 66, one gets:
...
  _25 = .USUBC (0, _24, _14);
  _12 = IMAGPART_EXPR <_25>;
  _26 = REALPART_EXPR <_25>;
  if (_23 >= 1)
goto ; [80.00%]
  else
goto ; [20.00%]

   :
  if (_23 != 1)
goto ; [80.00%]
  else
goto ; [20.00%]

   :
  _27 = (signed long) _26;
  _28 = _27 >> 1;
  _29 = (unsigned long) _28;
  _31 = _29 + 1;
  _30 = _31 > 1;
  goto ; [100.00%]

   :
  _32 = _26 != _18;
  _33 = _22 | _32;

   :
  # _17 = PHI <_30(9), _22(7), _33(10)>
  # _19 = PHI <_29(9), _18(7), _18(10)>
...
so there is one path for limbs below the boundary (in this case there are
actually no limbs there, maybe we could consider optimizing that further,
say with simply folding that _23 >= 1 condition to 1 == 1 and letting
cfg cleanup handle it), another case where it is exactly the limb on the
boundary (that is the bb 9 handling where it extracts the interesting
bits (the first 3 statements) and then checks if it is zero or all ones and
finally the case of limbs above that where it compares the current result
limb against the previously recorded 0 or all ones and ors differences into
accumulated result.

Now, the optimization which the first hunk removes was based on the idea
that for that case the extraction of the interesting bits from the limb
don't need anything special, so the _27/_28/_29 statements above aren't
needed, the whole limb is interesting bits, so it handled the >= 1
case like the bb 9 above without the first 3 statements and bb 10 wasn't
there at all.  There are 2 problems with that, for the higher limbs it
only checks if the the result limb bits are all zeros or all ones, but
doesn't check if they are the same as the other extension bits, and
it forgets the previous flag whether there was an overflow.
First I wanted to fix it just by adding the _33 = _22 | _30; statement
to the end of bb 9 above, which fixed the originally filed huge testcase
and the first 2 foo calls in the testcase included in the patch, it no
longer forgets about previously checked differences from 0/1.
But as the last 2 foo calls show, it still didn't check whether each
even (or each odd depending on the exact position) result limb is
equal to the first one, so every second limb it could choose some other
0 vs. all ones value and as long as it repeated in another limb above it
it would be ok.

So, the optimization just can't work properly and the following patch
removes it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/14.2?

2024-06-07  Jakub Jelinek  

PR middle-end/115352
* gimple-lower-bitint.cc (lower_addsub_overflow): Don't disable
single_comparison if cmp_code is GE_EXPR.

* gcc.dg/torture/bitint-71.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-04-12 10:59:48.233153262 +0200
+++ gcc/gimple-lower-bitint.cc  2024-06-06 12:06:57.065717651 +0200
@@ -4286,11 +4286,7 @@ bitint_large_huge::lower_addsub_overflow
  bool single_comparison
= (startlimb + 2 >= fin || (startlimb & 1) != (i & 1));
  if (!single_comparison)
-   {
- cmp_code = GE_EXPR;
- if (!check_zero && (start % limb_prec) == 0)
-   single_comparison = true;
-   }
+   cmp_code = GE_EXPR;
  else if ((startlimb & 1) == (i & 1))
cmp_code = EQ_EXPR;
  else
--- gcc/testsuite/gcc.dg/torture/bitint-71.c.jj 2024-06-06 12:20:55.824913276 
+0200
+++ gcc/testsuite/gcc.dg/torture/bitint-71.c2024-06-06 12:20:45.260044338 
+0200
@@ -0,0 +1,28 @@
+/* PR middle-end/115352 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c23" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 385
+int
+foo (_BitInt (385) b)
+{
+  return __builtin_sub_overflow_p (0, b, (_BitInt (65)) 0);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 385
+  if (!foo (-(_BitInt (385)) 
0x0c377e8a3fd1881fff035bb487a51c9ed1f7350befa7ec445a3cf8d1ebb723981wb))
+__builtin_abort ();
+  if (!foo 
(-0x1c377e8a3fd1881fff035bb487a51c9ed1f7350befa7ec445a3cf8d1ebb723981uwb))
+__builtin_abort ();
+  if (!foo (-(_BitInt (385)) 
0x0a3cf8d1ebb723981wb))
+__builtin_abort ();
+  if (!foo 
(-0x1a3cf8d1ebb723981uwb))
+__builtin_abort ();
+#endif
+}

Jakub



[committed] libgomp: Mark Loop transformation constructs as implemented in the implementation status

2024-06-06 Thread Jakub Jelinek
Hi!

The implementation has been committed in r15-1037.

2024-06-06  Jakub Jelinek  

* libgomp.texi (OpenMP 5.1 status): Mark Loop transformation constructs
as implemented.

--- libgomp/libgomp.texi
+++ libgomp/libgomp.texi
@@ -302,7 +302,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{error} directive @tab Y @tab
 @item @code{masked} construct @tab Y @tab
 @item @code{scope} directive @tab Y @tab
-@item Loop transformation constructs @tab N @tab
+@item Loop transformation constructs @tab Y @tab
 @item @code{strict} modifier in the @code{grainsize} and @code{num_tasks}
   clauses of the @code{taskloop} construct @tab Y @tab
 @item @code{align} clause in @code{allocate} directive @tab P

Jakub



Re: [PATCH] c++: Handle erroneous DECL_LOCAL_DECL_ALIAS in duplicate_decls [PR107575]

2024-06-05 Thread Jakub Jelinek
On Wed, Jun 05, 2024 at 08:13:14AM +, Simon Martin wrote:
> --- a/gcc/cp/decl.cc
> +++ b/gcc/cp/decl.cc
> @@ -2792,10 +2792,13 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
> hiding, bool was_hidden)
> retrofit_lang_decl (newdecl);
> tree alias = DECL_LOCAL_DECL_ALIAS (newdecl)
>   = DECL_LOCAL_DECL_ALIAS (olddecl);
> -   DECL_ATTRIBUTES (alias)
> - = (*targetm.merge_decl_attributes) (alias, newdecl);
> -   if (TREE_CODE (newdecl) == FUNCTION_DECL)
> - merge_attribute_bits (newdecl, alias);
> +   if (alias != error_mark_node)
> + {
> +   DECL_ATTRIBUTES (alias) =
> + (*targetm.merge_decl_attributes) (alias, newdecl);

Formatting nit, = should be on the next line, not at the end of a line.
See https://gcc.gnu.org/codingconventions.html and 
https://gcc.gnu.org/codingconventions.html

Jakub



[PATCH] fold-const: Handle CTZ like CLZ in tree_call_nonnegative_warnv_p [PR115337]

2024-06-04 Thread Jakub Jelinek
Hi!

I think we can handle CTZ exactly like CLZ in tree_call_nonnegative_warnv_p.
Like CLZ, if it is UB at zero, the result range is [0, prec-1] and if it is
well defined at zero, the second argument provides the value at zero.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-06-04  Jakub Jelinek  

PR tree-optimization/115337
* fold-const.cc (tree_call_nonnegative_warnv_p): Handle
CASE_CFN_CTZ like CASE_CFN_CLZ.

--- gcc/fold-const.cc.jj2024-06-04 12:08:14.671262211 +0200
+++ gcc/fold-const.cc   2024-06-04 10:56:57.575425348 +0200
@@ -15250,6 +15250,7 @@ tree_call_nonnegative_warnv_p (tree type
   return true;
 
 CASE_CFN_CLZ:
+CASE_CFN_CTZ:
   if (arg1)
return RECURSE (arg1);
   return true;

Jakub



[PATCH] ranger: Improve CLZ fold_range [PR115337]

2024-06-04 Thread Jakub Jelinek
Hi!

cfn_ctz::fold_range includes special cases for the case where .CTZ has
two arguments and so is well defined at zero, and the second argument is
equal to prec or -1, but cfn_clz::fold_range does that only for the prec
case.  -1 is fairly common as well though, because the  builtins
do use it now, so I think it is worth special casing that.
If we don't know anything about the argument, the difference for
.CLZ (arg, -1) is that previously the result was varying, now it will be
[-1, prec-1].  If we knew arg can't be zero, it used to be optimized before
as well into e.g. [0, prec-1] or similar.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-06-04  Jakub Jelinek  

PR tree-optimization/115337
* gimple-range-op.cc (cfn_clz::fold_range): For
m_gimple_call_internal_p handle as a special case also second argument
of -1 next to prec.

--- gcc/gimple-range-op.cc.jj   2024-05-21 10:19:34.736524824 +0200
+++ gcc/gimple-range-op.cc  2024-06-04 11:53:35.190005093 +0200
@@ -941,8 +941,10 @@ cfn_clz::fold_range (irange , tree typ
   int maxi = prec - 1;
   if (m_gimple_call_internal_p)
 {
-  // Only handle the single common value.
-  if (rh.lower_bound () == prec)
+  // Handle only the two common values.
+  if (rh.lower_bound () == -1)
+   mini = -1;
+  else if (rh.lower_bound () == prec)
maxi = prec;
   else
// Magic value to give up, unless we can prove arg is non-zero.
@@ -953,7 +955,7 @@ cfn_clz::fold_range (irange , tree typ
   if (wi::gt_p (lh.lower_bound (), 0, TYPE_SIGN (lh.type (
 {
   maxi = prec - 1 - wi::floor_log2 (lh.lower_bound ());
-  if (mini == -2)
+  if (mini < 0)
mini = 0;
 }
   else if (!range_includes_zero_p (lh))
@@ -969,11 +971,11 @@ cfn_clz::fold_range (irange , tree typ
   if (max == 0)
 {
   // If CLZ_DEFINED_VALUE_AT_ZERO is 2 with VALUE of prec,
-  // return [prec, prec], otherwise ignore the range.
-  if (maxi == prec)
-   mini = prec;
+  // return [prec, prec] or [-1, -1], otherwise ignore the range.
+  if (maxi == prec || mini == -1)
+   mini = maxi;
 }
-  else
+  else if (mini >= 0)
 mini = newmini;
 
   if (mini == -2)

Jakub



[PATCH] fold-const, gimple-fold: Some formatting cleanups

2024-06-04 Thread Jakub Jelinek
Hi!

While looking into PR115337, I've spotted some badly formatted code,
which the following patch fixes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-06-04  Jakub Jelinek  

* fold-const.cc (tree_call_nonnegative_warnv_p): Formatting fixes.
(tree_invalid_nonnegative_warnv_p): Likewise.
* gimple-fold.cc (gimple_call_nonnegative_warnv_p): Likewise.

--- gcc/fold-const.cc.jj2024-04-04 10:47:46.363287718 +0200
+++ gcc/fold-const.cc   2024-06-04 10:56:57.575425348 +0200
@@ -15331,8 +15331,8 @@ tree_call_nonnegative_warnv_p (tree type
 non-negative if both operands are non-negative.  In the presence
 of qNaNs, we're non-negative if either operand is non-negative
 and can't be a qNaN, or if both operands are non-negative.  */
-  if (tree_expr_maybe_signaling_nan_p (arg0) ||
- tree_expr_maybe_signaling_nan_p (arg1))
+  if (tree_expr_maybe_signaling_nan_p (arg0)
+ || tree_expr_maybe_signaling_nan_p (arg1))
 return RECURSE (arg0) && RECURSE (arg1);
   return RECURSE (arg0) ? (!tree_expr_maybe_nan_p (arg0)
   || RECURSE (arg1))
@@ -15431,8 +15431,8 @@ tree_invalid_nonnegative_warnv_p (tree t
 
 case CALL_EXPR:
   {
-   tree arg0 = call_expr_nargs (t) > 0 ?  CALL_EXPR_ARG (t, 0) : NULL_TREE;
-   tree arg1 = call_expr_nargs (t) > 1 ?  CALL_EXPR_ARG (t, 1) : NULL_TREE;
+   tree arg0 = call_expr_nargs (t) > 0 ? CALL_EXPR_ARG (t, 0) : NULL_TREE;
+   tree arg1 = call_expr_nargs (t) > 1 ? CALL_EXPR_ARG (t, 1) : NULL_TREE;
 
return tree_call_nonnegative_warnv_p (TREE_TYPE (t),
  get_call_combined_fn (t),
--- gcc/gimple-fold.cc.jj   2024-02-28 09:40:09.473563056 +0100
+++ gcc/gimple-fold.cc  2024-06-04 10:38:37.515145399 +0200
@@ -9334,10 +9334,10 @@ static bool
 gimple_call_nonnegative_warnv_p (gimple *stmt, bool *strict_overflow_p,
 int depth)
 {
-  tree arg0 = gimple_call_num_args (stmt) > 0 ?
-gimple_call_arg (stmt, 0) : NULL_TREE;
-  tree arg1 = gimple_call_num_args (stmt) > 1 ?
-gimple_call_arg (stmt, 1) : NULL_TREE;
+  tree arg0
+= gimple_call_num_args (stmt) > 0 ? gimple_call_arg (stmt, 0) : NULL_TREE;
+  tree arg1
+= gimple_call_num_args (stmt) > 1 ? gimple_call_arg (stmt, 1) : NULL_TREE;
   tree lhs = gimple_call_lhs (stmt);
   return (lhs
  && tree_call_nonnegative_warnv_p (TREE_TYPE (lhs),

Jakub



[PATCH] fold-const: Fix up CLZ handling in tree_call_nonnegative_warnv_p [PR115337]

2024-06-04 Thread Jakub Jelinek
Hi!

The function currently incorrectly assumes all the __builtin_clz* and .CLZ
calls have non-negative result.  That is the case of the former which is UB
on zero and has [0, prec-1] return value otherwise, and is the case of the
single argument .CLZ as well (again, UB on zero), but for two argument
.CLZ is the case only if the second argument is also nonnegative (or if we
know the argument can't be zero, but let's do that just in the ranger IMHO).

The following patch does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and 14?
For 13 and earlier, we can't use the testcase and the fold-const.cc change
would need to differentiate between __builtin_clz* vs. .CLZ and in the
latter case look at CLZ_DEFINED_VALUE_AT_ZERO.

2024-06-04  Jakub Jelinek  

PR tree-optimization/115337
* fold-const.cc (tree_call_nonnegative_warnv_p) :
If arg1 is non-NULL, RECURSE on it, otherwise return true.

* gcc.dg/bitint-106.c: New test.

--- gcc/fold-const.cc.jj2024-04-04 10:47:46.363287718 +0200
+++ gcc/fold-const.cc   2024-06-04 10:56:57.575425348 +0200
@@ -15241,7 +15241,6 @@ tree_call_nonnegative_warnv_p (tree type
 CASE_CFN_FFS:
 CASE_CFN_PARITY:
 CASE_CFN_POPCOUNT:
-CASE_CFN_CLZ:
 CASE_CFN_CLRSB:
 case CFN_BUILT_IN_BSWAP16:
 case CFN_BUILT_IN_BSWAP32:
@@ -15250,6 +15249,11 @@ tree_call_nonnegative_warnv_p (tree type
   /* Always true.  */
   return true;
 
+CASE_CFN_CLZ:
+  if (arg1)
+   return RECURSE (arg1);
+  return true;
+
 CASE_CFN_SQRT:
 CASE_CFN_SQRT_FN:
   /* sqrt(-0.0) is -0.0.  */
--- gcc/testsuite/gcc.dg/bitint-106.c.jj2024-06-04 12:00:59.017079094 
+0200
+++ gcc/testsuite/gcc.dg/bitint-106.c   2024-06-04 12:00:41.975306632 +0200
@@ -0,0 +1,29 @@
+/* PR tree-optimization/115337 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-O2" } */
+
+#if __BITINT_MAXWIDTH__ >= 129
+#define N 128
+#else
+#define N 63
+#endif
+
+_BitInt (N) g;
+int c;
+
+void
+foo (unsigned _BitInt (N + 1) z, _BitInt (N) *ret)
+{
+  c = __builtin_stdc_first_leading_one (z << N);
+  _BitInt (N) y = *(_BitInt (N) *) __builtin_memset (, c, 5);
+  *ret = y;
+}
+
+int
+main ()
+{
+  _BitInt (N) x;
+  foo (0, );
+  if (c || g || x)
+__builtin_abort ();
+}

Jakub



Re: [PATCH] Implement -fassume-sane-operator-new [PR110137]

2024-06-04 Thread Jakub Jelinek
On Wed, May 29, 2024 at 04:09:08AM +, user202...@protonmail.com wrote:
> This patch implements the flag -fassume-sane-operator-new as suggested in 
> PR110137. When the flag is enabled, it is assumed that operator new does not 
> modify global memory.
> 
> While this patch is not powerful enough to handle the original issue in 
> PR110035, it allows the optimizer to handle some simpler case (e.g. load from 
> global memory with fixed address), as demonstrated in the test 
> sane-operator-new-1.C.
> 
> To handle the original issue in PR110035, some other improvement to the 
> optimizer is needed, which will be sent as subsequent patches.
> 
> Bootstrapped and regression tested on x86_64-pc-linux-gnu.

> From 14a8604907c89838577ff8560df9a3f9dc2d8afb Mon Sep 17 00:00:00 2001
> From: user202729 
> Date: Fri, 24 May 2024 17:40:55 +0800
> Subject: [PATCH] Implement -fassume-sane-operator-new [PR110137]
> 
>   PR c++/110137
> 
> gcc/c-family/ChangeLog:
> 
>   * c.opt: New option.

You need c.opt (fassume-sane-operator-new): New option.

> gcc/ChangeLog:
> 
>   * ira.cc (is_call_operator_new_p): New function.
>   (may_modify_memory_p): Likewise.
>   (validate_equiv_mem): Modify to use may_modify_memory_p.

The patch doesn't update doc/invoke.texi with the description of
what the option does, that is essential.

> +fassume-sane-operator-new
> +C++ Optimization Var(flag_assume_sane_operator_new)
> +Assume operator new does not have any side effect other than the allocation.

Is it just about operator new and not about operator delete as well in
clang?
Is it about all operator new or just the replaceable ones (standard ones in
global scope, those also have DECL_IS_REPLACEABLE_OPERATOR flag on them).
Depending on this, if the flag is about only replaceable ones, I think it is
a global property, so for LTO it should be merged as if there is a single TU
which uses this flag, it is set for the whole LTO compilation (or should it
be only for TUs with that flag which actually use such operator new calls?).
If it is all operators new, then it is a local property in each function (or
even better a property of the operators actually) and we should track
somewhere in cfun whether a function compiled with that flag calls operator
new and whether a function compiled without that flag calls operator new.
Then e.g. during inlining merge it, such that if both the functions invoke
operator new and they disagree on whether it is sane or not, the non-sane
case wins.

> --- a/gcc/ira.cc
> +++ b/gcc/ira.cc

This surely is much more important to handle in the alias oracle, not just
IRA.

> @@ -3080,6 +3080,27 @@ validate_equiv_mem_from_store (rtx dest, const_rtx set 
> ATTRIBUTE_UNUSED,
>  
>  static bool equiv_init_varies_p (rtx x);
>  
> +static bool is_call_operator_new_p (rtx_insn *insn)

Formatting, static bool on one line, is_call_... on another one.
And needs a function comment.

> +{
> +  if (!CALL_P (insn))
> +return false;
> +  tree fn = get_call_fndecl (insn);
> +  if (fn == NULL_TREE)
> +return false;
> +  return DECL_IS_OPERATOR_NEW_P (fn);
> +}
> +
> +/* Returns true if there is a possibility that INSN may modify memory.
> +   If false is returned, the compiler proved INSN never modify memory.  */
> +static bool may_modify_memory_p (rtx_insn *insn)

Again, missing newline instead of space after bool.
Not sure about the name of this function, even sane replaceable operator new
may modify memory (it actually has to), just shouldn't modify memory
the compiler cares about.

> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/sane-operator-new-1.C
> @@ -0,0 +1,12 @@
> +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
> +/* { dg-options "-O2 -fassume-sane-operator-new" } */

If the tests are x86 specific, they should go to g++.target/i386/ directory.
But as I said earlier, it would be better to handle optimizations like that
on GIMPLE too and then you can test that say on optimized dump on all
targets.

Jakub



Re: RFC: Support for pragma clang loop interleave_count(N)

2024-06-04 Thread Jakub Jelinek
On Tue, Jun 04, 2024 at 11:58:43AM +0100, Andre Vieira (lists) wrote:
>   case annot_expr_unroll_kind:
> + case annot_expr_interleaves_kind:
> {
> - pp_string (pp, ", unroll ");
> + pp_string (pp,
> +annot_expr_unroll_kind

I think annot_expr_unroll_kind is 1 and thus always non-zero.
You want to compare the value of the operand, or just use separate
cases, they aren't that large.

> +? ", unroll "
> +: ", interleaves ");
>   pp_decimal_int (pp,
>   (int) TREE_INT_CST_LOW (TREE_OPERAND (node, 2)));
>   break;

Jakub



[PATCH] rs6000: Decrease code size of rs6000_init_generated_builtins [PR115324]

2024-06-04 Thread Jakub Jelinek
On Mon, Jun 03, 2024 at 03:40:38PM -0500, Segher Boessenkool wrote:
> > So, either we'd need to add some further GTY extensions, or the following
> > patch instead reworks it such that the fntype members which were the only
> > reason for PCH in those arrays are moved to separate arrays.
> 
> And that just sidesteps the limitation in PCH?

Yes.  But at the same size decreases the sizes of the data sections and
decreases size of the data written to/from PCH files, so I think it is a
win.

> >  void
> >  rs6000_init_generated_builtins ()
> >  {
> > +  bifdata *rs6000_builtin_info_p;
> > +  tree *rs6000_builtin_info_fntype_p;
> > +  ovlddata *rs6000_instance_info_p;
> > +  tree *rs6000_instance_info_fntype_p;
> > +  ovldrecord *rs6000_overload_info_p;
> > +  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
> 
> Bah.
> 
> It should not be called _p of course, it is not a predicate.  And
> relying on the operand tie to not have to do a much more obvious
> assignment, please don't.  Just *do* write assignments, and then use
> a simple "+r"?
> 
> But you call this a hack anyway, you wouldn't propose to actually
> include this patch :-)

It was a quick hack just to see why the size grew that much.
Ideally some optimization would figure out we have a single function which
has
461   rs6000_overload_info
   1257   rs6000_builtin_info_fntype
   1768   rs6000_builtin_decls
   2548   rs6000_instance_info_fntype
array references and that maybe it might be a good idea to just preload
the addresses of those arrays into some register if it decreases code size
and doesn't slow things down.
The function actually is called just once and is huge, so code size is even
more important than speed, which is dominated by all the GC allocations
anyway.

Until that is done, here is a slightly cleaner version of the hack, which
makes the function noipa (so that LTO doesn't undo it) for GCC 8.1+ and
passes the 4 arrays as arguments to the function from the caller.
This decreases the function size from 228668 bytes to 207572 bytes.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

2024-06-04  Jakub Jelinek  

PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Change
declaration of rs6000_init_generated_builtins from no arguments
to 4 pointer arguments.
(write_init_bif_table): Change rs6000_builtin_info_fntype to
builtin_info_fntype and rs6000_builtin_decls to builtin_decls.
(write_init_ovld_table): Change rs6000_instance_info_fntype to
instance_info_fntype, rs6000_builtin_decls to builtin_decls and
rs6000_overload_info to overload_info.
(write_init_file): Add __noipa__ attribute to
rs6000_init_generated_builtins for GCC 8.1+ and change the function
from no arguments to 4 pointer arguments.  Change rs6000_builtin_decls
to builtin_decls.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Adjust
rs6000_init_generated_builtins caller.

--- gcc/config/rs6000/rs6000-gen-builtins.cc.jj 2024-06-03 23:11:02.662631144 
+0200
+++ gcc/config/rs6000/rs6000-gen-builtins.cc2024-06-03 23:38:31.727620920 
+0200
@@ -2376,7 +2376,10 @@ write_decls (void)
   "rs6000_instance_info_fntype[RS6000_INST_MAX];\n");
   fprintf (header_file, "extern ovldrecord rs6000_overload_info[];\n\n");
 
-  fprintf (header_file, "extern void rs6000_init_generated_builtins ();\n\n");
+  fprintf (header_file,
+  "extern void rs6000_init_generated_builtins (tree *, tree *,\n");
+  fprintf (header_file,
+  "\t\t\t\t\tovldrecord *, tree *);\n\n");
   fprintf (header_file,
   "extern bool rs6000_builtin_is_supported (rs6000_gen_builtins);\n");
   fprintf (header_file,
@@ -2651,7 +2654,7 @@ write_init_bif_table (void)
   for (int i = 0; i <= curr_bif; i++)
 {
   fprintf (init_file,
-  "  rs6000_builtin_info_fntype[RS6000_BIF_%s]"
+  "  builtin_info_fntype[RS6000_BIF_%s]"
   "\n= %s;\n",
   bifs[i].idname, bifs[i].fndecl);
 
@@ -2678,7 +2681,7 @@ write_init_bif_table (void)
}
 
   fprintf (init_file,
-  "  rs6000_builtin_decls[(int)RS6000_BIF_%s] = t\n",
+  "  builtin_decls[(int)RS6000_BIF_%s] = t\n",
   bifs[i].idname);
   fprintf (init_file,
   "= add_builtin_function (\"%s\",\n",
@@ -2719,7 +2722,7 @@ write_init_bif_table (void)
  fprintf (init_file, "}\n");
  fprintf (init_file, "  else\n");
  fprintf (init_file, "{\n");
- fprintf (init_file, "  rs6000_b

[PATCH] c: Fix up pointer types to may_alias structures [PR114493]

2024-06-04 Thread Jakub Jelinek
Hi!

The following testcase ICEs in ipa-free-lang, because the
fld_incomplete_type_of
  gcc_assert (TYPE_CANONICAL (t2) != t2
  && TYPE_CANONICAL (t2) == TYPE_CANONICAL (TREE_TYPE (t)));
assertion doesn't hold.
This is because t is a struct S * type which was created while struct S
was still incomplete and without the may_alias attribute (and TYPE_CANONICAL
of a pointer type is a type created with can_alias_all = false argument),
while later on on the struct definition may_alias attribute was used.
fld_incomplete_type_of then creates an incomplete distinct copy of the
structure (but with the original attributes) but pointers created for it
are because of the "may_alias" attribute TYPE_REF_CAN_ALIAS_ALL, including
their TYPE_CANONICAL, because while that is created with !can_alias_all
argument, we later set it because of the "may_alias" attribute on the
to_type.

This doesn't ICE with C++ since PR70512 fix because the C++ FE sets
TYPE_REF_CAN_ALIAS_ALL on all pointer types to the class type (and its
variants) when the may_alias is added.

The following patch does that in the C FE as well.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
release branches?

2024-06-04  Jakub Jelinek  

PR c/114493
* c-decl.cc (c_fixup_may_alias): New function.
(finish_struct): Call it if "may_alias" attribute is
specified.

* gcc.dg/pr114493-1.c: New test.
* gcc.dg/pr114493-2.c: New test.

--- gcc/c/c-decl.cc.jj  2024-05-07 08:47:35.974836903 +0200
+++ gcc/c/c-decl.cc 2024-06-03 19:55:53.819586291 +0200
@@ -9446,6 +9446,17 @@ verify_counted_by_attribute (tree struct
   return;
 }
 
+/* TYPE is a struct or union that we're applying may_alias to after the body is
+   parsed.  Fixup any POINTER_TO types.  */
+
+static void
+c_fixup_may_alias (tree type)
+{
+  for (tree t = TYPE_POINTER_TO (type); t; t = TYPE_NEXT_PTR_TO (t))
+for (tree v = TYPE_MAIN_VARIANT (t); v; v = TYPE_NEXT_VARIANT (v))
+  TYPE_REF_CAN_ALIAS_ALL (v) = true;
+}
+
 /* Fill in the fields of a RECORD_TYPE or UNION_TYPE node, T.
LOC is the location of the RECORD_TYPE or UNION_TYPE's definition.
FIELDLIST is a chain of FIELD_DECL nodes for the fields.
@@ -9791,6 +9802,10 @@ finish_struct (location_t loc, tree t, t
 
   C_TYPE_BEING_DEFINED (t) = 0;
 
+  if (lookup_attribute ("may_alias", TYPE_ATTRIBUTES (t)))
+for (x = TYPE_MAIN_VARIANT (t); x; x = TYPE_NEXT_VARIANT (x))
+  c_fixup_may_alias (x);
+
   /* Set type canonical based on equivalence class.  */
   if (flag_isoc23 && !C_TYPE_VARIABLE_SIZE (t))
 {
--- gcc/testsuite/gcc.dg/pr114493-1.c.jj2024-06-03 19:59:58.774336785 
+0200
+++ gcc/testsuite/gcc.dg/pr114493-1.c   2024-06-03 19:59:12.931944923 +0200
@@ -0,0 +1,19 @@
+/* PR c/114493 */
+/* { dg-do compile { target lto } } */
+/* { dg-options "-O2 -flto" } */
+
+void foo (void);
+struct S;
+struct S bar (struct S **);
+struct S qux (const struct S **);
+
+struct __attribute__((__may_alias__)) S {
+  int s;
+};
+
+struct S
+baz (void)
+{
+  foo ();
+  return (struct S) {};
+}
--- gcc/testsuite/gcc.dg/pr114493-2.c.jj2024-06-03 19:59:58.774336785 
+0200
+++ gcc/testsuite/gcc.dg/pr114493-2.c   2024-06-03 20:01:00.886512830 +0200
@@ -0,0 +1,26 @@
+/* PR c/114493 */
+/* { dg-do compile { target lto } } */
+/* { dg-options "-O2 -flto -std=c23" } */
+
+void foo (void);
+struct S;
+struct S bar (struct S **);
+struct S qux (const struct S **);
+
+void
+corge (void)
+{
+  struct S { int s; } s;
+  s.s = 0;
+}
+
+struct __attribute__((__may_alias__)) S {
+  int s;
+};
+
+struct S
+baz (void)
+{
+  foo ();
+  return (struct S) {};
+}

Jakub



[PATCH] builtins: Force SAVE_EXPR for __builtin_{add,sub,mul}_overflow and __builtin{add,sub}c [PR108789]

2024-06-04 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled, because we use save_expr
on the .{ADD,SUB,MUL}_OVERFLOW call we are creating, but if the first
two operands are not INTEGER_CSTs (in that case we just fold it right away)
but are TREE_READONLY/!TREE_SIDE_EFFECTS, save_expr doesn't actually
create a SAVE_EXPR at all and so we lower it to
*arg2 = REALPART_EXPR (.ADD_OVERFLOW (arg0, arg1)), \
IMAGPART_EXPR (.ADD_OVERFLOW (arg0, arg1))
which evaluates the ifn twice and just hope it will be CSEd back.
As *arg2 aliases *arg0, that is not the case.
The builtins are really never const/pure as they store into what
the third arguments points to, so after handling the INTEGER_CST+INTEGER_CST
case, I think we should just always use SAVE_EXPR.  Just building SAVE_EXPR
by hand and setting TREE_SIDE_EFFECTS on it doesn't work, because
c_fully_fold optimizes it away again, so the following patch marks the
ifn calls as TREE_SIDE_EFFECTS (but doesn't do it for the
__builtin_{add,sub,mul}_overflow_p case which were designed for use
especially in constant expressions and don't really evaluate the
realpart side, so we don't really need a SAVE_EXPR in that case).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-06-04  Jakub Jelinek  

PR middle-end/108789
* builtins.cc (fold_builtin_arith_overflow): For ovf_only,
don't call save_expr and don't build REALPART_EXPR, otherwise
set TREE_SIDE_EFFECTS on call before calling save_expr.
(fold_builtin_addc_subc): Set TREE_SIDE_EFFECTS on call before
calling save_expr.

* gcc.c-torture/execute/pr108789.c: New test.

--- gcc/builtins.cc.jj  2024-04-05 09:19:47.899050410 +0200
+++ gcc/builtins.cc 2024-06-03 17:27:11.071693074 +0200
@@ -10042,7 +10042,21 @@ fold_builtin_arith_overflow (location_t
   tree ctype = build_complex_type (type);
   tree call = build_call_expr_internal_loc (loc, ifn, ctype, 2,
arg0, arg1);
-  tree tgt = save_expr (call);
+  tree tgt;
+  if (ovf_only)
+   {
+ tgt = call;
+ intres = NULL_TREE;
+   }
+  else
+   {
+ /* Force SAVE_EXPR even for calls which satisfy tree_invariant_p_1,
+as while the call itself is const, the REALPART_EXPR store is
+certainly not.  And in any case, we want just one call,
+not multiple and trying to CSE them later.  */
+ TREE_SIDE_EFFECTS (call) = 1;
+ tgt = save_expr (call);
+   }
   intres = build1_loc (loc, REALPART_EXPR, type, tgt);
   ovfres = build1_loc (loc, IMAGPART_EXPR, type, tgt);
   ovfres = fold_convert_loc (loc, boolean_type_node, ovfres);
@@ -10354,11 +10368,17 @@ fold_builtin_addc_subc (location_t loc,
   tree ctype = build_complex_type (type);
   tree call = build_call_expr_internal_loc (loc, ifn, ctype, 2,
args[0], args[1]);
+  /* Force SAVE_EXPR even for calls which satisfy tree_invariant_p_1,
+ as while the call itself is const, the REALPART_EXPR store is
+ certainly not.  And in any case, we want just one call,
+ not multiple and trying to CSE them later.  */
+  TREE_SIDE_EFFECTS (call) = 1;
   tree tgt = save_expr (call);
   tree intres = build1_loc (loc, REALPART_EXPR, type, tgt);
   tree ovfres = build1_loc (loc, IMAGPART_EXPR, type, tgt);
   call = build_call_expr_internal_loc (loc, ifn, ctype, 2,
   intres, args[2]);
+  TREE_SIDE_EFFECTS (call) = 1;
   tgt = save_expr (call);
   intres = build1_loc (loc, REALPART_EXPR, type, tgt);
   tree ovfres2 = build1_loc (loc, IMAGPART_EXPR, type, tgt);
--- gcc/testsuite/gcc.c-torture/execute/pr108789.c.jj   2024-06-03 
17:15:01.143366766 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr108789.c  2024-06-03 
17:12:55.189036744 +0200
@@ -0,0 +1,39 @@
+/* PR middle-end/108789 */
+
+int
+add (unsigned *r, const unsigned *a, const unsigned *b)
+{
+  return __builtin_add_overflow (*a, *b, r);
+}
+
+int
+mul (unsigned *r, const unsigned *a, const unsigned *b)
+{
+  return __builtin_mul_overflow (*a, *b, r);
+}
+
+int
+main ()
+{
+  unsigned x;
+
+  /* 1073741824U + 1073741824U should not overflow.  */
+  x = (__INT_MAX__ + 1U) / 2;
+  if (add (, , ))
+__builtin_abort ();
+
+  /* 256U * 256U should not overflow */
+  x = 1U << (sizeof (int) * __CHAR_BIT__ / 4);
+  if (mul (, , ))
+__builtin_abort ();
+
+  /* 2147483648U + 2147483648U should overflow */
+  x = __INT_MAX__ + 1U;
+  if (!add (, , ))
+__builtin_abort ();
+
+  /* 65536U * 65536U should overflow */
+  x = 1U << (sizeof (int) * __CHAR_BIT__ / 2);
+  if (!mul (, , ))
+__builtin_abort ();
+}

Jakub



Re: [PATCH v6 1/8] Improve must tail in RTL backend

2024-06-03 Thread Jakub Jelinek
On Mon, Jun 03, 2024 at 07:02:00PM +0200, Michael Matz wrote:
> Hello,
> 
> On Fri, 31 May 2024, Andi Kleen wrote:
> 
> > > I think the ultimate knowledge if a call can or cannot be implemented as 
> > > tail-call lies within calls.cc/expand_call: It is inherently 
> > > target and ABI specific how arguments and returns are layed out, how the 
> > > stack frame is generated, if arguments are or aren't removed by callers 
> > > or callees and so on; all of that being knowledge that tree-tailcall 
> > > doesn't have and doesn't want to have.  As such tree-tailcall should 
> > > not be regarded as ultimate truth, and failures of tree-tailcall to 
> > > recognize something as tail-callable shouldn't matter.
> > 
> > It's not the ultimate truth, but some of the checks it does are not 
> > duplicated at expand time nor the backend. So it's one necessary pre 
> > condition with the current code base.
> > 
> > Yes maybe the checks could be all moved, but that's a much larger 
> > project.
> 
> Hmm.  I count six tests in about 25 lines of code in 
> tree-tailcall.cc:suitable_for_tail_opt_p and suitable_for_tail_call_opt_p.
> 
> Are you perhaps worrying about the sibcall discovery itself (i.e. much of 
> find_tail_calls)?  Why would that be needed for musttail?  Is that 
> attribute sometimes applied to calls that aren't in fact sibcall-able?
> 
> One thing I'm worried about is the need for a new sibcall pass at O0 just 
> for sibcall discovery.  find_tail_calls isn't cheap, because it computes 
> live local variables for the whole function, potentially being quadratic.

But the pass could be done only if there is at least one musttail call
in a function (remembered in some cfun flag).  If people use that attribute,
guess they are willing to pay for it.

Jakub



Re: [PATCH v7 4/9] C++: Support clang compatible [[musttail]] (PR83324)

2024-06-03 Thread Jakub Jelinek
On Mon, Jun 03, 2024 at 08:33:52AM -0700, Andi Kleen wrote:
> On Mon, Jun 03, 2024 at 10:42:20AM -0400, Jason Merrill wrote:
> > > @@ -30316,7 +30348,7 @@ cp_parser_std_attribute (cp_parser *parser, tree 
> > > attr_ns)
> > >   /* Maybe we don't expect to see any arguments for this attribute.  
> > > */
> > >   const attribute_spec *as
> > > = lookup_attribute_spec (TREE_PURPOSE (attribute));
> > > -if (as && as->max_length == 0)
> > > +if ((as && as->max_length == 0) || is_attribute_p ("musttail", 
> > > attr_id))
> > 
> > This shouldn't be necessary with the attribute in the c-attribs table,
> > right?  This patch is OK without this hunk and with the comment tweak above.
> 
> Yes I will remove it. Also the hunk above can be simplified, we don't
> need the extra case anymore.
> 
> But unfortunately there's another problem (sorry I missed that earlier
> but the Linaro bot pointed it out again):
> 
> This hunk:
> 
> @@ -21085,12 +21085,14 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
> complain, tree in_decl)
>   bool op = CALL_EXPR_OPERATOR_SYNTAX (t);
>   bool ord = CALL_EXPR_ORDERED_ARGS (t);
>   bool rev = CALL_EXPR_REVERSE_ARGS (t);
> - if (op || ord || rev)
> + bool mtc = CALL_EXPR_MUST_TAIL_CALL (t);
> + if (op || ord || rev || mtc)
> if (tree call = extract_call_expr (ret))
>   {
> CALL_EXPR_OPERATOR_SYNTAX (call) = op;
> CALL_EXPR_ORDERED_ARGS (call) = ord;
> CALL_EXPR_REVERSE_ARGS (call) = rev;
> +   CALL_EXPR_MUST_TAIL_CALL (call) = mtc;
>   }

The difference is that CALL_EXPR_MUST_TAIL_CALL is defined as:
#define CALL_EXPR_MUST_TAIL_CALL(NODE) \
  (CALL_EXPR_CHECK (NODE)->base.static_flag)
while the others like:
#define CALL_EXPR_ORDERED_ARGS(NODE) \
  TREE_LANG_FLAG_3 (CALL_OR_AGGR_INIT_CHECK (NODE))
where
#define CALL_OR_AGGR_INIT_CHECK(NODE) \
  TREE_CHECK2 ((NODE), CALL_EXPR, AGGR_INIT_EXPR)
while
#define CALL_EXPR_CHECK(t)  TREE_CHECK (t, CALL_EXPR)
(this one is defined in generated tree-check.h).
So, while the CALL_EXPR_REVERSE_ARGS etc. can be used on either
CALL_EXPR or AGGR_INIT_EXPR (the latter is a C++ specific tree code),
CALL_EXPR_MUST_TAIL_CALL is allowed only on CALL_EXPR.
AGGR_INIT_EXPR is used for C++ constructor calls, so I think
they really don't need such a flag, so you could do:
bool mtc = (TREE_CODE (t) == CALL_EXPR
? CALL_EXPR_MUST_TAIL_CALL (t) : false);
if (op || ord || rev || mtc)
...
  if (mtc)
CALL_EXPR_MUST_TAIL_CALL (call) = 1;
or something similar.
Or you'd need to define a variant of the CALL_EXPR_MUST_TAIL_CALL
macro for the C++ FE (as CALL_OR_AGGR_INIT_CHECK is C++ FE too)
and use that in the FE and somehow assert it means the same thing
as the middle-end flag except that it can be also used on AGGR_INIT_EXPR.

Jakub



[PATCH] rs6000: Fix up PCH in --enable-host-pie builds [PR115324]

2024-06-03 Thread Jakub Jelinek
Hi!

PCH doesn't work properly in --enable-host-pie configurations on
powerpc*-linux*.
The problem is that the rs6000_builtin_info and rs6000_instance_info
arrays mix pointers to .rodata/.data (bifname and attr_string point
to string literals in .rodata section, and the next member is either NULL
or _instance_info[XXX]) and GC member (tree fntype).
Now, for normal GC this works just fine, we emit
  {
_instance_info[0].fntype,
1 * (RS6000_INST_MAX),
sizeof (rs6000_instance_info[0]),
_ggc_mx_tree_node,
_pch_nx_tree_node
  },
  {
_builtin_info[0].fntype,
1 * (RS6000_BIF_MAX),
sizeof (rs6000_builtin_info[0]),
_ggc_mx_tree_node,
_pch_nx_tree_node
  },
GC roots which are strided and thus cover only the fntype members of all
the elements of the two arrays.
For PCH though it actually results in saving those huge arrays (one is
130832 bytes, another 81568 bytes) into the .gch files and loading them back
in full.  While the bifname and attr_string and next pointers are marked as
GTY((skip)), they are actually saved to point to the .rodata and .data
sections of the process which writes the PCH, but because cc1/cc1plus etc.
are position independent executables with --enable-host-pie, when it is
loaded from the PCH file, it can point in a completely different addresses
where nothing is mapped at all or some random different thing appears at.
While gengtype supports the callback option, that one is meant for
relocatable function pointers and doesn't work in the case of GTY arrays
inside of .data section anyway.

So, either we'd need to add some further GTY extensions, or the following
patch instead reworks it such that the fntype members which were the only
reason for PCH in those arrays are moved to separate arrays.

Size-wise in .data sections it is (in bytes):

 vanillapatched
rs6000_builtin_info  130832 110704
rs6000_instance_info  81568  40784
rs6000_overload_info   7392   7392
rs6000_builtin_info_fntype0  10064
rs6000_instance_info_fntype   0  20392
sum  219792 189336

where previously we saved/restored for PCH those 130832+81568 bytes, now we
save/restore just 10064+20392 bytes, so this change is beneficial for the
data section size.

Unfortunately, it grows the size of the rs6000_init_generated_builtins
function, vanilla had 218328 bytes, patched has 228668.

When I applied
 void
 rs6000_init_generated_builtins ()
 {
+  bifdata *rs6000_builtin_info_p;
+  tree *rs6000_builtin_info_fntype_p;
+  ovlddata *rs6000_instance_info_p;
+  tree *rs6000_instance_info_fntype_p;
+  ovldrecord *rs6000_overload_info_p;
+  __asm ("" : "=r" (rs6000_builtin_info_p) : "0" (rs6000_builtin_info));
+  __asm ("" : "=r" (rs6000_builtin_info_fntype_p) : "0" 
(rs6000_builtin_info_fntype));
+  __asm ("" : "=r" (rs6000_instance_info_p) : "0" (rs6000_instance_info));
+  __asm ("" : "=r" (rs6000_instance_info_fntype_p) : "0" 
(rs6000_instance_info_fntype));
+  __asm ("" : "=r" (rs6000_overload_info_p) : "0" (rs6000_overload_info));
+  #define rs6000_builtin_info rs6000_builtin_info_p
+  #define rs6000_builtin_info_fntype rs6000_builtin_info_fntype_p
+  #define rs6000_instance_info rs6000_instance_info_p
+  #define rs6000_instance_info_fntype rs6000_instance_info_fntype_p
+  #define rs6000_overload_info rs6000_overload_info_p
+
hack by hand, the size of the function is 209700 though, so if really
wanted, we could add __attribute__((__noipa__)) to the function when
building with recent enough GCC and pass pointers to the first elements
of the 5 arrays to the function as arguments.  If you want such a change,
could that be done incrementally?

Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (-m32/-m64
testing there), ok for trunk and after a while for release branches?

2024-06-03  Jakub Jelinek  

PR target/115324
* config/rs6000/rs6000-gen-builtins.cc (write_decls): Remove
GTY markup from struct bifdata and struct ovlddata and remove their
fntype members.  Change next member in struct ovlddata and
first_instance member of struct ovldrecord to have int type rather
than struct ovlddata *.  Remove GTY markup from rs6000_builtin_info
and rs6000_instance_info arrays, declare new
rs6000_builtin_info_fntype and rs6000_instance_info_fntype arrays,
which have GTY markup.
(write_bif_static_init): Adjust for the above changes.
(write_ovld_static_init): Likewise.
(write_init_bif_table): Likewise.
(write_init_ovld_table): Likewise.
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-c.cc (find_instance): Likewise.  Make static.
(altivec_resolve_overloaded_builtin): Adjust for

Re: [PATCH 17/52] gcn: Remove macros {FLOAT, DOUBLE, LONG_DOUBLE}_TYPE_SIZE

2024-06-03 Thread Jakub Jelinek
On Mon, Jun 03, 2024 at 05:41:11PM +0800, Kewen.Lin wrote:
> > GCN does have some partially implemented support for HFmode ... do I need 
> > to do something new for that to work?
> 
> For this hook, no, as it's mainly for float, double and long double types (C 
> language supported non decimal floating
> point types).  If you are referring to _Float16, I guess you may be 
> interested in another hook TARGET_FLOATN_MODE
> which is for FloatN types.

You don't need a new hook for that, the current _FloatNN discovery code is all
that is needed.  There should be just one mode for the IEEE compliant
implementations for each size (there is the _Float16 vs. __bf16 but the
latter isn't IEEE compliant, or just IEEE like), so tree.cc should figure
everything out together with the current langhooks.

Jakub



Re: [PATCH v2] [libstdc++] add _GLIBCXX_CLANG to workaround predefined __clang__

2024-06-01 Thread Jakub Jelinek
On Sat, Jun 01, 2024 at 09:21:53AM +0100, Jonathan Wakely wrote:
> On Fri, 31 May 2024 at 18:43, Alexandre Oliva  wrote:
> >
> > On May 31, 2024, Alexandre Oliva  wrote:
> >
> > >> So either don't change this line at all, or just do a simple
> > >> s/__clang__/_GLIBCXX_CLANG/
> >
> > > If c++config can be counted on, I'd be happy to do that, but I couldn't
> > > tell that it could.
> >
> > Here's what I've retested on x86_64-linux-gnu and, slightly adjusted for
> > gcc-13, on arm-vx7r2.  Ok to install?
> 
> OK
> 
> If there's any chance of getting the vxworks system headers fixed to
> work with GCC properly, that would be nice.

Fixincludes?
That seems like the standard way to workaround bugs in system headers on
proprietary targets.

Jakub



Re: [PATCH 01/11] OpenMP/PolyInt: Pass poly-int structures by address to OMP libs.

2024-05-31 Thread Jakub Jelinek
On Fri, May 31, 2024 at 08:45:54AM +0100, Richard Sandiford wrote:
> > When you say same way, do you mean the way SVE ABI defines the rules for 
> > SVE types?
> 
> No, sorry, I meant that if the choice isn't purely local to a source
> code function, the condition should be something like sizeless_type_p
> (suitably abstracted) rather than POLY_INT_CST_P.  That way, the "ABI"
> stays the same regardless of -msve-vector-bits.

There is no ABI, it is how the caller and indirect callee communicate,
but both parts are compiled with the same compiler, so it can choose
differently based on different compiler version etc.
It is effectively simplified:
struct whatever { ... };
void callee (void *x) { struct whatever *w = *x; use *w; }
void caller (void) { struct whatever w; fill in w; ABI_call (callee, ); }
(plus in some cases the callee can also update values and propagate that
back to caller).
In any case, it is a similar "ABI" to e.g. tree-nested.cc communication
between caller and nested callee, how exactly are the variables laid out
in a struct depends on compiler version and whatever it decides, same
compiler then emits both sides.

Jakub



Re: [patch] libgomp: Enable USM for some nvptx devices

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 08:20:01AM +0200, Tobias Burnus wrote:
> +  if (num_devices > 0
> +  && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY))
> +for (int dev = 0; dev < num_devices; dev++)
> +  {
> + int pi;
> + CUresult r;
> + r = CUDA_CALL_NOCHECK (cuDeviceGetAttribute, ,
> +   CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS,
> +   dev);

Formatting nit, the CU_DEVICE_... should be below cuDeviceGetAttribute,
I think it fits like that (if it wouldn't one could use a temporary
variable).

Otherwise LGTM.

Jakub



Re: [patch] libgomp: Enable USM for AMD APUs and MI200 devices

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 02:15:07PM +0200, Tobias Burnus wrote:
> +  bool b;
> +  hsa_status_t status;
> +  status = hsa_fns.hsa_system_get_info_fn (
> +  HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT, );
> +  if (status != HSA_STATUS_SUCCESS)
> + GOMP_PLUGIN_error (
> +   "HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT failed");

Formatting, the (s at the end of lines look terrible.
In the first case, perhaps using a temporary would help,
  hsa_system_info_t arg = HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT;
  status = hsa_fns.hsa_system_get_info_fn (arg, );
(or use something else instead of arg, as long as its short), while in the
second
GOMP_PLUGIN_error ("HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT "
   "failed");
will do.

Other than that LGTM.

Jakub



Re: [patch] OpenMP: Add -fopenmp-force-usm mode

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 08:49:01AM +0200, Tobias Burnus wrote:
> Jakub Jelinek wrote:
> > I mean, if we want to add something, maybe better would an -include like
> > option that instead of including a file includes it directly.
> > gcc --include-inline '#pragma omp requires unified_shared_memory' ...
> 
> Likewise for Fortran, but there the question is whether it should be in the
> use-stmt, import-stmt, implicit-part or declaration-part; I guess having one
> --include-inline-use-stmt and --include-inline-declaration would make sense

Maybe name it slightly differently for Fortran and have the where it should
be added as one argument, so --whatever=where=what

> And, I guess, multiple flags should be permitted, which can then be
> processed as separate lines.

Obviously.  That was the intent with --include-inline= for C as well,
after all, -include works that way too.
-include a.h -include b.h -include c.h

Jakub



Re: [patch] OpenMP: Add -fopenmp-force-usm mode

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 08:41:04AM +0200, Tobias Burnus wrote:
> Jakub Jelinek wrote:
> > How is that option different from
> > echo '#pragma omp requires unified_shared_memory' > omp-usm.h
> > gcc -include omp-usm.h
> > ?
> > I mean with -include you can add anything you want, not just one particular
> > directive, and adding a separate option for each is just weird.
> 
> For C/C++, -include seems to be indeed sufficient (albeit not widely known).
> For Fortran, there at two issues: One placement/semantic issue: it has to be
> added per "compilation unit", i.e. to the specification part of a module,
> subprogram or main program. And a practical issue, gfortran shows:
> 
> error: command-line option '-include !$omp requires' is valid for
> C/C++/ObjC/ObjC++ but not for Fortran
> 
> Thus, for Fortran it is still intrinsically useful – even if one can argue
> whether that feature is needed at all / whether it should be added as
> command-line argument.

But then shouldn't we have an option that adds something at the start of
the declaration part of each ?
I mean, option to add 'implicit none' everywhere, or this
'!$omp requires unified_shared_memory' etc.?

I could live with an one off option for clang compatibility, I just fear
that in 2 years we'll need another one etc. and that solving it in some more
versatile way would be better.

Jakub



Re: [patch] OpenMP: Add -fopenmp-force-usm mode

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 08:26:04AM +0200, Jakub Jelinek wrote:
> > *I am especially thinking about a global variable and "#pragma omp declare
> > target". At least with 'omp requires self_maps' of OpenMP 6, it seems as if
> > 'declare target enter(global_var)' should become 'link(global_var)' where
> > the global_var pointer is updated to point to the host version.
> 
> How is that option different from
> echo '#pragma omp requires unified_shared_memory' > omp-usm.h
> gcc -include omp-usm.h
> ?
> I mean with -include you can add anything you want, not just one particular
> directive, and adding a separate option for each is just weird.

I mean, if we want to add something, maybe better would an -include like
option that instead of including a file includes it directly.
gcc --include-inline '#pragma omp requires unified_shared_memory' ...

Jakub



Re: [patch] OpenMP: Add -fopenmp-force-usm mode

2024-05-29 Thread Jakub Jelinek
On Tue, May 28, 2024 at 09:23:41PM +0200, Tobias Burnus wrote:
> -fopenmp-force-usm can be useful for some badly written code. Explicity
> using 'omp requires' makes more sense but still. It might also make sense
> for testing purpose.
> 
> Unfortunately, I did not see a simple way of testing it. When trying it
> manually, I looked at the 'a.xamdgcn-amdhsa.c' -save-temps file, where
> gcn_data has the omp_requires_mask as second argument and testing showed
> that an explicit pragma and the -f... argument have the same result.
> 
> Alternative would be to move this code later, e.g. to lto-cgraph.cc's
> omp_requires_mask, which might be safer (as it avoids changing as many
> locations). On the other hand, it might require more special cases
> elsewhere.*
> 
> Comment, suggestions?
> 
> Tobias
> 
> *I am especially thinking about a global variable and "#pragma omp declare
> target". At least with 'omp requires self_maps' of OpenMP 6, it seems as if
> 'declare target enter(global_var)' should become 'link(global_var)' where
> the global_var pointer is updated to point to the host version.

How is that option different from
echo '#pragma omp requires unified_shared_memory' > omp-usm.h
gcc -include omp-usm.h
?
I mean with -include you can add anything you want, not just one particular
directive, and adding a separate option for each is just weird.

Jakub



Re: [Patch] testsuite/*/gomp: Remove 'dg-prune-output "not supported yet"'

2024-05-28 Thread Jakub Jelinek
On Tue, May 28, 2024 at 07:43:00PM +0200, Tobias Burnus wrote:
> Improve test coverage by removing 'prune-output' given that the features are
> implemented in the meanwhile.
> 
> Comments, suggestions? Otherwise I will commit the patch as obvious.
> 
> Tobias

> testsuite/*/gomp: Remove 'dg-prune-output "not supported yet"'
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/gomp/lastprivate-conditional-1.c: Remove
>   '{ dg-prune-output "not supported yet" }'.
>   * c-c++-common/gomp/requires-1.c: Likewise.
>   * c-c++-common/gomp/requires-2.c: Likewise.
>   * c-c++-common/gomp/reverse-offload-1.c: Likewise.
>   * g++.dg/gomp/requires-1.C: Likewise.
>   * gfortran.dg/gomp/requires-1.f90: Likewise.
>   * gfortran.dg/gomp/requires-2.f90: Likewise.
>   * gfortran.dg/gomp/requires-4.f90: Likewise.
>   * gfortran.dg/gomp/requires-5.f90: Likewise.
>   * gfortran.dg/gomp/requires-6.f90: Likewise.
>   * gfortran.dg/gomp/requires-7.f90: Likewise.

LGTM.

Jakub



Re: [PATCH] tree-optimization/115232 - demangle failure during -Waccess

2024-05-27 Thread Jakub Jelinek
On Mon, May 27, 2024 at 11:11:43AM +0200, Richard Biener wrote:
> For the following testcase we fail to demangle
> _ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnernwEm and
> _ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnerdlEPv and in turn end
> up building NULL references.  The following puts in a safeguard for
> faile demangling into -Waccess.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> 
> Thanks,
> Richard.
> 
>   PR tree-optimization/115232
>   * gimple-ssa-warn-access.cc (new_delete_mismatch_p): Handle
>   failure to demangle gracefully.
> 
>   * g++.dg/pr115232.C: New testcase.

LGTM, thanks.

Jakub



[PATCH] libstdc++: Fix up 19_diagnostics/stacktrace/hash.cc on 13 branch

2024-05-27 Thread Jakub Jelinek
Hi!

The r13-8207-g17acf9fbeb10d7adad commit changed some tests to use
-lstdc++exp instead of -lstdc++_libbacktrace, but it didn't change
the 19_diagnostics/stacktrace/hash.cc test, presumably because
when it was added on the trunk, it already had -lstdc++exp and
it was changed to -lstdc++_libbacktrace only in the
r13-8067-g16635b89f36c07b9e0 cherry-pick.

The test fails with
/usr/bin/ld: cannot find -lstdc++_libbacktrace
collect2: error: ld returned 1 exit status
compiler exited with status 1
FAIL: 19_diagnostics/stacktrace/hash.cc (test for excess errors)
without this (while the library is still built, it isn't added in
-L options).

Ok for 13 branch?

I think the r13-8067 cherry-pick hasn't been applied to 12 branch,
so we don't need it there.

2024-05-27  Jakub Jelinek  

* testsuite/19_diagnostics/stacktrace/hash.cc: Adjust
dg-options to use -lstdc++exp.

--- libstdc++-v3/testsuite/19_diagnostics/stacktrace/hash.cc.jj 2023-11-22 
11:03:28.812657550 +0100
+++ libstdc++-v3/testsuite/19_diagnostics/stacktrace/hash.cc2024-05-27 
10:18:44.900058884 +0200
@@ -1,4 +1,4 @@
-// { dg-options "-std=gnu++23 -lstdc++_libbacktrace" }
+// { dg-options "-std=gnu++23 -lstdc++exp" }
 // { dg-do run { target c++23 } }
 // { dg-require-effective-target stacktrace }
 


Jakub



Re: [C PATCH, v2]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-24 Thread Jakub Jelinek
On Fri, May 24, 2024 at 05:39:45PM +0200, Martin Uecker wrote:
> PR 115157
> PR 115177
> 
> gcc/c/
> * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
> finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / TYPE_CANONICAL.
> * c-obj-common.cc (get_alias_set): Remove special case.
> (get_aka_type): Add special case.
> 
> gcc/c-family/
> * c-attribs.cc (handle_hardbool_attribute): Set TYPE_CANONICAL
> for hardbools.
> 
> gcc/
> * godump.cc (go_output_typedef): use TYPE_MAIN_VARIANT instead
> of TYPE_CANONICAL.

Just a nit:
s/use/Use/

Jakub



Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread Jakub Jelinek
On Wed, May 22, 2024 at 05:23:33PM +0800, YunQiang Su wrote:
> Jakub Jelinek  于2024年5月22日周三 17:14写道:
> >
> > On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote:
> > > > --- gcc/gcc.cc.jj   2024-02-09 14:54:09.141489744 +0100
> > > > +++ gcc/gcc.cc  2024-02-09 22:04:37.655678742 +0100
> > > > @@ -2410,8 +2410,7 @@ read_specs (const char *filename, bool m
> > > >   if (*p1++ != '<' || p[-2] != '>')
> > > > fatal_error (input_location,
> > > >  "specs %%include syntax malformed after "
> > > > -"%ld characters",
> > > > -(long) (p1 - buffer + 1));
> > > > +"%td characters", p1 - buffer + 1);
> > > >
> > >
> > > Should we use %td later for gcc itself? Since we may use older
> > > compiler to build gcc.
> > > My major workstation is Debian Bookworm, which has GCC 12, and then I
> > > get some warnings:
> >
> > That is fine and expected.  During stage1 such warnings are intentionally
> > not fatal, only in stage2+ when we know it is the same version of gcc
> > we want those can be fatal.
> 
> It may have only 1 stage in some cases.
> For example we have a full binutils/libc stack, and just build a cross-gcc.
> For all libraries for target, such as libgcc etc, it is OK; while for
> host executables
> it will be a problem.

That is still ok, it is just a warning about unknown gcc format specifiers,
at runtime the code from the compiler being built will be used and that
handles those.  We have added dozens of these over years, %td/%zd certainly
aren't an exception.  Just try to build with some older gcc version, say
4.8.5, and you'll see far more such warnings.
But also as recommended, you shouldn't be building cross-gcc with old
version of gcc, you should use same version of the native compiler to
build the cross compiler.

https://gcc.gnu.org/install/build.html

"To build a cross compiler, we recommend first building and installing a native
compiler. You can then use the native GCC compiler to build the cross
compiler."

Jakub



Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread Jakub Jelinek
On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote:
> > --- gcc/gcc.cc.jj   2024-02-09 14:54:09.141489744 +0100
> > +++ gcc/gcc.cc  2024-02-09 22:04:37.655678742 +0100
> > @@ -2410,8 +2410,7 @@ read_specs (const char *filename, bool m
> >   if (*p1++ != '<' || p[-2] != '>')
> > fatal_error (input_location,
> >  "specs %%include syntax malformed after "
> > -"%ld characters",
> > -(long) (p1 - buffer + 1));
> > +"%td characters", p1 - buffer + 1);
> >
> 
> Should we use %td later for gcc itself? Since we may use older
> compiler to build gcc.
> My major workstation is Debian Bookworm, which has GCC 12, and then I
> get some warnings:

That is fine and expected.  During stage1 such warnings are intentionally
not fatal, only in stage2+ when we know it is the same version of gcc
we want those can be fatal.
Otherwise we could never add any new modifies...

Jakub



Re: [PATCH] Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.

2024-05-22 Thread Jakub Jelinek
On Wed, May 22, 2024 at 09:46:41AM +0200, Richard Biener wrote:
> On Wed, May 22, 2024 at 3:58 AM liuhongt  wrote:
> >
> > According to IEEE standard, for conversions from floating point to
> > integer. When a NaN or infinite operand cannot be represented in the
> > destination format and this cannot otherwise be indicated, the invalid
> > operation exception shall be signaled. When a numeric operand would
> > convert to an integer outside the range of the destination format, the
> > invalid operation exception shall be signaled if this situation cannot
> > otherwise be indicated.
> >
> > The patch prevent simplication of the conversion from floating point
> > to integer for NAN/INF/out-of-range constant when flag_trapping_math.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> > Ok for trunk?
> 
> OK if there are no further comments today.

As I wrote in the PR, I don't think this is the right fix for the PR,
the simplify-rtx.cc change is the right thing to do, the C standard
in F.4 says that the out of range conversions to integers should raise
exceptions, but still says that the resulting value in those cases is
unspecified.
So, for the C part we should verify that with -ftrapping-math we don't
constant fold it and cover it both by pure C and perhaps backend specific
testcases which just search asm for the conversion instructions
or even runtime test which tests that the exceptions are triggered,
verify that we don't fold it either during GIMPLE opts or RTL opts
(dunno whether they can be folded in e.g. C constant initializers or not).

And then on the backend side, it should stop using FIX/UNSIGNED_FIX RTLs
in patterns which are used for the intrinsics (and keep using them in
patterns used for C scalar/vector conversions), because even with
-fno-trapping-math the intrinsics should have the documented behavior,
particular result value, while in C they are clearly unspecified and
FIX/UNSIGNED_FIX folding even with the patch chooses some particular values
which don't match those (sure, they are like that because of Java, but am
not sure it is the right time to change what we do in those cases say
by providing a target hook to pick a different value).

The provided testcase tests the values though, so I think is inappropriate
for this patch.

Jakub



[PATCH] strlen: Fix up !si->full_string_p handling in count_nonzero_bytes_addr [PR115152]

2024-05-21 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled because
strlen_pass::count_nonzero_bytes_addr doesn't handle correctly
the !si->full_string_p case.
If si->full_string_p, it correctly computes minlen and maxlen as
minimum and maximum length of the '\0' terminated stgring and
clears *nulterm (ie. makes sure !full_string_p in the ultimate
caller) if minlen is equal or larger than nbytes and so
'\0' isn't guaranteed to be among those bytes.
But in the !si->full_string_p case, all we know is that there
are [minlen,maxlen] non-zero bytes followed by unknown bytes,
so effectively the maxlen is infinite (but caller cares about only
the first nbytes bytes) and furthermore, we never know if there is
any '\0' char among those, so *nulterm needs to be always cleared.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk and affected release branches?

2024-05-21  Jakub Jelinek  

PR tree-optimization/115152
* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes_addr): If
!si->full_string_p, clear *nulterm and set maxlen to nbytes.

* gcc.dg/pr115152.c: New test.

--- gcc/tree-ssa-strlen.cc.jj   2024-04-29 11:00:45.0 +0200
+++ gcc/tree-ssa-strlen.cc  2024-05-21 13:43:31.031208000 +0200
@@ -4829,7 +4829,7 @@ strlen_pass::count_nonzero_bytes_addr (t
   if (maxlen + 1 < nbytes)
return false;
 
-  if (nbytes <= minlen)
+  if (nbytes <= minlen || !si->full_string_p)
*nulterm = false;
 
   if (nbytes < minlen)
@@ -4839,6 +4839,9 @@ strlen_pass::count_nonzero_bytes_addr (t
maxlen = nbytes;
}
 
+  if (!si->full_string_p)
+   maxlen = nbytes;
+
   if (minlen < lenrange[0])
lenrange[0] = minlen;
   if (lenrange[1] < maxlen)
--- gcc/testsuite/gcc.dg/pr115152.c.jj  2024-05-21 13:46:02.793214348 +0200
+++ gcc/testsuite/gcc.dg/pr115152.c 2024-05-21 12:49:38.791626073 +0200
@@ -0,0 +1,17 @@
+/* PR tree-optimization/115152 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-fre -fno-tree-dominator-opts -fno-tree-loop-im" 
} */
+
+int a, b, c, d;
+signed char e[1] = { 1 };
+
+int
+main ()
+{
+  for (a = 0; a < 3; a++)
+for (b = 0; b < 2; b++)
+  c = e[0] = e[0] ^ d;
+  if (!c)
+__builtin_abort ();
+  return 0;
+}

Jakub



[PATCH] ubsan: Use right address space for MEM_REF created for bool/enum sanitization [PR115172]

2024-05-21 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled, because -fsanitize=bool,enum
creates a MEM_REF without propagating there address space qualifiers,
so what should be normally loaded using say %gs:/%fs: segment prefix
isn't.  Together with asan it then causes that load to be sanitized.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk and release branches?

2024-05-21  Jakub Jelinek  

PR sanitizer/115172
* ubsan.cc (instrument_bool_enum_load): If rhs is not in generic
address space, use qualified version of utype with the right
address space.  Formatting fix.

* gcc.dg/asan/pr115172.c: New test.

--- gcc/ubsan.cc.jj 2024-03-22 09:23:37.695296775 +0100
+++ gcc/ubsan.cc2024-05-21 12:10:24.261454107 +0200
@@ -1776,13 +1776,17 @@ instrument_bool_enum_load (gimple_stmt_i
   || TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
 return;
 
+  addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (rhs));
+  if (as != TYPE_ADDR_SPACE (utype))
+utype = build_qualified_type (utype, TYPE_QUALS (utype)
+| ENCODE_QUAL_ADDR_SPACE (as));
   bool ends_bb = stmt_ends_bb_p (stmt);
   location_t loc = gimple_location (stmt);
   tree lhs = gimple_assign_lhs (stmt);
   tree ptype = build_pointer_type (TREE_TYPE (rhs));
   tree atype = reference_alias_ptr_type (rhs);
   gimple *g = gimple_build_assign (make_ssa_name (ptype),
- build_fold_addr_expr (rhs));
+  build_fold_addr_expr (rhs));
   gimple_set_location (g, loc);
   gsi_insert_before (gsi, g, GSI_SAME_STMT);
   tree mem = build2 (MEM_REF, utype, gimple_assign_lhs (g),
--- gcc/testsuite/gcc.dg/asan/pr115172.c.jj 2024-05-21 17:28:18.302815400 
+0200
+++ gcc/testsuite/gcc.dg/asan/pr115172.c2024-05-21 22:50:43.272753785 
+0200
@@ -0,0 +1,20 @@
+/* PR sanitizer/115172 */
+/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O2 -fsanitize=address,bool -ffat-lto-objects 
-fdump-tree-asan1" } */
+/* { dg-final { scan-tree-dump-not "\.ASAN_CHECK " "asan1" } } */
+
+#ifdef __x86_64__
+#define SEG __seg_gs
+#else
+#define SEG __seg_fs
+#endif
+
+extern struct S { _Bool b; } s;
+void bar (void);
+
+void
+foo (void)
+{
+  if (*(volatile _Bool SEG *) (__UINTPTR_TYPE__) )
+bar ();
+}

Jakub



Re: [Patch] contrib/gcc-changelog/git_update_version.py: Improve diagnostic

2024-05-21 Thread Jakub Jelinek
On Tue, May 21, 2024 at 09:36:05AM +0200, Tobias Burnus wrote:
> Jakub Jelinek wrote:
> > On Mon, May 20, 2024 at 08:31:02AM +0200, Tobias Burnus wrote:
> > > Hmm, there were now two daily bumps: [...] I really wonder why.
> > Because I've done it by hand.
> 
> Okay, that explains it.
> 
> I still do not understand why it slipped through at the first place; I tried
> old versions down to r12-709-g772e5e82e3114f and it still FAIL for the
> invalid commit ("ERR: cannot find a ChangeLog location in message").
> 
> Thus, I wonder whether the commit hook is active at all?!?

They are.  But
https://github.com/AdaCore/git-hooks/blob/master/hooks/updates/__init__.py#L836
with
https://github.com/AdaCore/git-hooks/blob/master/hooks/updates/commits.py#L230
bypasses all commits which contain just 3 magic words in a row.
And because that part is owned by AdaCore hooks, not the GCC customizations,
not sure what to do about that.

> > I have in ~gccadmin a gcc-changelog copy and adjusted update_version_git
> > script which doesn't use contrib/gcc-changelog subdirectory from the
> > checkout it makes but from the ~gccadmin directory,
> [...]
> > I'm already using something similar in
> > my hack (just was doing it for even successful commits, but I think your
> > patch is better).
> > And, I think best would be if update_version_git script simply
> > accepted a list of ignored commits from the command line too,
> > passed it to the git_update_version.py script and that one
> > added those to IGNORED_COMMITS.
> 
> Updated version:
> 
> * Uses my diagnostic
> 
> * Adds an -i/--ignore argument for commits. Permits to use '-i hash1  -i
> hash2' but also '-i hash1,hash2' or '-i "hash1 hash2'
> 
> * I changed the global variable to lower case as Python's style guide states
> that all uppercase variables is for constants.
> 
> * The '=None' matches one of the current usages (no argument passed); hence,
> it is now explicit and 'pylint' is happy.
> 
> OK for mainline?

Yes, thanks.

> PS: I have not updated the hashes. If needed/wanted, I leave that to you,
> Jakub.

Once some commit is ignored, we won't be processing it anymore, so I think
the -i option is all we need.

Jakub



Re: [Patch] contrib/gcc-changelog/git_update_version.py: Improve diagnostic (was: [Patch] contrib/gcc-changelog/git_update_version.py: Add ignore commit, improve diagnostic)

2024-05-20 Thread Jakub Jelinek
On Mon, May 20, 2024 at 08:31:02AM +0200, Tobias Burnus wrote:
> Hmm, there were now two daily bumps:
> 
> Date:   Mon May 20 00:16:30 2024 +
> 
> Date:   Sun May 19 18:15:28 2024 +
> 
> I really wonder why.

Because I've done it by hand.
I have in ~gccadmin a gcc-changelog copy and adjusted update_version_git
script which doesn't use contrib/gcc-changelog subdirectory from the
checkout it makes but from the ~gccadmin directory, because I don't want to
constantly try to add some commit number to IGNORED_COMMITS, see that it
either works or doesn't (I think sometimes it needs the hash of the revert
commit, at other times the commit hash referenced in the revert commit)
or that further ones are needed.

> From f56b1764f2b5c2c83c6852607405e5be0a763a2c Mon Sep 17 00:00:00 2001
> From: Tobias Burnus 
> Date: Sun, 19 May 2024 08:17:42 +0200
> Subject: [PATCH] contrib/gcc-changelog/git_update_version.py: Improve 
> diagnostic
> 
> contrib/ChangeLog:
> 
> * gcc-changelog/git_update_version.py (prepend_to_changelog_files): 
> Output

8 spaces rather than tab

>   git hash in case errors occurred.
> 
> diff --git a/contrib/gcc-changelog/git_update_version.py 
> b/contrib/gcc-changelog/git_update_version.py
> index 24f6c43d0b2..ec0151b83fe 100755
> --- a/contrib/gcc-changelog/git_update_version.py
> +++ b/contrib/gcc-changelog/git_update_version.py
> @@ -58,6 +58,7 @@ def read_timestamp(path):
>  
>  def prepend_to_changelog_files(repo, folder, git_commit, add_to_git):
>  if not git_commit.success:
> +logging.info(f"While processing {git_commit.info.hexsha}:")
>  for error in git_commit.errors:
>  logging.info(error)
>  raise AssertionError()

So, your commit is useful part of it, I'm already using something similar in
my hack (just was doing it for even successful commits, but I think your
patch is better).
And, I think best would be if update_version_git script simply
accepted a list of ignored commits from the command line too,
passed it to the git_update_version.py script and that one
added those to IGNORED_COMMITS.
Because typically if the DATESTAMP/ChangeLog updates gets stuck,
one doesn't just adjust IGNORED_COMMITS and wait up to another
day to see if it worked, but runs the script by hand to make sure
it works.

--- gcc-checkout/contrib/gcc-changelog/git_update_version.py2024-05-13 
16:52:57.890151748 +
+++ gcc-changelog/git_update_version.py 2024-05-19 18:13:44.953648834 +
@@ -41,7 +41,21 @@ IGNORED_COMMITS = (
 '040e5b0edbca861196d9e2ea2af5e805769c8d5d',
 '8057f9aa1f7e70490064de796d7a8d42d446caf8',
 '109f1b28fc94c93096506e3df0c25e331cef19d0',
-'39f81924d88e3cc197fc3df74204c9b5e01e12f7')
+'39f81924d88e3cc197fc3df74204c9b5e01e12f7',
+'d7bb8eaade3cd3aa70715c8567b4d7b08098e699',
+'89feb3557a018893cfe50c2e07f91559bd3cde2b',
+'ccf8d3e3d26c6ba3d5e11fffeed8d64018e9c060',
+'e0c52905f666e3d23881f82dbf39466a24f009f4',
+'b38472ffc1e631bd357573b44d956ce16d94e666',
+'a0b13d0860848dd5f2876897ada1e22e4e681e91',
+'b8c772cae97b54386f7853edf0f9897012bfa90b',
+'810d35a7e054bcbb5b66d2e5924428e445f5fba9',
+'0df1ee083434ac00ecb19582b1e5b25e105981b2',
+'2c688f6afce4cbb414f5baab1199cd525f309fca',
+'60dcb710b6b4aa22ea96abc8df6dfe9067f3d7fe',
+'44968a0e00f656e9bb3e504bb2fa1a8282002015',
+'d7bb8eaade3cd3aa70715c8567b4d7b08098e699',
+'da73261ce7731be7f2b164f1db796878cdc23365')
 
 FORMAT = '%(asctime)s:%(levelname)s:%(name)s:%(message)s'
 logging.basicConfig(level=logging.INFO, format=FORMAT,
@@ -125,6 +139,7 @@ def update_current_branch(ref_name):
   % (commit.hexsha, head.hexsha), ref_name)
 commits = [c for c in commits if c.info.hexsha not in IGNORED_COMMITS]
 for git_commit in reversed(commits):
+logging.info('trying %s', git_commit.info.hexsha)
 prepend_to_changelog_files(repo, args.git_path, git_commit,
not args.dry_mode)
 if args.dry_mode:

Jakub



Re: [PATCH] libstdc++: increment *this instead of this

2024-05-18 Thread Jakub Jelinek
On Sat, May 18, 2024 at 02:53:20PM +0800, Kefu Chai wrote:
> libstdc++-v3/ChangeLog:
> 
> * include/bits/unicode.h (enable_borrowed_range): Call ++(*this)
> instead of ++this

This should be already fixed, see https://gcc.gnu.org/PR115119

Jakub



Re: [PATCH] Use DW_TAG_module for Ada

2024-05-17 Thread Jakub Jelinek
On Fri, May 03, 2024 at 11:08:04AM -0600, Tom Tromey wrote:
> DWARF is not especially clear on the distinction between
> DW_TAG_namespace and DW_TAG_module, but I think that DW_TAG_module is
> more appropriate for Ada.  This patch changes the compiler to do this.
> Note that the Ada compiler does not yet create NAMESPACE_DECLs.
> 
> gcc
> 
>   * dwarf2out.cc (gen_namespace_die): Use DW_TAG_module for Ada.

Ok, thanks.

> diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
> index 1b0e8b5a5b2..1e46c27cdf7 100644
> --- a/gcc/dwarf2out.cc
> +++ b/gcc/dwarf2out.cc
> @@ -26992,7 +26992,7 @@ gen_namespace_die (tree decl, dw_die_ref context_die)
>  {
>/* Output a real namespace or module.  */
>context_die = setup_namespace_context (decl, comp_unit_die ());
> -  namespace_die = new_die (is_fortran () || is_dlang ()
> +  namespace_die = new_die (is_fortran () || is_dlang () || is_ada ()
>  ? DW_TAG_module : DW_TAG_namespace,
>  context_die, decl);
>/* For Fortran modules defined in different CU don't add src coords.  
> */
> -- 
> 2.44.0

Jakub



C++ Patch ping - Re: [PATCH] c++: Fix parsing of abstract-declarator starting with ... followed by [ or ( [PR115012]

2024-05-16 Thread Jakub Jelinek
Hi!

I'd like to ping the 
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651199.html
patch.

Thanks.

On Thu, May 09, 2024 at 08:12:30PM +0200, Jakub Jelinek wrote:
> The C++26 P2662R3 Pack indexing paper mentions that both GCC
> and MSVC don't handle T...[10] parameter declaration when T
> is a pack.  While that will change meaning in C++26, in C++11 .. C++23
> this ought to be valid.  Also, T...(args) as well.
> 
> The following patch handles those in cp_parser_direct_declarator.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2024-05-09  Jakub Jelinek  
> 
>   PR c++/115012
>   * parser.cc (cp_parser_direct_declarator): Handle
>   abstract declarator starting with ... followed by [
>   or (.
> 
>   * g++.dg/cpp0x/variadic185.C: New test.
>   * g++.dg/cpp0x/variadic186.C: New test.

Jakub



Re: [COMMITTED] Revert "Revert: "Enable prange support.""

2024-05-16 Thread Jakub Jelinek
On Thu, May 16, 2024 at 12:14:09PM +0200, Aldy Hernandez wrote:
> Wait, what's the preferred way of reverting a patch?  I followed what I saw 
> in:

Reverting a patch (that isn't a reversion) just push git revert.
The important part is not to modify the This reverts commit line from what
git revert created.

> commit 04ee1f788ceaa4c7f777ff3b9441ae076191439c
> Author: Jeff Law 
> Date:   Mon May 13 21:42:38 2024 -0600
> 
> Revert "[PATCH v2 1/3] RISC-V: movmem for RISCV with V extension"
> 
> This reverts commit df15eb15b5f820321c81efc75f0af13ff8c0dd5b.

So, this is just fine.

> and here:
> 
> commit 0c6dd4b0973738ce43e76b468a002ab5eb58aaf4
> Author: YunQiang Su 
> Date:   Mon May 13 14:15:38 2024 +0800
> 
> Revert "MIPS: Support constraint 'w' for MSA instruction"
> 
> This reverts commit 9ba01240864ac446052d97692e2199539b7c76d8.

And this too.

What is not fine is hand edit the message:
This reverts commit 9ba01240864ac446052d97692e2199539b7c76d8 because
foo and bar.
You can do that separately, so
This reverts commit 9ba01240864ac446052d97692e2199539b7c76d8.
The reversion is because of foo and bar.
Or being further creative:
This reverts commit r13-8390-g9de6ff5ec9a46951d2.
etc.

> commit f6ce85502eb2e4e7bbd9b3c6c1c065a004f8f531
> Author: Hans-Peter Nilsson 
> Date:   Wed May 8 04:11:20 2024 +0200
> 
> Revert "Revert "testsuite/gcc.target/cris/pr93372-2.c: Handle
> xpass from combine improvement""
> 
> This reverts commit 39f81924d88e3cc197fc3df74204c9b5e01e12f7.

This one is not fine.  Our current infrastructure for ChangeLog
generation can't deal with that and there is no agreement what to
write in the ChangeLog for it anyway, whether 2 reversions turn it into
Reapply commit: or 2 Revert: lines?  What happens on 3rd reversion?
So, one needs to manually remove the
This reverts commit 39f81924d88e3cc197fc3df74204c9b5e01e12f7.
line and supply ChangeLog entry.

For cases like this or the ammended lines (or say if This reverts
commit or (cherry-picked from ) lines refer to invalid commit
the daily DATESTAMP update then fails, I need to add manually
all problematic commits to IGNORED_COMMITS, rerun it by hand and
then write the ChangeLog entries by hand.
See
https://gcc.gnu.org/r13-8764
https://gcc.gnu.org/r15-426
https://gcc.gnu.org/r15-345
https://gcc.gnu.org/r15-344
https://gcc.gnu.org/r15-341
https://gcc.gnu.org/r14-9832
https://gcc.gnu.org/r14-9830
for what I had to do only in April/May for this.

Jakub



Re: [COMMITTED] Revert "Revert: "Enable prange support.""

2024-05-16 Thread Jakub Jelinek
On Thu, May 16, 2024 at 12:01:01PM +0200, Aldy Hernandez wrote:
> This reverts commit d7bb8eaade3cd3aa70715c8567b4d7b08098e699 and enables 
> prange
> support again.

Please don't do this.
This breaks ChangeLog generation, will need to handle it tomorrow by hand again.
Both the ammendments to the git (cherry-pick -x or revert) added message
lines
This reverts commit COMMITHASH.
and
(cherry picked from commit COMMITHASH)
and revert of revert.

Jakub



[committed] openmp: Diagnose using grainsize+num_tasks clauses together [PR115103]

2024-05-15 Thread Jakub Jelinek
Hi!

I've noticed that while we diagnose many other OpenMP exclusive clauses,
we don't diagnose grainsize together with num_tasks on taskloop construct
in all of C, C++ and Fortran (the implementation simply ignored grainsize
in that case) and for Fortran also don't diagnose mixing nogroup clause
with reduction clause(s).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk.

2024-05-15  Jakub Jelinek  

PR c/115103
gcc/c/
* c-typeck.cc (c_finish_omp_clauses): Diagnose grainsize
used together with num_tasks.
gcc/cp/
* semantics.cc (finish_omp_clauses): Diagnose grainsize
used together with num_tasks.
gcc/fortran/
* openmp.cc (resolve_omp_clauses): Diagnose grainsize
used together with num_tasks or nogroup used together with
reduction.
gcc/testsuite/
* c-c++-common/gomp/clause-dups-1.c: Add 2 further expected errors.
* gfortran.dg/gomp/pr115103.f90: New test.

--- gcc/c/c-typeck.cc.jj2024-04-22 14:46:28.917086705 +0200
+++ gcc/c/c-typeck.cc   2024-05-15 15:43:23.117428045 +0200
@@ -14722,6 +14722,8 @@ c_finish_omp_clauses (tree clauses, enum
   tree *detach_seen = NULL;
   bool linear_variable_step_check = false;
   tree *nowait_clause = NULL;
+  tree *grainsize_seen = NULL;
+  bool num_tasks_seen = false;
   tree ordered_clause = NULL_TREE;
   tree schedule_clause = NULL_TREE;
   bool oacc_async = false;
@@ -16021,8 +16023,6 @@ c_finish_omp_clauses (tree clauses, enum
case OMP_CLAUSE_PROC_BIND:
case OMP_CLAUSE_DEVICE_TYPE:
case OMP_CLAUSE_PRIORITY:
-   case OMP_CLAUSE_GRAINSIZE:
-   case OMP_CLAUSE_NUM_TASKS:
case OMP_CLAUSE_THREADS:
case OMP_CLAUSE_SIMD:
case OMP_CLAUSE_HINT:
@@ -16048,6 +16048,16 @@ c_finish_omp_clauses (tree clauses, enum
  pc = _CLAUSE_CHAIN (c);
  continue;
 
+   case OMP_CLAUSE_GRAINSIZE:
+ grainsize_seen = pc;
+ pc = _CLAUSE_CHAIN (c);
+ continue;
+
+   case OMP_CLAUSE_NUM_TASKS:
+ num_tasks_seen = true;
+ pc = _CLAUSE_CHAIN (c);
+ continue;
+
case OMP_CLAUSE_MERGEABLE:
  mergeable_seen = true;
  pc = _CLAUSE_CHAIN (c);
@@ -16333,6 +16343,14 @@ c_finish_omp_clauses (tree clauses, enum
   *nogroup_seen = OMP_CLAUSE_CHAIN (*nogroup_seen);
 }
 
+  if (grainsize_seen && num_tasks_seen)
+{
+  error_at (OMP_CLAUSE_LOCATION (*grainsize_seen),
+   "% clause must not be used together with "
+   "% clause");
+  *grainsize_seen = OMP_CLAUSE_CHAIN (*grainsize_seen);
+}
+
   if (detach_seen)
 {
   if (mergeable_seen)
--- gcc/cp/semantics.cc.jj  2024-05-15 15:43:05.823657545 +0200
+++ gcc/cp/semantics.cc 2024-05-15 15:44:07.085844564 +0200
@@ -7098,6 +7098,7 @@ finish_omp_clauses (tree clauses, enum c
   bool mergeable_seen = false;
   bool implicit_moved = false;
   bool target_in_reduction_seen = false;
+  bool num_tasks_seen = false;
 
   bitmap_obstack_initialize (NULL);
   bitmap_initialize (_head, _default_obstack);
@@ -7656,6 +7657,10 @@ finish_omp_clauses (tree clauses, enum c
  /* FALLTHRU */
 
case OMP_CLAUSE_NUM_TASKS:
+ if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_NUM_TASKS)
+   num_tasks_seen = true;
+ /* FALLTHRU */
+
case OMP_CLAUSE_NUM_TEAMS:
case OMP_CLAUSE_NUM_THREADS:
case OMP_CLAUSE_NUM_GANGS:
@@ -9244,6 +9249,17 @@ finish_omp_clauses (tree clauses, enum c
  *pc = OMP_CLAUSE_CHAIN (c);
  continue;
}
+ pc = _CLAUSE_CHAIN (c);
+ continue;
+   case OMP_CLAUSE_GRAINSIZE:
+ if (num_tasks_seen)
+   {
+ error_at (OMP_CLAUSE_LOCATION (c),
+   "% clause must not be used together with "
+   "% clause");
+ *pc = OMP_CLAUSE_CHAIN (c);
+ continue;
+   }
  pc = _CLAUSE_CHAIN (c);
  continue;
case OMP_CLAUSE_ORDERED:
--- gcc/fortran/openmp.cc.jj2024-03-14 22:06:58.273669790 +0100
+++ gcc/fortran/openmp.cc   2024-05-15 15:43:23.122427979 +0200
@@ -9175,6 +9175,13 @@ resolve_omp_clauses (gfc_code *code, gfc
 resolve_positive_int_expr (omp_clauses->grainsize, "GRAINSIZE");
   if (omp_clauses->num_tasks)
 resolve_positive_int_expr (omp_clauses->num_tasks, "NUM_TASKS");
+  if (omp_clauses->grainsize && omp_clauses->num_tasks)
+gfc_error ("% clause at %L must not be used together with "
+  "% clause", _clauses->grainsize->where);
+  if (omp_clauses->lists[OMP_LIST_REDUCTION] && omp_clauses->nogroup)
+gfc_error ("% clause at %L must not be used together with "
+  "% clause",
+  _clauses->lists[OMP_LIST_REDUCTION]->

[committed] combine: Fix up simplify_compare_const [PR115092]

2024-05-15 Thread Jakub Jelinek
Hi!

The following testcases are miscompiled (with tons of GIMPLE
optimization disabled) because combine sees GE comparison of
1-bit sign_extract (i.e. something with [-1, 0] value range)
with (const_int -1) (which is always true) and optimizes it into
NE comparison of 1-bit zero_extract ([0, 1] value range) against
(const_int 0).
The reason is that simplify_compare_const first (correctly)
simplifies the comparison to
GE (ashift:SI something (const_int 31)) (const_int -2147483648)
and then an optimization for when the second operand is power of 2
triggers.  That optimization is fine for power of 2s which aren't
the signed minimum of the mode, or if it is NE, EQ, GEU or LTU
against the signed minimum of the mode, but for GE or LT optimizing
it into NE (or EQ) against const0_rtx is wrong, those cases
are always true or always false (but the function doesn't have
a standardized way to tell callers the comparison is now unconditional).

The following patch just disables the optimization in that case.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by
Segher in the PR, committed to trunk so far.

2024-05-15  Jakub Jelinek  

PR rtl-optimization/114902
PR rtl-optimization/115092
* combine.cc (simplify_compare_const): Don't optimize
GE op0 SIGNED_MIN or LT op0 SIGNED_MIN into NE op0 const0_rtx or
EQ op0 const0_rtx.

* gcc.dg/pr114902.c: New test.
* gcc.dg/pr115092.c: New test.

--- gcc/combine.cc.jj   2024-05-07 18:10:10.415874636 +0200
+++ gcc/combine.cc  2024-05-15 13:33:26.555081215 +0200
@@ -11852,8 +11852,10 @@ simplify_compare_const (enum rtx_code co
  `and'ed with that bit), we can replace this with a comparison
  with zero.  */
   if (const_op
-  && (code == EQ || code == NE || code == GE || code == GEU
- || code == LT || code == LTU)
+  && (code == EQ || code == NE || code == GEU || code == LTU
+ /* This optimization is incorrect for signed >= INT_MIN or
+< INT_MIN, those are always true or always false.  */
+ || ((code == GE || code == LT) && const_op > 0))
   && is_a  (mode, _mode)
   && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
   && pow2p_hwi (const_op & GET_MODE_MASK (int_mode))
--- gcc/testsuite/gcc.dg/pr114902.c.jj  2024-05-15 14:01:20.826717914 +0200
+++ gcc/testsuite/gcc.dg/pr114902.c 2024-05-15 14:00:39.603268571 +0200
@@ -0,0 +1,23 @@
+/* PR rtl-optimization/114902 */
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-tree-fre -fno-tree-forwprop -fno-tree-ccp 
-fno-tree-dominator-opts" } */
+
+__attribute__((noipa))
+int foo (int x)
+{
+  int a = ~x;
+  int t = a & 1;
+  int e = -t;
+  int b = e >= -1;
+  if (b)
+return 0;
+  __builtin_trap ();
+}
+
+int
+main ()
+{
+  foo (-1);
+  foo (0);
+  foo (1);
+}
--- gcc/testsuite/gcc.dg/pr115092.c.jj  2024-05-15 13:46:27.634649150 +0200
+++ gcc/testsuite/gcc.dg/pr115092.c 2024-05-15 13:46:12.052857268 +0200
@@ -0,0 +1,16 @@
+/* PR rtl-optimization/115092 */
+/* { dg-do run } */
+/* { dg-options "-O1 -fgcse -ftree-pre -fno-tree-dominator-opts -fno-tree-fre 
-fno-guess-branch-probability" } */
+
+int a, b, c = 1, d, e;
+
+int
+main ()
+{
+  int f, g = a;
+  b = -2;
+  f = -(1 >> ((c && b) & ~a));
+  if (f <= b)
+d = g / e;
+  return 0;
+}

Jakub



Re: [PATCH] middle-end/111422 - wrong stack var coalescing, handle PHIs

2024-05-15 Thread Jakub Jelinek
On Wed, May 15, 2024 at 01:41:04PM +0200, Richard Biener wrote:
>   PR middle-end/111422
>   * cfgexpand.cc (add_scope_conflicts_2): Handle PHIs
>   by recursing to their arguments.
> ---
>  gcc/cfgexpand.cc | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> index 557cb28733b..e4d763fa998 100644
> --- a/gcc/cfgexpand.cc
> +++ b/gcc/cfgexpand.cc
> @@ -584,10 +584,23 @@ add_scope_conflicts_2 (tree use, bitmap work,
> || INTEGRAL_TYPE_P (TREE_TYPE (use
>  {
>gimple *g = SSA_NAME_DEF_STMT (use);
> -  if (is_gimple_assign (g))
> - if (tree op = gimple_assign_rhs1 (g))
> -   if (TREE_CODE (op) == ADDR_EXPR)
> - visit (g, TREE_OPERAND (op, 0), op, work);
> +  if (gassign *a = dyn_cast  (g))
> + {
> +   if (tree op = gimple_assign_rhs1 (a))
> + if (TREE_CODE (op) == ADDR_EXPR)
> +   visit (a, TREE_OPERAND (op, 0), op, work);
> + }
> +  else if (gphi *p = dyn_cast  (g))
> + {
> +   for (unsigned i = 0; i < gimple_phi_num_args (p); ++i)
> + if (TREE_CODE (use = gimple_phi_arg_def (p, i)) == SSA_NAME)
> +   if (gassign *a = dyn_cast  (SSA_NAME_DEF_STMT (use)))
> + {
> +   if (tree op = gimple_assign_rhs1 (a))
> + if (TREE_CODE (op) == ADDR_EXPR)
> +   visit (a, TREE_OPERAND (op, 0), op, work);
> + }
> + }

Why the 2 {} pairs here?  Can't it be done without them (sure, before the
else if it is required)?

Otherwise LGTM.

Jakub



Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread Jakub Jelinek
On Tue, May 07, 2024 at 10:37:55AM +0800, HAO CHEN GUI wrote:
>   The former patch adds isfinite optab for __builtin_isfinite.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html
> 
>   Thus the builtin might not be folded at front end. The range op for
> isfinite is needed for value range analysis. This patch adds them.
> 
>   Compared to last version, this version fixes a typo.
> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is it OK for the trunk?
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> Value Range: Add range op for builtin isfinite
> 
> The former patch adds optab for builtin isfinite. Thus builtin isfinite might
> not be folded at front end.  So the range op for isfinite is needed for value
> range analysis.  This patch adds range op for builtin isfinite.
> 
> gcc/
>   * gimple-range-op.cc (class cfn_isfinite): New.
>   (op_cfn_finite): New variables.
>   (gimple_range_op_handler::maybe_builtin_call): Handle
>   CFN_BUILT_IN_ISFINITE.
> 
> gcc/testsuite/
>   * gcc/testsuite/gcc.dg/tree-ssa/range-isfinite.c: New test.

BUILT_IN_ISFINITE is just one of many BUILT_IN_IS... builtins,
would be nice to handle the others as well.

E.g. isnormal/isnan/isinf, fpclassify etc.

Note, the man page says for e.g. isnormal that it returns nonzero or zero,
but in reality I think we implement it always inline and can check if
it always returns [0,1].
Some others like isinf return [-1,1] though I think and fpclassify
returns union of all the passed int values.

Jakub



Re: [PATCH] [testsuite] Fix gcc.dg/pr115066.c fail on aarch64

2024-05-14 Thread Jakub Jelinek
On Tue, May 14, 2024 at 03:47:46PM +0200, Tom de Vries wrote:
> On aarch64, I get this failure:
> ...
> FAIL: gcc.dg/pr115066.c scan-assembler \\.byte\\t0xb\\t# Define macro strx
> ...
> 
> This happens because we expect to match:
> ...
> .byte   0xb # Define macro strx
> ...
> but instead we get:
> ...
> .byte   0xb // Define macro strx
> ...
> 
> Fix this by not explicitly matching the comment marker.
> 
> Tested on aarch64 and x86_64.
> 
> gcc/testsuite/ChangeLog:
> 
> 2024-05-14  Tom de Vries  
> 
> * gcc.dg/pr115066.c: Don't match comment marker.
> ---
>  gcc/testsuite/gcc.dg/pr115066.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/pr115066.c b/gcc/testsuite/gcc.dg/pr115066.c
> index 645757df209..a7e98500160 100644
> --- a/gcc/testsuite/gcc.dg/pr115066.c
> +++ b/gcc/testsuite/gcc.dg/pr115066.c
> @@ -2,7 +2,7 @@
>  /* { dg-skip-if "split DWARF unsupported" { hppa*-*-hpux* powerpc*-ibm-aix* 
> *-*-darwin* } } */
>  /* { dg-options "-gsplit-dwarf -g3 -dA -gdwarf-4" } */
>  /* { dg-final { scan-assembler-times {\.section\t"?\.debug_macro} 1 } } */
> -/* { dg-final { scan-assembler-not {\.byte\t0x5\t# Define macro strp} } } */
> -/* { dg-final { scan-assembler {\.byte\t0xb\t# Define macro strx} } } */
> +/* { dg-final { scan-assembler-not {\.byte\t0x5\t.* Define macro strp} } } */
> +/* { dg-final { scan-assembler {\.byte\t0xb\t.* Define macro strx} } } */

Actually, perhaps better use [^\n\r]* instead of .*
You don't want to match the comment on a different line.

Jakub



Re: [PATCH] [testsuite] Fix gcc.dg/pr115066.c fail on aarch64

2024-05-14 Thread Jakub Jelinek
On Tue, May 14, 2024 at 03:47:46PM +0200, Tom de Vries wrote:
> On aarch64, I get this failure:
> ...
> FAIL: gcc.dg/pr115066.c scan-assembler \\.byte\\t0xb\\t# Define macro strx
> ...
> 
> This happens because we expect to match:
> ...
> .byte   0xb # Define macro strx
> ...
> but instead we get:
> ...
> .byte   0xb // Define macro strx
> ...
> 
> Fix this by not explicitly matching the comment marker.
> 
> Tested on aarch64 and x86_64.
> 
> gcc/testsuite/ChangeLog:
> 
> 2024-05-14  Tom de Vries  
> 
> * gcc.dg/pr115066.c: Don't match comment marker.
> ---
>  gcc/testsuite/gcc.dg/pr115066.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Ok.

Jakub



Re: [PATCH] [debug] Fix dwarf v4 .debug_macro.dwo

2024-05-14 Thread Jakub Jelinek
On Tue, May 14, 2024 at 01:35:30PM +0200, Tom de Vries wrote:
> Consider a hello world, compiled with -gsplit-dwarf and dwarf version 4, and 
> -g3:
> ...
> $ gcc -gdwarf-4 -gsplit-dwarf /data/vries/hello.c -g3 -save-temps -dA
> ...
> 
> In section .debug_macro.dwo, we have:
> ...
> .Ldebug_macro0:
> .value  0x4 # DWARF macro version number
> .byte   0x2 # Flags: 32-bit, lineptr present
> .long   .Lskeleton_debug_line0
> .byte   0x3 # Start new file
> .uleb128 0  # Included from line number 0
> .uleb128 0x1# file /data/vries/hello.c
> .byte   0x5 # Define macro strp
> .uleb128 0  # At line number 0
> .uleb128 0x1d0  # The macro: "__STDC__ 1"
> ...
> 
> Given that we use a DW_MACRO_define_strp, we'd expect 0x1d0 to be an
> offset into a .debug_str.dwo section.
> 
> But in fact, 0x1d0 is an index into the string offset table in
> .debug_str_offsets.dwo:
> ...
> .long   0x34f0  # indexed string 0x1d0: __STDC__ 1
> ...
> 
> Add asserts that catch this inconsistency, and fix this by using
> DW_MACRO_define_strx instead.
> 
> Tested on x86_64.
> 
> PR debug/115066

ChangeLog entry is missing.

Otherwise LGTM.

Jakub



Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]

2024-05-13 Thread Jakub Jelinek
On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote:
> > 2024-05-09  Jakub Jelinek  
> > Jason Merrill  
> > 
> > PR lto/113208
> > * cp-tree.h (maybe_optimize_cdtor): Remove.
> > * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
> > for implicit instantiations of maybe in charge ctors/dtors
> > declared inline.
> > (import_export_decl): Don't call maybe_optimize_cdtor.
> > (c_parse_final_cleanups): Formatting fixes.
> > * optimize.cc (can_alias_cdtor): Adjust condition, for
> > HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
> > if not DECL_INTERFACE_KNOWN.
> 
> > --- gcc/cp/optimize.cc.jj   2024-04-25 20:33:30.771858912 +0200
> > +++ gcc/cp/optimize.cc  2024-05-09 17:10:23.920478922 +0200
> > @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
> > gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
> > /* Don't use aliases for weak/linkonce definitions unless we can put 
> > both
> >symbols in the same COMDAT group.  */
> > -  return (DECL_INTERFACE_KNOWN (fn)
> > - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
> > - && (!DECL_ONE_ONLY (fn)
> > - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn;
> > +  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
> > +: (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
> 
> Hmm, would
> 
> (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP
>  : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn)))
> 
> make sense instead?  I don't think DECL_WEAK is necessary for COMDAT.

I think it isn't indeed necessary for COMDAT, although e.g. comdat_linkage
will not call make_decl_one_only if !flag_weak.

But I think it is absolutely required for the alias cdtor optimization
in question, because otherwise it would be an ABI change.
Consider older version of GCC or some other compiler emitting
_ZN6vectorI12QualityValueEC1ERKS1_
and
_ZN6vectorI12QualityValueEC2ERKS1_
symbols not as aliases, each in their own comdat groups, so
.text._ZN6vectorI12QualityValueEC1ERKS1_ in _ZN6vectorI12QualityValueEC1ERKS1_
comdat group and
.text._ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_
comdat group.  And then comes GCC with the above patch without the DECL_WEAK
check in there, and decides to use alias, so
_ZN6vectorI12QualityValueEC1ERKS1_ is an alias to
_ZN6vectorI12QualityValueEC2ERKS1_ and both live in
.text._ZN6vectorI12QualityValueEC2ERKS1_ section in
_ZN6vectorI12QualityValueEC5ERKS1_ comdat group.  If you mix TUs with this,
the linker can keep one of the section sets from the 
_ZN6vectorI12QualityValueEC1ERKS1_
and _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC5ERKS1_
comdat groups.  If there is no .weak for the symbols, this will fail to
link, one can emit it either the old way or the new way but never both, it
is part of an ABI.
While with .weak, mixing it is possible, worst case one gets some unused
code in the linked binary or shared library.  Of course the desirable case
is that there is no mixing and there is no unused code, but if it happens,
no big deal.  Without .weak it is a big deal.

Jakub



Re: [pushed 00/21] Various backports to gcc 13 (analyzer, jit, diagnostics)

2024-05-13 Thread Jakub Jelinek
On Thu, May 09, 2024 at 01:42:15PM -0400, David Malcolm wrote:
> I've pushed the following changes to releases/gcc-13
> as r13-8741-g89feb3557a0188 through r13-8761-gb7a2697733d19a.

Unfortunately many of the commits contained git commit message wording
that update_git_version can't cope with.
Wording like
(cherry picked from commit r14-1664-gfe9771b59f576f)
is wrong,
(cherry picked from commit .)
is reserved solely for what one gets from git cherry-pick -x
(i.e. the full commit hash without anything extra).

I had to ignore the following commits in the ChangeLog generation
because of this:

89feb3557a018893cfe50c2e07f91559bd3cde2b
ccf8d3e3d26c6ba3d5e11fffeed8d64018e9c060
e0c52905f666e3d23881f82dbf39466a24f009f4
b38472ffc1e631bd357573b44d956ce16d94e666
a0b13d0860848dd5f2876897ada1e22e4e681e91
b8c772cae97b54386f7853edf0f9897012bfa90b
810d35a7e054bcbb5b66d2e5924428e445f5fba9
0df1ee083434ac00ecb19582b1e5b25e105981b2
2c688f6afce4cbb414f5baab1199cd525f309fca
60dcb710b6b4aa22ea96abc8df6dfe9067f3d7fe
44968a0e00f656e9bb3e504bb2fa1a8282002015

Can you please add the ChangeLog entries for these by hand
(commits which only touch ChangeLog files are allowed and shouldn't
contain ChangeLog style entry in the commit message)?

Thanks.

Jakub



[PATCH] tree-ssa-math-opts: Pattern recognize yet another .ADD_OVERFLOW pattern [PR113982]

2024-05-13 Thread Jakub Jelinek
Hi!

We pattern recognize already many different patterns, and closest to the
requested one also
   yc = (type) y;
   zc = (type) z;
   x = yc + zc;
   w = (typeof_y) x;
   if (x > max)
where y/z has the same unsigned type and type is a wider unsigned type
and max is maximum value of the narrower unsigned type.
But apparently people are creative in writing this in diffent ways,
this requests
   yc = (type) y;
   zc = (type) z;
   x = yc + zc;
   w = (typeof_y) x;
   if (x >> narrower_type_bits)

The following patch implements that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-13  Jakub Jelinek  

PR middle-end/113982
* tree-ssa-math-opts.cc (arith_overflow_check_p): Also return 1
for RSHIFT_EXPR by precision of maxval if shift result is only
used in a cast or comparison against zero.
(match_arith_overflow): Handle the RSHIFT_EXPR use case.

* gcc.dg/pr113982.c: New test.

--- gcc/tree-ssa-math-opts.cc.jj2024-04-11 09:26:36.318369218 +0200
+++ gcc/tree-ssa-math-opts.cc   2024-05-10 18:17:08.795744811 +0200
@@ -3947,6 +3947,66 @@ arith_overflow_check_p (gimple *stmt, gi
   else
 return 0;
 
+  if (maxval
+  && ccode == RSHIFT_EXPR
+  && crhs1 == lhs
+  && TREE_CODE (crhs2) == INTEGER_CST
+  && wi::to_widest (crhs2) == TYPE_PRECISION (TREE_TYPE (maxval)))
+{
+  tree shiftlhs = gimple_assign_lhs (use_stmt);
+  if (!shiftlhs)
+   return 0;
+  use_operand_p use;
+  if (!single_imm_use (shiftlhs, , _use_stmt))
+   return 0;
+  if (gimple_code (cur_use_stmt) == GIMPLE_COND)
+   {
+ ccode = gimple_cond_code (cur_use_stmt);
+ crhs1 = gimple_cond_lhs (cur_use_stmt);
+ crhs2 = gimple_cond_rhs (cur_use_stmt);
+   }
+  else if (is_gimple_assign (cur_use_stmt))
+   {
+ if (gimple_assign_rhs_class (cur_use_stmt) == GIMPLE_BINARY_RHS)
+   {
+ ccode = gimple_assign_rhs_code (cur_use_stmt);
+ crhs1 = gimple_assign_rhs1 (cur_use_stmt);
+ crhs2 = gimple_assign_rhs2 (cur_use_stmt);
+   }
+ else if (gimple_assign_rhs_code (cur_use_stmt) == COND_EXPR)
+   {
+ tree cond = gimple_assign_rhs1 (cur_use_stmt);
+ if (COMPARISON_CLASS_P (cond))
+   {
+ ccode = TREE_CODE (cond);
+ crhs1 = TREE_OPERAND (cond, 0);
+ crhs2 = TREE_OPERAND (cond, 1);
+   }
+ else
+   return 0;
+   }
+ else
+   {
+ enum tree_code sc = gimple_assign_rhs_code (cur_use_stmt);
+ tree castlhs = gimple_assign_lhs (cur_use_stmt);
+ if (!CONVERT_EXPR_CODE_P (sc)
+ || !castlhs
+ || !INTEGRAL_TYPE_P (TREE_TYPE (castlhs))
+ || (TYPE_PRECISION (TREE_TYPE (castlhs))
+ > TYPE_PRECISION (TREE_TYPE (maxval
+   return 0;
+ return 1;
+   }
+   }
+  else
+   return 0;
+  if ((ccode != EQ_EXPR && ccode != NE_EXPR)
+ || crhs1 != shiftlhs
+ || !integer_zerop (crhs2))
+   return 0;
+  return 1;
+}
+
   if (TREE_CODE_CLASS (ccode) != tcc_comparison)
 return 0;
 
@@ -4049,6 +4109,7 @@ arith_overflow_check_p (gimple *stmt, gi
_8 = IMAGPART_EXPR <_7>;
if (_8)
and replace (utype) x with _9.
+   Or with x >> popcount (max) instead of x > max.
 
Also recognize:
x = ~z;
@@ -4481,10 +4542,62 @@ match_arith_overflow (gimple_stmt_iterat
  gcc_checking_assert (is_gimple_assign (use_stmt));
  if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
{
- gimple_assign_set_rhs1 (use_stmt, ovf);
- gimple_assign_set_rhs2 (use_stmt, build_int_cst (type, 0));
- gimple_assign_set_rhs_code (use_stmt,
- ovf_use == 1 ? NE_EXPR : EQ_EXPR);
+ if (gimple_assign_rhs_code (use_stmt) == RSHIFT_EXPR)
+   {
+ g2 = gimple_build_assign (make_ssa_name (boolean_type_node),
+   ovf_use == 1 ? NE_EXPR : EQ_EXPR,
+   ovf, build_int_cst (type, 0));
+ gimple_stmt_iterator gsiu = gsi_for_stmt (use_stmt);
+ gsi_insert_before (, g2, GSI_SAME_STMT);
+ gimple_assign_set_rhs_with_ops (, NOP_EXPR,
+ gimple_assign_lhs (g2));
+ update_stmt (use_stmt);
+ use_operand_p use;
+ single_imm_use (gimple_assign_lhs (use_stmt), ,
+ _stmt);
+ if (gimple_code (use_stmt) == GIMPLE_COND)
+   {
+ gcond *cond_stmt = as_a  (u

Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]

2024-05-09 Thread Jakub Jelinek
On Thu, May 09, 2024 at 02:58:52PM -0400, Marek Polacek wrote:
> On Thu, May 09, 2024 at 08:20:00PM +0200, Jakub Jelinek wrote:
> > --- gcc/cp/decl.cc.jj   2024-05-09 10:30:54.804505130 +0200
> > +++ gcc/cp/decl.cc  2024-05-09 17:07:08.400110018 +0200
> > @@ -19280,6 +19280,14 @@ cxx_comdat_group (tree decl)
> >   else
> > break;
> > }
> > +  /* If a ctor/dtor has already set the comdat group by
> > +maybe_clone_body, don't override it.  */
> > +  if (SUPPORTS_ONE_ONLY
> > + && TREE_CODE (decl) == FUNCTION_DECL
> > + && DECL_CLONED_FUNCTION_P (decl)
> > + && SUPPORTS_ONE_ONLY)
> > +   if (tree comdat = DECL_COMDAT_GROUP (decl))
> > + return comdat;
> 
> This checks SUPPORTS_ONE_ONLY twice.

Oops, you're right, fixed in my copy.

Jakub



[committed] testsuite: Fix up pr84508* tests [PR84508]

2024-05-09 Thread Jakub Jelinek
On Thu, May 09, 2024 at 12:45:42PM +0800, Hongtao Liu wrote:
> > PR target/84508
> > * gcc.target/i386/pr84508-1.c: New test.
> > * gcc.target/i386/pr84508-2.c: Ditto.

The tests FAIL on x86_64-linux with
/usr/bin/ld: cannot find -lubsan
collect2: error: ld returned 1 exit status
compiler exited with status 1
FAIL: gcc.target/i386/pr84508-1.c (test for excess errors)
Excess errors:
/usr/bin/ld: cannot find -lubsan

The problem is that only *.dg/ubsan/ubsan.exp calls ubsan_init
which adds the needed search paths to libubsan library.
So, link/run tests for -fsanitize=undefined need to go into
gcc.dg/ubsan/ or g++.dg/ubsan/, even when they are target specific.

Tested on x86_64-linux with
make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} i386.exp=pr84508* 
ubsan.exp=pr84508*'
and committed to trunk as obvious.

2024-05-09  Jakub Jelinek  

PR target/84508
* gcc.target/i386/pr84508-1.c: Move to ...
* gcc.dg/ubsan/pr84508-1.c: ... here.  Restrict to i?86/x86_64
non-ia32 targets.
* gcc.target/i386/pr84508-2.c: Move to ...
* gcc.dg/ubsan/pr84508-2.c: ... here.  Restrict to i?86/x86_64
non-ia32 targets.

diff --git a/gcc/testsuite/gcc.target/i386/pr84508-1.c 
b/gcc/testsuite/gcc.dg/ubsan/pr84508-1.c
similarity index 74%
rename from gcc/testsuite/gcc.target/i386/pr84508-1.c
rename to gcc/testsuite/gcc.dg/ubsan/pr84508-1.c
index bb3e28d017e..d781e01 100644
--- a/gcc/testsuite/gcc.target/i386/pr84508-1.c
+++ b/gcc/testsuite/gcc.dg/ubsan/pr84508-1.c
@@ -1,5 +1,6 @@
-/* { dg-do run { target { ! ia32 } } } */
+/* { dg-do run { target { { i?86-*-* x86_64-*-* } && { ! ia32 } } } } */
 /* { dg-options "-fsanitize=undefined" } */
+
 #include 
 
 int main()
diff --git a/gcc/testsuite/gcc.target/i386/pr84508-2.c 
b/gcc/testsuite/gcc.dg/ubsan/pr84508-2.c
similarity index 73%
rename from gcc/testsuite/gcc.target/i386/pr84508-2.c
rename to gcc/testsuite/gcc.dg/ubsan/pr84508-2.c
index 32a8f20a536..cf9c7db1d15 100644
--- a/gcc/testsuite/gcc.target/i386/pr84508-2.c
+++ b/gcc/testsuite/gcc.dg/ubsan/pr84508-2.c
@@ -1,5 +1,6 @@
-/* { dg-do run { target { ! ia32 } } } */
+/* { dg-do run { target { { i?86-*-* x86_64-*-* } && { ! ia32 } } } } */
 /* { dg-options "-fsanitize=undefined" } */
+
 #include 
 
 int main()

Jakub



[PATCH] c++, mingw, v2: Fix up types of dtor hooks to __cxa_{,thread_}atexit/__cxa_throw on mingw ia32 [PR114968]

2024-05-09 Thread Jakub Jelinek
On Thu, May 09, 2024 at 01:05:59PM -0400, Jason Merrill wrote:
> I think I'd rather pass ob_parm to start_cleanup_fn, where it can also
> replace the flag_use_cxa_atexit check in that function.

Good idea, changed in the following patch.

> > @@ -9998,7 +10004,8 @@ register_dtor_fn (tree decl)
> >   {
> > /* We must convert CLEANUP to the type that "__cxa_atexit"
> >  expects.  */
> > -  cleanup = build_nop (get_atexit_fn_ptr_type (), cleanup);
> > +  cleanup = build_nop (ob_parm ? get_cxa_atexit_fn_ptr_type ()
> > +  : get_atexit_fn_ptr_type (), cleanup);
> 
> If we're (now) using the correct type to build the cleanup fn, this
> conversion should be unnecessary.

This is the use_dtor case, where cleanup will have METHOD_TYPE, so
I think we need to cast.  But, we can cast always to
get_cxa_atexit_fn_ptr_type () type, because this is in use_dtor guarded code
and use_dtor is ob_parm && CLASS_TYPE_P (type), so when use_dtor is true,
ob_parm is also true.

Ok for trunk if it passes another bootstrap/regtest?

2024-05-09  Jakub Jelinek  

PR target/114968
gcc/
* target.def (use_atexit_for_cxa_atexit): Remove spurious space
from comment.
(adjust_cdtor_callabi_fntype): New cxx target hook.
* targhooks.h (default_cxx_adjust_cdtor_callabi_fntype): Declare.
* targhooks.cc (default_cxx_adjust_cdtor_callabi_fntype): New
function.
* doc/tm.texi.in (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Add.
* doc/tm.texi: Regenerate.
* config/i386/i386.cc (ix86_cxx_adjust_cdtor_callabi_fntype): New
function.
(TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Redefine.
gcc/cp/
* cp-tree.h (atexit_fn_ptr_type_node, cleanup_type): Adjust macro
comments.
(get_cxa_atexit_fn_ptr_type): Declare.
* decl.cc (get_atexit_fn_ptr_type): Adjust function comment, only
build type for atexit argument.
(get_cxa_atexit_fn_ptr_type): New function.
(get_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than
get_atexit_fn_ptr_type when using __cxa_atexit.
(get_thread_atexit_node): Call get_cxa_atexit_fn_ptr_type
rather than get_atexit_fn_ptr_type.
(start_cleanup_fn): Add ob_parm argument, call
get_cxa_atexit_fn_ptr_type or get_atexit_fn_ptr_type depending
on it and create PARM_DECL also based on that argument.
(register_dtor_fn): Adjust start_cleanup_fn caller, use
get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type
for use_dtor casts.
* except.cc (build_throw): Use get_cxa_atexit_fn_ptr_type ().

--- gcc/target.def.jj   2024-05-09 10:30:54.926503473 +0200
+++ gcc/target.def  2024-05-09 20:27:16.294780780 +0200
@@ -6498,7 +6498,7 @@ is in effect.  The default is to return
  hook_bool_void_false)
 
 /* Returns true if target may use atexit in the same manner as
-   __cxa_atexit  to register static destructors.  */
+   __cxa_atexit to register static destructors.  */
 DEFHOOK
 (use_atexit_for_cxa_atexit,
  "This hook returns true if the target @code{atexit} function can be used\n\
@@ -6509,6 +6509,17 @@ unloaded. The default is to return false
  bool, (void),
  hook_bool_void_false)
 
+/* Returns modified FUNCTION_TYPE for cdtor callabi.  */
+DEFHOOK
+(adjust_cdtor_callabi_fntype,
+ "This hook returns a possibly modified @code{FUNCTION_TYPE} for arguments\n\
+to @code{__cxa_atexit}, @code{__cxa_thread_atexit} or @code{__cxa_throw}\n\
+function pointers.  ABIs like mingw32 require special attributes to be added\n\
+to function types pointed to by arguments of these functions.\n\
+The default is to return the passed argument unmodified.",
+ tree, (tree fntype),
+ default_cxx_adjust_cdtor_callabi_fntype)
+
 DEFHOOK
 (adjust_class_at_definition,
 "@var{type} is a C++ class (i.e., RECORD_TYPE or UNION_TYPE) that has just\n\
--- gcc/targhooks.h.jj  2024-05-09 10:30:54.941503269 +0200
+++ gcc/targhooks.h 2024-05-09 20:27:16.315780505 +0200
@@ -65,6 +65,7 @@ extern machine_mode default_mode_for_suf
 
 extern tree default_cxx_guard_type (void);
 extern tree default_cxx_get_cookie_size (tree);
+extern tree default_cxx_adjust_cdtor_callabi_fntype (tree);
 
 extern bool hook_pass_by_reference_must_pass_in_stack
   (cumulative_args_t, const function_arg_info &);
--- gcc/targhooks.cc.jj 2024-05-09 10:30:54.927503459 +0200
+++ gcc/targhooks.cc2024-05-09 20:27:16.338780204 +0200
@@ -329,6 +329,14 @@ default_cxx_get_cookie_size (tree type)
   return cookie_size;
 }
 
+/* Returns modified FUNCTION_TYPE for cdtor callabi.  */
+
+tree
+default_cxx_adjust_cdtor_callabi_fntype (tree fntype)
+{
+  return fntype;
+}
+
 /* Return true if a parameter must be passed by reference.  This version
of the TARGET_PASS_BY_REFERENCE hook uses just MUST_PASS_IN_STACK.  */
 
--- gcc/d

[PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]

2024-05-09 Thread Jakub Jelinek
On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote:
> Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but
> rather set it to some stub like void_node?
> 
> Though with all these changes, it's probably better to go with your first
> patch for GCC 14 and delay this approach to 15.  Your v1 patch is OK for 14.

Ok, here is an updated patch, which sets DECL_SAVED_TREE to void_node for
the aliases together with reversion of the earlier committed patch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-09  Jakub Jelinek  
Jason Merrill  

PR lto/113208
* cp-tree.h (maybe_optimize_cdtor): Remove.
* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
for implicit instantiations of maybe in charge ctors/dtors
declared inline.
(import_export_decl): Don't call maybe_optimize_cdtor.
(c_parse_final_cleanups): Formatting fixes.
* optimize.cc (can_alias_cdtor): Adjust condition, for
HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
if not DECL_INTERFACE_KNOWN.
(maybe_clone_body): Don't clear DECL_SAVED_TREE, instead set it
to void_node.
(maybe_clone_body): Remove.
* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
set.

* g++.dg/abi/comdat3.C: New test.
* g++.dg/abi/comdat4.C: New test.

--- gcc/cp/cp-tree.h.jj 2024-05-09 10:30:54.775505524 +0200
+++ gcc/cp/cp-tree.h2024-05-09 17:07:01.246206288 +0200
@@ -7451,7 +7451,6 @@ extern bool handle_module_option (unsign
 /* In optimize.cc */
 extern tree clone_attrs(tree);
 extern bool maybe_clone_body   (tree);
-extern void maybe_optimize_cdtor   (tree);
 
 /* In parser.cc */
 extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
--- gcc/cp/decl2.cc.jj  2024-05-02 09:31:17.753298180 +0200
+++ gcc/cp/decl2.cc 2024-05-09 17:11:11.676836268 +0200
@@ -3325,16 +3325,23 @@ tentative_decl_linkage (tree decl)
 linkage of all functions, and as that causes writes to
 the data mapped in from the PCH file, it's advantageous
 to mark the functions at this point.  */
- if (DECL_DECLARED_INLINE_P (decl)
- && (!DECL_IMPLICIT_INSTANTIATION (decl)
- || DECL_DEFAULTED_FN (decl)))
+ if (DECL_DECLARED_INLINE_P (decl))
{
- /* This function must have external linkage, as
-otherwise DECL_INTERFACE_KNOWN would have been
-set.  */
- gcc_assert (TREE_PUBLIC (decl));
- comdat_linkage (decl);
- DECL_INTERFACE_KNOWN (decl) = 1;
+ if (!DECL_IMPLICIT_INSTANTIATION (decl)
+ || DECL_DEFAULTED_FN (decl))
+   {
+ /* This function must have external linkage, as
+otherwise DECL_INTERFACE_KNOWN would have been
+set.  */
+ gcc_assert (TREE_PUBLIC (decl));
+ comdat_linkage (decl);
+ DECL_INTERFACE_KNOWN (decl) = 1;
+   }
+ else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
+   /* For implicit instantiations of cdtors try to make
+  it comdat, so that maybe_clone_body can use aliases.
+  See PR113208.  */
+   maybe_make_one_only (decl);
}
}
   else if (VAR_P (decl))
@@ -3604,9 +3611,6 @@ import_export_decl (tree decl)
 }
 
   DECL_INTERFACE_KNOWN (decl) = 1;
-
-  if (DECL_CLONED_FUNCTION_P (decl))
-maybe_optimize_cdtor (decl);
 }
 
 /* Return an expression that performs the destruction of DECL, which
@@ -5331,7 +5335,7 @@ c_parse_final_cleanups (void)
node = node->get_alias_target ();
 
  node->call_for_symbol_thunks_and_aliases (clear_decl_external,
- NULL, true);
+   NULL, true);
  /* If we mark !DECL_EXTERNAL one of the symbols in some comdat
 group, we need to mark all symbols in the same comdat group
 that way.  */
@@ -5341,7 +5345,7 @@ c_parse_final_cleanups (void)
 next != node;
 next = dyn_cast (next->same_comdat_group))
  next->call_for_symbol_thunks_and_aliases (clear_decl_external,
- NULL, true);
+   NULL, true);
}
 
  /* If we're going to need to write this function out, and
--- gcc/cp/optimize.cc.jj   2024-04-25 20:33:30.771858912 +0200
+++ gcc/cp/optimize.cc  2024-05-09 17:10:23.9

[PATCH] c++: Fix parsing of abstract-declarator starting with ... followed by [ or ( [PR115012]

2024-05-09 Thread Jakub Jelinek
Hi!

The C++26 P2662R3 Pack indexing paper mentions that both GCC
and MSVC don't handle T...[10] parameter declaration when T
is a pack.  While that will change meaning in C++26, in C++11 .. C++23
this ought to be valid.  Also, T...(args) as well.

The following patch handles those in cp_parser_direct_declarator.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-09  Jakub Jelinek  

PR c++/115012
* parser.cc (cp_parser_direct_declarator): Handle
abstract declarator starting with ... followed by [
or (.

* g++.dg/cpp0x/variadic185.C: New test.
* g++.dg/cpp0x/variadic186.C: New test.

--- gcc/cp/parser.cc.jj 2024-05-09 10:30:58.0 +0200
+++ gcc/cp/parser.cc2024-05-09 16:44:01.250777325 +0200
@@ -23916,7 +23916,12 @@ cp_parser_direct_declarator (cp_parser*
 {
   /* Peek at the next token.  */
   token = cp_lexer_peek_token (parser->lexer);
-  if (token->type == CPP_OPEN_PAREN)
+  if (token->type == CPP_OPEN_PAREN
+ || (first
+ && dcl_kind != CP_PARSER_DECLARATOR_NAMED
+ && token->type == CPP_ELLIPSIS
+ && cxx_dialect > cxx98
+ && cp_lexer_nth_token_is (parser->lexer, 2, CPP_OPEN_PAREN)))
{
  /* This is either a parameter-declaration-clause, or a
 parenthesized declarator. When we know we are parsing a
@@ -23955,6 +23960,11 @@ cp_parser_direct_declarator (cp_parser*
 
 Thus again, we try a parameter-declaration-clause, and if
 that fails, we back out and return.  */
+ bool pack_expansion_p = token->type == CPP_ELLIPSIS;
+
+ if (pack_expansion_p)
+   /* Consume the `...' */
+   cp_lexer_consume_token (parser->lexer);
 
  if (!first || dcl_kind != CP_PARSER_DECLARATOR_NAMED)
{
@@ -24098,6 +24108,7 @@ cp_parser_direct_declarator (cp_parser*
 attrs,
 parens_loc);
  declarator->attributes = gnu_attrs;
+ declarator->parameter_pack_p |= pack_expansion_p;
  /* Any subsequent parameter lists are to do with
 return type, so are not those of the declared
 function.  */
@@ -24121,7 +24132,7 @@ cp_parser_direct_declarator (cp_parser*
 
  /* If this is the first, we can try a parenthesized
 declarator.  */
- if (first)
+ if (first && !pack_expansion_p)
{
  bool saved_in_type_id_in_expr_p;
 
@@ -24156,16 +24167,27 @@ cp_parser_direct_declarator (cp_parser*
  else
break;
}
-  else if ((!first || dcl_kind != CP_PARSER_DECLARATOR_NAMED)
-  && token->type == CPP_OPEN_SQUARE
-  && !cp_next_tokens_can_be_attribute_p (parser))
+  else if (((!first || dcl_kind != CP_PARSER_DECLARATOR_NAMED)
+   && token->type == CPP_OPEN_SQUARE
+   && !cp_next_tokens_can_be_attribute_p (parser))
+  || (first
+  && dcl_kind != CP_PARSER_DECLARATOR_NAMED
+  && token->type == CPP_ELLIPSIS
+  && cp_lexer_nth_token_is (parser->lexer, 2, CPP_OPEN_SQUARE)
+  && cxx_dialect > cxx98
+  && !cp_nth_tokens_can_be_std_attribute_p (parser, 2)))
{
  /* Parse an array-declarator.  */
  tree bounds, attrs;
+ bool pack_expansion_p = token->type == CPP_ELLIPSIS;
 
  if (ctor_dtor_or_conv_p)
*ctor_dtor_or_conv_p = 0;
 
+ if (pack_expansion_p)
+   /* Consume the `...' */
+   cp_lexer_consume_token (parser->lexer);
+
  open_paren = NULL;
  first = false;
  parser->default_arg_ok_p = false;
@@ -24220,6 +24242,7 @@ cp_parser_direct_declarator (cp_parser*
  attrs = cp_parser_std_attribute_spec_seq (parser);
  declarator = make_array_declarator (declarator, bounds);
  declarator->std_attributes = attrs;
+ declarator->parameter_pack_p |= pack_expansion_p;
}
   else if (first && dcl_kind != CP_PARSER_DECLARATOR_ABSTRACT)
{
--- gcc/testsuite/g++.dg/cpp0x/variadic185.C.jj 2024-05-09 15:08:49.070651189 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/variadic185.C2024-05-09 15:07:40.045583153 
+0200
@@ -0,0 +1,39 @@
+// PR c++/115012
+// { dg-do compile { target { c++11 && c++23_down } } }
+// { dg-final { scan-assembler "_Z3fooILi10EJidEEvDpAT__T0_" } }
+// { dg-final { scan-assembler "_Z3barILi10EiEvPT0_" } }
+// { dg-final { scan-assembler "_Z3bazILi10EJidEEvDpAT__T0_" } }
+// { dg-final { scan-assembler "_Z3quxILi5EJifEEvDpAT__AT_

Re: gcc/DATESTAMP wasn't updated since 20240507

2024-05-09 Thread Jakub Jelinek
On Thu, May 09, 2024 at 12:14:43PM +0200, Jakub Jelinek wrote:
> On Thu, May 09, 2024 at 12:04:38PM +0200, Rainer Orth wrote:
> > I just noticed that gcc/DATESTAMP wasn't updated yesterday and today,
> > staying at 20240507.
> 
> I think it is because of the r15-268 commit, we do support
> This reverts commit ...
> when the referenced commit contains a ChangeLog message, but here
> it doesn't, as it is a revert commit.

Indeed and also the r15-311 commit.
Please don't Revert Revert, we don't really support that, had to fix it all
by hand.

Jakub



Re: gcc/DATESTAMP wasn't updated since 20240507

2024-05-09 Thread Jakub Jelinek
On Thu, May 09, 2024 at 12:04:38PM +0200, Rainer Orth wrote:
> I just noticed that gcc/DATESTAMP wasn't updated yesterday and today,
> staying at 20240507.

I think it is because of the r15-268 commit, we do support
This reverts commit ...
when the referenced commit contains a ChangeLog message, but here
it doesn't, as it is a revert commit.

Jakub



[committed] testsuite: Fix up vector-subaccess-1.C test for ia32 [PR89224]

2024-05-09 Thread Jakub Jelinek
Hi!

The test FAILs on i686-linux due to
.../gcc/testsuite/g++.dg/torture/vector-subaccess-1.C:16:6: warning: SSE vector 
argument without SSE enabled changes the ABI [-Wpsabi]
excess warnings.

This fixes it by adding -Wno-psabi, like commonly done in other tests.

Committed to trunk as obvious after testing it on x86_64-linux with
make check-g++ RUNTESTFLAGS='--target_board=unix\{-m32/-mno-sse,-m32,-m64\} 
dg-torture.exp=vector-subaccess-1.C'
and backported all the way to 11 branch.

2024-05-09  Jakub Jelinek  

PR c++/89224
* g++.dg/torture/vector-subaccess-1.C: Add -Wno-psabi as additional
options.

--- gcc/testsuite/g++.dg/torture/vector-subaccess-1.C.jj2024-05-08 
10:16:54.045823642 +0200
+++ gcc/testsuite/g++.dg/torture/vector-subaccess-1.C   2024-05-09 
11:16:46.730114871 +0200
@@ -1,4 +1,5 @@
 /* PR c++/89224 */
+/* { dg-additional-options "-Wno-psabi" } */
 
 /* The access of `vector[i]` has the same qualifiers as the original
vector which was missing. */

Jakub



[PATCH] c++, mingw: Fix up types of dtor hooks to __cxa_{,thread_}atexit/__cxa_throw on mingw ia32 [PR114968]

2024-05-09 Thread Jakub Jelinek
Hi!

__cxa_atexit/__cxa_thread_atexit/__cxa_throw functions accept function
pointers to usually directly destructors rather than wrappers around
them.
Now, mingw ia32 uses implicitly __attribute__((thiscall)) calling
conventions for METHOD_TYPE (where the this pointer is passed in %ecx
register, the rest on the stack), so these functions use:
in config/os/mingw32/os_defines.h:
 #if defined (__i386__)
 #define _GLIBCXX_CDTOR_CALLABI __thiscall
 #endif
in libsupc++/cxxabi.h
__cxa_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void*) 
_GLIBCXX_NOTHROW;
__cxa_thread_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void *) 
_GLIBCXX_NOTHROW;
__cxa_throw(void*, std::type_info*, void (_GLIBCXX_CDTOR_CALLABI *) (void *))
__attribute__((__noreturn__));

Now, mingw for some weird reason uses
 #define TARGET_CXX_USE_ATEXIT_FOR_CXA_ATEXIT hook_bool_void_true
so it never actually uses __cxa_atexit, but does use __cxa_thread_atexit
and __cxa_throw.  Recent changes for modules result in more detailed
__cxa_*atexit/__cxa_throw prototypes precreated by the compiler, and if
that happens and one also includes , the compiler complains about
mismatches in the prototypes.

One thing is the missing thiscall attribute on the FUNCTION_TYPE, the
other problem is that all of atexit/__cxa_atexit/__cxa_thread_atexit
get function pointer types gets created by a single function,
get_atexit_fn_ptr_type (), which creates it depending on if atexit
or __cxa_atexit will be used as either void(*)(void) or void(*)(void *),
but when using atexit and __cxa_thread_atexit it uses the wrong function
type for __cxa_thread_atexit.

The following patch adds a target hook to add the thiscall attribute to the
function pointers, and splits the get_atexit_fn_ptr_type () function into
get_atexit_fn_ptr_type () and get_cxa_atexit_fn_ptr_type (), the former always
creates shared void(*)(void) type, the latter creates either
void(*)(void*) (on most targets) or void(__attribute__((thiscall))*)(void*)
(on mingw ia32).  So that we don't waiste another GTY global tree for it,
because cleanup_type used for the same purpose for __cxa_throw should be
the same, the code changes it to use that type too.

In register_dtor_fn then based on the decision whether to use atexit,
__cxa_atexit or __cxa_thread_atexit it picks the right function pointer
type, and also if it decides to emit a __tcf_* wrapper for the cleanup,
uses that type for that wrapper so that it agrees on calling convention.

Bootstrapped/regtested on x86_64-linux and i686-linux and Liu Hao tested
it on mingw32, ok for trunk?

2024-05-09  Jakub Jelinek  

PR target/114968
gcc/
* target.def (use_atexit_for_cxa_atexit): Remove spurious space
from comment.
(adjust_cdtor_callabi_fntype): New cxx target hook.
* targhooks.h (default_cxx_adjust_cdtor_callabi_fntype): Declare.
* targhooks.cc (default_cxx_adjust_cdtor_callabi_fntype): New
function.
* doc/tm.texi.in (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Add.
* doc/tm.texi: Regenerate.
* config/i386/i386.cc (ix86_cxx_adjust_cdtor_callabi_fntype): New
function.
(TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Redefine.
gcc/cp/
* cp-tree.h (atexit_fn_ptr_type_node, cleanup_type): Adjust macro
comments.
(get_cxa_atexit_fn_ptr_type): Declare.
* decl.cc (get_atexit_fn_ptr_type): Adjust function comment, only
build type for atexit argument.
(get_cxa_atexit_fn_ptr_type): New function.
(get_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than
get_atexit_fn_ptr_type when using __cxa_atexit.
(get_thread_atexit_node): Call get_cxa_atexit_fn_ptr_type
rather than get_atexit_fn_ptr_type.
(start_cleanup_fn): Add fntype argument, don't call
get_atexit_fn_ptr_type for it.
(register_dtor_fn): Adjust start_cleanup_fn caller, use
get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type
when ob_parm is true.
* except.cc (build_throw): Use get_cxa_atexit_fn_ptr_type ().

--- gcc/target.def.jj   2024-05-07 21:28:46.554394913 +0200
+++ gcc/target.def  2024-05-08 11:19:39.290798568 +0200
@@ -6498,7 +6498,7 @@ is in effect.  The default is to return
  hook_bool_void_false)
 
 /* Returns true if target may use atexit in the same manner as
-   __cxa_atexit  to register static destructors.  */
+   __cxa_atexit to register static destructors.  */
 DEFHOOK
 (use_atexit_for_cxa_atexit,
  "This hook returns true if the target @code{atexit} function can be used\n\
@@ -6509,6 +6509,17 @@ unloaded. The default is to return false
  bool, (void),
  hook_bool_void_false)
 
+/* Returns modified FUNCTION_TYPE for cdtor callabi.  */
+DEFHOOK
+(adjust_cdtor_callabi_fntype,
+ "This hook returns a possibly modified @code{FUNCTION_TYPE} for arguments\n\
+to @code{__cxa_atexit}, @code{__cxa_thread_atexit} or @code{__cxa_throw}\n\
+function pointers.  ABIs li

[PATCH] reassoc: Fix up optimize_range_tests_to_bit_test [PR114965]

2024-05-08 Thread Jakub Jelinek
Hi!

The optimize_range_tests_to_bit_test optimization normally emits a range
test first:
  if (entry_test_needed)
{
  tem = build_range_check (loc, optype, unshare_expr (exp),
   false, lowi, high);
  if (tem == NULL_TREE || is_gimple_val (tem))
continue;
}
so during the bit test we already know that exp is in the [lowi, high]
range, but skips it if we have range info which tells us this isn't
necessary.
Also, normally it emits shifts by exp - lowi counter, but has an
optimization to use just exp counter if the mask isn't a more expensive
constant in that case and lowi is > 0 and high is smaller than prec.

The following testcase is miscompiled because the two abnormal cases
are triggered.  The range of exp is [43, 43][48, 48][95, 95], so we on
64-bit arch decide we don't need the entry test, because 95 - 43 < 64.
And we also decide to use just exp as counter, because the range test
tests just for exp == 43 || exp == 48, so high is smaller than 64 too.
Because 95 is in the exp range, we can't do that, we'd either need to
do a range test first, i.e.
if (exp - 43U <= 48U - 43U) if ((1UL << exp) & mask1))
or need to subtract lowi from the shift counter, i.e.
if ((1UL << (exp - 43)) & mask2)
but can't do both unless r.upper_bound () is < prec.

The following patch ensures that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-08  Jakub Jelinek  

PR tree-optimization/114965
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): Don't try to
optimize away exp - lowi subtraction from shift count unless entry
test is emitted or unless r.upper_bound () is smaller than prec.

* gcc.c-torture/execute/pr114965.c: New test.

--- gcc/tree-ssa-reassoc.cc.jj  2024-01-12 10:07:58.384848977 +0100
+++ gcc/tree-ssa-reassoc.cc 2024-05-07 18:18:45.558814991 +0200
@@ -3418,7 +3418,8 @@ optimize_range_tests_to_bit_test (enum t
 We can avoid then subtraction of the minimum value, but the
 mask constant could be perhaps more expensive.  */
  if (compare_tree_int (lowi, 0) > 0
- && compare_tree_int (high, prec) < 0)
+ && compare_tree_int (high, prec) < 0
+ && (entry_test_needed || wi::ltu_p (r.upper_bound (), prec)))
{
  int cost_diff;
  HOST_WIDE_INT m = tree_to_uhwi (lowi);
--- gcc/testsuite/gcc.c-torture/execute/pr114965.c.jj   2024-05-07 
18:17:16.767031821 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr114965.c  2024-05-07 
18:15:52.332188943 +0200
@@ -0,0 +1,30 @@
+/* PR tree-optimization/114965 */
+
+static void
+foo (const char *x)
+{
+
+  char a = '0';
+  while (1)
+{
+  switch (*x)
+   {
+   case '_':
+   case '+':
+ a = *x;
+ x++;
+ continue;
+   default:
+ break;
+   }
+  break;
+}
+  if (a == '0' || a == '+')
+__builtin_abort ();
+}
+
+int
+main ()
+{
+  foo ("_");
+}

Jakub



Re: [PATCH] expansion: Use __trunchfbf2 calls rather than __extendhfbf2 [PR114907]

2024-05-07 Thread Jakub Jelinek
On Tue, May 07, 2024 at 08:57:00PM +0200, Richard Biener wrote:
> 
> 
> > Am 07.05.2024 um 18:02 schrieb Jakub Jelinek :
> > 
> > Hi!
> > 
> > The HF and BF modes have the same size/precision and neither is
> > a subset nor superset of the other.
> > So, using either __extendhfbf2 or __trunchfbf2 is weird.
> > The expansion apparently emits __extendhfbf2, but on the libgcc side
> > we apparently have __trunchfbf2 implemented.
> > 
> > I think it is easier to switch to using what is available rather than
> > adding new entrypoints to libgcc, even alias, because this is backportable.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Ok - do we have any target patterns that need adjustments?

I don't think so.
BFmode is i386/aarch64/arm/riscv backend only from what I can see,
I've done make mddump for all of them and none of the tmp-mddump.md
files show any matches for hfbf (nor bfhf).

Jakub



[PATCH] expansion: Use __trunchfbf2 calls rather than __extendhfbf2 [PR114907]

2024-05-07 Thread Jakub Jelinek
Hi!

The HF and BF modes have the same size/precision and neither is
a subset nor superset of the other.
So, using either __extendhfbf2 or __trunchfbf2 is weird.
The expansion apparently emits __extendhfbf2, but on the libgcc side
we apparently have __trunchfbf2 implemented.

I think it is easier to switch to using what is available rather than
adding new entrypoints to libgcc, even alias, because this is backportable.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-07  Jakub Jelinek  

PR middle-end/114907
* expr.cc (convert_mode_scalar): Use trunc_optab rather than
sext_optab for HF->BF conversions.
* optabs-libfuncs.cc (gen_trunc_conv_libfunc): Likewise.

* gcc.dg/pr114907.c: New test.

--- gcc/expr.cc.jj  2024-04-09 09:29:04.0 +0200
+++ gcc/expr.cc 2024-05-06 13:21:33.933798494 +0200
@@ -355,8 +355,16 @@ convert_mode_scalar (rtx to, rtx from, i
  && REAL_MODE_FORMAT (from_mode) == _half_format));
 
   if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode))
-   /* Conversion between decimal float and binary float, same size.  */
-   tab = DECIMAL_FLOAT_MODE_P (from_mode) ? trunc_optab : sext_optab;
+   {
+ if (REAL_MODE_FORMAT (to_mode) == _bfloat_half_format
+ && REAL_MODE_FORMAT (from_mode) == _half_format)
+   /* libgcc implements just __trunchfbf2, not __extendhfbf2.  */
+   tab = trunc_optab;
+ else
+   /* Conversion between decimal float and binary float, same
+  size.  */
+   tab = DECIMAL_FLOAT_MODE_P (from_mode) ? trunc_optab : sext_optab;
+   }
   else if (GET_MODE_PRECISION (from_mode) < GET_MODE_PRECISION (to_mode))
tab = sext_optab;
   else
--- gcc/optabs-libfuncs.cc.jj   2024-01-03 11:51:31.739728303 +0100
+++ gcc/optabs-libfuncs.cc  2024-05-06 15:50:21.611027802 +0200
@@ -589,7 +589,9 @@ gen_trunc_conv_libfunc (convert_optab ta
   if (GET_MODE_CLASS (float_tmode) != GET_MODE_CLASS (float_fmode))
 gen_interclass_conv_libfunc (tab, opname, float_tmode, float_fmode);
 
-  if (GET_MODE_PRECISION (float_fmode) <= GET_MODE_PRECISION (float_tmode))
+  if (GET_MODE_PRECISION (float_fmode) <= GET_MODE_PRECISION (float_tmode)
+  && (REAL_MODE_FORMAT (float_tmode) != _bfloat_half_format
+ || REAL_MODE_FORMAT (float_fmode) != _half_format))
 return;
 
   if (GET_MODE_CLASS (float_tmode) == GET_MODE_CLASS (float_fmode))
--- gcc/testsuite/gcc.dg/pr114907.c.jj  2024-05-06 15:59:08.734958523 +0200
+++ gcc/testsuite/gcc.dg/pr114907.c 2024-05-06 16:02:38.914139829 +0200
@@ -0,0 +1,27 @@
+/* PR middle-end/114907 */
+/* { dg-do run } */
+/* { dg-options "" } */
+/* { dg-add-options float16 } */
+/* { dg-require-effective-target float16_runtime } */
+/* { dg-add-options bfloat16 } */
+/* { dg-require-effective-target bfloat16_runtime } */
+
+__attribute__((noipa)) _Float16
+foo (__bf16 x)
+{
+  return (_Float16) x;
+}
+
+__attribute__((noipa)) __bf16
+bar (_Float16 x)
+{
+  return (__bf16) x;
+}
+
+int
+main ()
+{
+  if (foo (11.125bf16) != 11.125f16
+  || bar (11.125f16) != 11.125bf16)
+__builtin_abort ();
+}

Jakub



[PATCH] tree-inline: Remove .ASAN_MARK calls when inlining functions into no_sanitize callers [PR114956]

2024-05-07 Thread Jakub Jelinek
Hi!

In r9-5742 we've started allowing to inline always_inline functions into
functions which have disabled e.g. address sanitization even when the
always_inline function is implicitly from command line options sanitized.

This mostly works fine because most of the asan instrumentation is done only
late after ipa, but as the following testcase the .ASAN_MARK ifn calls
gimplifier adds can result in ICEs.

Fixed by dropping those during inlining, similarly to how we drop
.TSAN_FUNC_EXIT calls.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-07  Jakub Jelinek  

PR sanitizer/114956
* tree-inline.cc: Include asan.h.
(copy_bb): Remove also .ASAN_MARK calls if id->dst_fn has asan/hwasan
sanitization disabled.

* gcc.dg/asan/pr114956.c: New test.

--- gcc/tree-inline.cc.jj   2024-05-03 09:44:21.199055899 +0200
+++ gcc/tree-inline.cc  2024-05-06 10:45:37.231349328 +0200
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.
 #include "symbol-summary.h"
 #include "symtab-thunks.h"
 #include "symtab-clones.h"
+#include "asan.h"
 
 /* I'm not real happy about this, but we need to handle gimple and
non-gimple trees.  */
@@ -2226,13 +2227,26 @@ copy_bb (copy_body_data *id, basic_block
}
  else if (call_stmt
   && id->call_stmt
-  && gimple_call_internal_p (stmt)
-  && gimple_call_internal_fn (stmt) == IFN_TSAN_FUNC_EXIT)
-   {
- /* Drop TSAN_FUNC_EXIT () internal calls during inlining.  */
- gsi_remove (_gsi, false);
- continue;
-   }
+  && gimple_call_internal_p (stmt))
+   switch (gimple_call_internal_fn (stmt))
+ {
+ case IFN_TSAN_FUNC_EXIT:
+   /* Drop .TSAN_FUNC_EXIT () internal calls during inlining.  */
+   gsi_remove (_gsi, false);
+   continue;
+ case IFN_ASAN_MARK:
+   /* Drop .ASAN_MARK internal calls during inlining into
+  no_sanitize functions.  */
+   if (!sanitize_flags_p (SANITIZE_ADDRESS, id->dst_fn)
+   && !sanitize_flags_p (SANITIZE_HWADDRESS, id->dst_fn))
+ {
+   gsi_remove (_gsi, false);
+   continue;
+ }
+   break;
+ default:
+   break;
+ }
 
  /* Statements produced by inlining can be unfolded, especially
 when we constant propagated some operands.  We can't fold
--- gcc/testsuite/gcc.dg/asan/pr114956.c.jj 2024-05-06 10:54:52.601892840 
+0200
+++ gcc/testsuite/gcc.dg/asan/pr114956.c2024-05-06 10:54:33.920143734 
+0200
@@ -0,0 +1,26 @@
+/* PR sanitizer/114956 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsanitize=address,null" } */
+
+int **a;
+void qux (int *);
+
+__attribute__((always_inline)) static inline int *
+foo (void)
+{
+  int b[1];
+  qux (b);
+  return a[1];
+}
+
+__attribute__((no_sanitize_address)) void
+bar (void)
+{
+  *a = foo ();
+}
+
+void
+baz (void)
+{
+  bar ();
+}

Jakub



Re: [wwwdocs] Specify AArch64 BitInt support for little-endian only

2024-05-07 Thread Jakub Jelinek
On Tue, May 07, 2024 at 02:12:07PM +0100, Andre Vieira (lists) wrote:
> Hey Jakub,
> 
> This what ya had in mind?

Yes.

> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> index 
> ca5174de991bb088f653468f77485c15a61526e6..924e045a15a78b5702a0d6997953f35c6b47efd1
>  100644
> --- a/htdocs/gcc-14/changes.html
> +++ b/htdocs/gcc-14/changes.html
> @@ -325,7 +325,7 @@ You may also want to check out our
>Bit-precise integer types (_BitInt (N)
>and unsigned _BitInt (N)): integer types with
>a specified number of bits.  These are only supported on
> -  IA-32, x86-64 and AArch64 at present.
> +  IA-32, x86-64 and AArch64 (little-endian) at present.
>Structure, union and enumeration types may be defined more
>than once in the same scope with the same contents and the same
>tag; if such types are defined with the same contents and the


Jakub



Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-03 Thread Jakub Jelinek
On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > TYPE_CANONICAL as used by the middle-end cannot express this but
> 
> Hm. so how does it work now for arrays?

Do you have a testcase which doesn't work correctly with the arrays?

E.g. same_type_for_tbaa has
  type1 = TYPE_MAIN_VARIANT (type1);
  type2 = TYPE_MAIN_VARIANT (type2);

  /* Handle the most common case first.  */
  if (type1 == type2)
return 1;

  /* If we would have to do structural comparison bail out.  */
  if (TYPE_STRUCTURAL_EQUALITY_P (type1)
  || TYPE_STRUCTURAL_EQUALITY_P (type2))
return -1;

  /* Compare the canonical types.  */
  if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
return 1;

  /* ??? Array types are not properly unified in all cases as we have
 spurious changes in the index types for example.  Removing this
 causes all sorts of problems with the Fortran frontend.  */
  if (TREE_CODE (type1) == ARRAY_TYPE
  && TREE_CODE (type2) == ARRAY_TYPE)
return -1;
...
and later compares alias sets and the like.
So, even if int[] and int[0] have different TYPE_CANONICAL, they
will be considered maybe the same.  Also, guess get_alias_set
has some ARRAY_TYPE handling...

Anyway, I think we should just go with Richi's patch.

Jakub



Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-03 Thread Jakub Jelinek
On Fri, May 03, 2024 at 08:04:18PM +0200, Martin Uecker wrote:
> A change that is not optimal but would avoid a lot of trouble is to
> only use the tag of the struct for computing a TYPE_CANONICAL, which
> could then be set already for incomplete types and never needs to
> change again. We would not differentiate between different struct
> types with the same tag for aliasing analysis, but in most cases
> I would expect different structs to have a different tag.

Having incompatible types have the same TYPE_CANONICAL would lead to wrong
code IMHO, while for aliasing purposes that might be conservative (though
not sure, the alias set computation is based on what types the element have
etc., so if the alias set is computed for say struct S { int s; }; and
then the same alias set used for struct S { long long a; double b; union {
short c; float d; } c; };, I think nothing good will come out of that),
but middle-end also uses TYPE_CANONICAL to see if types are the same,
say e.g. useless_type_conversion_p says that conversions from one
RECORD_TYPE to a different RECORD_TYPE are useless if they have the
same TYPE_CANONICAL.
  /* For aggregates we rely on TYPE_CANONICAL exclusively and require
 explicit conversions for types involving to be structurally
 compared types.  */
  else if (AGGREGATE_TYPE_P (inner_type)
   && TREE_CODE (inner_type) == TREE_CODE (outer_type))
return TYPE_CANONICAL (inner_type)
   && TYPE_CANONICAL (inner_type) == TYPE_CANONICAL (outer_type);
So, if you have struct S { int s; } and struct S { short a, b; }; and
VIEW_CONVERT_EXPR between them, that VIEW_CONVERT_EXPR will be removed
as useless, etc.

BTW, the idea of lazily updating TYPE_CANONICAL is basically what I've
described as the option to update all the derived types where it would
pretty much do that for all TYPE_STRUCTURAL_EQUALITY_P types in the
hash table (see if they are derived from the type in question and recompute
the TYPE_CANONICAL after recomputing all the TYPE_CANONICAL of its base
types), except perhaps even more costly (if the trigger would be some
build_array_type/build_function_type/... function is called and found
a cached TYPE_STRUCTURAL_EQUALITY_P type).  Note also that
TYPE_STRUCTURAL_EQUALITY_P isn't the case just for the C23 types which
are marked that way when incomplete and later completed, but by various
other cases for types which will be permanently like that, so doing
expensive checks each time some build*_type* is called that refers
to those would be expensive.

Jakub



Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-03 Thread Jakub Jelinek
On Fri, May 03, 2024 at 05:32:12PM +0200, Martin Uecker wrote:
> Am Freitag, dem 03.05.2024 um 14:13 +0200 schrieb Richard Biener:
> > TYPE_STRUCTURAL_EQUALITY_P is part of our type system so we have
> > to make sure to include that into the type unification done via
> > type_hash_canon.  This requires the flag to be set before querying
> > the hash which is the biggest part of the patch.
> 
> I assume this does not affect structs / unions because they
> do not make this mechanism of type unification (each tagged type
> is a unique type), but only derived types that end up having
> TYPE_STRUCTURAL_EQUALITY_P because they are constructed from
> incomplete structs / unions before TYPE_CANONICAL is set.
> 
> I do not yet understand why this change is needed. Type
> identity should not be affected by setting TYPE_CANONICAL, so
> why do we need to keep such types separate?  I understand that we
> created some inconsistencies, but I do not see why this change
> is needed to fix it.  But I also haven't understood how we ended
> up with a TYPE_CANONICAL having TYPE_STRUCTURAL_EQUALITY_P in
> PR 114931 ...

So, the C23 situation before the r14-10045 change (not counting the
r14-9763 change that was later reverted) was that sometimes TYPE_CANONICAL
of a RECORD_TYPE/UNION_TYPE could change from self to a different
RECORD_TYPE/UNION_TYPE and we didn't bother to adjust derived types.
That was really dangerous, I think e.g. type alias set wasn't recomputed.

r14-10045 changed it to the non-ideal, but perhaps less wrong model,
where we start with TYPE_STRUCTURAL_EQUALITY_P on incomplete types in C23
and perhaps later on change them to !TYPE_STRUCTURAL_EQUALITY_P when
the type is completed, and adjust TYPE_CANONICAL of some easily discoverable
derived types but certainly not all.

Still, that change introduces something novel to the type system, namely
that TYPE_CANONICAL can change on a type, even when it is just limited to
the TYPE_STRUCTURAL_EQUALITY_P -> !TYPE_STRUCTURAL_EQUALITY_P kind of
change and we never change one non-NULL TYPE_CANONICAL to a different one
(ok, not counting the short lived TYPE_CANONICAL being initially set to
self during make_node and then quickly adjusted in the callers).

One question is, could we for C23 somehow limit this for the most common
case where people just forward declare some aggregate type and then soon
after define it?  But guess the problematic counterexample there is
struct S; // forward declare
struct S *p; // create some derived types from it
void foo (void)
{
  struct S { int s; };  // define the type in a different scope
// (perhaps with a forward declaration as well)
  struct S s;
  use (); // create derived types
}
struct S { int s; };// define the type in the global scope to something
// that matches previously defined struct S in
// another scope
So e.g. noting in the hash table that a particular type has been forward
declared so far and using TYPE_STRUCTURAL_EQUALITY_P only if it has been
forward declared in some other scope wouldn't work.

Another question is whether c_update_type_canonical can or should try to
update TYPE_ALIAS_SET if TYPE_ALIAS_SET_KNOWN_P.  Or do we never cache
alias sets for TYPE_STRUCTURAL_EQUALITY_P types?

Anyway, the ICE on the testcase is because alias.cc asserts that
a !TYPE_STRUCTURAL_EQUALITY_P (type) has
!TYPE_STRUCTURAL_EQUALITY_P (TYPE_CANONICAL (type)).

The possibilities to resolve that are either at c_update_type_canonical
time try to find all the derived types rather than just some and recompute
their TYPE_CANONICAL.  Guess we could e.g. just traverse the whole
type_hash_table hash table and for each type see if it is in any way related
to the type that is being changed and then recompute them.  Though,
especially FUNCTION_TYPEs make that really ugly and furthermore it needs
to be recomputed in the right order, basically in the derivation order.
Without doing that, we'll have some TYPE_STRUCTURAL_EQUALITY_P derived
types in the type_hash_table hash table; that is conservatively correct,
but can result in worse code generation because of using alias set 0.

Another possibility is what Richi posted, essentially stop reusing
derived types created from the time when the base type was incomplete
when asking for a new derived type.  We'll get the TYPE_STRUCTURAL_EQUALITY_P
derived types if they were created before the base type was completed
when used directly (e.g. when it is a TREE_TYPE of some decl etc.), but
when we ask for a new type we'd disregard the old type and create a new one.
I'm not sure the patch is complete for that, because it doesn't adjust
check_base_type, build_pointer_type_for_mode, build_reference_type_for_mode
which don't use type_hash_canon but walk TYPE_NEXT_VARIANT list or
TYPE_POINTER_TO or TYPE_REFERENCE_TO chains.  Though, maybe it is ok
as is when c_update_type_canonical adjusts the pointer types
and variant types, those 

[PATCH] c++: Implement C++26 P2893R3 - Variadic friends [PR114459]

2024-05-03 Thread Jakub Jelinek
Hi!

The following patch imeplements the C++26 P2893R3 - Variadic friends
paper.  The paper allows for the friend type declarations to specify
more than one friend type specifier and allows to specify ... at
the end of each.  The patch doesn't introduce tentative parsing of
friend-type-declaration non-terminal, but rather just extends existing
parsing where it is a friend declaration which ends with ; after the
declaration specifiers to the cases where it ends with ...; or , or ...,
In that case it pedwarns for cxx_dialect < cxx26, handles the ... and
if there is , continues in a loop to parse the further friend type
specifiers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-03  Jakub Jelinek  

PR c++/114459
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_variadic_friend=202403L for C++26.
gcc/cp/
* parser.cc (cp_parser_member_declaration): Implement C++26
P2893R3 - Variadic friends.  Parse friend type declarations
with ... or with more than one friend type specifier.
* friend.cc (make_friend_class): Allow TYPE_PACK_EXPANSION.
* pt.cc (instantiate_class_template): Handle PACK_EXPANSION_P
in friend classes.
gcc/testsuite/
* g++.dg/cpp26/feat-cxx26.C (__cpp_variadic_friend): Add test.
* g++.dg/cpp26/variadic-friend1.C: New test.

--- gcc/c-family/c-cppbuiltin.cc.jj 2024-05-02 09:31:17.746298275 +0200
+++ gcc/c-family/c-cppbuiltin.cc2024-05-03 14:50:08.008242950 +0200
@@ -1093,6 +1093,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_placeholder_variables=202306L");
  cpp_define (pfile, "__cpp_structured_bindings=202403L");
  cpp_define (pfile, "__cpp_deleted_function=202403L");
+ cpp_define (pfile, "__cpp_variadic_friend=202403L");
}
   if (flag_concepts)
 {
--- gcc/cp/parser.cc.jj 2024-05-03 09:43:47.781511477 +0200
+++ gcc/cp/parser.cc2024-05-03 13:26:38.208088017 +0200
@@ -28102,7 +28102,14 @@ cp_parser_member_declaration (cp_parser*
 goto out;
   /* If there is no declarator, then the decl-specifier-seq should
  specify a type.  */
-  if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
+  if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON)
+  || (cp_parser_friend_p (_specifiers)
+ && cxx_dialect >= cxx11
+ && (cp_lexer_next_token_is (parser->lexer, CPP_COMMA)
+ || (cp_lexer_next_token_is (parser->lexer, CPP_ELLIPSIS)
+ && (cp_lexer_nth_token_is (parser->lexer, 2, CPP_SEMICOLON)
+ || cp_lexer_nth_token_is (parser->lexer, 2,
+   CPP_COMMA))
 {
   /* If there was no decl-specifier-seq, and the next token is a
 `;', then we have something like:
@@ -28137,44 +28144,81 @@ cp_parser_member_declaration (cp_parser*
{
  /* If the `friend' keyword was present, the friend must
 be introduced with a class-key.  */
-  if (!declares_class_or_enum && cxx_dialect < cxx11)
-pedwarn (decl_spec_token_start->location, OPT_Wpedantic,
- "in C++03 a class-key must be used "
- "when declaring a friend");
-  /* In this case:
+ if (!declares_class_or_enum && cxx_dialect < cxx11)
+   pedwarn (decl_spec_token_start->location, OPT_Wpedantic,
+"in C++03 a class-key must be used "
+"when declaring a friend");
+ if (!cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON)
+ && cxx_dialect < cxx26)
+   pedwarn (cp_lexer_peek_token (parser->lexer)->location,
+OPT_Wc__26_extensions,
+"variadic friends or friend type declarations with "
+"multiple types only available with "
+"%<-std=c++2c%> or %<-std=gnu++2c%>");
+ location_t friend_loc = decl_specifiers.locations[ds_friend];
+ do
+   {
+ /* In this case:
 
-   template  struct A {
- friend struct A::B;
-   };
+template  struct A {
+  friend struct A::B;
+};
 
- A::B will be represented by a TYPENAME_TYPE, and
- therefore not recognized by check_tag_decl.  */
-  if (!type)
-{
-  type = decl_specifiers.type;
-  if (type && TREE_CODE (type) == TYPE_DECL)
-type = TREE_TYPE (type);
-

Re: [PATCH v2] gcc-14: Mention that some warnings are now errors

2024-05-03 Thread Jakub Jelinek
On Fri, May 03, 2024 at 04:06:28PM +0100, Jonathan Wakely wrote:
> I agree it should be mentioned, but I would put it in the caveats
> section at the top, not as the last item of the C section.
> 
> How about this? OK for wwwdocs?

LGTM.

> commit fe5fd75ea5a7a08eee0831cadbdd05689e9408db
> Author: Jonathan Wakely 
> Date:   Fri May 3 16:04:49 2024 +0100
> 
> Add caveat to GCC 14 release notes about C warnings-as-errors change
> 
> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> index 46a0266d..82906de1 100644
> --- a/htdocs/gcc-14/changes.html
> +++ b/htdocs/gcc-14/changes.html
> @@ -40,6 +40,11 @@ a work-in-progress.
> href="https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end;>-Wflex-array-member-not-at-end
>  to
>identify all such cases in the source code and modify them.
>
> +  C:
> +  Certain warnings about are now errors, see
> +  Porting to GCC 14
> +  for details.
> +  
> href="https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html;>-fcf-protection=[full|branch|return|none|check]
>is refactored, to override -fcf-protection,
>-fcf-protection=none needs to be added and then

Jakub



Re: [PATCH] testsuite: fix analyzer C++ failures on Solaris [PR111475]

2024-05-03 Thread Jakub Jelinek
On Fri, May 03, 2024 at 09:31:08AM -0400, David Malcolm wrote:
> Jakub, Richi, Rainer: this is a non-trivial change that cleans up
> analyzer C++ testsuite results on Solaris, but has a slight risk of
> affecting analyzer behavior on other targets.  As such, I was thinking
> to hold off on backporting it to GCC 14 until after 14.1 is released.
> Is that a good plan?

Agreed 14.2 is better target than 14.1 for this, especially if committed
shortly after 14.1 goes out.

Jakub



Re: [PATCH] Avoid changing type in the type_hash_canon hash

2024-05-03 Thread Jakub Jelinek
On Fri, May 03, 2024 at 12:58:55PM +0200, Richard Biener wrote:
> When building a type and type_hash_canon returns an existing type
> avoid changing it, in particular its TYPE_CANONICAL.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.
> 
> OK for trunk?
> 
> Thanks,
> Richard.
> 
>   PR middle-end/114931
>   * tree.cc (build_array_type_1): Return early when type_hash_canon
>   returned an older existing type.
>   (build_function_type): Likewise.
>   (build_method_type_directly): Likewise.
>   (build_offset_type): Likewise.

LGTM, thanks.

Jakub



[PATCH] tree-inline: Add __builtin_stack_{save,restore} pair about inline calls with calls to alloca [PR113596]

2024-05-03 Thread Jakub Jelinek
Hi!

The following patch adds save_NNN = __builtin_stack_save (); ...
__builtin_stack_restore (save_NNN);
pair around inline calls which call alloca (alloca calls because of
VLA vars are ignored in that decision).
The patch doesn't change anything on whether we try to inline such calls or
not, it just fixes the behavior when we inline them despite those checks.
The stack save/restore restores the behavior that alloca acquired regions
are freed at the end of the containing call.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-03  Jakub Jelinek  

PR middle-end/113596
* tree-inline.cc (expand_call_inline): Emit __builtin_stack_save
and __builtin_stack_restore calls around inlined functions which
call alloca.

* gcc.dg/pr113596.c: New test.
* gcc.dg/tree-ssa/pr113596.c: New test.

--- gcc/tree-inline.cc.jj   2024-04-11 11:09:07.274670922 +0200
+++ gcc/tree-inline.cc  2024-05-02 19:05:06.963750322 +0200
@@ -4794,6 +4794,7 @@ expand_call_inline (basic_block bb, gimp
   use_operand_p use;
   gimple *simtenter_stmt = NULL;
   vec *simtvars_save;
+  tree save_stack = NULL_TREE;
 
   /* The gimplifier uses input_location in too many places, such as
  internal_get_tmp_var ().  */
@@ -5042,6 +5043,28 @@ expand_call_inline (basic_block bb, gimp
GSI_NEW_STMT);
 }
 
+  /* If function to be inlined calls alloca, wrap the inlined function
+ in between save_stack = __builtin_stack_save (); and
+ __builtin_stack_restore (save_stack); calls.  */
+  if (id->src_cfun->calls_alloca && !gimple_call_noreturn_p (stmt))
+/* Don't do this for VLA allocations though, just for user alloca
+   calls.  */
+for (struct cgraph_edge *e = id->src_node->callees; e; e = e->next_callee)
+  if (gimple_maybe_alloca_call_p (e->call_stmt)
+ && !gimple_call_alloca_for_var_p (e->call_stmt))
+   {
+ tree fn = builtin_decl_implicit (BUILT_IN_STACK_SAVE);
+ gcall *call = gimple_build_call (fn, 0);
+ save_stack = make_ssa_name (ptr_type_node);
+ gimple_call_set_lhs (call, save_stack);
+ gimple_stmt_iterator si = gsi_last_bb (bb);
+ gsi_insert_after (, call, GSI_NEW_STMT);
+ struct cgraph_node *dest = cgraph_node::get_create (fn);
+ id->dst_node->create_edge (dest, call, bb->count)->inline_failed
+   = CIF_BODY_NOT_AVAILABLE;
+ break;
+   }
+
   if (DECL_INITIAL (fn))
 {
   if (gimple_block (stmt))
@@ -5165,6 +5188,17 @@ expand_call_inline (basic_block bb, gimp
}
}
 
+  if (save_stack)
+{
+  tree fn = builtin_decl_implicit (BUILT_IN_STACK_RESTORE);
+  gcall *call = gimple_build_call (fn, 1, save_stack);
+  gsi_insert_before (_gsi, call, GSI_SAME_STMT);
+  struct cgraph_node *dest = cgraph_node::get_create (fn);
+  id->dst_node->create_edge (dest, call,
+return_block->count)->inline_failed
+   = CIF_BODY_NOT_AVAILABLE;
+}
+
   /* Reset the escaped solution.  */
   if (cfun->gimple_df)
 {
--- gcc/testsuite/gcc.dg/pr113596.c.jj  2024-05-02 15:05:25.048642302 +0200
+++ gcc/testsuite/gcc.dg/pr113596.c 2024-05-02 15:05:25.048642302 +0200
@@ -0,0 +1,24 @@
+/* PR middle-end/113596 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+__attribute__((noipa)) void
+bar (char *p, int n)
+{
+  p[0] = 1;
+  p[n - 1] = 2;
+}
+
+static inline __attribute__((always_inline)) void
+foo (int n)
+{
+  char *p = __builtin_alloca (n);
+  bar (p, n);
+}
+
+int
+main ()
+{
+  for (int i = 2; i < 8192; ++i)
+foo (i);
+}
--- gcc/testsuite/gcc.dg/tree-ssa/pr113596.c.jj 2024-05-02 19:10:29.218455257 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr113596.c2024-05-02 19:11:11.211895559 
+0200
@@ -0,0 +1,37 @@
+/* PR middle-end/113596 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-einline" } */
+/* { dg-final { scan-tree-dump-times "__builtin_stack_save \\\(" 3 "einline" } 
} */
+/* { dg-final { scan-tree-dump-times "__builtin_stack_restore \\\(" 3 
"einline" } } */
+
+void baz (char *p, int n);
+volatile int v;
+
+static inline __attribute__((always_inline)) void
+foo (int n)
+{
+  ++v;
+  {
+char *p = __builtin_alloca (n);
+baz (p, n);
+  }
+  ++v;
+}
+
+static inline __attribute__((always_inline)) void
+bar (int n)
+{
+  ++v;
+  {
+char p[n];
+baz (p, n);
+  }
+  ++v;
+}
+
+void
+qux (int n)
+{
+  foo (n);
+  bar (n);
+}

Jakub



Re: Trait built-in naming convention

2024-05-02 Thread Jakub Jelinek
On Thu, May 02, 2024 at 12:52:59PM -0700, Ken Matsui wrote:
> > This seems to be the prevailing sentiment, so let's continue that way.
> > Thanks for the input.
> 
> I actually found that we have two built-in type traits prefixed with
> __builtin: __builtin_is_corresponding_member and

That is a FE builtin, not a trait, and is very much different from the
__is_* traits, is varargs with extra processing, I don't think any of
the normal traits accepts pointer to members.

> __builtin_is_pointer_interconvertible_with_class.  Do we want to
> update these to use __ instead for consistency?

No, I think we want to keep them as is.

Jakub



Re: [pushed] Objective-C, NeXT, v2: Correct a regression in code-gen.

2024-05-02 Thread Jakub Jelinek
On Thu, May 02, 2024 at 01:53:21PM +0100, Iain Sandoe wrote:
> My testing of the GCC-14 release branch revealed an Objective-C
> regression in code-gen, the fix has been tested on x86_64, i686
> and powerpc darwin, pushed to trunk.
> 
> I will shortly apply this to the open branches, since they are
> affected too.  Given that this is completely local to Darwin and
> Objective-C (and pretty trivial too) - would it be acceptable for
> GCC-14.1?

Can't this just wait for GCC 14.2?
The code has been like that for years, no, and it is ObjC, not a release
critical language.

Jakub



Re: [PATCH] fix single argument static_assert

2024-05-02 Thread Jakub Jelinek
On Thu, May 02, 2024 at 12:28:29PM +0200, Marc Poulhiès wrote:
> Single argument static_assert is C++17 only.
> 
> gcc/ChangeLog:
> 
> * value-range.h: fix static_assert to use 2 arguments.
> ---
> 
> Ok for master?

Yes.

Jakub



[committed] libgomp: Add gfx90c, 1036 and 1103 declare variant tests

2024-05-02 Thread Jakub Jelinek
Hi!

Recently -march=gfx{90c,1036,1103} support has been added, but corresponding
changes weren't done in the testsuite.

The following patch adds that.

Tested on x86_64-linux (with fiji and gfx1103 devices; had to use
OMP_DEFAULT_DEVICE=1 there, fiji doesn't really work due to LLVM dropping
support, but we still list those as offloading devices).
Committed to trunk.

2024-05-02  Jakub Jelinek  

* testsuite/libgomp.c/declare-variant-4.h (gfx90c, gfx1036, gfx1103):
New functions.
(f): Add #pragma omp declare variant directives for those.
* testsuite/libgomp.c/declare-variant-4-gfx90c.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1036.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1103.c: New test.

--- libgomp/testsuite/libgomp.c/declare-variant-4.h.jj  2024-01-29 
12:11:57.917149306 +0100
+++ libgomp/testsuite/libgomp.c/declare-variant-4.h 2024-05-02 
11:41:42.579379273 +0200
@@ -37,6 +37,13 @@ gfx90a (void)
 
 __attribute__ ((noipa))
 int
+gfx90c (void)
+{
+  return 0x90c;
+}
+
+__attribute__ ((noipa))
+int
 gfx1030 (void)
 {
   return 0x1030;
@@ -44,11 +51,25 @@ gfx1030 (void)
 
 __attribute__ ((noipa))
 int
+gfx1036 (void)
+{
+  return 0x1036;
+}
+
+__attribute__ ((noipa))
+int
 gfx1100 (void)
 {
   return 0x1100;
 }
 
+__attribute__ ((noipa))
+int
+gfx1103 (void)
+{
+  return 0x1103;
+}
+
 #ifdef USE_FIJI_FOR_GFX803
 #pragma omp declare variant(gfx803) match(device = {isa("fiji")})
 #else
@@ -58,8 +79,11 @@ gfx1100 (void)
 #pragma omp declare variant(gfx906) match(device = {isa("gfx906")})
 #pragma omp declare variant(gfx908) match(device = {isa("gfx908")})
 #pragma omp declare variant(gfx90a) match(device = {isa("gfx90a")})
+#pragma omp declare variant(gfx90c) match(device = {isa("gfx90c")})
 #pragma omp declare variant(gfx1030) match(device = {isa("gfx1030")})
+#pragma omp declare variant(gfx1036) match(device = {isa("gfx1036")})
 #pragma omp declare variant(gfx1100) match(device = {isa("gfx1100")})
+#pragma omp declare variant(gfx1103) match(device = {isa("gfx1103")})
 __attribute__ ((noipa))
 int
 f (void)
--- libgomp/testsuite/libgomp.c/declare-variant-4-gfx90c.c.jj   2024-05-02 
11:38:57.272597106 +0200
+++ libgomp/testsuite/libgomp.c/declare-variant-4-gfx90c.c  2024-05-02 
11:39:11.169410657 +0200
@@ -0,0 +1,8 @@
+/* { dg-do link { target { offload_target_amdgcn } } } */
+/* { dg-additional-options -foffload=amdgcn-amdhsa } */
+/* { dg-additional-options -foffload=-march=gfx90c } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-4.h"
+
+/* { dg-final { only_for_offload_target amdgcn-amdhsa scan-offload-tree-dump 
"= gfx90c \\(\\);" "optimized" } } */
--- libgomp/testsuite/libgomp.c/declare-variant-4-gfx1036.c.jj  2024-05-02 
11:39:29.393166162 +0200
+++ libgomp/testsuite/libgomp.c/declare-variant-4-gfx1036.c 2024-05-02 
11:39:51.834865074 +0200
@@ -0,0 +1,8 @@
+/* { dg-do link { target { offload_target_amdgcn } } } */
+/* { dg-additional-options -foffload=amdgcn-amdhsa } */
+/* { dg-additional-options -foffload=-march=gfx1036 } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-4.h"
+
+/* { dg-final { only_for_offload_target amdgcn-amdhsa scan-offload-tree-dump 
"= gfx1036 \\(\\);" "optimized" } } */
--- libgomp/testsuite/libgomp.c/declare-variant-4-gfx1103.c.jj  2024-05-02 
11:39:43.155981513 +0200
+++ libgomp/testsuite/libgomp.c/declare-variant-4-gfx1103.c 2024-05-02 
11:40:02.801717936 +0200
@@ -0,0 +1,8 @@
+/* { dg-do link { target { offload_target_amdgcn } } } */
+/* { dg-additional-options -foffload=amdgcn-amdhsa } */
+/* { dg-additional-options -foffload=-march=gfx1103 } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-4.h"
+
+/* { dg-final { only_for_offload_target amdgcn-amdhsa scan-offload-tree-dump 
"= gfx1103 \\(\\);" "optimized" } } */

Jakub



Re: [PATCH] libgcc: Rename __trunchfbf2 to __extendhfbf2

2024-05-01 Thread Jakub Jelinek
On Wed, May 01, 2024 at 12:55:25PM -0700, H.J. Lu wrote:
> Since bfloat16 has the same range as float32, _Float16 to bfloat16
> conversion is an extension, not a truncation.  Rename trunchfbf2.c
> to extendhfbf2.c to provide __extendhfbf2, instead of __trunchfbf2.
> 
> Since _Float16 to bfloat16 conversion never worked from the day one,
> the same libgcc version of __trunchfbf2 is used with __extendhfbf2 so
> that this can be backported to release branches all the way where
> __trunchfbf2 was added.

This is wrong.
First of all, it is ABI incompatible change, we can't do that.
And second, neither _Float16 is a subset of __bf16 nor the other way,
so both extend and trunc names are equally wrong.

Jakub



[PATCH] c++: Implement C++26 P2573R2 - = delete("should have a reason"); [PR114458]

2024-05-01 Thread Jakub Jelinek
Hi!

The following patch implements the C++26 P2573R2
= delete("should have a reason"); paper.
I've tried to avoid increasing compile time memory for it when it isn't
used (e.g. by adding a new lang_decl tree member), so the reason is
represented as STRING_CST in DECL_INITIAL (which normally is for
DECL_DELETED_FN error_mark_node) and to differentiate this delete("reason")
initializer from some bogus attempt to initialize a function with "reason"
using the RID_DELETE identifier as TREE_TYPE of the STRING_CST, as nothing
needs to care about the type of the reason.  If preferred it could
be say TREE_LIST with the reason STRING_CST and RID_DELETE identifier or
something similar instead, but that would need more compile time memory when
it is used.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-01  Jakub Jelinek  

PR c++/114458
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_deleted_function=202403L for C++26.
gcc/cp/ChangeLog
* parser.cc (cp_parser_pure_specifier): Implement C++26 P2573R2
- = delete("should have a reason");.  Parse deleted-function-body.
* decl.cc (duplicate_decls): Copy DECL_INITIAL from DECL_DELETED_FN
olddecl to newdecl if it is a STRING_CST.
(cp_finish_decl): Handle deleted init with a reason.
* decl2.cc: Include "escaped_string.h".
(grokfield): Handle deleted init with a reason.
(mark_used): Emit DECL_DELETED_FN reason in the message if any.
gcc/testsuite/
* g++.dg/cpp26/feat-cxx26.C (__cpp_deleted_function): Add test.
* g++.dg/cpp26/delete-reason1.C: New test.
* g++.dg/cpp26/delete-reason2.C: New test.
* g++.dg/parse/error65.C (f1): Adjust expected diagnostics.

--- gcc/c-family/c-cppbuiltin.cc.jj 2024-04-30 08:57:07.359039013 +0200
+++ gcc/c-family/c-cppbuiltin.cc2024-04-30 19:16:45.069542205 +0200
@@ -1092,6 +1092,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_static_assert=202306L");
  cpp_define (pfile, "__cpp_placeholder_variables=202306L");
  cpp_define (pfile, "__cpp_structured_bindings=202403L");
+ cpp_define (pfile, "__cpp_deleted_function=202403L");
}
   if (flag_concepts)
 {
--- gcc/cp/parser.cc.jj 2024-04-30 08:57:07.349039147 +0200
+++ gcc/cp/parser.cc2024-04-30 16:47:01.427952875 +0200
@@ -28573,6 +28573,27 @@ cp_parser_pure_specifier (cp_parser* par
   || token->keyword == RID_DELETE)
 {
   maybe_warn_cpp0x (CPP0X_DEFAULTED_DELETED);
+  if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_PAREN))
+   {
+ if (cxx_dialect >= cxx11 && cxx_dialect < cxx26)
+   pedwarn (cp_lexer_peek_token (parser->lexer)->location,
+OPT_Wc__26_extensions,
+"% reason only available with "
+"%<-std=c++2c%> or %<-std=gnu++2c%>");
+
+ /* Consume the `('.  */
+ matching_parens parens;
+ parens.consume_open (parser);
+ tree reason = cp_parser_unevaluated_string_literal (parser);
+ /* Consume the `)'.  */
+ parens.require_close (parser);
+ if (TREE_CODE (reason) == STRING_CST)
+   {
+ TREE_TYPE (reason) = token->u.value;
+ return reason;
+   }
+   }
+
   return token->u.value;
 }
 
--- gcc/cp/decl.cc.jj   2024-04-30 08:55:26.172389593 +0200
+++ gcc/cp/decl.cc  2024-04-30 19:01:32.316543498 +0200
@@ -2410,6 +2410,10 @@ duplicate_decls (tree newdecl, tree oldd
"previous declaration of %qD", olddecl);
}
  DECL_DELETED_FN (newdecl) |= DECL_DELETED_FN (olddecl);
+ if (DECL_DELETED_FN (olddecl)
+ && DECL_INITIAL (olddecl)
+ && TREE_CODE (DECL_INITIAL (olddecl)) == STRING_CST)
+   DECL_INITIAL (newdecl) = DECL_INITIAL (olddecl);
}
 }
 
@@ -8587,17 +8591,20 @@ cp_finish_decl (tree decl, tree init, bo
   if (init && TREE_CODE (decl) == FUNCTION_DECL)
 {
   tree clone;
-  if (init == ridpointers[(int)RID_DELETE])
+  if (init == ridpointers[(int)RID_DELETE]
+ || (TREE_CODE (init) == STRING_CST
+ && TREE_TYPE (init) == ridpointers[(int)RID_DELETE]))
{
  /* FIXME check this is 1st decl.  */
  DECL_DELETED_FN (decl) = 1;
  DECL_DECLARED_INLINE_P (decl) = 1;
- DECL_INITIAL (decl) = error_mark_node;
+ DECL_INITIAL (decl)
+   = TREE_CODE (init) == STRING_CST ? init : error_mark_node;
  FOR_EACH_CLONE (clone, decl)
{
  DECL_DELETED_FN (clone) = 1;
  DECL_DECLARED_INLINE_P (clone) = 1;
- DECL_INITIAL (clone) = error_mark

Re: [wwwdocs] Porting-to-14: Mention new pragma GCC Target behavior

2024-04-30 Thread Jakub Jelinek
On Tue, Apr 30, 2024 at 11:12:30PM +0200, Martin Jambor wrote:
> Would the following then perhaps describe the situation accurately?
> Note that I have moved the whole thing to C++ section because it seems
> porting issues in C because of this are quite unlikely.
> 
> Michal, I assume that the file where this issue happened was written in
> C++, right?
> 
> Martin
> 
> 
> 
> diff --git a/htdocs/gcc-14/porting_to.html b/htdocs/gcc-14/porting_to.html
> index c825a68e..1e67b0b3 100644
> --- a/htdocs/gcc-14/porting_to.html
> +++ b/htdocs/gcc-14/porting_to.html
> @@ -514,6 +514,51 @@ be included explicitly when compiling with GCC 14:
>  
>  
>  
> +Pragma GCC Target now affects preprocessor 
> symbols

I'd use lowercase Target here
> +
> +
> +The behavior of pragma GCC Target and specifically how it affects ISA

And here as well, perhaps even #pragma GCC target.

Otherwise LGTM.

> +macros has changed in GCC 14.  In GCC 13 and older, the GCC
> +target pragma defined and undefined corresponding ISA macros in
> +C when using integrated preprocessor during compilation but not when
> +preprocessor was invoked as a separate step or when using -save-temps.
> +In C++ the ISA macro definitions were performed in a way which did not
> +have any actual effect.
> +
> +In GCC 14 C++ behaves like C with integrated preprocessing in earlier
> +versions. Moreover, in both languages ISA macros are defined and
> +undefined as expected when preprocessing separately from compilation.
> +
> +
> +This can lead to different behavior, especially in C++.  For example,
> +functions the C++ snippet below will be (silently) compiled for an
> +incorrect instruction set by GCC 14.
> +
> +
> +  #if ! __AVX2__
> +  #pragma GCC push_options
> +  #pragma GCC target("avx2")
> +  #endif
> +
> +  /* Code to be compiled for AVX2. */
> +
> +  /* With GCC 14, __AVX2__ here will always be defined and pop_options
> +  never called. */
> +  #if ! __AVX2__
> +  #pragma GCC pop_options
> +  #endif
> +
> +  /* With GCC 14, all following functions will be compiled for AVX2
> +  which was not intended. */
> +
> +
> +
> +The fix in this case would be to remember
> +whether pop_options needs to be performed in a new
> +user-defined macro.
> +
> +
> +
>  
>  
>  

Jakub



Re: [PATCH] Don't assert for IFN_COND_{MIN, MAX} in vect_transform_reduction

2024-04-30 Thread Jakub Jelinek
On Tue, Apr 30, 2024 at 09:30:00AM +0200, Richard Biener wrote:
> On Mon, Apr 29, 2024 at 5:30 PM H.J. Lu  wrote:
> >
> > On Mon, Apr 29, 2024 at 6:47 AM liuhongt  wrote:
> > >
> > > The Fortran standard does not specify what the result of the MAX
> > > and MIN intrinsics are if one of the arguments is a NaN. So it
> > > should be ok to tranform reduction for IFN_COND_MIN with vectorized
> > > COND_MIN and REDUC_MIN.
> >
> > The commit subject isn't very clear.   This patch isn't about "Don't assert
> > for IFN_COND_{MIN,MAX}".  It allows IFN_COND_{MIN,MAX} in
> > vect_transform_reduction.
> 
> Well, we allow it elsewhere, we just fail to enumerate all COND_* we allow
> here correctly.
> 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > > Ok for trunk and backport to GCC14?
> 
> OK for trunk and branch.

Oops, I've just sent the same patch, just with a different testcase
(reduced and which tests both the min and max).
I think the reduced testcase is better.

> > > gcc/ChangeLog:
> > >
> > > PR 114883

Missing tree-optimization/

> > > * tree-vect-loop.cc (vect_transform_reduction): Don't assert
> > > for IFN_COND_{MIN, MAX}.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gfortran.dg/pr114883.f90: New test.

Jakub



[PATCH] vect: Adjust vect_transform_reduction assertion [PR114883]

2024-04-30 Thread Jakub Jelinek
Hi!

The assertion doesn't allow IFN_COND_MIN/IFN_COND_MAX, which are
commutative conditional binary operations like ADD/MUL/AND/IOR/XOR,
and can be handled just fine.
In particular, we emit
vminpd  %zmm3, %zmm5, %zmm0{%k2}
vminpd  %zmm0, %zmm3, %zmm5{%k1}
and
vmaxpd  %zmm3, %zmm5, %zmm0{%k2}
vmaxpd  %zmm0, %zmm3, %zmm5{%k1}
in the vectorized loops of the first and second subroutine.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
14.1?

2024-04-30  Jakub Jelinek  
Hongtao Liu  

PR tree-optimization/114883
* tree-vect-loop.cc (vect_transform_reduction): Allow IFN_COND_MIN and
IFN_COND_MAX in the assert.

* gfortran.dg/pr114883.f90: New test.

--- gcc/tree-vect-loop.cc.jj2024-04-17 11:34:02.465185397 +0200
+++ gcc/tree-vect-loop.cc   2024-04-29 20:41:04.973723992 +0200
@@ -8505,7 +8505,8 @@ vect_transform_reduction (loop_vec_info
 {
   gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB
  || code == IFN_COND_MUL || code == IFN_COND_AND
- || code == IFN_COND_IOR || code == IFN_COND_XOR);
+ || code == IFN_COND_IOR || code == IFN_COND_XOR
+ || code == IFN_COND_MIN || code == IFN_COND_MAX);
   gcc_assert (op.num_ops == 4
  && (op.ops[reduc_index]
  == op.ops[internal_fn_else_index ((internal_fn) code)]));
--- gcc/testsuite/gfortran.dg/pr114883.f90.jj   2024-04-29 20:39:39.000871849 
+0200
+++ gcc/testsuite/gfortran.dg/pr114883.f90  2024-04-29 20:39:27.757021972 
+0200
@@ -0,0 +1,53 @@
+! PR tree-optimization/114883
+! { dg-do compile }
+! { dg-options "-O2 -fvect-cost-model=cheap" }
+! { dg-additional-options "-march=x86-64-v4" { target i?86-*-* x86_64-*-* } }
+
+subroutine pr114883_1(a, b, c, d, e, f, g, h, o)
+  real(8) :: c(1011), d(1011), e(0:1011)
+  real(8) :: p, q, f, r, g(1011), h(1011), b, bar
+  integer :: o(100), a, t, u
+  p = 0.0_8
+  r = bar()
+  u = 1
+  do i = 1,a
+do k = 1,1011
+  km1 = max0(k-1,1)
+  h(k) = c(k) * e(k-1) * d(km1)
+  f = g(k) + h(k)
+  if(f.gt.1.e-6)then
+p = min(p,r)
+  endif
+end do
+q = 0.9_8 * p
+t = integer(b/q + 1)
+if(t>100)then
+  u = t
+endif
+o(u) = o(u) + 1
+  end do
+end subroutine pr114883_1
+subroutine pr114883_2(a, b, c, d, e, f, g, h, o)
+  real(8) :: c(1011), d(1011), e(0:1011)
+  real(8) :: p, q, f, r, g(1011), h(1011), b, bar
+  integer :: o(100), a, t, u
+  p = 0.0_8
+  r = bar()
+  u = 1
+  do i = 1,a
+do k = 1,1011
+  km1 = max0(k-1,1)
+  h(k) = c(k) * e(k-1) * d(km1)
+  f = g(k) + h(k)
+  if(f.gt.1.e-6)then
+p = max(p,r)
+  endif
+end do
+q = 0.9_8 * p
+t = integer(b/q + 1)
+if(t>100)then
+  u = t
+endif
+o(u) = o(u) + 1
+  end do
+end subroutine pr114883_2

Jakub



[PATCH] gimple-ssa-sprintf: Use [0, 1] range for %lc with (wint_t) 0 argument [PR114876]

2024-04-30 Thread Jakub Jelinek
Hi!

Seems when Martin S. implemented this, he coded there strict reading
of the standard, which said that %lc with (wint_t) 0 argument is handled
as wchar_t[2] temp = { arg, 0 }; %ls with temp arg and so shouldn't print
any values.  But, most of the libc implementations actually handled that
case like %c with '\0' argument, adding a single NUL character, the only
known exception is musl.
Recently, C23 changed this in response to GB-141 and POSIX in
https://austingroupbugs.net/view.php?id=1647
so that it should have the same behavior as %c with '\0'.

Because there is implementation divergence, the following patch uses
a range rather than hardcoding it to all 1s (i.e. the %c behavior),
though the likely case is still 1 (forward looking plus most of
implementations).
The res.knownrange = true; assignment removed is redundant due to
the same assignment done unconditionally before the if statement,
rest is formatting fixes.

I don't think the min >= 0 && min < 128 case is right either, I'd think
it should be min >= 0 && max < 128, otherwise it is just some possible
inputs are (maybe) ASCII and there can be others, but this code is a total
mess anyway, with the min, max, likely (somewhere in [min, max]?) and then
unlikely possibly larger than max, dunno, perhaps for at least some chars
in the ASCII range the likely case could be for the ascii case; so perhaps
just the one_2_one_ascii shouldn't set max to 1 and mayfail should be true
for max >= 128.  Anyway, didn't feel I should touch that right now.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Shall it go to 14.1, or wait for 14.2?

2024-04-30  Jakub Jelinek  

PR tree-optimization/114876
* gimple-ssa-sprintf.cc (format_character): For min == 0 && max == 0,
set max, likely and unlikely members to 1 rather than 0.  Remove
useless res.knownrange = true;.  Formatting fixes.

* gcc.dg/pr114876.c: New test.
* gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Adjust expected
diagnostics.

--- gcc/gimple-ssa-sprintf.cc.jj2024-01-03 11:51:22.225860346 +0100
+++ gcc/gimple-ssa-sprintf.cc   2024-04-29 12:52:59.760668894 +0200
@@ -2177,8 +2177,7 @@ format_character (const directive ,
 
   res.knownrange = true;
 
-  if (dir.specifier == 'C'
-  || dir.modifier == FMT_LEN_l)
+  if (dir.specifier == 'C' || dir.modifier == FMT_LEN_l)
 {
   /* A wide character can result in as few as zero bytes.  */
   res.range.min = 0;
@@ -2189,10 +2188,13 @@ format_character (const directive ,
{
  if (min == 0 && max == 0)
{
- /* The NUL wide character results in no bytes.  */
- res.range.max = 0;
- res.range.likely = 0;
- res.range.unlikely = 0;
+ /* In strict reading of older ISO C or POSIX, this required
+no characters to be emitted.  ISO C23 changes that, so
+does POSIX, to match what has been implemented in most of the
+implementations, namely emitting a single NUL character.
+Let's use 0 for minimum and 1 for all the other values.  */
+ res.range.max = 1;
+ res.range.likely = res.range.unlikely = 1;
}
  else if (min >= 0 && min < 128)
{
@@ -2200,11 +2202,12 @@ format_character (const directive ,
 is not a 1-to-1 mapping to the source character set or
 if the source set is not ASCII.  */
  bool one_2_one_ascii
-   = (target_to_host_charmap[0] == 1 && target_to_host ('a') == 
97);
+   = (target_to_host_charmap[0] == 1
+  && target_to_host ('a') == 97);
 
  /* A wide character in the ASCII range most likely results
 in a single byte, and only unlikely in up to MB_LEN_MAX.  */
- res.range.max = one_2_one_ascii ? 1 : target_mb_len_max ();;
+ res.range.max = one_2_one_ascii ? 1 : target_mb_len_max ();
  res.range.likely = 1;
  res.range.unlikely = target_mb_len_max ();
  res.mayfail = !one_2_one_ascii;
@@ -2235,7 +2238,6 @@ format_character (const directive ,
   /* A plain '%c' directive.  Its output is exactly 1.  */
   res.range.min = res.range.max = 1;
   res.range.likely = res.range.unlikely = 1;
-  res.knownrange = true;
 }
 
   /* Bump up the byte counters if WIDTH is greater.  */
--- gcc/testsuite/gcc.dg/pr114876.c.jj  2024-04-29 12:26:45.774965158 +0200
+++ gcc/testsuite/gcc.dg/pr114876.c 2024-04-29 12:51:37.863777055 +0200
@@ -0,0 +1,34 @@
+/* PR tree-optimization/114876 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "return \[01\];" "optimized" } } */
+/* { dg-final { scan-tree-dump "return 3;" &

[PATCH] libgcc: Do use weakrefs for glibc 2.34 on GNU Hurd

2024-04-29 Thread Jakub Jelinek
On Mon, Apr 29, 2024 at 01:44:24PM +, Joseph Myers wrote:
> > glibc 2.34 and later doesn't have separate libpthread (libpthread.so.0 is a
> > dummy shared library with just some symbol versions for compatibility, but
> > all the pthread_* APIs are in libc.so.6).
> 
> I suspect this has caused link failures in the glibc testsuite for Hurd, 
> which still has separate libpthread.
> 
> https://sourceware.org/pipermail/libc-testresults/2024q2/012556.html

So like this then?  I can't really test it on Hurd, but will certainly
test on x86_64-linux/i686-linux.

2024-04-29  Jakub Jelinek  

* gthr.h (GTHREAD_USE_WEAK): Don't redefine to 0 for glibc 2.34+
on GNU Hurd.

--- libgcc/gthr.h.jj2024-04-25 20:43:10.555694952 +0200
+++ libgcc/gthr.h   2024-04-29 16:57:40.734062691 +0200
@@ -142,7 +142,7 @@ see the files COPYING3 and COPYING.RUNTI
 #endif
 
 #ifdef __GLIBC_PREREQ
-#if __GLIBC_PREREQ(2, 34)
+#if __GLIBC_PREREQ(2, 34) && !defined(__gnu_hurd__)
 /* glibc 2.34 and later has all pthread_* APIs inside of libc,
no need to link separately with -lpthread.  */
 #undef GTHREAD_USE_WEAK


Jakub



Re: [PATCH] arm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]

2024-04-29 Thread Jakub Jelinek
On Fri, Apr 26, 2024 at 11:10:12PM +, Christophe Lyon wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/pr114801.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include 
> +
> +/*
> +** test_32:
> +**...
> +**   mov r[0-9]+, #65535 @ movhi
> +**...
> +*/
> +uint32x4_t test_32() {
> +  return vdupq_m_n_u32(vdupq_n_u32(0), 0, 0x);

Just a testcase nit.  I think testing 0x isn't that useful,
it tests the same 4 bits 4 times.
Might be more interesting to test 4 different 4 bit elements,
one of them 0 (to verify it doesn't turn that into all ones),
one all 1s (that is the other valid case) and then 2 random
other values in between.

> +}
> +
> +/*
> +** test_16:
> +**...
> +**   mov r[0-9]+, #52428 @ movhi
> +**...
> +*/
> +uint16x8_t test_16() {
> +  return vdupq_m_n_u16(vdupq_n_u16(0), 0, 0x);

And for these it can actually test all 4 possible 2 bit elements,
so say 0x3021

> +}
> +
> +/*
> +** test_8:
> +**...
> +**   mov r[0-9]+, #52428 @ movhi
> +**...
> +*/
> +uint8x16_t test_8() {
> +  return vdupq_m_n_u8(vdupq_n_u8(0), 0, 0x);

and here use some random pattern.

BTW, the patch is ok for 14.1 if it is approved and committed today
(so that it can be cherry-picked tomorrow morning at latest to the branch).

Jakub



Re: [Patch, fortran] PR114859 - [14/15 Regression] Seeing new segmentation fault in same_type_as since r14-9752

2024-04-29 Thread Jakub Jelinek
On Sun, Apr 28, 2024 at 10:37:06PM +0100, Paul Richard Thomas wrote:
> Could this be looked at quickly? The timing of this regression is more than
> a little embarrassing on the eve of the 14.1 release. The testcase and the
> comment in gfc_trans_class_init_assign explain what this problem is all
> about and how the patch fixes it.
> 
> OK for 15-branch and backporting to 14-branch (hopefully to the RC as well)?

The patch is ok for 14.1 if cherry-picked today.

Jakub



Re: [PATCH] libstdc++: Update Solaris baselines for GCC 14.0

2024-04-29 Thread Jakub Jelinek
On Mon, Apr 29, 2024 at 10:07:42AM +0200, Rainer Orth wrote:
> This patch updates the Solaris baselines for the GLIBCXX_3.4.33 version
> added in GCC 14.0.
> 
> Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (32 and 64-bit
> each), together with the GLIBCXX_3.4.32 update, on both gcc-14 branch
> and trunk.
> 
> Ok for and gcc-14 branch and trunk?

Ok for both.

> 2024-04-28  Rainer Orth  
> 
>   libstdc++-v3:
>   * config/abi/post/i386-solaris/baseline_symbols.txt: Regenerate.
>   * config/abi/post/i386-solaris/amd64/baseline_symbols.txt:
>   Likewise.
>   * config/abi/post/sparc-solaris/baseline_symbols.txt: Likewise.
>   * config/abi/post/sparc-solaris/sparcv9/baseline_symbols.txt:
>   Likewise.

Jakub



Re: [PATCH] libstdc++: Update Solaris baselines for GCC 13.2

2024-04-29 Thread Jakub Jelinek
On Mon, Apr 29, 2024 at 10:02:56AM +0200, Rainer Orth wrote:
> This patch updates the Solaris baselines for the GLIBCXX_3.4.32 version
> added in GCC 13.2.
> 
> Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (32 and 64-bit
> each) on the gcc-13 branch and (together with the GLIBCXX_3.4.33 update)
> on both gcc-14 branch and trunk.
> 
> Ok for all of gcc-13 and gcc-14 branches and trunk?

I think Solaris shouldn't have the _ZNSt8ios_base4InitC1Ev @@GLIBCXX_3.4.32
export, so this LGTM, for all 3 branches.

> 2024-04-28  Rainer Orth  
> 
>   libstdc++-v3:
>   * config/abi/post/i386-solaris/baseline_symbols.txt: Regenerate.
>   * config/abi/post/i386-solaris/amd64/baseline_symbols.txt:
>   Likewise.
>   * config/abi/post/sparc-solaris/baseline_symbols.txt: Likewise.
>   * config/abi/post/sparc-solaris/sparcv9/baseline_symbols.txt:
>   Likewise.

Jakub



Re: [PATCH v2] RISC-V: Fix ICE for legitimize move on subreg const_poly_int [PR114885]

2024-04-29 Thread Jakub Jelinek
On Mon, Apr 29, 2024 at 03:47:05PM +0800, Kito Cheng wrote:
> Hi Jakub:
> 
> Is this OK for GCC 14 branch? it's fix ICE on valid code, thanks :)

Ok.

Jakub



[PATCH] c++: Implement C++26 P0609R3 - Attributes for Structured Bindings [PR114456]

2024-04-29 Thread Jakub Jelinek
Hi!

The following patch implements the P0609R3 paper; we build the
VAR_DECLs for the structured binding identifiers early, so all we need
IMHO is just to parse the attributed identifier list and pass the attributes
to the VAR_DECL creation.

The paper mentions maybe_unused and gnu::nonstring attributes as examples
where they can be useful.  Not sure about either of them.
For maybe_unused, the thing is that both GCC and clang already don't
diagnose maybe unused for the structured binding identifiers, because it
would be a false positive too often; and there is no easy way to find out
if a structured binding has been written with the P0609R3 paper in mind or
not (maybe we could turn it on if in the structured binding is any
attribute, even if just [[]] and record that as a flag on the whole
underlying decl, so that we'd diagnose
  auto [a, b, c[[]]] = d;
  // use a, c but not b
but not
  auto [e, f, g] = d;
  // use a, c but not b
).  For gnu::nonstring, the issue is that we currently don't allow the
attribute on references to char * or references to char[], just on
char */char[].  I've filed a PR for that.

The first testcase in the patch tests it on [[]] and [[maybe_unused]],
just whether it is parsed properly, second on gnu::deprecated, which
works.  Haven't used deprecated attribute because the paper said that
attribute is for further investigation.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-29  Jakub Jelinek  

PR c++/114456
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_structured_bindings for C++26 to 202403L rather than
201606L.
gcc/cp/
* parser.cc (cp_parser_decomposition_declaration): Implement C++26
P0609R3 - Attributes for Structured Bindings.  Parse attributed
identifier lists for structured binding declarations, pass the
attributes to start_decl.
gcc/testsuite/
* g++.dg/cpp26/decomp1.C: New test.
* g++.dg/cpp26/decomp2.C: New test.
* g++.dg/cpp26/feat-cxx26.C (__cpp_structured_bindings): Expect
202403 rather than 201606.

--- gcc/cp/parser.cc.jj 2024-04-26 11:42:24.653016208 +0200
+++ gcc/cp/parser.cc2024-04-26 13:59:17.791482874 +0200
@@ -16075,13 +16075,37 @@ cp_parser_decomposition_declaration (cp_
 
   /* Parse the identifier-list.  */
   auto_vec v;
+  bool attr_diagnosed = false;
+  int first_attr = -1;
+  unsigned int cnt = 0;
   if (!cp_lexer_next_token_is (parser->lexer, CPP_CLOSE_SQUARE))
 while (true)
   {
cp_expr e = cp_parser_identifier (parser);
if (e.get_value () == error_mark_node)
  break;
+   tree attr = NULL_TREE;
+   if (cp_next_tokens_can_be_std_attribute_p (parser))
+ {
+   if (cxx_dialect >= cxx17 && cxx_dialect < cxx26 && !attr_diagnosed)
+ {
+   pedwarn (cp_lexer_peek_token (parser->lexer)->location,
+OPT_Wc__26_extensions,
+"structured bindings with attributed identifiers "
+"only available with %<-std=c++2c%> or "
+"%<-std=gnu++2c%>");
+   attr_diagnosed = true;
+ }
+   attr = cp_parser_std_attribute_spec_seq (parser);
+   if (attr == error_mark_node)
+ attr = NULL_TREE;
+   if (attr && first_attr == -1)
+ first_attr = v.length ();
+ }
v.safe_push (e);
+   ++cnt;
+   if (first_attr != -1)
+ v.safe_push (attr);
if (!cp_lexer_next_token_is (parser->lexer, CPP_COMMA))
  break;
cp_lexer_consume_token (parser->lexer);
@@ -16139,8 +16163,11 @@ cp_parser_decomposition_declaration (cp_
  declarator->id_loc = e.get_location ();
}
   tree elt_pushed_scope;
+  tree attr = NULL_TREE;
+  if (first_attr != -1 && i >= (unsigned) first_attr)
+   attr = v[++i].get_value ();
   tree decl2 = start_decl (declarator, _specs, SD_DECOMPOSITION,
-  NULL_TREE, NULL_TREE, _pushed_scope);
+  NULL_TREE, attr, _pushed_scope);
   if (decl2 == error_mark_node)
decl = error_mark_node;
   else if (decl != error_mark_node && DECL_CHAIN (decl2) != prev)
@@ -16183,7 +16210,7 @@ cp_parser_decomposition_declaration (cp_
 
   if (decl != error_mark_node)
{
- cp_decomp decomp = { prev, v.length () };
+ cp_decomp decomp = { prev, cnt };
  cp_finish_decl (decl, initializer, non_constant_p, NULL_TREE,
  (is_direct_init ? LOOKUP_NORMAL : LOOKUP_IMPLICIT),
  );
@@ -16193,7 +16220,7 @@ cp_parser_decomposition_declaration (cp_
   else if (decl != error_mark_node)
 {
   *maybe_range_for_decl = prev;
-  cp_decomp decomp = { prev, v.length

[PATCH] c++, v5: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

2024-04-25 Thread Jakub Jelinek
On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote:
> Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but
> rather set it to some stub like void_node?

I'll try that in stage1.

> Though with all these changes, it's probably better to go with your first
> patch for GCC 14 and delay this approach to 15.  Your v1 patch is OK for 14.

Just to record, following patch passed bootstrap/regtest on x86_64-linux and
i686-linux.  But I've committed the v1 version instead with the addition of
comdat2.C and comdat5.C testcases from this patch now and in stage1 will
post an incremental diff.

Thanks.

2024-04-25  Jakub Jelinek  
Jason Merrill  

PR lto/113208
* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
for implicit instantiations of maybe in charge ctors/dtors
declared inline.
(c_parse_final_cleanups): Don't skip used same body aliases which
have non-NULL DECL_SAVED_TREE on the alias target.  Formatting fixes.
* optimize.cc (can_alias_cdtor): Adjust condition, for
HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
if not DECL_INTERFACE_KNOWN.
* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
set.

* g++.dg/abi/comdat2.C: New test.
* g++.dg/abi/comdat3.C: New test.
* g++.dg/abi/comdat4.C: New test.
* g++.dg/abi/comdat5.C: New test.
* g++.dg/lto/pr113208_0.C: New test.
* g++.dg/lto/pr113208_1.C: New file.
* g++.dg/lto/pr113208.h: New file.

--- gcc/cp/decl2.cc.jj  2024-04-24 18:28:22.299513620 +0200
+++ gcc/cp/decl2.cc 2024-04-25 16:19:17.385547357 +0200
@@ -3312,16 +3312,23 @@ tentative_decl_linkage (tree decl)
 linkage of all functions, and as that causes writes to
 the data mapped in from the PCH file, it's advantageous
 to mark the functions at this point.  */
- if (DECL_DECLARED_INLINE_P (decl)
- && (!DECL_IMPLICIT_INSTANTIATION (decl)
- || DECL_DEFAULTED_FN (decl)))
+ if (DECL_DECLARED_INLINE_P (decl))
{
- /* This function must have external linkage, as
-otherwise DECL_INTERFACE_KNOWN would have been
-set.  */
- gcc_assert (TREE_PUBLIC (decl));
- comdat_linkage (decl);
- DECL_INTERFACE_KNOWN (decl) = 1;
+ if (!DECL_IMPLICIT_INSTANTIATION (decl)
+ || DECL_DEFAULTED_FN (decl))
+   {
+ /* This function must have external linkage, as
+otherwise DECL_INTERFACE_KNOWN would have been
+set.  */
+ gcc_assert (TREE_PUBLIC (decl));
+ comdat_linkage (decl);
+ DECL_INTERFACE_KNOWN (decl) = 1;
+   }
+ else if (DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
+   /* For implicit instantiations of cdtors try to make
+  it comdat, so that maybe_clone_body can use aliases.
+  See PR113208.  */
+   maybe_make_one_only (decl);
}
}
   else if (VAR_P (decl))
@@ -5264,7 +5271,19 @@ c_parse_final_cleanups (void)
generate_tls_wrapper (decl);
 
  if (!DECL_SAVED_TREE (decl))
-   continue;
+   {
+ cgraph_node *node;
+ tree tgt;
+ /* Even when maybe_clone_body created same body alias
+has no DECL_SAVED_TREE, if its alias target does,
+don't skip it.  */
+ if (!DECL_CLONED_FUNCTION (decl)
+ || !(node = cgraph_node::get (decl))
+ || !node->cpp_implicit_alias
+ || !(tgt = node->get_alias_target_tree ())
+ || !DECL_SAVED_TREE (tgt))
+   continue;
+   }
 
  cgraph_node *node = cgraph_node::get_create (decl);
 
@@ -5292,7 +5311,7 @@ c_parse_final_cleanups (void)
node = node->get_alias_target ();
 
  node->call_for_symbol_thunks_and_aliases (clear_decl_external,
- NULL, true);
+   NULL, true);
  /* If we mark !DECL_EXTERNAL one of the symbols in some comdat
 group, we need to mark all symbols in the same comdat group
 that way.  */
@@ -5302,7 +5321,7 @@ c_parse_final_cleanups (void)
 next != node;
 next = dyn_cast (next->same_comdat_group))
  next->call_for_symbol_thunks_and_aliases (clear_decl_external,
- NULL, true);
+   NULL, true);
  

[PATCH] libgcc: Don't use weakrefs for glibc 2.34

2024-04-25 Thread Jakub Jelinek
Hi!

glibc 2.34 and later doesn't have separate libpthread (libpthread.so.0 is a
dummy shared library with just some symbol versions for compatibility, but
all the pthread_* APIs are in libc.so.6).
So, we don't need to do the .weakref dances to check whether a program
has been linked with -lpthread or not, in dynamically linked apps those
will be always true anyway.
In -static linking, this fixes various issues people had when only linking
some parts of libpthread.a and getting weird crashes.  A hack for that was
what e.g. some Fedora glibcs used, where libpthread.a was a library
containing just one giant *.o file which had all the normal libpthread.a
*.o files linked with -r together.

libstdc++-v3 actually does something like this already since r10-10928,
the following patch is meant to fix it even for libgfortran, libobjc and
whatever else uses gthr.h.

Bootstrapped/regtested on x86_64-linux and i686-linux (with glibc 2.35), ok
for trunk?

2024-04-25  Jakub Jelinek  

* gthr.h (GTHREAD_USE_WEAK): Redefine to 0 for GLIBC 2.34 or later.

--- libgcc/gthr.h.jj2024-01-03 12:07:28.623363560 +0100
+++ libgcc/gthr.h   2024-04-25 12:09:39.708622613 +0200
@@ -141,6 +141,15 @@ see the files COPYING3 and COPYING.RUNTI
 #define GTHREAD_USE_WEAK 0
 #endif
 
+#ifdef __GLIBC_PREREQ
+#if __GLIBC_PREREQ(2, 34)
+/* glibc 2.34 and later has all pthread_* APIs inside of libc,
+   no need to link separately with -lpthread.  */
+#undef GTHREAD_USE_WEAK
+#define GTHREAD_USE_WEAK 0
+#endif
+#endif
+
 #ifndef GTHREAD_USE_WEAK
 #define GTHREAD_USE_WEAK 1
 #endif

Jakub



[committed] openmp: Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_? to tree-nested decl copy [PR114825]

2024-04-25 Thread Jakub Jelinek
Hi!

tree-nested.cc creates in 2 spots artificial VAR_DECLs, one of them is used
both for debug info and OpenMP/OpenACC lowering purposes, the other solely for
OpenMP/OpenACC lowering purposes.
When the decls are used in OpenMP/OpenACC lowering, the OMP langhooks (mostly
Fortran, C just a little and C++ doesn't have nested functions) then inspect
the flags on the vars and based on that decide how to lower the corresponding
clauses.

Unfortunately we weren't copying DECL_LANG_SPECIFIC and DECL_LANG_FLAG_?, so
the langhooks made decisions on the default flags on those instead.
As the original decl isn't necessarily a VAR_DECL, could be e.g. PARM_DECL,
using copy_node wouldn't work properly, so this patch just copies those
flags in addition to other flags it was copying already.  And I've removed
code duplication by introducing a helper function which does copying common
to both uses.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed so far to
trunk.

2024-04-25  Jakub Jelinek  

PR fortran/114825
* tree-nested.cc (get_debug_decl): New function.
(get_nonlocal_debug_decl): Use it.
(get_local_debug_decl): Likewise.

* gfortran.dg/gomp/pr114825.f90: New test.

--- gcc/tree-nested.cc.jj   2024-01-29 09:41:19.804391621 +0100
+++ gcc/tree-nested.cc  2024-04-24 18:02:55.103841888 +0200
@@ -1047,6 +1047,37 @@ get_frame_field (struct nesting_info *in
 
 static void note_nonlocal_vla_type (struct nesting_info *info, tree type);
 
+/* Helper for get_nonlocal_debug_decl and get_local_debug_decl.  */
+
+static tree
+get_debug_decl (tree decl)
+{
+  tree new_decl
+= build_decl (DECL_SOURCE_LOCATION (decl),
+ VAR_DECL, DECL_NAME (decl), TREE_TYPE (decl));
+  DECL_ARTIFICIAL (new_decl) = DECL_ARTIFICIAL (decl);
+  DECL_IGNORED_P (new_decl) = DECL_IGNORED_P (decl);
+  TREE_THIS_VOLATILE (new_decl) = TREE_THIS_VOLATILE (decl);
+  TREE_SIDE_EFFECTS (new_decl) = TREE_SIDE_EFFECTS (decl);
+  TREE_READONLY (new_decl) = TREE_READONLY (decl);
+  TREE_ADDRESSABLE (new_decl) = TREE_ADDRESSABLE (decl);
+  DECL_SEEN_IN_BIND_EXPR_P (new_decl) = 1;
+  if ((TREE_CODE (decl) == PARM_DECL
+   || TREE_CODE (decl) == RESULT_DECL
+   || VAR_P (decl))
+  && DECL_BY_REFERENCE (decl))
+DECL_BY_REFERENCE (new_decl) = 1;
+  /* Copy DECL_LANG_SPECIFIC and DECL_LANG_FLAG_* for OpenMP langhook
+ purposes.  */
+  DECL_LANG_SPECIFIC (new_decl) = DECL_LANG_SPECIFIC (decl);
+#define COPY_DLF(n) DECL_LANG_FLAG_##n (new_decl) = DECL_LANG_FLAG_##n (decl)
+  COPY_DLF (0); COPY_DLF (1); COPY_DLF (2); COPY_DLF (3);
+  COPY_DLF (4); COPY_DLF (5); COPY_DLF (6); COPY_DLF (7);
+  COPY_DLF (8);
+#undef COPY_DLF
+  return new_decl;
+}
+
 /* A subroutine of convert_nonlocal_reference_op.  Create a local variable
in the nested function with DECL_VALUE_EXPR set to reference the true
variable in the parent function.  This is used both for debug info
@@ -1094,21 +1125,8 @@ get_nonlocal_debug_decl (struct nesting_
 x = build_simple_mem_ref_notrap (x);
 
   /* ??? We should be remapping types as well, surely.  */
-  new_decl = build_decl (DECL_SOURCE_LOCATION (decl),
-VAR_DECL, DECL_NAME (decl), TREE_TYPE (decl));
+  new_decl = get_debug_decl (decl);
   DECL_CONTEXT (new_decl) = info->context;
-  DECL_ARTIFICIAL (new_decl) = DECL_ARTIFICIAL (decl);
-  DECL_IGNORED_P (new_decl) = DECL_IGNORED_P (decl);
-  TREE_THIS_VOLATILE (new_decl) = TREE_THIS_VOLATILE (decl);
-  TREE_SIDE_EFFECTS (new_decl) = TREE_SIDE_EFFECTS (decl);
-  TREE_READONLY (new_decl) = TREE_READONLY (decl);
-  TREE_ADDRESSABLE (new_decl) = TREE_ADDRESSABLE (decl);
-  DECL_SEEN_IN_BIND_EXPR_P (new_decl) = 1;
-  if ((TREE_CODE (decl) == PARM_DECL
-   || TREE_CODE (decl) == RESULT_DECL
-   || VAR_P (decl))
-  && DECL_BY_REFERENCE (decl))
-DECL_BY_REFERENCE (new_decl) = 1;
 
   SET_DECL_VALUE_EXPR (new_decl, x);
   DECL_HAS_VALUE_EXPR_P (new_decl) = 1;
@@ -1892,21 +1910,8 @@ get_local_debug_decl (struct nesting_inf
   x = info->frame_decl;
   x = build3 (COMPONENT_REF, TREE_TYPE (field), x, field, NULL_TREE);
 
-  new_decl = build_decl (DECL_SOURCE_LOCATION (decl),
-VAR_DECL, DECL_NAME (decl), TREE_TYPE (decl));
+  new_decl = get_debug_decl (decl);
   DECL_CONTEXT (new_decl) = info->context;
-  DECL_ARTIFICIAL (new_decl) = DECL_ARTIFICIAL (decl);
-  DECL_IGNORED_P (new_decl) = DECL_IGNORED_P (decl);
-  TREE_THIS_VOLATILE (new_decl) = TREE_THIS_VOLATILE (decl);
-  TREE_SIDE_EFFECTS (new_decl) = TREE_SIDE_EFFECTS (decl);
-  TREE_READONLY (new_decl) = TREE_READONLY (decl);
-  TREE_ADDRESSABLE (new_decl) = TREE_ADDRESSABLE (decl);
-  DECL_SEEN_IN_BIND_EXPR_P (new_decl) = 1;
-  if ((TREE_CODE (decl) == PARM_DECL
-   || TREE_CODE (decl) == RESULT_DECL
-   || VAR_P (decl))
-  && DECL_BY_REFERENCE (decl))
-DECL_BY_REFERENCE (new_decl) = 1;
 
   SET_DECL_VALUE_EXPR (new_decl, x);
   DECL_HAS_VAL

Re: [PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

2024-04-25 Thread Jakub Jelinek
On Thu, Apr 25, 2024 at 02:02:32PM +0200, Jakub Jelinek wrote:
> I've tried the following patch, but unfortunately that lead to large
> number of regressions:
> +FAIL: g++.dg/cpp0x/initlist25.C  -std=c++17 (test for excess errors)

So the reduced testcase for this is
template  struct A {
  T a1;
  U a2;
  template 
  constexpr A(V &, W &) : a1(x), a2(y) {}
};
template  struct B;
namespace std {
template  struct initializer_list {
  int *_M_array;
  decltype (sizeof 0) _M_len;
};
}
template  struct C {
  void foo (std::initializer_list>);
};
template  struct D;
template , typename = B>
struct E { E (const char *); ~E (); };
int
main ()
{
  C, E> m;
  m.foo ({{"t", "t"}, {"y", "y"}});
}
Without the patch I've just posted or even with the earlier version
of the patch the
_ZN1AIK1EIc1DIcE1BIcEES5_EC[12]IRA2_KcSB_Lb1EEEOT_OT0_
ctors were emitted, but with this patch they are unresolved externals.

The reason is that the code actually uses (calls) the
_ZN1AIK1EIc1DIcE1BIcEES5_EC1IRA2_KcSB_Lb1EEEOT_OT0_
__ct_comp constructor, that one has TREE_USED, while the
_ZN1AIK1EIc1DIcE1BIcEES5_EC2IRA2_KcSB_Lb1EEEOT_OT0_
__ct_base constructor is not TREE_USED.

But the c_parse_final_cleanups loop over
FOR_EACH_VEC_SAFE_ELT (deferred_fns, i, decl)
will ignore the TREE_USED __ct_comp because it is an alias
and so has !DECL_SAVED_TREE:
5273  if (!DECL_SAVED_TREE (decl))
5274continue;

With the following incremental patch the tests in make check-g++
(haven't tried the coroutine one) which failed with the earlier patch
now pass.

--- gcc/cp/decl2.cc.jj  2024-04-25 10:52:21.057535959 +0200
+++ gcc/cp/decl2.cc 2024-04-25 16:19:17.385547357 +0200
@@ -5271,7 +5271,19 @@ c_parse_final_cleanups (void)
generate_tls_wrapper (decl);
 
  if (!DECL_SAVED_TREE (decl))
-   continue;
+   {
+ cgraph_node *node;
+ tree tgt;
+ /* Even when maybe_clone_body created same body alias
+has no DECL_SAVED_TREE, if its alias target does,
+don't skip it.  */
+ if (!DECL_CLONED_FUNCTION (decl)
+ || !(node = cgraph_node::get (decl))
+ || !node->cpp_implicit_alias
+ || !(tgt = node->get_alias_target_tree ())
+ || !DECL_SAVED_TREE (tgt))
+   continue;
+   }
 
  cgraph_node *node = cgraph_node::get_create (decl);
 
@@ -5299,7 +5311,7 @@ c_parse_final_cleanups (void)
node = node->get_alias_target ();
 
  node->call_for_symbol_thunks_and_aliases (clear_decl_external,
- NULL, true);
+   NULL, true);
  /* If we mark !DECL_EXTERNAL one of the symbols in some comdat
 group, we need to mark all symbols in the same comdat group
 that way.  */
@@ -5309,7 +5321,7 @@ c_parse_final_cleanups (void)
 next != node;
 next = dyn_cast (next->same_comdat_group))
  next->call_for_symbol_thunks_and_aliases (clear_decl_external,
- NULL, true);
+   NULL, true);
}
 
  /* If we're going to need to write this function out, and


Jakub



Re: [wwwdocs] Porting-to-14: Mention new pragma GCC Target behavior

2024-04-25 Thread Jakub Jelinek
On Thu, Apr 25, 2024 at 02:34:22PM +0200, Martin Jambor wrote:
> when looking at a package build issue with GCC 14, Michal Jireš noted a
> different behavior of pragma GCC Target.  This snippet tries to describe
> the gist of the problem.  I have left it in the C section even though it
> is not really C specific, but could not think of a good name for a new
> section for it.  Ideas (and any other suggestions for improvements)
> welcome, of course.

The change was more subtle.
We used to define/undefine the ISA macros in C in GCC 13 and older as well,
but only when using integrated preprocessor during compilation,
so it didn't work that way with -save-temps or separate -E and -S/-c
steps.
While in C++ it behaved as if the define/undefines aren't done at all
(they were done, but after preprocessing/lexing everything, so didn't
affect anything).
In GCC 14, it behaves in C++ the same as in C in older versions, and
additionally they are defined/undefined also when using separate
preprocessing, in both C and C++.

Jakub



[PATCH] c++, v4: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

2024-04-25 Thread Jakub Jelinek
On Wed, Apr 24, 2024 at 08:43:46PM -0400, Jason Merrill wrote:
> > Then can_alias_cdtor would return false, because it ends with:
> >/* Don't use aliases for weak/linkonce definitions unless we can put both
> >   symbols in the same COMDAT group.  */
> >return (DECL_INTERFACE_KNOWN (fn)
> >&& (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
> >&& (!DECL_ONE_ONLY (fn)
> >|| (HAVE_COMDAT_GROUP && DECL_WEAK (fn;
> > Should we change that DECL_INTERFACE_KNOWN (fn) in there to
> > (DECL_INTERFACE_KNOWN (fn) || something) then and what that
> > something should be?  HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn)?
> 
> Yes, I think reorganize to
> 
> ((DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn) && !DECL_ONE_ONLY (fn))
>  || (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))

I've tried the following patch, but unfortunately that lead to large
number of regressions:
+FAIL: g++.dg/coroutines/torture/co-yield-04-complex-local-state.C (test for 
excess errors)
+FAIL: g++.dg/coroutines/torture/func-params-08.C (test for excess errors)
+FAIL: g++.dg/coroutines/torture/func-params-09-awaitable-parms.C (test for 
excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++11 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++14 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++17 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp0x/constexpr-initlist.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++11 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++14 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++17 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp0x/initlist25.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp1y/pr95226.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp1y/pr95226.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp1y/pr95226.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp1z/decomp12.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp1z/decomp12.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp1z/eval-order2.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp1z/eval-order2.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp1z/eval-order2.C  -std=c++26 (test for excess errors)
+FAIL: g++.dg/cpp2a/srcloc17.C  -std=c++20 (test for excess errors)
+FAIL: g++.dg/cpp2a/srcloc17.C  -std=c++23 (test for excess errors)
+FAIL: g++.dg/cpp2a/srcloc17.C  -std=c++26 (test for excess errors)
+FAIL: g++.old-deja/g++.jason/template31.C  -std=c++20 (test for excess errors)
+FAIL: g++.old-deja/g++.jason/template31.C  -std=c++23 (test for excess errors)
+FAIL: g++.old-deja/g++.jason/template31.C  -std=c++26 (test for excess errors)
+FAIL: 20_util/unique_ptr/creation/for_overwrite.cc  -std=gnu++26 (test for 
excess errors)
+FAIL: 23_containers/span/cons_1_assert_neg.cc  -std=gnu++20 (test for excess 
errors)
+FAIL: 23_containers/span/cons_1_assert_neg.cc  -std=gnu++26 (test for excess 
errors)
+FAIL: 23_containers/span/cons_2_assert_neg.cc  -std=gnu++20 (test for excess 
errors)
+FAIL: 23_containers/span/cons_2_assert_neg.cc  -std=gnu++26 (test for excess 
errors)
+FAIL: std/ranges/repeat/1.cc  -std=gnu++23 (test for excess errors)
+FAIL: std/ranges/repeat/1.cc  -std=gnu++26 (test for excess errors)

Errors are like:
func-params-08.C:(.text._ZNSt12_Vector_baseIiSaIiEEC2Ev[_ZNSt12_Vector_baseIiSaIiEEC5Ev]+0x14):
 undefined reference to 
`_ZNSt12_Vector_baseIiSaIiEE12_Vector_implC1EvQ26is_default_constructible_vIN9__gnu_cxx14__alloc_traitsIT0_NS5_10value_typeEE6rebindIT_E5otherEE'
Though, libstdc++.so.6 abilist is the same.
Trying to debug it now.

2024-04-24  Jakub Jelinek  
Jason Merrill  

PR lto/113208
* decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
for implicit instantiations of maybe in charge ctors/dtors
declared inline.
* optimize.cc (can_alias_cdtor): Adjust condition, for
HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
if not DECL_INTERFACE_KNOWN.
* decl.cc (cxx_comdat_group): For DECL_CLONED_FUNCTION_P
functions if SUPPORTS_ONE_ONLY return DECL_COMDAT_GROUP if already
set.

* g++.dg/abi/comdat2.C: New test.
* g++.dg/abi/comdat3.C: New test.
* g++.dg/abi/comdat4.C: New test.
* g++.dg/abi/comdat5.C: New test.
* g++.dg/lto/pr113208_0.C: New test.
* g++.dg/lto/pr113208_1.C: New file.
* g++.dg/lto/pr113208.h: New 

  1   2   3   4   5   6   7   8   9   10   >