[PATCH] [PATCH] AArch64: add R30_REGNUM into shrink-wrapping separate

2022-02-24 Thread Dan Li via Gcc-patches
R30_REGNUM could also be used as a component in shrink-wrapping
separate, this patch enables it in aarch64.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_get_separate_components):
Remove bitmap clear of R30_REGNUM.
(aarch64_components_for_bb): Support R30_REGNUM as a component.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrink_wrap_separate_1.c: New test.
---
 gcc/config/aarch64/aarch64.cc   |  4 ++--
 .../gcc.target/aarch64/shrink_wrap_separate_1.c | 17 +
 2 files changed, 19 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 8bcee8be9eb..6e1589b0312 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -8463,7 +8463,6 @@ aarch64_get_separate_components (void)
   if (reg1 != INVALID_REGNUM)
 bitmap_clear_bit (components, reg1);
 
-  bitmap_clear_bit (components, LR_REGNUM);
   bitmap_clear_bit (components, SP_REGNUM);
 
   return components;
@@ -8500,7 +8499,8 @@ aarch64_components_for_bb (basic_block bb)
   /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets.  */
   for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++)
 if (!fixed_regs[regno]
-   && !crtl->abi->clobbers_full_reg_p (regno)
+   && (regno == R30_REGNUM
+   || !crtl->abi->clobbers_full_reg_p (regno))
&& (TEST_HARD_REG_BIT (extra_caller_saves, regno)
|| bitmap_bit_p (in, regno)
|| bitmap_bit_p (gen, regno)
diff --git a/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c 
b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c
new file mode 100644
index 000..34002705ace
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/shrink_wrap_separate_1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fomit-frame-pointer -fdump-rtl-pro_and_epilogue" } */
+
+void f();
+
+int g(int x)
+{
+  if (x == 0)
+{
+  __asm__ ("":::"x19", "x20");
+  return 1;
+}
+  f();
+  return 2;
+}
+
+/* { dg-final { scan-rtl-dump {The components we wrap separately are \[sep 
30\]} "pro_and_epilogue"  } } */
-- 
2.17.1



Re: [PATCH]middle-end vect: Simplify and extend the complex numbers validation routines. (GCC-11 Backport)

2022-02-24 Thread Richard Biener via Gcc-patches
On Thu, 24 Feb 2022, Tamar Christina wrote:

> Hi All,
> 
> This is a backport of the GCC 12 patch backporting only the correctness part 
> of
> the fix.   This also backports two small helper functions and documentation
> update on the optabs.
> 
> The patch boosts the analysis for complex mul,fma and fms in order to ensure
> that it doesn't create an incorrect output.
> 
> Essentially it adds an extra verification to check that the two nodes it's 
> going
> to combine do the same operations on compatible values.  The reason it needs 
> to
> do this is that if one computation differs from the other then with the 
> current
> implementation we have no way to deal with it since we have to remove the
> permute.
> 
> When we can keep the permute around we can probably handle these by unrolling.
> 
> While implementing this since I have to do the traversal anyway I took 
> advantage
> of it by simplifying the code a bit.  Previously we would determine whether
> something is a conjugate and then try to figure out which conjugate it is and
> then try to see if the permutes match what we expect.
> 
> Now the code that does the traversal will detect this in one go and return to 
> us
> whether the operation is something that can be combined and whether a 
> conjugate
> is present.
> 
> Secondly because it does this I can now simplify the checking code itself to
> essentially just try to apply fixed patterns to each operation.
> 
> The patterns represent the order operations should appear in. For instance a
> complex MUL operation combines :
> 
>   Left 1 + Right 1
>   Left 2 + Right 2
> 
> with a permute on the nodes consisting of:
> 
>   { Even, Even } + { Odd, Odd  }
>   { Even, Odd  } + { Odd, Even }
> 
> By abstracting over these patterns the checking code becomes quite simple.
> 
> As part of this I was checking the order of the operands which was left in
> "slp" order. as in, the same order they showed up in during SLP, which means
> that the accumulator is first.  However it looks like I didn't document this.
> 
> I have this changed the order to match that of FMA and FMS which corrects the
> x86 codegen and will update the Arm targets.  This has now also been
> documented.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
> 
> Ok for GCC-11?

OK.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/102819
>   PR tree-optimization/103169
>   * doc/md.texi: Update docs for cfms, cfma.
>   * tree-data-ref.h (same_data_refs): Accept optional offset.
>   * tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating
>   patterns.
>   (vect_normalize_conj_loc): Remove.
>   (is_eq_or_top): Change to take two nodes.
>   (enum _conj_status, compatible_complex_nodes_p,
>   vect_validate_multiplication): New.
>   (class complex_add_pattern, complex_add_pattern::matches,
>   complex_add_pattern::recognize, class complex_mul_pattern,
>   complex_mul_pattern::recognize, class complex_fms_pattern,
>   complex_fms_pattern::recognize,, class complex_fma_pattern,
>   complex_fma_pattern::recognize, class complex_operations_pattern,
>   complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
>   new cache.
>   (complex_fms_pattern::matches, complex_fma_pattern::matches,
>   complex_mul_pattern::matches): Pass new cache and use new validation
>   code.
>   * tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns,
>   vect_analyze_slp): Pass along cache.
>   (compatible_calls_p): Expose.
>   * tree-vectorizer.h (compatible_calls_p, slp_node_hash,
>   slp_compat_nodes_map_t): New.
>   (class vect_pattern): Update signatures include new cache.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/102819
>   PR tree-optimization/103169
>   * g++.dg/vect/pr99149.cc: xfail for now.
>   * gcc.dg/vect/complex/pr102819-1.c: New test.
>   * gcc.dg/vect/complex/pr102819-2.c: New test.
>   * gcc.dg/vect/complex/pr102819-3.c: New test.
>   * gcc.dg/vect/complex/pr102819-4.c: New test.
>   * gcc.dg/vect/complex/pr102819-5.c: New test.
>   * gcc.dg/vect/complex/pr102819-6.c: New test.
>   * gcc.dg/vect/complex/pr102819-7.c: New test.
>   * gcc.dg/vect/complex/pr102819-8.c: New test.
>   * gcc.dg/vect/complex/pr102819-9.c: New test.
>   * gcc.dg/vect/complex/pr103169.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..ac7611008944abca08fe48cd7a74b8463f1573da
>  100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -6234,12 +6234,13 @@ Perform a vector multiply and accumulate that is 
> semantically the same as
>  a multiply and accumulate of complex numbers.
>  
>  @smallexample
> -  complex TYPE c[N];
> -  complex TYPE a[N];
> -  complex TYPE b[N];
> +  complex 

Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Richard Biener via Gcc-patches
On Thu, 24 Feb 2022, Qing Zhao wrote:

> I briefly checked all the usages of suppress_warning within the current gcc, 
> and see that most of them are not guarded by any condition. 
> 
> So, the current change should be fine without introducing new issues. -:)
> 
> Another thing is, if we use “warning_enable_at” to guard, I just checked, 
> this routine is not used by any routine right now, so it might be possible 
> that this 
> routine has some bug itself.  And now it’s the stage 4, we might need to be
> conservative. 
> 
> Based on this, I think that it might be better to put the change in as it 
> right now. 
> If we think that all the “suppress_warning” need to be guarded by a specific
> condition, we can do it in gcc13 earlier stage.
> 
> What’s your opinion?

I would agree here.  Maybe a compromise would be to simply
set gimple_set_no_warning () on the actual stmt.  That shouldn't
pollute hashtables or locations in any way and the loads are
artificial so I'm not sure what we should warn about there - in
the end there's always the store that can be diagnosed for out-of-bound
accesses and the like.

Richard.

> Qing
> 
> 
> > On Feb 24, 2022, at 9:13 AM, Jakub Jelinek  wrote:
> > 
> > On Thu, Feb 24, 2022 at 04:00:33PM +0100, Richard Biener wrote:
>  --- a/gcc/gimple-fold.cc
>  +++ b/gcc/gimple-fold.cc
>  @@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, 
>  bool full)
> else
>   {
> src = make_ssa_name (type);
>  -  g = gimple_build_assign (src, unshare_expr (dst));
>  +  tree tmp_dst = unshare_expr (dst);
>  +  /* The folding introduces a read from the tmp_dst, we 
>  should
>  + prevent uninitialized warning analysis from 
>  issuing warning
>  + for such fake read.  */
>  +  suppress_warning (tmp_dst, OPT_Wuninitialized);
> >>> 
> >>> I wonder if we shouldn't guard the suppress_warning call on
> >>> if (warn_uninitialized || warn_maybe_uninitialized)
> >>> because those warnings aren't on by default and the suppress_warning 
> >>> stuff,
> >>> especially when it could be done for many loads from the builtin means
> >>> populating hash tables with those.
> >> 
> >> Maybe that's something suppress_warning should do then?  OTOH you
> > 
> > Well, OPT_Wuninitialized is an argument why it can't.  The suppression
> > is using a single OPT_W*, but there are multiple different warnings
> > that care about that suppression, and suppress_warning can't know about it.
> > 
> >> don't know whether you're suppressing a warning in a region with
> >> -Wno-uninitialized but that's inlined into a -Wuninitialized
> >> function where then the false diagnostic pops up if we didn't
> >> suppress the warning ...
> > 
> > I think both -Wuninitialized and -Wmaybe-uninitialized aren't
> > Optimization or PerFunction, so they are global options.
> > On the other side, they can be locally changed through pragmas.
> > 
> > Maybe we could use
> >  if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
> >  || warning_enabled_at (buf->loc, OPT_Wmaybe_uninitialized))
> > if uninit pass uses the gimple_location of the read, that shouldn't
> > be really changing...
> > 
> > Jakub
> > 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-02-24 Thread Hongtao Liu via Gcc-patches
On Fri, Feb 25, 2022 at 1:50 PM liuhongt  wrote:
>
> The patch fixes ICE in ix86_gimple_fold_builtin.
>
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
Ok for main trunk?

> gcc/ChangeLog:
>
> PR target/104666
> * config/i386/i386-expand.cc
> (ix86_check_builtin_isa_match): New func.
> (ix86_expand_builtin): Move code to
> ix86_check_builtin_isa_match and call it.
> * config/i386/i386-protos.h
> (ix86_check_builtin_isa_match): Declare.
> * config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
> builtin into gimple when isa mismatches.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr104666.c: New test.
> ---
>  gcc/config/i386/i386-expand.cc   | 97 ++--
>  gcc/config/i386/i386-protos.h|  5 ++
>  gcc/config/i386/i386.cc  |  4 +
>  gcc/testsuite/gcc.target/i386/pr104666.c | 49 
>  4 files changed, 115 insertions(+), 40 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr104666.c
>
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index faa0191c6dd..1d132f0181d 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -12232,46 +12232,14 @@ ix86_expand_vec_set_builtin (tree exp)
>return target;
>  }
>
> -/* Expand an expression EXP that calls a built-in function,
> -   with result going to TARGET if that's convenient
> -   (and in mode MODE if that's convenient).
> -   SUBTARGET may be used as the target for computing one of EXP's operands.
> -   IGNORE is nonzero if the value is to be ignored.  */
> -
> -rtx
> -ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
> -machine_mode mode, int ignore)
> +/* Return true if the necessary isa options for this builtin exist,
> +   else false.
> +   fcode = DECL_MD_FUNCTION_CODE (fndecl);  */
> +bool
> +ix86_check_builtin_isa_match (unsigned int fcode,
> + HOST_WIDE_INT* pbisa,
> + HOST_WIDE_INT* pbisa2)
>  {
> -  size_t i;
> -  enum insn_code icode, icode2;
> -  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
> -  tree arg0, arg1, arg2, arg3, arg4;
> -  rtx op0, op1, op2, op3, op4, pat, pat2, insn;
> -  machine_mode mode0, mode1, mode2, mode3, mode4;
> -  unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl);
> -
> -  /* For CPU builtins that can be folded, fold first and expand the fold.  */
> -  switch (fcode)
> -{
> -case IX86_BUILTIN_CPU_INIT:
> -  {
> -   /* Make it call __cpu_indicator_init in libgcc. */
> -   tree call_expr, fndecl, type;
> -type = build_function_type_list (integer_type_node, NULL_TREE);
> -   fndecl = build_fn_decl ("__cpu_indicator_init", type);
> -   call_expr = build_call_expr (fndecl, 0);
> -   return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
> -  }
> -case IX86_BUILTIN_CPU_IS:
> -case IX86_BUILTIN_CPU_SUPPORTS:
> -  {
> -   tree arg0 = CALL_EXPR_ARG (exp, 0);
> -   tree fold_expr = fold_builtin_cpu (fndecl, );
> -   gcc_assert (fold_expr != NULL_TREE);
> -   return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
> -  }
> -}
> -
>HOST_WIDE_INT isa = ix86_isa_flags;
>HOST_WIDE_INT isa2 = ix86_isa_flags2;
>HOST_WIDE_INT bisa = ix86_builtins_isa[fcode].isa;
> @@ -12321,7 +12289,56 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
>bisa |= OPTION_MASK_ISA_SSE2;
>  }
>
> -  if ((bisa & isa) != bisa || (bisa2 & isa2) != bisa2)
> +  if (pbisa)
> +*pbisa = bisa;
> +  if (pbisa2)
> +*pbisa2 = bisa2;
> +
> +  return (bisa & isa) == bisa && (bisa2 & isa2) == bisa2;
> +}
> +
> +/* Expand an expression EXP that calls a built-in function,
> +   with result going to TARGET if that's convenient
> +   (and in mode MODE if that's convenient).
> +   SUBTARGET may be used as the target for computing one of EXP's operands.
> +   IGNORE is nonzero if the value is to be ignored.  */
> +
> +rtx
> +ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
> +machine_mode mode, int ignore)
> +{
> +  size_t i;
> +  enum insn_code icode, icode2;
> +  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
> +  tree arg0, arg1, arg2, arg3, arg4;
> +  rtx op0, op1, op2, op3, op4, pat, pat2, insn;
> +  machine_mode mode0, mode1, mode2, mode3, mode4;
> +  unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl);
> +  HOST_WIDE_INT bisa, bisa2;
> +
> +  /* For CPU builtins that can be folded, fold first and expand the fold.  */
> +  switch (fcode)
> +{
> +case IX86_BUILTIN_CPU_INIT:
> +  {
> +   /* Make it call __cpu_indicator_init in libgcc.  */
> +   tree call_expr, fndecl, type;
> +   type = build_function_type_list (integer_type_node, NULL_TREE);
> +   fndecl = build_fn_decl ("__cpu_indicator_init", type);
> +   call_expr = build_call_expr (fndecl, 0);
> + 

[PATCH] [i386] Don't fold builtin into gimple when isa mismatches.

2022-02-24 Thread liuhongt via Gcc-patches
The patch fixes ICE in ix86_gimple_fold_builtin.

gcc/ChangeLog:

PR target/104666
* config/i386/i386-expand.cc
(ix86_check_builtin_isa_match): New func.
(ix86_expand_builtin): Move code to
ix86_check_builtin_isa_match and call it.
* config/i386/i386-protos.h
(ix86_check_builtin_isa_match): Declare.
* config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
builtin into gimple when isa mismatches.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104666.c: New test.
---
 gcc/config/i386/i386-expand.cc   | 97 ++--
 gcc/config/i386/i386-protos.h|  5 ++
 gcc/config/i386/i386.cc  |  4 +
 gcc/testsuite/gcc.target/i386/pr104666.c | 49 
 4 files changed, 115 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104666.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index faa0191c6dd..1d132f0181d 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -12232,46 +12232,14 @@ ix86_expand_vec_set_builtin (tree exp)
   return target;
 }
 
-/* Expand an expression EXP that calls a built-in function,
-   with result going to TARGET if that's convenient
-   (and in mode MODE if that's convenient).
-   SUBTARGET may be used as the target for computing one of EXP's operands.
-   IGNORE is nonzero if the value is to be ignored.  */
-
-rtx
-ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
-machine_mode mode, int ignore)
+/* Return true if the necessary isa options for this builtin exist,
+   else false.
+   fcode = DECL_MD_FUNCTION_CODE (fndecl);  */
+bool
+ix86_check_builtin_isa_match (unsigned int fcode,
+ HOST_WIDE_INT* pbisa,
+ HOST_WIDE_INT* pbisa2)
 {
-  size_t i;
-  enum insn_code icode, icode2;
-  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
-  tree arg0, arg1, arg2, arg3, arg4;
-  rtx op0, op1, op2, op3, op4, pat, pat2, insn;
-  machine_mode mode0, mode1, mode2, mode3, mode4;
-  unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl);
-
-  /* For CPU builtins that can be folded, fold first and expand the fold.  */
-  switch (fcode)
-{
-case IX86_BUILTIN_CPU_INIT:
-  {
-   /* Make it call __cpu_indicator_init in libgcc. */
-   tree call_expr, fndecl, type;
-type = build_function_type_list (integer_type_node, NULL_TREE); 
-   fndecl = build_fn_decl ("__cpu_indicator_init", type);
-   call_expr = build_call_expr (fndecl, 0); 
-   return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
-  }
-case IX86_BUILTIN_CPU_IS:
-case IX86_BUILTIN_CPU_SUPPORTS:
-  {
-   tree arg0 = CALL_EXPR_ARG (exp, 0);
-   tree fold_expr = fold_builtin_cpu (fndecl, );
-   gcc_assert (fold_expr != NULL_TREE);
-   return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
-  }
-}
-
   HOST_WIDE_INT isa = ix86_isa_flags;
   HOST_WIDE_INT isa2 = ix86_isa_flags2;
   HOST_WIDE_INT bisa = ix86_builtins_isa[fcode].isa;
@@ -12321,7 +12289,56 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
   bisa |= OPTION_MASK_ISA_SSE2;
 }
 
-  if ((bisa & isa) != bisa || (bisa2 & isa2) != bisa2)
+  if (pbisa)
+*pbisa = bisa;
+  if (pbisa2)
+*pbisa2 = bisa2;
+
+  return (bisa & isa) == bisa && (bisa2 & isa2) == bisa2;
+}
+
+/* Expand an expression EXP that calls a built-in function,
+   with result going to TARGET if that's convenient
+   (and in mode MODE if that's convenient).
+   SUBTARGET may be used as the target for computing one of EXP's operands.
+   IGNORE is nonzero if the value is to be ignored.  */
+
+rtx
+ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
+machine_mode mode, int ignore)
+{
+  size_t i;
+  enum insn_code icode, icode2;
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  tree arg0, arg1, arg2, arg3, arg4;
+  rtx op0, op1, op2, op3, op4, pat, pat2, insn;
+  machine_mode mode0, mode1, mode2, mode3, mode4;
+  unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl);
+  HOST_WIDE_INT bisa, bisa2;
+
+  /* For CPU builtins that can be folded, fold first and expand the fold.  */
+  switch (fcode)
+{
+case IX86_BUILTIN_CPU_INIT:
+  {
+   /* Make it call __cpu_indicator_init in libgcc.  */
+   tree call_expr, fndecl, type;
+   type = build_function_type_list (integer_type_node, NULL_TREE);
+   fndecl = build_fn_decl ("__cpu_indicator_init", type);
+   call_expr = build_call_expr (fndecl, 0);
+   return expand_expr (call_expr, target, mode, EXPAND_NORMAL);
+  }
+case IX86_BUILTIN_CPU_IS:
+case IX86_BUILTIN_CPU_SUPPORTS:
+  {
+   tree arg0 = CALL_EXPR_ARG (exp, 0);
+   tree fold_expr = fold_builtin_cpu (fndecl, );
+   gcc_assert (fold_expr != NULL_TREE);
+   return expand_expr (fold_expr, target, mode, EXPAND_NORMAL);
+  

Re: [PATCH] Don't do int cmoves for IEEE comparisons, PR target/104256.

2022-02-24 Thread Michael Meissner via Gcc-patches
On Thu, Feb 24, 2022 at 08:07:28AM +0100, Robin Dapp wrote:
> Hi,
> 
> > Robin's patch has the effct making rs6000_emit_int_cmove return false for
> > floating point comparisons, so I marked the bug as being a duplicate of PR
> > target/104335.
> 
> Didn't I just return false for MODE_CC?  This should not affect proper
> floating-point comparisons.  It looks like the problem was indeed caused
> by my original patch but then I wonder why I did not catch it in the
> first place despite running a Power9 bootstrap and regtest (with Fortran
> of course) that looked unchanged.

It only showed up with some specific options (-O1 -fnon-call-exceptions and
-mcpu set to power9 or power10).  If you use -O0, -O2, or -O3 it doesn't show
up, and if you don't use -fnon-call-exceptions it doesn't show up.  In
particular, you needed ISEL enabled (power8 doesn't enable it due to
performance reasons).  Bill Seuer reported the bug.  Maybe he has more details.

Returning false is fine in this case.  It just says that we can't do
conditional move (i.e. use the ISEL instruction).  The compiler will fall back
to doing a jump around a move.

However, in doing the patch, I tried to get the compiler to generate cmoves for
integer values with floating point conditions, and I just don't see it normally
being generated.

I suspect if we want to do it, it will be a much larger project for the GCC 13
time frame.  As I recall, there was one Spec test that really wanted to do
something like this, but to re-architect this will be a large undertaking.

At the moment, we allow mixed floating point test and conditional move:

float f1, f2;
double d1, d2, d3;

d1 = (f1 == f2) ? d2 : d3;

When I added the IEEE 128-bit support, it because a combinatorial explosion to
try and handle all possible compares and moves.  Eventually, we decided just to
only do the IEEE 128-bit cmove if the test and values being moved were the same
mode (i.e. KFmode or TFmode), and you couldn't mix float/double and __float128.

Int cmoves have traditionally just not thought of having floating point
comparisons.  They could, it just wasn't thought of.

> Shouldn't this have come up? I vaguely recall seeing maxloc FAILs at
> some point but not in the final runs.  Going to re-check because this
> would have helped not introduce the problem that late.

As I said, it needed some specific cases to get it to fail.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] Check if loading const from mem is faster

2022-02-24 Thread Jiufu Guo via Gcc-patches
Richard Biener  writes:

> On Thu, 24 Feb 2022, Jiufu Guo wrote:
>
>> Jiufu Guo via Gcc-patches  writes:
>> 
>> > Segher Boessenkool  writes:
>> >
>> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
>> >>> I'm assuming we're always dealing with
>> >>> 
>> >>>   (set (reg:MODE ..) )
>> >>> 
>> >>> here and CSE is not substituting into random places of an
>> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
>> >>> to for a constant, if it should implicitely evaluate the cost
>> >>> of putting the result into a register for example.
>> >>
>> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
>> >> for anything that is used as input in a machine instruction -- but you
>> >> need much more context to determine that.  insn_cost is much simpler and
>> >> much easier to use.
>> >>
>> >>> Using RTX_COST with SET and 1 at least looks no worse than using
>> >>> your proposed new target hook and comparing it with the original
>> >>> unfolded src (again with SET and 1).
>> >>
>> >> It is required to generate valid instructions no matter what, before
>> >> the pass has finished that is.  On all more modern architectures it is
>> >> futile to think you can usefully consider the cost of an RTL expression
>> >> and derive a real-world cost of the generated code from that.
>> >
>> > Thanks Segher for pointing out these!  Here is  another reason that I
>> > did not use rtx_cost: in a few passes, there are codes to check the
>> > constants and store them in constant pool.  I'm thinking to integerate
>> > those codes in a consistent way.
>> 
>> Hi Segher, Richard!
>> 
>> I'm thinking the way like: For a constant,
>> 1. if the constant could be used as an immediate for the
>> instruction, then retreated as an operand;
>> 2. otherwise if the constant can not be stored into a
>> constant pool, then handle through instructions;
>> 3. if it is faster to access constant from pool, then emit
>> constant as data(.rodata);
>> 4. otherwise, handle the constant by instructions.
>> 
>> And to store the constant into a pool, besides force_const_mem,
>> create reference (toc) may be needed on some platforms.
>> 
>> For this particular issue in CSE, there is already code that
>> tries to put constant into a pool (invoke force_const_mem).
>> While the code is too late.  So, we may check the constant
>> earlier and store it into constant pool if profitable.
>> 
>> And another thing as Segher pointed out, CSE is doing too
>> much work.  It may be ok to separate the constant handling
>> logic from CSE.
>
> Not sure - CSE just is value numbering, I don't see that it does
> more than that.  Yes, it might have developed "heuristics" over
> the years what to CSE and to what and where to substitute and
> where not.  But in the end it does just value numbering.
>
>> 
>> I update a new version patch as follow (did not seprate CSE):
>
> How is the new target hook better in any way compared to rtx_cost
> or insn_cost?  It looks like a total hack.
>
> I suppose the actual way of materializing a constant is done
> behind GCCs back and not exposed anywhere?  But instead you
> claim the constants are valid when they actually are not?
> Isn't the problem then that the rs6000 backend lies?

Hi Richard,

Thanks for your comments and sugguestions!

Materializing a constant should be done behind GCC.
On rs6000, in expand pass, during emit_move, the constant is
checked and store into constant pool if necessary.
Some other platforms are doing a similar thing, e.g.
ix86_expand_vector_move, alpha_expand_mov,...
mips_legitimize_const_move.

But, it does not as we expect, force_const_mem is also 
exposed other places (not only ira/reload for stack reference).

CSE is one place, for example, CSE first retrieve the constant
from insn's equal, but it also tries to put constant into
pool for some condition (the condition was introduced at
early age of cse.c, and it is rare to run into in trunk).
In some aspects, IMHO, this seems not a great work of CSE.

And this is how the 'invalid(or say slow)' constant occurs.
e.g.  before cse:
7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
  REG_EQUAL 0x100803004101001
after cse: 
7: r119:DI=0x100803004101001
  REG_EQUAL 0x100803004101001

As you pointed out: we can also avoid this transformation through
rtx_cost/insn_cost by estimating the COST more accurately for
 "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not
be a real final instruction.)

Is it necessary to refine this constant handling for CSE, or even
to eliminate the logic on constant extracting for an insn, beside
updating rtx_cost/insn_cost?
Any sugguestions?

>
> Btw, all of this is of course not appropriate for stage4 and changes
> to CSE need testing on more than one target.
I would do more evaluation, thanks!

Jiufu

>
> Richard.
>
>> Thanks for the comments and suggestions again!
>> 
>> 
>> BR,
>> Jiufu
>> 
>> ---
>>  gcc/config/rs6000/rs6000.cc   | 39 ++-
>>  

[PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao
Hi, Jakub and Richard:

This is the 3rd version of the patch, the major change compared to the previous 
version are:

1. Add warning_enabled_at guard before “suppress_warning”
2. Add location to the call to __builtin_clear_padding for auto init.

The patch has been bootstrapped and regress tested on both x86 and aarch64.
Okay for trunk?

Thanks.

Qing

==
From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Thu, 24 Feb 2022 22:38:38 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding() will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.
* gimplify.cc (gimple_add_padding_init_for_auto_var): Set
location for new created call to __builtin_clear_padding.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  | 11 ++-
 gcc/gimplify.cc |  1 +
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
 5 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098d..67b4963ffd96 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "asan.h"
 #include "diagnostic-core.h"
+#include "diagnostic.h"
 #include "intl.h"
 #include "calls.h"
 #include "tree-vector-builder.h"
@@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool 
full)
  else
{
  src = make_ssa_name (type);
- g = gimple_build_assign (src, unshare_expr (dst));
+ tree tmp_dst = unshare_expr (dst);
+ /* The folding introduces a read from the tmp_dst, we should
+prevent uninitialized warning analysis from issuing warning
+for such fake read.  */
+ if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
+ || warning_enabled_at (buf->loc,
+OPT_Wmaybe_uninitialized))
+   suppress_warning (tmp_dst, OPT_Wuninitialized);
+ g = gimple_build_assign (src, tmp_dst);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
  tree mask = native_interpret_expr (type,
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index f570daa015a5..977cf458f858 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool 
is_vla,
 
   gimple *call = gimple_build_call (fn, 2, addr_of_decl,
build_one_cst (TREE_TYPE (addr_of_decl)));
+  gimple_set_location (call, EXPR_LOCATION (decl));
   gimplify_seq_add_stmt (seq_p, call);
 }
 
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
new file mode 100644
index ..a08110c3a170
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
@@ -0,0 +1,10 @@
+/* PR 104550*/
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
new file mode 100644
index ..2c395b32d322
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; 
+ __builtin_clear_padding ();  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
new file mode 100644

Re: [PATCH v2] configure: Implement --enable-host-pie

2022-02-24 Thread Joseph Myers
On Thu, 24 Feb 2022, Marek Polacek via Gcc-patches wrote:

> gmp/mpfr/mpc/isl are DSOs I believe and therefore always PIC.

They are *not* DSOs when built in-tree (see the use of --disable-shared in 
the relevant parts of Makefile.def).

> intl: I have no idea about this; I don't see any binaries in that directory
> after a bootstrap.

If you use --with-included-gettext, there should be libintl.a or similar 
there and it should be linked into host binaries.

-- 
Joseph S. Myers
jos...@codesourcery.com


[committed] libstdc++: Fix cast in source_location::current() [PR104602]

2022-02-24 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

This fixes a problem for Clang, which is going to return a non-void
pointer from __builtin_source_location(). The current definition of
std::source_location::current() converts that to void* and then has to
cast it back again in the body (which makes it invalid in a constant
expression). By using the actual type of the returned pointer, we avoid
the problematic cast for Clang.

libstdc++-v3/ChangeLog:

PR libstdc++/104602
* include/std/source_location (source_location::current): Use
deduced type of __builtin_source_location().
---
 libstdc++-v3/include/std/source_location | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/source_location 
b/libstdc++-v3/include/std/source_location
index d6c7be567d6..7b091bb91b7 100644
--- a/libstdc++-v3/include/std/source_location
+++ b/libstdc++-v3/include/std/source_location
@@ -43,12 +43,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
   private:
 using uint_least32_t = __UINT_LEAST32_TYPE__;
+using __builtin_ret_type = decltype(__builtin_source_location());
 
   public:
 
 // [support.srcloc.cons], creation
 static consteval source_location
-current(const void* __p = __builtin_source_location()) noexcept
+current(__builtin_ret_type __p = __builtin_source_location()) noexcept
 {
   source_location __ret;
   __ret._M_impl = static_cast (__p);
-- 
2.34.1



[PATCH v2] configure: Implement --enable-host-pie

2022-02-24 Thread Marek Polacek via Gcc-patches
On Thu, Feb 10, 2022 at 09:10:17PM +, Joseph Myers wrote:
> Some general observations:

Thanks for the comment and sorry for the delay (I was on vacation).
 
> * There are various toplevel GCC subdirectories that are built for the 
> host (possibly in addition to the target in some cases) but aren't changed 
> in this patch.  Do they get a PIE or PIC build anyway by default?  Such 
> directories include, I think: fixincludes (as a corner case, for the 
> installed fixincludes), gmp, mpfr, mpc, isl (host libraries whose 
> configure scripts aren't part of GCC, so any changes to ensure they build 
> as PIE when needed would need to be at top level), intl, libbacktrace, 
> libiberty, gnattools, gotools.
> 
> (Using a bootstrap compiler that *doesn't* default to PIE might help 
> detect any such issues, though only for directores that get built for the 
> host in that build - some may not get built by default.)

For fixincludes: the original patch didn't make fixincl in fixincludes PIE, but
the following one does (when --enable-host-pie).

gmp/mpfr/mpc/isl are DSOs I believe and therefore always PIC.

intl: I have no idea about this; I don't see any binaries in that directory
after a bootstrap.

libbacktrace builds with -fPIC already (at least the object files in
libbacktrace/.libs).

libiberty is built twice, once as PIC (in pic/).

gnattools: that directory is empty even when I build Ada, so not sure
what's with that.

gotools: here the binaries aren't PIE/PIC.  I don't really have plans to
change that as Go is not a priority.

> For directories that are only used as host libraries but don't install any 
> executables, even if this patch needs additions the -z now one shouldn't.

Yes, that makes sense.  My goal is to simply build the compilers like cc1
as PIE when requested.  That involved several other changes like using the
pic/ version of libiberty, but I didn't consider every toplevel subdirectory.

> * I don't see anything obvious here (or for the existing 
> --enable-host-shared) that actually causes the configure option to apply 
> only to the host and not to the target, in the case of subdirectories such 
> as libbacktrace that get built for both host and target.  (Though static 
> target libraries may well default to PIC in many cases anyway.)

Good point.  I don't think there's anything insuring that the option affects
only the host binaries.  The configure option name is misleading in that way.
I'm not sure that I would know how to fix that though.

Here's a v2 which additionally builds fixincludes/ as PIE.  That necessitated
a change in libiberty/: I needed a pic/ version even for the host, not just
the target.  That's why I'm setting shared when --enable-host-pie is on.

Thanks again for taking a look.

Bootstrapped/regtested on x86_64-pc-linux-gnu.

-- >8 --
This patch implements the --enable-host-pie configure option which
makes the compiler executables PIE.  This can be used to enhance
protection against ROP attacks, and can be viewed as part of a wider
trend to harden binaries.

It is similar to the option --enable-host-shared, except that --e-h-s
won't add -shared to the linker flags whereas --e-h-p will add -pie.
It is different from --enable-default-pie because that option just
adds an implicit -fPIE/-pie when the compiler is invoked, but the
compiler itself isn't PIE.

Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
regressions.

I plan to add an option to link with -Wl,-z,now.

c++tools/ChangeLog:

* Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
Use pic/libiberty.a if PICFLAG is set.
* configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
(--enable-host-pie): New check.
* configure: Regenerate.

fixincludes/ChangeLog:

* Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
build of libiberty if PICFLAG is set.
* configure.ac:
* configure: Regenerate.

gcc/ChangeLog:

* Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.
* d/Make-lang.in: Remove NO_PIE_CFLAGS.
* doc/install.texi: Document --enable-host-pie.

libcody/ChangeLog:

* Makefile.in: Pass LD_PICFLAG to LDFLAGS.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.

libcpp/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.


[PATCH]middle-end vect: Simplify and extend the complex numbers validation routines. (GCC-11 Backport)

2022-02-24 Thread Tamar Christina via Gcc-patches
Hi All,

This is a backport of the GCC 12 patch backporting only the correctness part of
the fix.   This also backports two small helper functions and documentation
update on the optabs.

The patch boosts the analysis for complex mul,fma and fms in order to ensure
that it doesn't create an incorrect output.

Essentially it adds an extra verification to check that the two nodes it's going
to combine do the same operations on compatible values.  The reason it needs to
do this is that if one computation differs from the other then with the current
implementation we have no way to deal with it since we have to remove the
permute.

When we can keep the permute around we can probably handle these by unrolling.

While implementing this since I have to do the traversal anyway I took advantage
of it by simplifying the code a bit.  Previously we would determine whether
something is a conjugate and then try to figure out which conjugate it is and
then try to see if the permutes match what we expect.

Now the code that does the traversal will detect this in one go and return to us
whether the operation is something that can be combined and whether a conjugate
is present.

Secondly because it does this I can now simplify the checking code itself to
essentially just try to apply fixed patterns to each operation.

The patterns represent the order operations should appear in. For instance a
complex MUL operation combines :

  Left 1 + Right 1
  Left 2 + Right 2

with a permute on the nodes consisting of:

  { Even, Even } + { Odd, Odd  }
  { Even, Odd  } + { Odd, Even }

By abstracting over these patterns the checking code becomes quite simple.

As part of this I was checking the order of the operands which was left in
"slp" order. as in, the same order they showed up in during SLP, which means
that the accumulator is first.  However it looks like I didn't document this.

I have this changed the order to match that of FMA and FMS which corrects the
x86 codegen and will update the Arm targets.  This has now also been
documented.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Ok for GCC-11?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/102819
PR tree-optimization/103169
* doc/md.texi: Update docs for cfms, cfma.
* tree-data-ref.h (same_data_refs): Accept optional offset.
* tree-vect-slp-patterns.cc (is_linear_load_p): Fix issue with repeating
patterns.
(vect_normalize_conj_loc): Remove.
(is_eq_or_top): Change to take two nodes.
(enum _conj_status, compatible_complex_nodes_p,
vect_validate_multiplication): New.
(class complex_add_pattern, complex_add_pattern::matches,
complex_add_pattern::recognize, class complex_mul_pattern,
complex_mul_pattern::recognize, class complex_fms_pattern,
complex_fms_pattern::recognize,, class complex_fma_pattern,
complex_fma_pattern::recognize, class complex_operations_pattern,
complex_operations_pattern::recognize, addsub_pattern::recognize): Pass
new cache.
(complex_fms_pattern::matches, complex_fma_pattern::matches,
complex_mul_pattern::matches): Pass new cache and use new validation
code.
* tree-vect-slp.cc (vect_match_slp_patterns_2, vect_match_slp_patterns,
vect_analyze_slp): Pass along cache.
(compatible_calls_p): Expose.
* tree-vectorizer.h (compatible_calls_p, slp_node_hash,
slp_compat_nodes_map_t): New.
(class vect_pattern): Update signatures include new cache.

gcc/testsuite/ChangeLog:

PR tree-optimization/102819
PR tree-optimization/103169
* g++.dg/vect/pr99149.cc: xfail for now.
* gcc.dg/vect/complex/pr102819-1.c: New test.
* gcc.dg/vect/complex/pr102819-2.c: New test.
* gcc.dg/vect/complex/pr102819-3.c: New test.
* gcc.dg/vect/complex/pr102819-4.c: New test.
* gcc.dg/vect/complex/pr102819-5.c: New test.
* gcc.dg/vect/complex/pr102819-6.c: New test.
* gcc.dg/vect/complex/pr102819-7.c: New test.
* gcc.dg/vect/complex/pr102819-8.c: New test.
* gcc.dg/vect/complex/pr102819-9.c: New test.
* gcc.dg/vect/complex/pr103169.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..ac7611008944abca08fe48cd7a74b8463f1573da
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -6234,12 +6234,13 @@ Perform a vector multiply and accumulate that is 
semantically the same as
 a multiply and accumulate of complex numbers.
 
 @smallexample
-  complex TYPE c[N];
-  complex TYPE a[N];
-  complex TYPE b[N];
+  complex TYPE op0[N];
+  complex TYPE op1[N];
+  complex TYPE op2[N];
+  complex TYPE op3[N];
   for (int i = 0; i < N; i += 1)
 @{
-  c[i] += a[i] * b[i];
+  op0[i] = op1[i] * op2[i] + op3[i];
 @}
 @end smallexample
 
@@ -6257,12 +6258,13 @@ 

New German PO file for 'gcc' (version 12.1-b20220213)

2022-02-24 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-12.1-b20220213.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] PR fortran/84519 - [F2018] STOP and ERROR STOP statements with QUIET specifier

2022-02-24 Thread Harald Anlauf via Gcc-patches

Dear Jerry, Mikael,

thanks for the feedback!

Am 24.02.22 um 12:50 schrieb Mikael Morin:

Le 23/02/2022 à 23:21, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

Fortran 2018 added a QUIET= specifier to STOP and ERROR STOP statements.
Janne already implemented the library side code four (4!) years ago,
but so far the frontend implementation was missing.

Furthermore, F2018 allows for non-default-integer stopcode expressions
(finally!).

The attached patch provides this implementation.

That was not too much fun for the following reasons:

- fixed format vs. free format
- F95 and F2003 apparently did not require a blank between STOP and
   stopcode, while F2008+ do require it.

This should explain for the three testcases.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

One step closer to F2018!


Please move the error from trans-stmt.cc to resolve.cc.


That is certainly cleaner.  I've done this and rerun the regtest.

As suggested by Jerry a simple run-time testcase with QUIET=.true. has
been added.  However, since I could not find a way to convince dejagnu
that there should be no output, I simply check that the right values
are passed to the runtime library.

If somebody knows how to solve this and feels strongly about this,
please proceed.

Pushed as https://gcc.gnu.org/g:916b809fbfdd2740006270baf549bf22fe9ec3c4


Otherwise looks good, and you have a green light by Jerry, but I would
rather defer this to gcc-13.

Mikael



Thanks,
Harald


Re: [PATCH v7 05/12] LoongArch Port: Machine description C files and .h files.

2022-02-24 Thread Xi Ruoyao via Gcc-patches
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote:
> +  /* Clean up the vars set above.  Note that final_end_function resets
> + the global pointer for us.  */

We don't have a global pointer.  Let's kill this MIPS remenant :).

> +  reload_completed = 0;

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] RISC-V: Document the degree of position independence that medany affords

2022-02-24 Thread Palmer Dabbelt

On Tue, 18 Jan 2022 18:58:00 PST (-0800), Kito Cheng wrote:

LGTM, thanks for adding those comments :)


Committed



On Wed, Jan 19, 2022 at 1:21 AM Palmer Dabbelt  wrote:


The code generated by -mcmodel=medany is defined to be
position-independent, but is not guaranteed to function correctly when
linked into position-independent executables or libraries.  See the
recent discussion at the psABI specification [1] for more details.

It would be better to reject these invalid sequences when linking, but
as pointed out in a recent LD bug [2] there may be some compatibility
issues related to the PCREL_HI20 relocations used to initialize GP.
Given the complexity here it's unlikely we'll be able to reject these
sequences any time soon, so instead just document that these may not
work.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/245
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=28789

gcc/ChangeLog:

* doc/invoke.texi: Document the degree of position independence
that -mcmodel=medany affords.

Signed-off-by: Palmer Dabbelt 

---

Changes since v1:

* Fix spelling of "guaranteed", twice.
* Reference the binutils bug on rejecting these sequences, for more
  context.
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5504971ea81..7bca621535f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27568,6 +27568,10 @@ Generate code for the medium-any code model. The 
program and its statically
 defined symbols must be within any single 2 GiB address range. Programs can be
 statically or dynamically linked.

+The code generated by the medium-any code model is position-independent, but is
+not guaranteed to function correctly when linked into position-independent
+executables or libraries.
+
 @item -mexplicit-relocs
 @itemx -mno-exlicit-relocs
 Use or do not use assembler relocation operators when dealing with symbolic
--
2.32.0



Re: [PATCH v7 11/12] LoongArch Port: gcc/testsuite

2022-02-24 Thread Xi Ruoyao via Gcc-patches
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote:
> From: chenglulu 
> 
> 2022-02-12  Chenghua Xu  
>     Lulu Cheng  
> 
> gcc/testsuite/

spec-barrier tests fail with:

./testsuite/c-c++-common/spec-barrier-1.c:21:3: warning: this target
does not define a speculation barrier; your program will still execute
correctly, but incorrect speculation may not be restricted

I'd seen some news saying your uarch has in-silicon defense for
speculation related vulnerabilities.  If this is true you can just make
__builtin_speculation_safe_value a nop.  Quote from gcc internal doc:

>  If this pattern is not defined then the default expansion of
>  '__builtin_speculation_safe_value' will emit a warning.  You can
>  suppress this warning by defining this pattern with a final
>  condition of '0' (zero), which tells the compiler that a
>  speculation barrier is not needed for this target.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v7 02/12] LoongArch Port: gcc build

2022-02-24 Thread Xi Ruoyao via Gcc-patches
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote:
> +mstrict-align
> +Target Var(TARGET_STRICT_ALIGN) Init(0)
> +Do not generate unaligned memory accesses.

Any update on the rational to make -mno-strict-align the default?

Note that I'm not against this decision: I'm really not a fan of the
"dinosaur" or "teaching tool" uarchs with no unaligned access support
:).  But you should really document this somewhere, for e. g. updating
your arch spec, or cliam "any OS on LoongArch should emulate unaligned
access if it's not supported by hardware".

Maybe this is a little OT though.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] libgcc: fix a warning calling find_fde_tail

2022-02-24 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-02-24 at 19:53 +0100, Jakub Jelinek wrote:
> On Fri, Feb 25, 2022 at 02:35:07AM +0800, Xi Ruoyao via Gcc-patches
> wrote:
> > Bootstrapped on x86_64-linux-gnu.  OK for master?
> > 
> > The third parameter of find_fde_tail is an _Unwind_Ptr (which is an
> > integer type instead of a pointer), but we are passing NULL to it. 
> > This
> > causes a -Wint-conversion warning.
> > 
> > libgcc/
> > 
> > * unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Call
> > find_fde_tail
> >   with 0 instead of NULL.
> 
> Ok (except that the second ChangeLog entry line should be indented
> just
> with a tab, not any spaces after the tab).

Pushed as r12-7375, with the ChangeLog corrected.


Re: [PATCH] libgcc: fix a warning calling find_fde_tail

2022-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 25, 2022 at 02:35:07AM +0800, Xi Ruoyao via Gcc-patches wrote:
> Bootstrapped on x86_64-linux-gnu.  OK for master?
> 
> The third parameter of find_fde_tail is an _Unwind_Ptr (which is an
> integer type instead of a pointer), but we are passing NULL to it.  This
> causes a -Wint-conversion warning.
> 
> libgcc/
> 
>   * unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Call find_fde_tail
> with 0 instead of NULL.

Ok (except that the second ChangeLog entry line should be indented just
with a tab, not any spaces after the tab).

> --- a/libgcc/unwind-dw2-fde-dip.c
> +++ b/libgcc/unwind-dw2-fde-dip.c
> @@ -514,7 +514,7 @@ _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases
> *bases)
>  # if DLFO_STRUCT_HAS_EH_DBASE
>   (_Unwind_Ptr) dlfo.dlfo_eh_dbase,
>  # else
> - NULL,
> + 0,
>  # endif
>   bases);
>  else
> -- 
> 2.35.1
> 

Jakub



[PATCH] libgcc: fix a warning calling find_fde_tail

2022-02-24 Thread Xi Ruoyao via Gcc-patches
Bootstrapped on x86_64-linux-gnu.  OK for master?

The third parameter of find_fde_tail is an _Unwind_Ptr (which is an
integer type instead of a pointer), but we are passing NULL to it.  This
causes a -Wint-conversion warning.

libgcc/

* unwind-dw2-fde-dip.c (_Unwind_Find_FDE): Call find_fde_tail
  with 0 instead of NULL.
---
 libgcc/unwind-dw2-fde-dip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c
index 3d6f39f5460..7f9be5e6b02 100644
--- a/libgcc/unwind-dw2-fde-dip.c
+++ b/libgcc/unwind-dw2-fde-dip.c
@@ -514,7 +514,7 @@ _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases
*bases)
 # if DLFO_STRUCT_HAS_EH_DBASE
(_Unwind_Ptr) dlfo.dlfo_eh_dbase,
 # else
-   NULL,
+   0,
 # endif
bases);
 else
-- 
2.35.1




Re: [PATCH, testsuite] Fix attr-retain-*.c testcases on 32-bit PowerPC [PR100407]

2022-02-24 Thread Segher Boessenkool
On Thu, Feb 10, 2022 at 04:17:00PM -0600, Pat Haugen wrote:
> Per Alan's comment in the bugzilla, fix attr-retain-* tescases for 32-bit 
> PowerPC.

> --- a/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
> +++ b/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
> @@ -1,4 +1,5 @@
>  /* { dg-do compile { target R_flag_in_section } } */
> +/* { dg-options "-G0" { target { powerpc*-*-* && ilp32 } } } */

This needs a comment exokaining what it is for.  Okay for trunk with
that, thanks!


Segher


Re: [PATCH v7 08/12] LoongArch Port: libgcc

2022-02-24 Thread Xi Ruoyao via Gcc-patches
On Sat, 2022-02-12 at 11:11 +0800, xucheng...@loongson.cn wrote:

> +  sc = _->uc.uc_mcontext;

Get a warning:

In file included from ../../../libgcc/unwind-dw2.c:412:
./md-unwind-support.h: In function ‘loongarch_fallback_frame_state’:
./md-unwind-support.h:55:10: warning: assignment to ‘struct sigcontext *’ from 
incompatible pointer type ‘mcontext_t *’ [-Wincompatible-pointer-types]
   55 |   sc = _->uc.uc_mcontext;
  |  ^

Maybe we should just add a cast here like
`(struct sigcontext *) _->uc.uc_mcontext` ?
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


PING: [PATCH, testsuite] Fix attr-retain-*.c testcases on 32-bit PowerPC [PR100407]

2022-02-24 Thread Pat Haugen via Gcc-patches

Ping.

On 2/10/22 4:17 PM, Pat Haugen via Gcc-patches wrote:

Per Alan's comment in the bugzilla, fix attr-retain-* tescases for 32-bit 
PowerPC.

Bootstrapped and regression tested on powerpc64(32/64) and powerpc64le.
Ok for master?

-Pat


2022-02-10  Pat Haugen  

PR testsuite/100407

gcc/testsuite/
* gcc.c-torture/compile/attr-retain-1.c: Add -G0 for 32-bit PowerPC.
* gcc.c-torture/compile/attr-retain-2.c: Likewise.



diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c 
b/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
index 6cab155..4a366eb 100644
--- a/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
@@ -1,4 +1,5 @@
  /* { dg-do compile { target R_flag_in_section } } */
+/* { dg-options "-G0" { target { powerpc*-*-* && ilp32 } } } */
  /* { dg-final { scan-assembler ".text.*,\"axR\"" } } */
  /* { dg-final { scan-assembler ".bss.*,\"awR\"" } } */
  /* { dg-final { scan-assembler ".data.*,\"awR\"" } } */
diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c 
b/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c
index 0208ffe..d9fc150 100644
--- a/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c
@@ -11,5 +11,6 @@
  /* { dg-final { scan-assembler ".bss.used_lcomm2,\"awR\"" { target arm-*-* } 
} } */
  /* { dg-final { scan-assembler ".data.used_foo_sec,\"awR\"" } } */
  /* { dg-options "-ffunction-sections -fdata-sections" } */
+/* { dg-options "-ffunction-sections -fdata-sections -G0" { target { powerpc*-*-* 
&& ilp32 } } } */
  
  #include "attr-retain-1.c"




Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao
I briefly checked all the usages of suppress_warning within the current gcc, 
and see that most of them are not guarded by any condition. 

So, the current change should be fine without introducing new issues. -:)

Another thing is, if we use “warning_enable_at” to guard, I just checked, 
this routine is not used by any routine right now, so it might be possible that 
this 
routine has some bug itself.  And now it’s the stage 4, we might need to be
conservative. 

Based on this, I think that it might be better to put the change in as it right 
now. 
If we think that all the “suppress_warning” need to be guarded by a specific
condition, we can do it in gcc13 earlier stage.

What’s your opinion?

Qing


> On Feb 24, 2022, at 9:13 AM, Jakub Jelinek  wrote:
> 
> On Thu, Feb 24, 2022 at 04:00:33PM +0100, Richard Biener wrote:
 --- a/gcc/gimple-fold.cc
 +++ b/gcc/gimple-fold.cc
 @@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, 
 bool full)
  else
{
  src = make_ssa_name (type);
 -g = gimple_build_assign (src, unshare_expr (dst));
 +tree tmp_dst = unshare_expr (dst);
 +/* The folding introduces a read from the tmp_dst, we should
 +   prevent uninitialized warning analysis from issuing warning
 +   for such fake read.  */
 +suppress_warning (tmp_dst, OPT_Wuninitialized);
>>> 
>>> I wonder if we shouldn't guard the suppress_warning call on
>>>   if (warn_uninitialized || warn_maybe_uninitialized)
>>> because those warnings aren't on by default and the suppress_warning stuff,
>>> especially when it could be done for many loads from the builtin means
>>> populating hash tables with those.
>> 
>> Maybe that's something suppress_warning should do then?  OTOH you
> 
> Well, OPT_Wuninitialized is an argument why it can't.  The suppression
> is using a single OPT_W*, but there are multiple different warnings
> that care about that suppression, and suppress_warning can't know about it.
> 
>> don't know whether you're suppressing a warning in a region with
>> -Wno-uninitialized but that's inlined into a -Wuninitialized
>> function where then the false diagnostic pops up if we didn't
>> suppress the warning ...
> 
> I think both -Wuninitialized and -Wmaybe-uninitialized aren't
> Optimization or PerFunction, so they are global options.
> On the other side, they can be locally changed through pragmas.
> 
> Maybe we could use
>  if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
>  || warning_enabled_at (buf->loc, OPT_Wmaybe_uninitialized))
> if uninit pass uses the gimple_location of the read, that shouldn't
> be really changing...
> 
>   Jakub
> 



Re: [pushed] LRA, rs6000, Darwin: Amend lo_sum use for forced constants [PR104117].

2022-02-24 Thread Iain Sandoe
Folks,

> On 22 Feb 2022, at 14:44, Vladimir Makarov  wrote:
> 
> 
> On 2022-02-20 12:34, Iain Sandoe wrote:
>> 
>> ^^^ this is mostly for my education - the stuff below is a potential 
>> solution to leaving lra-constraints unchanged and fixing the Darwin bug….
>> 
> I'd be really glad if you do manage to fix this w/o changing LRA. Richard has 
> a legitimate point that my proposed change in LRA prohibiting 
> `...;reg=low_sum; ...mem[reg]` might force LRA to generate less optimized 
> code or even might make LRA to generate unrecognized insns `reg = orginal 
> addr` for some ports requiring further fixes in machine-dependent code of the 
> ports.

I think this is within my remit to push without further review - however I’d 
very much welcome any comment you folks have: I’d like to push this before my 
weekly Darwin test run - which is usually started just after the daily bump on 
Saturday morning.

The other RS6000 changes remain, as Vlad pointed out we were not being picky 
enough there - despite getting away with it for longer than I’ve been on the 
project ;)

I tested that the patch fixes the problem on 11.2 (for the testcases provided, 
the bug is latent on master) and causes no regressions on powerpc-darwin9 
(master).

cheers
Iain


[PATCH] LRA, rs6000, Darwin: Revise lo_sum use for forced constants  [PR104117].

Follow up discussion to the initial patch for this PR identified that it is
preferable to avoid the LRA change, and arrange for the target to reject the
hi and lo_sum selections when presented with an invalid address.

We split the Darwin high/low selectors into two:
 1. One that handles non-PIC addresses (kernel mode, mdynamic-no-pic).
 2. One that handles PIC addresses and rejects SYMBOL_REFs unless they are
suitably wrapped in the MACHOPIC_OFFSET unspec.

The second case is handled by providing a new predicate (macho_pic_address)
that checks the requirements.

Signed-off-by: Iain Sandoe 

PR target/PR104117

gcc/ChangeLog:

* config/rs6000/darwin.md (@machopic_high_): New.
(@machopic_low_): New.
* config/rs6000/predicates.md (macho_pic_address): New.
* config/rs6000/rs6000.cc (rs6000_legitimize_address): Do not
apply the TLS processing to Darwin.
* lra-constraints.cc (process_address_1): Revert the changes
in r12-7209.
---
 gcc/config/rs6000/darwin.md | 19 +++
 gcc/config/rs6000/predicates.md | 14 ++
 gcc/config/rs6000/rs6000.cc |  2 +-
 gcc/lra-constraints.cc  | 17 +++--
 4 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/darwin.md b/gcc/config/rs6000/darwin.md
index 8443585df00..e73d59e8066 100644
--- a/gcc/config/rs6000/darwin.md
+++ b/gcc/config/rs6000/darwin.md
@@ -121,21 +121,32 @@ You should have received a copy of the GNU General Public 
License
stw %0,lo16(%2)(%1)"
   [(set_attr "type" "store")])
 
-;; 64-bit MachO load/store support
-
 ;; Mach-O PIC.
 
 (define_insn "@macho_high_"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b*r")
(high:P (match_operand 1 "" "")))]
-  "TARGET_MACHO && (DEFAULT_ABI == ABI_DARWIN)"
+  "TARGET_MACHO && (DEFAULT_ABI == ABI_DARWIN) && !flag_pic"
   "lis %0,ha16(%1)")
 
 (define_insn "@macho_low_"
   [(set (match_operand:P 0 "gpc_reg_operand" "=r")
(lo_sum:P (match_operand:P 1 "gpc_reg_operand" "b")
   (match_operand 2 "" "")))]
-   "TARGET_MACHO && (DEFAULT_ABI == ABI_DARWIN)"
+   "TARGET_MACHO && (DEFAULT_ABI == ABI_DARWIN) && !flag_pic"
+   "la %0,lo16(%2)(%1)")
+
+(define_insn "@machopic_high_"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b*r")
+   (high:P (match_operand 1 "macho_pic_address" "")))]
+  "TARGET_MACHO && flag_pic"
+  "lis %0,ha16(%1)")
+
+(define_insn "@machopic_low_"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
+   (lo_sum:P (match_operand:P 1 "gpc_reg_operand" "b")
+  (match_operand 2 "macho_pic_address" "")))]
+   "TARGET_MACHO && flag_pic"
"la %0,lo16(%2)(%1)")
 
 (define_split
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index c65dfb91f3d..28f6e9883cb 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -2045,3 +2045,17 @@
  (if_then_else (match_test "TARGET_VSX")
   (match_operand 0 "reg_or_cint_operand")
   (match_operand 0 "const_int_operand")))
+
+;; Return true if the operand is a valid Mach-O pic address.
+;;
+(define_predicate "macho_pic_address"
+  (match_code "const,unspec")
+{
+  if (GET_CODE (op) == CONST)
+op = XEXP (op, 0);
+
+  if (GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_MACHOPIC_OFFSET)
+return CONSTANT_P (XVECEXP (op, 0, 0));
+  else
+return false;
+})
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a855e8c4c72..9dbab1fc644 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -9028,7 +9028,7 @@ rs6000_legitimize_address (rtx x, rtx oldx 

Re: [PATCH] Fix clang warning in pt.cc

2022-02-24 Thread Martin Liška

On 2/21/22 17:48, Martin Liška wrote:

Ready to be installed?


I'm going to install this as obvious.

Martin


Re: [PATCH V2] bpf: do not --enable-gcov for bpf-*-* targets

2022-02-24 Thread Jose E. Marchesi


> On Thu, Feb 24, 2022 at 2:49 PM Jose E. Marchesi
>  wrote:
>>
>>
>> > On Wed, Feb 23, 2022 at 8:56 PM Jose E. Marchesi
>> >  wrote:
>> >>
>> >> This patch changes the build machinery in order to disable the build
>> >> of GCOV (both compiler and libgcc) in bpf-*-* targets.  The reason for
>> >> this change is that BPF is (currently) too restricted in order to
>> >> support the coverage instrumentalization.
>> >>
>> >> Tested in bpf-unknown-none and x86_64-linux-gnu targets.
>> >
>> > LGTM.
>>
>> Thanks.
>>
>> Is this OK for gcc 12 as well?  If yes, what is the proper branch other
>> than master?
>
> gcc 12 is still master.

Committed to master.
Thanks!

>> >> 2022-02-23  Jose E. Marchesi  
>> >>
>> >> gcc/ChangeLog
>> >>
>> >> PR target/104656
>> >> * configure.ac: --disable-gcov if targetting bpf-*.
>> >> * configure: Regenerate.
>> >>
>> >> libgcc/ChangeLog
>> >>
>> >> PR target/104656
>> >> * configure.ac: --disable-gcov if targetting bpf-*.
>> >> * configure: Regenerate.
>> >> ---
>> >>  gcc/configure   | 14 +++---
>> >>  gcc/configure.ac| 10 +-
>> >>  libgcc/configure| 31 +++
>> >>  libgcc/configure.ac | 17 -
>> >>  4 files changed, 51 insertions(+), 21 deletions(-)
>> >>
>> >> diff --git a/gcc/configure b/gcc/configure
>> >> index 258b17a226e..22eb3451e3d 100755
>> >> --- a/gcc/configure
>> >> +++ b/gcc/configure
>> >> @@ -8085,12 +8085,20 @@ fi
>> >>  if test "${enable_gcov+set}" = set; then :
>> >>enableval=$enable_gcov;
>> >>  else
>> >> -  enable_gcov=yes
>> >> +  case $target in
>> >> +   bpf-*-*)
>> >> + enable_gcov=no
>> >> +   ;;
>> >> +   *)
>> >> + enable_gcov=yes
>> >> +   ;;
>> >> + esac
>> >>  fi
>> >>
>> >>
>> >>
>> >>
>> >> +
>> >>  # Check whether --with-specs was given.
>> >>  if test "${with_specs+set}" = set; then :
>> >>withval=$with_specs; CONFIGURE_SPECS=$withval
>> >> @@ -19659,7 +19667,7 @@ else
>> >>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>> >>lt_status=$lt_dlunknown
>> >>cat > conftest.$ac_ext <<_LT_EOF
>> >> -#line 19662 "configure"
>> >> +#line 19670 "configure"
>> >>  #include "confdefs.h"
>> >>
>> >>  #if HAVE_DLFCN_H
>> >> @@ -19765,7 +19773,7 @@ else
>> >>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>> >>lt_status=$lt_dlunknown
>> >>cat > conftest.$ac_ext <<_LT_EOF
>> >> -#line 19768 "configure"
>> >> +#line 19776 "configure"
>> >>  #include "confdefs.h"
>> >>
>> >>  #if HAVE_DLFCN_H
>> >> diff --git a/gcc/configure.ac b/gcc/configure.ac
>> >> index 06750cee977..20da90901f8 100644
>> >> --- a/gcc/configure.ac
>> >> +++ b/gcc/configure.ac
>> >> @@ -1041,7 +1041,15 @@ AC_SUBST(enable_shared)
>> >>
>> >>  AC_ARG_ENABLE(gcov,
>> >>  [  --disable-gcov  don't provide libgcov and related host tools],
>> >> -[], [enable_gcov=yes])
>> >> +[], [case $target in
>> >> +   bpf-*-*)
>> >> + enable_gcov=no
>> >> +   ;;
>> >> +   *)
>> >> + enable_gcov=yes
>> >> +   ;;
>> >> + esac])
>> >> +
>> >>  AC_SUBST(enable_gcov)
>> >>
>> >>  AC_ARG_WITH(specs,
>> >> diff --git a/libgcc/configure b/libgcc/configure
>> >> index 4919a56f518..52bf25d4e94 100755
>> >> --- a/libgcc/configure
>> >> +++ b/libgcc/configure
>> >> @@ -630,6 +630,7 @@ LIPO
>> >>  AR
>> >>  toolexeclibdir
>> >>  toolexecdir
>> >> +enable_gcov
>> >>  target_subdir
>> >>  host_subdir
>> >>  build_subdir
>> >> @@ -653,7 +654,6 @@ build_cpu
>> >>  build
>> >>  with_aix_soname
>> >>  enable_vtable_verify
>> >> -enable_gcov
>> >>  enable_shared
>> >>  libgcc_topdir
>> >>  target_alias
>> >> @@ -701,7 +701,6 @@ with_target_subdir
>> >>  with_cross_host
>> >>  with_ld
>> >>  enable_shared
>> >> -enable_gcov
>> >>  enable_vtable_verify
>> >>  with_aix_soname
>> >>  enable_version_specific_runtime_libs
>> >> @@ -709,6 +708,7 @@ with_toolexeclibdir
>> >>  with_slibdir
>> >>  enable_maintainer_mode
>> >>  with_build_libsubdir
>> >> +enable_gcov
>> >>  enable_largefile
>> >>  enable_decimal_float
>> >>  with_system_libunwind
>> >> @@ -1342,12 +1342,12 @@ Optional Features:
>> >>--disable-FEATURE   do not include FEATURE (same as 
>> >> --enable-FEATURE=no)
>> >>--enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
>> >>--disable-shareddon't provide a shared libgcc
>> >> -  --disable-gcov  don't provide libgcov and related host tools
>> >>--enable-vtable-verifyEnable vtable verification feature
>> >>--enable-version-specific-runtime-libsSpecify that runtime 
>> >> libraries should be installed in a compiler-specific directory
>> >>--enable-maintainer-mode
>> >>enable make rules and dependencies not useful 
>> >> (and
>> >>sometimes confusing) to the casual installer
>> >> +  --disable-gcov  don't provide libgcov and related host tools
>> >>--disable-largefile omit 

Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-24 Thread Qing Zhao



> On Feb 24, 2022, at 4:16 AM, Richard Biener  wrote:
> 
> On Sat, 19 Feb 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> This is the 2nd patch for fixing pr102276.
>> 
>> Adding -Wtrivial-auto-var-init and update documentation.
>> 
>> Adding a new warning option -Wtrivial-auto-var-init to report cases when
>> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
>> time, update documentation for -ftrivial-auto-var-init to connect it with
>> the new warning option -Wtrivial-auto-var-init,  and add documentation
>> for -Wtrivial-auto-var-init.
>> 
>> Bootstraped and regression tested on both x86 and aarch64.
>> 
>> Okay for committing?
>> 
>> thanks.
>> 
>> Qing.
>> 
>> ==
>> From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Fri, 18 Feb 2022 15:53:15 +
>> Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update documentation.
>> 
>> Adding a new warning option -Wtrivial-auto-var-init to report cases when
>> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
>> time, update documentation for -ftrivial-auto-var-init to connect it with
>> the new warning option -Wtrivial-auto-var-init,  and add documentation
>> for -Wtrivial-auto-var-init.
>> 
>> 2022-02-18 Qing Zhao  
>> gcc/ChangeLog:
>> 
>>  * common.opt (-Wtrivial-auto-var-init): New option.
>>  * doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
>>  (-ftrivial-auto-var-init): Update option;
>>  * gimplify.cc (maybe_warn_switch_unreachable): Rename...
>>  (maybe_warn_switch_unreachable_and_auto_init): ...to this.
>>  (gimplify_switch_expr): Call new function.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr102276-3.c: New test.
>>  * gcc.dg/auto-init-pr102276-4.c: New test.
>> ---
>> gcc/common.opt  |   4 +
>> gcc/doc/invoke.texi |  14 ++-
>> gcc/gimplify.cc | 100 +++-
>> gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
>> gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
>> 5 files changed, 175 insertions(+), 23 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c
>> 
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index c21e5273ae3..22c95dbfa49 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -801,6 +801,10 @@ Wtrampolines
>> Common Var(warn_trampolines) Warning
>> Warn whenever a trampoline is generated.
>> 
>> +Wtrivial-auto-var-init
>> +Common Var(warn_trivial_auto_var_init) Warning Init(0)
>> +Warn about where -ftrivial-auto-var-init cannot initialize the auto 
>> variable.
>> +
> 
> Warn about cases where ... initialize a variable.

Okay. 

> 
>> Wtype-limits
>> Common Var(warn_type_limits) Warning EnabledBy(Wextra)
>> Warn if a comparison is always true or always false due to the limited range 
>> of the data type.
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index e1a00c80307..c61a5b4b4a5 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
>> -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
>> -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
>> -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
>> --Wtsan -Wtype-limits  -Wundef @gol
>> +-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
>> -Wuninitialized  -Wunknown-pragmas @gol
>> -Wunsuffixed-float-constants  -Wunused @gol
>> -Wunused-but-set-parameter  -Wunused-but-set-variable @gol
>> @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ 
>> programs.
>> Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch}
>> built-in functions are used.  These functions changed semantics in GCC 4.4.
>> 
>> +@item -Wtrivial-auto-var-init
>> +@opindex Wtrivial-auto-var-init
>> +@opindex Wno-trivial-auto-var-init
>> +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic
>> +variable.  A common situation is an automatic variable that is declared
>> +between the controlling expression and the first case lable of a 
>> @code{switch}
>> +statement.
>> +
>> @item -Wunused-but-set-parameter
>> @opindex Wunused-but-set-parameter
>> @opindex Wno-unused-but-set-parameter
>> @@ -12314,6 +12322,10 @@ initializer as uninitialized, 
>> @option{-Wuninitialized} and
>> warning messages on such automatic variables.
>> With this option, GCC will also initialize any padding of automatic variables
>> that have structure or union types to zeroes.
>> +However, the current implementation cannot initialize automatic variables 
>> that
>> +are declared between the controlling expression and the first case of a
>> +@code{switch} statement.  Using @option{-Wtrivial-auto-var-init} to report 
>> all
>> +such cases.
>> 
>> The 

Re: [PATCH 1/2][middle-end/102276] Don't emit switch-unreachable warnings for -ftrivial-auto-var-init (PR102276)

2022-02-24 Thread Qing Zhao



> On Feb 24, 2022, at 4:10 AM, Richard Biener  wrote:
> 
> On Sat, 19 Feb 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> Per our discussion in the bug report 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102276
>> 
>> We decided to go with the following solution:
>> 
>> 1. avoid emitting switch-unreachable warnings for -ftrivial-auto-var-init;
>> 2. adding a new option -Wtrivial-auto-var-init to emit warnings for the 
>> switch-unreadable cases to suggest the user modify the source code;
>> 3. update documentation of -ftrivial-auto-var-init for the limitation on 
>> switch-unreachable cases and introduce the new option -Wtrivial-auto-var-init
>> 
>> with the above 1, we can resolve the current immediate issue of spurious 
>> warnings of using -ftrivial-auto-var-init to make kernel build succeed;
>> with the above 2, we provide the user a way to know that 
>> -ftrivial-auto-var-init has limitation on the switch-unreachable cases, and 
>> user should modify the source code to avoid this problem;
>> with the above 3, we will provide the user a clear documentation of the 
>> -ftrivial-auto-var-init and also provide suggestions how to resolve this 
>> issue. 
>> 
>> There are two patches included for this bug.  This is the first one.
>> 
>> The patches has been bootstrapped and regression tested on both x86 and 
>> aarch64.
>> 
>> Okay for commit?
>> 
>> Thanks.
>> 
>> Qing.
>> 
>> ===
>> 
>> From 65bc9607ff35ad49e5501ec5c392293c5b6358d0 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Fri, 18 Feb 2022 15:35:53 +
>> Subject: [PATCH 1/2] Don't emit switch-unreachable warnings for
>> -ftrivial-auto-var-init (PR102276)
>> 
>> for the following testing case:
>>  1 int g(int *);
>>  2 int f1()
>>  3 {
>>  4 switch (0) {
>>  5 int x;
>>  6 default:
>>  7 return g();
>>  8 }
>>  9 }
>> compiling with -O -ftrivial-auto-var-init causes spurious warning:
>> warning: statement will never be executed [-Wswitch-unreachable]
>>5 | int x;
>>  | ^
>> This is due to the compiler-generated initialization at the point of
>> the declaration.
>> 
>> We could avoid the warning by adjusting the routine
>> "maybe_warn_switch_unreachable" to exclude the following cases:
>> 
>> when
>> flag_auto_var_init > AUTO_INIT_UNINITIALIZED
>> And
>> call to .DEFERRED_INIT
>> 
>> 2022-02-18 Qing Zhao  
>> gcc/ChangeLog:
>> 
>>  * gimplify.cc (maybe_warn_switch_unreachable): Don't warn for compiler
>>  -generated initializations for -ftrivial-auto-var-init.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr102276-1.c: New test.
>>  * gcc.dg/auto-init-pr102276-2.c: New test.
>> ---
>> gcc/gimplify.cc |  8 -
>> gcc/testsuite/gcc.dg/auto-init-pr102276-1.c | 38 +
>> gcc/testsuite/gcc.dg/auto-init-pr102276-2.c | 38 +
>> 3 files changed, 83 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-2.c
>> 
>> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
>> index f570daa015a..4e3bbf5314d 100644
>> --- a/gcc/gimplify.cc
>> +++ b/gcc/gimplify.cc
>> @@ -2103,7 +2103,13 @@ maybe_warn_switch_unreachable (gimple_seq seq)
>>&& TREE_CODE (gimple_goto_dest (stmt)) == LABEL_DECL
>>&& DECL_ARTIFICIAL (gimple_goto_dest (stmt)))
>>  /* Don't warn for compiler-generated gotos.  These occur
>> -   in Duff's devices, for example.  */;
>> +   in Duff's devices, for example.  */
>> +;
>> +  else if ((flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
>> +&& (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)))
>> +/* Don't warn for compiler-generated initializations for
>> +  -ftrivial-auto-var-init.  */
>> +;
> 
> I think you want to instead skip these in warn_switch_unreachable_r
> since otherwise a .DEFERRED_INIT can silence the warning for a real
> stmt following it that is not reachable.

Oh, yeah, you are right.
Will fix this.

Thanks.

Qing



Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 24, 2022 at 04:00:33PM +0100, Richard Biener wrote:
> > > --- a/gcc/gimple-fold.cc
> > > +++ b/gcc/gimple-fold.cc
> > > @@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, 
> > > bool full)
> > > else
> > >   {
> > > src = make_ssa_name (type);
> > > -   g = gimple_build_assign (src, unshare_expr (dst));
> > > +   tree tmp_dst = unshare_expr (dst);
> > > +   /* The folding introduces a read from the tmp_dst, we should
> > > +  prevent uninitialized warning analysis from issuing warning
> > > +  for such fake read.  */
> > > +   suppress_warning (tmp_dst, OPT_Wuninitialized);
> > 
> > I wonder if we shouldn't guard the suppress_warning call on
> >   if (warn_uninitialized || warn_maybe_uninitialized)
> > because those warnings aren't on by default and the suppress_warning stuff,
> > especially when it could be done for many loads from the builtin means
> > populating hash tables with those.
> 
> Maybe that's something suppress_warning should do then?  OTOH you

Well, OPT_Wuninitialized is an argument why it can't.  The suppression
is using a single OPT_W*, but there are multiple different warnings
that care about that suppression, and suppress_warning can't know about it.

> don't know whether you're suppressing a warning in a region with
> -Wno-uninitialized but that's inlined into a -Wuninitialized
> function where then the false diagnostic pops up if we didn't
> suppress the warning ...

I think both -Wuninitialized and -Wmaybe-uninitialized aren't
Optimization or PerFunction, so they are global options.
On the other side, they can be locally changed through pragmas.

Maybe we could use
  if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
  || warning_enabled_at (buf->loc, OPT_Wmaybe_uninitialized))
if uninit pass uses the gimple_location of the read, that shouldn't
be really changing...

Jakub



Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Richard Biener via Gcc-patches
On Thu, 24 Feb 2022, Jakub Jelinek wrote:

> On Thu, Feb 24, 2022 at 02:30:05PM +, Qing Zhao wrote:
> > PR middle-end/104550
> > 
> > gcc/ChangeLog:
> > 
> > * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
> > created uses.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/auto-init-pr104550-1.c: New test.
> > * gcc.dg/auto-init-pr104550-2.c: New test.
> > * gcc.dg/auto-init-pr104550-3.c: New test.
> > ---
> >  gcc/gimple-fold.cc  |  7 ++-
> >  gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
> >  gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
> >  gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
> >  4 files changed, 38 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
> >  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
> > 
> > diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> > index 16f02c2d098..e11a775ad9f 100644
> > --- a/gcc/gimple-fold.cc
> > +++ b/gcc/gimple-fold.cc
> > @@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, bool 
> > full)
> >   else
> > {
> >   src = make_ssa_name (type);
> > - g = gimple_build_assign (src, unshare_expr (dst));
> > + tree tmp_dst = unshare_expr (dst);
> > + /* The folding introduces a read from the tmp_dst, we should
> > +prevent uninitialized warning analysis from issuing warning
> > +for such fake read.  */
> > + suppress_warning (tmp_dst, OPT_Wuninitialized);
> 
> I wonder if we shouldn't guard the suppress_warning call on
> if (warn_uninitialized || warn_maybe_uninitialized)
> because those warnings aren't on by default and the suppress_warning stuff,
> especially when it could be done for many loads from the builtin means
> populating hash tables with those.

Maybe that's something suppress_warning should do then?  OTOH you
don't know whether you're suppressing a warning in a region with
-Wno-uninitialized but that's inlined into a -Wuninitialized
function where then the false diagnostic pops up if we didn't
suppress the warning ...

So yeah, I think LTGM.

Richard.


Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 24, 2022 at 02:30:05PM +, Qing Zhao wrote:
>   PR middle-end/104550
> 
> gcc/ChangeLog:
> 
>   * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
>   created uses.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/auto-init-pr104550-1.c: New test.
>   * gcc.dg/auto-init-pr104550-2.c: New test.
>   * gcc.dg/auto-init-pr104550-3.c: New test.
> ---
>  gcc/gimple-fold.cc  |  7 ++-
>  gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
>  gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
>  gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
>  4 files changed, 38 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
> 
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 16f02c2d098..e11a775ad9f 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, bool 
> full)
> else
>   {
> src = make_ssa_name (type);
> -   g = gimple_build_assign (src, unshare_expr (dst));
> +   tree tmp_dst = unshare_expr (dst);
> +   /* The folding introduces a read from the tmp_dst, we should
> +  prevent uninitialized warning analysis from issuing warning
> +  for such fake read.  */
> +   suppress_warning (tmp_dst, OPT_Wuninitialized);

I wonder if we shouldn't guard the suppress_warning call on
  if (warn_uninitialized || warn_maybe_uninitialized)
because those warnings aren't on by default and the suppress_warning stuff,
especially when it could be done for many loads from the builtin means
populating hash tables with those.

Otherwise LGTM.

Jakub



[PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao
Hi, 

This is the 2nd version for this bug per our discussion.

Compared to the previous patch, this patch ONLY suppresses warnings for the 
fake read that was introduced with folding. 
The patch has been bootstrapped and regress tested on both x86 and aarch64.
Okay for trunk?

Thanks.

Qing

==
>From a858be0fd979e05a6f54ac340e34bf85ddbc7067 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Wed, 23 Feb 2022 23:45:10 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from 
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding() will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  |  7 ++-
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
 4 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098..e11a775ad9f 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -4379,7 +4379,12 @@ clear_padding_flush (clear_padding_struct *buf, bool 
full)
  else
{
  src = make_ssa_name (type);
- g = gimple_build_assign (src, unshare_expr (dst));
+ tree tmp_dst = unshare_expr (dst);
+ /* The folding introduces a read from the tmp_dst, we should
+prevent uninitialized warning analysis from issuing warning
+for such fake read.  */
+ suppress_warning (tmp_dst, OPT_Wuninitialized);
+ g = gimple_build_assign (src, tmp_dst);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
  tree mask = native_interpret_expr (type,
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
new file mode 100644
index 000..a08110c3a17
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
@@ -0,0 +1,10 @@
+/* PR 104550*/
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
new file mode 100644
index 000..2c395b32d32
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; 
+ __builtin_clear_padding ();  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
new file mode 100644
index 000..9893e37f12d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info;   /* { dg-bogus "info" "is used uninitialized" } 
*/
+ __builtin_clear_padding ();  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
-- 
2.27.0



Re: [PATCH][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Qing Zhao


> On Feb 24, 2022, at 2:46 AM, Richard Biener  wrote:
> 
> On Wed, 23 Feb 2022, Qing Zhao wrote:
> 
>> 
>> 
>>> On Feb 23, 2022, at 11:49 AM, Jakub Jelinek  wrote:
>>> 
>>> On Wed, Feb 23, 2022 at 05:33:57PM +, Qing Zhao wrote:
 From my understanding, __builtin_clear_padding (), does not _use_ 
 any variable,
 therefore, no uninitialized usage warning should be emitted for it. 
>>> 
>>> __builtin_clear_padding ()
>>> sometimes expands to roughly:
>>> *(int *)((char *) + 32) = 0;
>>> etc., in that case it shouldn't be suppressed in any way, it doesn't read
>>> anything, only stores.
>>> Or at other times it is:
>>> *(int *)((char *) + 32) &= 0xfec7dab1;
>>> etc., in that case it reads bytes from the object which can be
>>> uninitialized, we mask some bits off and store.
>> 
>> Okay, I see. 
>> So, only the MEM_REF that will be used to read first should be suppressed 
>> warning. Then there is only one (out of 4) MEM_REF
>> should be suppressed warning, that’s the following one (line 4371 and then 
>> line 4382):
>> 
>> 4371   tree dst = build2_loc (buf->loc, MEM_REF, atype, 
>> buf->base,
>> 4372  build_int_cst (buf->alias_type, 
>> off));
>> 4373   tree src;
>> 4374   gimple *g;
>> 4375   if (all_ones
>> 4376   && nonzero_first == start
>> 4377   && nonzero_last == start + eltsz)
>> 4378 src = build_zero_cst (type);
>> 4379   else
>> 4380 {
>> 4381   src = make_ssa_name (type);
>> 4382   g = gimple_build_assign (src, unshare_expr (dst));
>> 4383   gimple_set_location (g, buf->loc);
>> 4384   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>> 4385   tree mask = native_interpret_expr (type,
>> 4386  buf->buf + i + 
>> start,
>> 4387  eltsz);
>> 4388   gcc_assert (mask && TREE_CODE (mask) == INTEGER_CST);
>> 4389   mask = fold_build1 (BIT_NOT_EXPR, type, mask);
>> 4390   tree src_masked = make_ssa_name (type);
>> 4391   g = gimple_build_assign (src_masked, BIT_AND_EXPR,
>> 4392src, mask);
>> 4393   gimple_set_location (g, buf->loc);
>> 4394   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>> 4395   src = src_masked;
>> 4396 }
>> 4397   g = gimple_build_assign (dst, src);
>> 
>> 
>> All the other 3 MEM_REFs are not read. So, we can just exclude them from 
>> suppressing warning, right?
>> Another question, for the above MEM_REF, should I suppress warning for line 
>> 4371 “dst”? Or shall I 
>> Suppress warning for line 4382 (for the “unshared_expr(dst)”)?
>> 
>> I think that we should suppress warning for the latter, i.e 
>> “unshared_expr(dst)” at line 4382, right?
> 
> Yes, the one that's put into the GIMPLE stmt.

Okay.
> 
>>> 
>>> It is similar to what object.bitfld = 3; expands to,
>>> but usually only after the uninit pass.  Though, we have the
>>> optimize_bit_field_compare optimization, that is done very early
>>> and I wonder what uninit does about that.  Perhaps it ignores
>>> BIT_FIELD_REFs, I'd need to check that.
>> 
>> Yes, I see that uninitialized warning specially handles BIT_INSERT_EXPR as: 
>> (tree-ssa-uninit.cc)
>> 
>> 573   /* Do not warn if the result of the access is then used for
>> 574  a BIT_INSERT_EXPR. */
>> 575   if (lhs && TREE_CODE (lhs) == SSA_NAME)
>> 576 FOR_EACH_IMM_USE_FAST (luse_p, liter, lhs)
>> 577   {
>> 578 gimple *use_stmt = USE_STMT (luse_p);
>> 579 /* BIT_INSERT_EXPR first operand should not be considered
>> 580a use for the purpose of uninit warnings.  */
> 
> That follows the COMPLEX_EXPR handling I think.
> 
>>> 
>>> Anyway, if we want to disable uninit warnings for __builtin_clear_padding,
>>> we should do that with suppress_warning on the read stmts that load
>>> a byte (or more adjacent ones) before they are masked off and stored again,
>>> so that we don't warn about that.
>> 
>> IN addition to this read stmts, shall we suppress warnings for the following:
>> 
>> /* Emit a runtime loop:
>>   for (; buf.base != end; buf.base += sz)
>> __builtin_clear_padding (buf.base);  */
>> 
>> static void
>> clear_padding_emit_loop (clear_padding_struct *buf, tree type,
>> tree end, bool for_auto_init)
>> {
>> 
>> i.e, should we suppress warnings for the above “buf.base != end”, “buf.base 
>> += sz”?
>> 
>> No need to suppress warning for them since they just read the address of the 
>> object, not the object itself?
> 
> No need to supporess those indeed.

agreed.

thanks.

Will send out the new version soon.

Qing
> 
> Richard.



[PATCH] tree-optimization/104676 - free nb_iterations after loop distribution

2022-02-24 Thread Richard Biener via Gcc-patches
Loop distribution can release SSA names used in nb_iterations, make
sure to release those.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

2022-02-24  Richard Biener  

PR tree-optimization/104676
* tree-loop-distribution.cc (loop_distribution::execute):
Do a full scev_reset.

* gcc.dg/torture/pr104676.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr104676.c | 35 +
 gcc/tree-loop-distribution.cc   |  2 +-
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr104676.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr104676.c 
b/gcc/testsuite/gcc.dg/torture/pr104676.c
new file mode 100644
index 000..50845bb9e15
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr104676.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-loop-distribution 
-ftree-parallelize-loops=2" } */
+
+struct S {
+  int f;
+};
+
+int n;
+
+int
+foo (struct S *s)
+{
+  int arr[3];
+  int v = 0;
+
+  for (n = 0; n < 2; ++n)
+{
+  int i;
+
+  for (i = 0; i < 2; ++i)
+{
+  int j;
+
+  for (j = 0; j < s->f; ++j)
+++v;
+}
+
+  if (v)
+arr[0] = 0;
+
+  arr[n + 1] = 0;
+}
+
+  return arr[0];
+}
diff --git a/gcc/tree-loop-distribution.cc b/gcc/tree-loop-distribution.cc
index c7b42857263..8ee40d88816 100644
--- a/gcc/tree-loop-distribution.cc
+++ b/gcc/tree-loop-distribution.cc
@@ -3853,7 +3853,7 @@ loop_distribution::execute (function *fun)
 
   /* Cached scalar evolutions now may refer to wrong or non-existing
 loops.  */
-  scev_reset_htab ();
+  scev_reset ();
   mark_virtual_operands_for_renaming (fun);
   rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
 }
-- 
2.34.1


Re: [PATCH V2] bpf: do not --enable-gcov for bpf-*-* targets

2022-02-24 Thread Richard Biener via Gcc-patches
On Thu, Feb 24, 2022 at 2:49 PM Jose E. Marchesi
 wrote:
>
>
> > On Wed, Feb 23, 2022 at 8:56 PM Jose E. Marchesi
> >  wrote:
> >>
> >> This patch changes the build machinery in order to disable the build
> >> of GCOV (both compiler and libgcc) in bpf-*-* targets.  The reason for
> >> this change is that BPF is (currently) too restricted in order to
> >> support the coverage instrumentalization.
> >>
> >> Tested in bpf-unknown-none and x86_64-linux-gnu targets.
> >
> > LGTM.
>
> Thanks.
>
> Is this OK for gcc 12 as well?  If yes, what is the proper branch other
> than master?

gcc 12 is still master.

Richard.

>
> >> 2022-02-23  Jose E. Marchesi  
> >>
> >> gcc/ChangeLog
> >>
> >> PR target/104656
> >> * configure.ac: --disable-gcov if targetting bpf-*.
> >> * configure: Regenerate.
> >>
> >> libgcc/ChangeLog
> >>
> >> PR target/104656
> >> * configure.ac: --disable-gcov if targetting bpf-*.
> >> * configure: Regenerate.
> >> ---
> >>  gcc/configure   | 14 +++---
> >>  gcc/configure.ac| 10 +-
> >>  libgcc/configure| 31 +++
> >>  libgcc/configure.ac | 17 -
> >>  4 files changed, 51 insertions(+), 21 deletions(-)
> >>
> >> diff --git a/gcc/configure b/gcc/configure
> >> index 258b17a226e..22eb3451e3d 100755
> >> --- a/gcc/configure
> >> +++ b/gcc/configure
> >> @@ -8085,12 +8085,20 @@ fi
> >>  if test "${enable_gcov+set}" = set; then :
> >>enableval=$enable_gcov;
> >>  else
> >> -  enable_gcov=yes
> >> +  case $target in
> >> +   bpf-*-*)
> >> + enable_gcov=no
> >> +   ;;
> >> +   *)
> >> + enable_gcov=yes
> >> +   ;;
> >> + esac
> >>  fi
> >>
> >>
> >>
> >>
> >> +
> >>  # Check whether --with-specs was given.
> >>  if test "${with_specs+set}" = set; then :
> >>withval=$with_specs; CONFIGURE_SPECS=$withval
> >> @@ -19659,7 +19667,7 @@ else
> >>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
> >>lt_status=$lt_dlunknown
> >>cat > conftest.$ac_ext <<_LT_EOF
> >> -#line 19662 "configure"
> >> +#line 19670 "configure"
> >>  #include "confdefs.h"
> >>
> >>  #if HAVE_DLFCN_H
> >> @@ -19765,7 +19773,7 @@ else
> >>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
> >>lt_status=$lt_dlunknown
> >>cat > conftest.$ac_ext <<_LT_EOF
> >> -#line 19768 "configure"
> >> +#line 19776 "configure"
> >>  #include "confdefs.h"
> >>
> >>  #if HAVE_DLFCN_H
> >> diff --git a/gcc/configure.ac b/gcc/configure.ac
> >> index 06750cee977..20da90901f8 100644
> >> --- a/gcc/configure.ac
> >> +++ b/gcc/configure.ac
> >> @@ -1041,7 +1041,15 @@ AC_SUBST(enable_shared)
> >>
> >>  AC_ARG_ENABLE(gcov,
> >>  [  --disable-gcov  don't provide libgcov and related host tools],
> >> -[], [enable_gcov=yes])
> >> +[], [case $target in
> >> +   bpf-*-*)
> >> + enable_gcov=no
> >> +   ;;
> >> +   *)
> >> + enable_gcov=yes
> >> +   ;;
> >> + esac])
> >> +
> >>  AC_SUBST(enable_gcov)
> >>
> >>  AC_ARG_WITH(specs,
> >> diff --git a/libgcc/configure b/libgcc/configure
> >> index 4919a56f518..52bf25d4e94 100755
> >> --- a/libgcc/configure
> >> +++ b/libgcc/configure
> >> @@ -630,6 +630,7 @@ LIPO
> >>  AR
> >>  toolexeclibdir
> >>  toolexecdir
> >> +enable_gcov
> >>  target_subdir
> >>  host_subdir
> >>  build_subdir
> >> @@ -653,7 +654,6 @@ build_cpu
> >>  build
> >>  with_aix_soname
> >>  enable_vtable_verify
> >> -enable_gcov
> >>  enable_shared
> >>  libgcc_topdir
> >>  target_alias
> >> @@ -701,7 +701,6 @@ with_target_subdir
> >>  with_cross_host
> >>  with_ld
> >>  enable_shared
> >> -enable_gcov
> >>  enable_vtable_verify
> >>  with_aix_soname
> >>  enable_version_specific_runtime_libs
> >> @@ -709,6 +708,7 @@ with_toolexeclibdir
> >>  with_slibdir
> >>  enable_maintainer_mode
> >>  with_build_libsubdir
> >> +enable_gcov
> >>  enable_largefile
> >>  enable_decimal_float
> >>  with_system_libunwind
> >> @@ -1342,12 +1342,12 @@ Optional Features:
> >>--disable-FEATURE   do not include FEATURE (same as 
> >> --enable-FEATURE=no)
> >>--enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
> >>--disable-shareddon't provide a shared libgcc
> >> -  --disable-gcov  don't provide libgcov and related host tools
> >>--enable-vtable-verifyEnable vtable verification feature
> >>--enable-version-specific-runtime-libsSpecify that runtime 
> >> libraries should be installed in a compiler-specific directory
> >>--enable-maintainer-mode
> >>enable make rules and dependencies not useful 
> >> (and
> >>sometimes confusing) to the casual installer
> >> +  --disable-gcov  don't provide libgcov and related host tools
> >>--disable-largefile omit support for large files
> >>--enable-decimal-float={no,yes,bid,dpd}
> >> enable decimal float extension to C.  Selecting 
> >> 'bid'
> >> @@ -2252,15 

Re: [PATCH V2] bpf: do not --enable-gcov for bpf-*-* targets

2022-02-24 Thread Jose E. Marchesi


> On Wed, Feb 23, 2022 at 8:56 PM Jose E. Marchesi
>  wrote:
>>
>> This patch changes the build machinery in order to disable the build
>> of GCOV (both compiler and libgcc) in bpf-*-* targets.  The reason for
>> this change is that BPF is (currently) too restricted in order to
>> support the coverage instrumentalization.
>>
>> Tested in bpf-unknown-none and x86_64-linux-gnu targets.
>
> LGTM.

Thanks.

Is this OK for gcc 12 as well?  If yes, what is the proper branch other
than master?

>> 2022-02-23  Jose E. Marchesi  
>>
>> gcc/ChangeLog
>>
>> PR target/104656
>> * configure.ac: --disable-gcov if targetting bpf-*.
>> * configure: Regenerate.
>>
>> libgcc/ChangeLog
>>
>> PR target/104656
>> * configure.ac: --disable-gcov if targetting bpf-*.
>> * configure: Regenerate.
>> ---
>>  gcc/configure   | 14 +++---
>>  gcc/configure.ac| 10 +-
>>  libgcc/configure| 31 +++
>>  libgcc/configure.ac | 17 -
>>  4 files changed, 51 insertions(+), 21 deletions(-)
>>
>> diff --git a/gcc/configure b/gcc/configure
>> index 258b17a226e..22eb3451e3d 100755
>> --- a/gcc/configure
>> +++ b/gcc/configure
>> @@ -8085,12 +8085,20 @@ fi
>>  if test "${enable_gcov+set}" = set; then :
>>enableval=$enable_gcov;
>>  else
>> -  enable_gcov=yes
>> +  case $target in
>> +   bpf-*-*)
>> + enable_gcov=no
>> +   ;;
>> +   *)
>> + enable_gcov=yes
>> +   ;;
>> + esac
>>  fi
>>
>>
>>
>>
>> +
>>  # Check whether --with-specs was given.
>>  if test "${with_specs+set}" = set; then :
>>withval=$with_specs; CONFIGURE_SPECS=$withval
>> @@ -19659,7 +19667,7 @@ else
>>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>>lt_status=$lt_dlunknown
>>cat > conftest.$ac_ext <<_LT_EOF
>> -#line 19662 "configure"
>> +#line 19670 "configure"
>>  #include "confdefs.h"
>>
>>  #if HAVE_DLFCN_H
>> @@ -19765,7 +19773,7 @@ else
>>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>>lt_status=$lt_dlunknown
>>cat > conftest.$ac_ext <<_LT_EOF
>> -#line 19768 "configure"
>> +#line 19776 "configure"
>>  #include "confdefs.h"
>>
>>  #if HAVE_DLFCN_H
>> diff --git a/gcc/configure.ac b/gcc/configure.ac
>> index 06750cee977..20da90901f8 100644
>> --- a/gcc/configure.ac
>> +++ b/gcc/configure.ac
>> @@ -1041,7 +1041,15 @@ AC_SUBST(enable_shared)
>>
>>  AC_ARG_ENABLE(gcov,
>>  [  --disable-gcov  don't provide libgcov and related host tools],
>> -[], [enable_gcov=yes])
>> +[], [case $target in
>> +   bpf-*-*)
>> + enable_gcov=no
>> +   ;;
>> +   *)
>> + enable_gcov=yes
>> +   ;;
>> + esac])
>> +
>>  AC_SUBST(enable_gcov)
>>
>>  AC_ARG_WITH(specs,
>> diff --git a/libgcc/configure b/libgcc/configure
>> index 4919a56f518..52bf25d4e94 100755
>> --- a/libgcc/configure
>> +++ b/libgcc/configure
>> @@ -630,6 +630,7 @@ LIPO
>>  AR
>>  toolexeclibdir
>>  toolexecdir
>> +enable_gcov
>>  target_subdir
>>  host_subdir
>>  build_subdir
>> @@ -653,7 +654,6 @@ build_cpu
>>  build
>>  with_aix_soname
>>  enable_vtable_verify
>> -enable_gcov
>>  enable_shared
>>  libgcc_topdir
>>  target_alias
>> @@ -701,7 +701,6 @@ with_target_subdir
>>  with_cross_host
>>  with_ld
>>  enable_shared
>> -enable_gcov
>>  enable_vtable_verify
>>  with_aix_soname
>>  enable_version_specific_runtime_libs
>> @@ -709,6 +708,7 @@ with_toolexeclibdir
>>  with_slibdir
>>  enable_maintainer_mode
>>  with_build_libsubdir
>> +enable_gcov
>>  enable_largefile
>>  enable_decimal_float
>>  with_system_libunwind
>> @@ -1342,12 +1342,12 @@ Optional Features:
>>--disable-FEATURE   do not include FEATURE (same as 
>> --enable-FEATURE=no)
>>--enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
>>--disable-shareddon't provide a shared libgcc
>> -  --disable-gcov  don't provide libgcov and related host tools
>>--enable-vtable-verifyEnable vtable verification feature
>>--enable-version-specific-runtime-libsSpecify that runtime libraries 
>> should be installed in a compiler-specific directory
>>--enable-maintainer-mode
>>enable make rules and dependencies not useful (and
>>sometimes confusing) to the casual installer
>> +  --disable-gcov  don't provide libgcov and related host tools
>>--disable-largefile omit support for large files
>>--enable-decimal-float={no,yes,bid,dpd}
>> enable decimal float extension to C.  Selecting 'bid'
>> @@ -2252,15 +2252,6 @@ fi
>>
>>
>>
>> -# Check whether --enable-gcov was given.
>> -if test "${enable_gcov+set}" = set; then :
>> -  enableval=$enable_gcov;
>> -else
>> -  enable_gcov=yes
>> -fi
>> -
>> -
>> -
>>  # Check whether --enable-vtable-verify was given.
>>  if test "${enable_vtable_verify+set}" = set; then :
>>enableval=$enable_vtable_verify; case "$enableval" in
>> @@ -2713,6 +2704,22 @@ fi
>>  

Re: [PATCH][GCC] aarch64: fix: ls64 tests fail on aarch64-linux-gnu_ilp32 [PR103729]

2022-02-24 Thread Richard Sandiford via Gcc-patches
Przemyslaw Wirkus  writes:
> Ping :)

Sorry, guess I missed this.

>> This patch is sorting issue with LS64 intrinsics tests failing with
>> aarch64-linux-gnu_ilp32 target.
>>
>> Regtested on aarch64-linux-gnu_ilp32, aarch64-elf and aarch64_be-elf
>> and no issues.
>>
>> OK to install?
>>
>> gcc/ChangeLog:
>>
>>    PR target/103729
>>    * config/aarch64/aarch64-builtins.c 
>>(aarch64_expand_builtin_ls64):
>>    Handle SImode for ILP32.
>
> --- 
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index 
> 0d09fe9dd6dd65c655f5bd0b9a622e7550b61a4b..58bcd99d25b79191589cf9bf8a99db4f4b6a6ba1
>  100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -2216,7 +2216,8 @@ aarch64_expand_builtin_ls64 (int fcode, tree exp, rtx 
> target)
>{
>  rtx op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
>  create_output_operand ([0], target, V8DImode);
> -create_input_operand ([1], op0, DImode);
> +create_input_operand ([1],
> +GET_MODE (op0) == SImode ? gen_reg_rtx (DImode) : op0, DImode);
>  expand_insn (CODE_FOR_ld64b, 2, ops);
>  return ops[0].value;
>}
> @@ -2234,7 +2235,8 @@ aarch64_expand_builtin_ls64 (int fcode, tree exp, rtx 
> target)
>  rtx op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
>  rtx op1 = expand_normal (CALL_EXPR_ARG (exp, 1));
>  create_output_operand ([0], target, DImode);
> -create_input_operand ([1], op0, DImode);
> +create_input_operand ([1],
> +GET_MODE (op0) == SImode ? gen_reg_rtx (DImode) : op0, DImode);
>  create_input_operand ([2], op1, V8DImode);
>  expand_insn (CODE_FOR_st64bv, 3, ops);
>  return ops[0].value;
> @@ -2244,7 +2246,8 @@ aarch64_expand_builtin_ls64 (int fcode, tree exp, rtx 
> target)
>  rtx op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
>  rtx op1 = expand_normal (CALL_EXPR_ARG (exp, 1));
>  create_output_operand ([0], target, DImode);
> -create_input_operand ([1], op0, DImode);
> +create_input_operand ([1],
> +GET_MODE (op0) == SImode ? gen_reg_rtx (DImode) : op0, DImode);
>  create_input_operand ([2], op1, V8DImode);
>  expand_insn (CODE_FOR_st64bv0, 3, ops);
>  return ops[0].value;

In the GET_MODE (op0) == SImode case, it looks like this will just set
the operand to a new DImode register, whose value is never initialised.

I think instead we need to use create_convert_operand_from
to convert from Pmode (with unsigned_p true).

Thanks,
Richard


Re: [PATCH] PR fortran/84519 - [F2018] STOP and ERROR STOP statements with QUIET specifier

2022-02-24 Thread Mikael Morin

Le 23/02/2022 à 23:21, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

Fortran 2018 added a QUIET= specifier to STOP and ERROR STOP statements.
Janne already implemented the library side code four (4!) years ago,
but so far the frontend implementation was missing.

Furthermore, F2018 allows for non-default-integer stopcode expressions
(finally!).

The attached patch provides this implementation.

That was not too much fun for the following reasons:

- fixed format vs. free format
- F95 and F2003 apparently did not require a blank between STOP and
   stopcode, while F2008+ do require it.

This should explain for the three testcases.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

One step closer to F2018!


Please move the error from trans-stmt.cc to resolve.cc.
Otherwise looks good, and you have a green light by Jerry, but I would 
rather defer this to gcc-13.


Mikael


Re: PR64454: (x % y) % y

2022-02-24 Thread Andrew Pinski via Gcc-patches
On Thu, May 14, 2015 at 9:33 AM Marc Glisse  wrote:
>
> Hello,
>
> after this patch I think I'll close the PR. This was regtested on
> ppc64le-redhat-linux.
>
> Apparently I wrote this patch in a file that already had a trivial hunk:
> -1-A -> ~A is rejected for complex while -A-1 isn't, there is no reason
> for this difference (maybe there was before integer_all_onesp /
> integer_minus_onep was introduced), I hope you don't mind.


So this hunk actually was wrong in the end. PR 104675 is opened for
the reason why it was wrong.
-A-1 should also be rejected for complex types too.

I will also notice the gimple verifiers should be catching BIT_* on
the complex types as they don't make sense really but currently is
not. Though that would be something for GCC 13.

Thanks,
Andrew Pinski

>
> I am wondering if we want some helper (like :c for commutative operations)
> to avoid duplicating patterns for xx. We could also, when a
> comparison x<=y doesn't simplify, see if !!(x<=y) simplifies better, but
> that's becoming a bit complicated.
>
> 2015-05-15  Marc Glisse  
>
> PR tree-optimization/64454
> gcc/
> * match.pd ((X % Y) % Y, (X % Y) < Y): New patterns.
> (-1 - A -> ~A): Remove unnecessary condition.
> gcc/testsuite/
> * gcc.dg/modmod.c: New testcase.
>
> --
> Marc Glisse


Re: [PATCH][libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

2022-02-24 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 24, 2022 at 11:32:53AM +0100, Tom de Vries wrote:
> libgomp/ChangeLog:
> 
> 2022-02-24  Tom de Vries  
> 
>   * testsuite/libgomp.c/declare-variant-3-sm30.c: New test.
>   * testsuite/libgomp.c/declare-variant-3-sm35.c: New test.
>   * testsuite/libgomp.c/declare-variant-3-sm53.c: New test.
>   * testsuite/libgomp.c/declare-variant-3-sm70.c: New test.
>   * testsuite/libgomp.c/declare-variant-3-sm75.c: New test.
>   * testsuite/libgomp.c/declare-variant-3-sm80.c: New test.
>   * testsuite/libgomp.c/declare-variant-3.h: New header file.

LGTM, thanks.

Jakub



Re: Add testcase from PR103845

2022-02-24 Thread Richard Biener via Gcc-patches
On Wed, Feb 23, 2022 at 11:47 PM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> This problem was already fixed as part of PR104263: the abnormal edge
> that remained from before inlining didn't make sense after inlining.
> So this patch adds only the testcase.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

>
> for  gcc/testsuite/ChangeLog
>
> PR tree-optimization/103845
> PR tree-optimization/104263
> * gcc.dg/pr103845.c: New.
> ---
>  gcc/testsuite/gcc.dg/pr103845.c |   29 +
>  1 file changed, 29 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr103845.c
>
> diff --git a/gcc/testsuite/gcc.dg/pr103845.c b/gcc/testsuite/gcc.dg/pr103845.c
> new file mode 100644
> index 0..45ab518d07c9a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr103845.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fharden-compares -fno-ipa-pure-const" } */
> +
> +int
> +baz (void);
> +
> +__attribute__ ((returns_twice)) void
> +bar (void)
> +{
> +}
> +
> +int
> +quux (int y, int z)
> +{
> +  return (y || z >= 0) ? y : z;
> +}
> +
> +int
> +foo (int x)
> +{
> +  int a = 0, b = x == a;
> +
> +  bar ();
> +
> +  if (!!baz () < quux (b, a))
> +++x;
> +
> +  return x;
> +}
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: Cope with NULL dw_cfi_cfa_loc

2022-02-24 Thread Richard Biener via Gcc-patches
On Wed, Feb 23, 2022 at 11:46 PM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> In def_cfa_0, we may set the 2nd operand's dw_cfi_cfa_loc to NULL, but
> then cfi_oprnd_equal_p calls cfa_equal_p with a NULL dw_cfa_location*.
> This patch aranges for us to tolerate NULL dw_cfi_cfa_loc.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

Richard.

>
> for  gcc/ChangeLog
>
> PR middle-end/104540
> * dwarf2cfi.cc (cfi_oprnd_equal_p): Cope with NULL
> dw_cfi_cfa_loc.
>
> for  gcc/testsuite/ChangeLog
>
> PR middle-end/104540
> * g++.dg/PR104540.C: New.
> ---
>  gcc/dwarf2cfi.cc|3 +++
>  gcc/testsuite/g++.dg/pr104540.C |   21 +
>  2 files changed, 24 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/pr104540.C
>
> diff --git a/gcc/dwarf2cfi.cc b/gcc/dwarf2cfi.cc
> index 9ca97d7a3bf56..ab7c5cc5b27b5 100644
> --- a/gcc/dwarf2cfi.cc
> +++ b/gcc/dwarf2cfi.cc
> @@ -788,6 +788,9 @@ cfi_oprnd_equal_p (enum dw_cfi_oprnd_type t, dw_cfi_oprnd 
> *a, dw_cfi_oprnd *b)
>  case dw_cfi_oprnd_loc:
>return loc_descr_equal_p (a->dw_cfi_loc, b->dw_cfi_loc);
>  case dw_cfi_oprnd_cfa_loc:
> +  /* If any of them is NULL, don't dereference either.  */
> +  if (!a->dw_cfi_cfa_loc || !b->dw_cfi_cfa_loc)
> +   return a->dw_cfi_cfa_loc == b->dw_cfi_cfa_loc;
>return cfa_equal_p (a->dw_cfi_cfa_loc, b->dw_cfi_cfa_loc);
>  }
>gcc_unreachable ();
> diff --git a/gcc/testsuite/g++.dg/pr104540.C b/gcc/testsuite/g++.dg/pr104540.C
> new file mode 100644
> index 0..a86ecbfd088c3
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr104540.C
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fharden-conditional-branches -mforce-drap 
> -mstackrealign --param=max-grow-copy-bb-insns=125" } */
> +
> +char c;
> +int i;
> +
> +void bar(int);
> +
> +struct S {
> +  int mi;
> +  long ml;
> +  S(int);
> +};
> +
> +
> +void foo() {
> +  int s = c == 0 ? 1 : 2;
> +  bar(s);
> +  if (i)
> +S s(0);
> +}
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: [PATCH][libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

2022-02-24 Thread Tom de Vries via Gcc-patches

On 2/24/22 11:09, Jakub Jelinek wrote:

On Thu, Feb 24, 2022 at 11:01:22AM +0100, Tom de Vries wrote:

[ was: Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70 ]

On 2/24/22 09:29, Tom de Vries wrote:

I'll try to submit a patch with one or more test-cases.


Hi,

These test-cases exercise the omp declare variant construct using the
available nvptx isas.

OK for trunk?

Thanks,
- Tom



[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

Add openmp test-cases that test the omp declare variant construct:
...
   #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
using the available nvptx isas.

On a Pascal board GT 1030 with sm_61, we have these unsupported:
...
UNSUPPORTED: libgomp.c/declare-variant-3-sm70.c
UNSUPPORTED: libgomp.c/declare-variant-3-sm75.c
UNSUPPORTED: libgomp.c/declare-variant-3-sm80.c
...
and on a Turing board T400 with sm_75, we have this only this one:
...
UNSUPPORTED: libgomp.c/declare-variant-3-sm80.c
...

Tested on x86_64 with nvptx accelerator.


I think testing it through dg-do link tests with -fdump-tree-optimized
or so would be better, you wouldn't need access to actual hardware level
and checking in the dump what function is actually called for each case is
easy.



Done, expect for the sm_30 test which is still dg-do run (although I've 
added the compile time test) which should pass on all boards (since we 
don't support below sm_30).


OK for trunk?

Thanks,
- Tom
[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

Add openmp test-cases that test the omp declare variant construct:
...
  #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
using the available nvptx isas.

Only the one for sm_30 is a dg-do run test-case, the other ones are dg-do
link.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-02-24  Tom de Vries  

	* testsuite/libgomp.c/declare-variant-3-sm30.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm35.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm53.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm70.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm75.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm80.c: New test.
	* testsuite/libgomp.c/declare-variant-3.h: New header file.

---
 .../testsuite/libgomp.c/declare-variant-3-sm30.c   |  7 +++
 .../testsuite/libgomp.c/declare-variant-3-sm35.c   |  7 +++
 .../testsuite/libgomp.c/declare-variant-3-sm53.c   |  7 +++
 .../testsuite/libgomp.c/declare-variant-3-sm70.c   |  7 +++
 .../testsuite/libgomp.c/declare-variant-3-sm75.c   |  7 +++
 .../testsuite/libgomp.c/declare-variant-3-sm80.c   |  7 +++
 libgomp/testsuite/libgomp.c/declare-variant-3.h| 66 ++
 7 files changed, 108 insertions(+)

diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c b/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
new file mode 100644
index 000..ad1602c13cd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
@@ -0,0 +1,7 @@
+/* { dg-do run { target { offload_target_nvptx } } } */
+/* { dg-additional-options "-foffload=-misa=sm_30" } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-3.h"
+
+/* { dg-final { scan-offload-tree-dump "= f30 \\(\\);" "optimized" } } */
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c b/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c
new file mode 100644
index 000..1a7cda2456b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c
@@ -0,0 +1,7 @@
+/* { dg-do link { target { offload_target_nvptx } } } */
+/* { dg-additional-options "-foffload=-misa=sm_35" } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-3.h"
+
+/* { dg-final { scan-offload-tree-dump "= f35 \\(\\);" "optimized" } } */
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c b/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c
new file mode 100644
index 000..a37b5fdaa28
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm53.c
@@ -0,0 +1,7 @@
+/* { dg-do link { target { offload_target_nvptx } } } */
+/* { dg-additional-options "-foffload=-misa=sm_53" } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-3.h"
+
+/* { dg-final { scan-offload-tree-dump "= f53 \\(\\);" "optimized" } } */
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c b/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c
new file mode 100644
index 000..ab022cd79f9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm70.c
@@ -0,0 +1,7 @@
+/* { dg-do link { target { offload_target_nvptx } } } */
+/* { dg-additional-options "-foffload=-misa=sm_70" } */
+/* { dg-additional-options "-foffload=-fdump-tree-optimized" } */
+
+#include "declare-variant-3.h"
+
+/* { dg-final { scan-offload-tree-dump "= f70 \\(\\);" "optimized" } } */
diff --git 

Re: Copy EH phi args for throwing hardened compares

2022-02-24 Thread Richard Biener via Gcc-patches
On Wed, Feb 23, 2022 at 11:44 PM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> When we duplicate a throwing compare for hardening, the EH edge from
> the original compare gets duplicated for the inverted compare, but we
> failed to adjust any PHI nodes in the EH block.  This patch adds the
> needed adjustment, copying the PHI args from those of the preexisting
> edge.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

>
> for  gcc/ChangeLog
>
> PR tree-optimization/103856
> * gimple-harden-conditionals.cc (non_eh_succ_edge): Enable the
> eh edge to be requested through an extra parameter.
> (pass_harden_compares::execute): Copy PHI args in the EH dest
> block for the new EH edge added for the inverted compare.
>
> for  gcc/testsuite/ChangeLog
>
> PR tree-optimization/103856
> * g++.dg/pr103856.C: New.
> ---
>  gcc/gimple-harden-conditionals.cc |   31 ---
>  gcc/testsuite/g++.dg/pr103856.C   |   17 +
>  2 files changed, 45 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/pr103856.C
>
> diff --git a/gcc/gimple-harden-conditionals.cc 
> b/gcc/gimple-harden-conditionals.cc
> index 9418194cb20c2..6a5fc3fb9e1a2 100644
> --- a/gcc/gimple-harden-conditionals.cc
> +++ b/gcc/gimple-harden-conditionals.cc
> @@ -361,9 +361,9 @@ make_pass_harden_conditional_branches (gcc::context *ctxt)
>  }
>
>  /* Return the fallthru edge of a block whose other edge is an EH
> -   edge.  */
> +   edge.  If EHP is not NULL, store the EH edge in it.  */
>  static inline edge
> -non_eh_succ_edge (basic_block bb)
> +non_eh_succ_edge (basic_block bb, edge *ehp = NULL)
>  {
>gcc_checking_assert (EDGE_COUNT (bb->succs) == 2);
>
> @@ -375,6 +375,9 @@ non_eh_succ_edge (basic_block bb)
>gcc_checking_assert (!(ret->flags & EDGE_EH)
>&& (eh->flags & EDGE_EH));
>
> +  if (ehp)
> +*ehp = eh;
> +
>return ret;
>  }
>
> @@ -538,8 +541,9 @@ pass_harden_compares::execute (function *fun)
> add_stmt_to_eh_lp (asgnck, lookup_stmt_eh_lp (asgn));
> make_eh_edges (asgnck);
>
> +   edge ckeh;
> basic_block nbb = split_edge (non_eh_succ_edge
> - (gimple_bb (asgnck)));
> + (gimple_bb (asgnck), ));
> gsi_split = gsi_start_bb (nbb);
>
> if (dump_file)
> @@ -547,6 +551,27 @@ pass_harden_compares::execute (function *fun)
>"Splitting non-EH edge from block %i into %i after"
>" the newly-inserted reversed throwing compare\n",
>gimple_bb (asgnck)->index, nbb->index);
> +
> +   if (!gimple_seq_empty_p (phi_nodes (ckeh->dest)))
> + {
> +   edge aseh;
> +   non_eh_succ_edge (gimple_bb (asgn), );
> +
> +   gcc_checking_assert (aseh->dest == ckeh->dest);
> +
> +   for (gphi_iterator psi = gsi_start_phis (ckeh->dest);
> +!gsi_end_p (psi); gsi_next ())
> + {
> +   gphi *phi = psi.phi ();
> +   add_phi_arg (phi, PHI_ARG_DEF_FROM_EDGE (phi, aseh), ckeh,
> +gimple_phi_arg_location_from_edge (phi, 
> aseh));
> + }
> +
> +   if (dump_file)
> + fprintf (dump_file,
> +  "Copying PHI args in EH block %i from %i to %i\n",
> +  aseh->dest->index, aseh->src->index, 
> ckeh->src->index);
> + }
>   }
>
> gcc_checking_assert (single_succ_p (gsi_bb (gsi_split)));
> diff --git a/gcc/testsuite/g++.dg/pr103856.C b/gcc/testsuite/g++.dg/pr103856.C
> new file mode 100644
> index 0..26c7d8750255a
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr103856.C
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Og -fnon-call-exceptions -fsignaling-nans 
> -fharden-compares" } */
> +
> +struct S {
> +  S(float);
> +  S();
> +  operator float();
> +  ~S() {}
> +};
> +
> +int
> +main() {
> +  S s_arr[] = {2};
> +  S var1;
> +  if (var1)
> +;
> +}
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: [PATCH V2] bpf: do not --enable-gcov for bpf-*-* targets

2022-02-24 Thread Richard Biener via Gcc-patches
On Wed, Feb 23, 2022 at 8:56 PM Jose E. Marchesi
 wrote:
>
> This patch changes the build machinery in order to disable the build
> of GCOV (both compiler and libgcc) in bpf-*-* targets.  The reason for
> this change is that BPF is (currently) too restricted in order to
> support the coverage instrumentalization.
>
> Tested in bpf-unknown-none and x86_64-linux-gnu targets.

LGTM.

Richard.

> 2022-02-23  Jose E. Marchesi  
>
> gcc/ChangeLog
>
> PR target/104656
> * configure.ac: --disable-gcov if targetting bpf-*.
> * configure: Regenerate.
>
> libgcc/ChangeLog
>
> PR target/104656
> * configure.ac: --disable-gcov if targetting bpf-*.
> * configure: Regenerate.
> ---
>  gcc/configure   | 14 +++---
>  gcc/configure.ac| 10 +-
>  libgcc/configure| 31 +++
>  libgcc/configure.ac | 17 -
>  4 files changed, 51 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/configure b/gcc/configure
> index 258b17a226e..22eb3451e3d 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -8085,12 +8085,20 @@ fi
>  if test "${enable_gcov+set}" = set; then :
>enableval=$enable_gcov;
>  else
> -  enable_gcov=yes
> +  case $target in
> +   bpf-*-*)
> + enable_gcov=no
> +   ;;
> +   *)
> + enable_gcov=yes
> +   ;;
> + esac
>  fi
>
>
>
>
> +
>  # Check whether --with-specs was given.
>  if test "${with_specs+set}" = set; then :
>withval=$with_specs; CONFIGURE_SPECS=$withval
> @@ -19659,7 +19667,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 19662 "configure"
> +#line 19670 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -19765,7 +19773,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 19768 "configure"
> +#line 19776 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> diff --git a/gcc/configure.ac b/gcc/configure.ac
> index 06750cee977..20da90901f8 100644
> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -1041,7 +1041,15 @@ AC_SUBST(enable_shared)
>
>  AC_ARG_ENABLE(gcov,
>  [  --disable-gcov  don't provide libgcov and related host tools],
> -[], [enable_gcov=yes])
> +[], [case $target in
> +   bpf-*-*)
> + enable_gcov=no
> +   ;;
> +   *)
> + enable_gcov=yes
> +   ;;
> + esac])
> +
>  AC_SUBST(enable_gcov)
>
>  AC_ARG_WITH(specs,
> diff --git a/libgcc/configure b/libgcc/configure
> index 4919a56f518..52bf25d4e94 100755
> --- a/libgcc/configure
> +++ b/libgcc/configure
> @@ -630,6 +630,7 @@ LIPO
>  AR
>  toolexeclibdir
>  toolexecdir
> +enable_gcov
>  target_subdir
>  host_subdir
>  build_subdir
> @@ -653,7 +654,6 @@ build_cpu
>  build
>  with_aix_soname
>  enable_vtable_verify
> -enable_gcov
>  enable_shared
>  libgcc_topdir
>  target_alias
> @@ -701,7 +701,6 @@ with_target_subdir
>  with_cross_host
>  with_ld
>  enable_shared
> -enable_gcov
>  enable_vtable_verify
>  with_aix_soname
>  enable_version_specific_runtime_libs
> @@ -709,6 +708,7 @@ with_toolexeclibdir
>  with_slibdir
>  enable_maintainer_mode
>  with_build_libsubdir
> +enable_gcov
>  enable_largefile
>  enable_decimal_float
>  with_system_libunwind
> @@ -1342,12 +1342,12 @@ Optional Features:
>--disable-FEATURE   do not include FEATURE (same as 
> --enable-FEATURE=no)
>--enable-FEATURE[=ARG]  include FEATURE [ARG=yes]
>--disable-shareddon't provide a shared libgcc
> -  --disable-gcov  don't provide libgcov and related host tools
>--enable-vtable-verifyEnable vtable verification feature
>--enable-version-specific-runtime-libsSpecify that runtime libraries 
> should be installed in a compiler-specific directory
>--enable-maintainer-mode
>enable make rules and dependencies not useful (and
>sometimes confusing) to the casual installer
> +  --disable-gcov  don't provide libgcov and related host tools
>--disable-largefile omit support for large files
>--enable-decimal-float={no,yes,bid,dpd}
> enable decimal float extension to C.  Selecting 'bid'
> @@ -2252,15 +2252,6 @@ fi
>
>
>
> -# Check whether --enable-gcov was given.
> -if test "${enable_gcov+set}" = set; then :
> -  enableval=$enable_gcov;
> -else
> -  enable_gcov=yes
> -fi
> -
> -
> -
>  # Check whether --enable-vtable-verify was given.
>  if test "${enable_vtable_verify+set}" = set; then :
>enableval=$enable_vtable_verify; case "$enableval" in
> @@ -2713,6 +2704,22 @@ fi
>  target_subdir=${target_noncanonical}
>
>
> +# Check whether --enable-gcov was given.
> +if test "${enable_gcov+set}" = set; then :
> +  enableval=$enable_gcov;
> +else
> +  case $target in
> +   bpf-*-*)
> + enable_gcov=no
> +   ;;
> +   *)
> + 

Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-24 Thread Richard Biener via Gcc-patches
On Sat, 19 Feb 2022, Qing Zhao wrote:

> Hi,
> 
> This is the 2nd patch for fixing pr102276.
> 
> Adding -Wtrivial-auto-var-init and update documentation.
> 
> Adding a new warning option -Wtrivial-auto-var-init to report cases when
> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
> time, update documentation for -ftrivial-auto-var-init to connect it with
> the new warning option -Wtrivial-auto-var-init,  and add documentation
> for -Wtrivial-auto-var-init.
> 
> Bootstraped and regression tested on both x86 and aarch64.
> 
> Okay for committing?
> 
> thanks.
> 
> Qing.
> 
> ==
> From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Fri, 18 Feb 2022 15:53:15 +
> Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update documentation.
> 
> Adding a new warning option -Wtrivial-auto-var-init to report cases when
> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
> time, update documentation for -ftrivial-auto-var-init to connect it with
> the new warning option -Wtrivial-auto-var-init,  and add documentation
> for -Wtrivial-auto-var-init.
> 
> 2022-02-18 Qing Zhao  
> gcc/ChangeLog:
> 
>   * common.opt (-Wtrivial-auto-var-init): New option.
>   * doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
>   (-ftrivial-auto-var-init): Update option;
>   * gimplify.cc (maybe_warn_switch_unreachable): Rename...
>   (maybe_warn_switch_unreachable_and_auto_init): ...to this.
>   (gimplify_switch_expr): Call new function.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/auto-init-pr102276-3.c: New test.
>   * gcc.dg/auto-init-pr102276-4.c: New test.
> ---
>  gcc/common.opt  |   4 +
>  gcc/doc/invoke.texi |  14 ++-
>  gcc/gimplify.cc | 100 +++-
>  gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
>  gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
>  5 files changed, 175 insertions(+), 23 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index c21e5273ae3..22c95dbfa49 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -801,6 +801,10 @@ Wtrampolines
>  Common Var(warn_trampolines) Warning
>  Warn whenever a trampoline is generated.
>  
> +Wtrivial-auto-var-init
> +Common Var(warn_trivial_auto_var_init) Warning Init(0)
> +Warn about where -ftrivial-auto-var-init cannot initialize the auto variable.
> +

Warn about cases where ... initialize a variable.

>  Wtype-limits
>  Common Var(warn_type_limits) Warning EnabledBy(Wextra)
>  Warn if a comparison is always true or always false due to the limited range 
> of the data type.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e1a00c80307..c61a5b4b4a5 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
>  -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
>  -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
>  -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
> --Wtsan -Wtype-limits  -Wundef @gol
> +-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
>  -Wuninitialized  -Wunknown-pragmas @gol
>  -Wunsuffixed-float-constants  -Wunused @gol
>  -Wunused-but-set-parameter  -Wunused-but-set-variable @gol
> @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ 
> programs.
>  Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch}
>  built-in functions are used.  These functions changed semantics in GCC 4.4.
>  
> +@item -Wtrivial-auto-var-init
> +@opindex Wtrivial-auto-var-init
> +@opindex Wno-trivial-auto-var-init
> +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic
> +variable.  A common situation is an automatic variable that is declared
> +between the controlling expression and the first case lable of a 
> @code{switch}
> +statement.
> +
>  @item -Wunused-but-set-parameter
>  @opindex Wunused-but-set-parameter
>  @opindex Wno-unused-but-set-parameter
> @@ -12314,6 +12322,10 @@ initializer as uninitialized, 
> @option{-Wuninitialized} and
>  warning messages on such automatic variables.
>  With this option, GCC will also initialize any padding of automatic variables
>  that have structure or union types to zeroes.
> +However, the current implementation cannot initialize automatic variables 
> that
> +are declared between the controlling expression and the first case of a
> +@code{switch} statement.  Using @option{-Wtrivial-auto-var-init} to report 
> all
> +such cases.
>  
>  The three values of @var{choice} are:
>  
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
> index 4e3bbf5314d..7e52794691f 100644
> --- a/gcc/gimplify.cc
> +++ 

Re: [PATCH 1/2][middle-end/102276] Don't emit switch-unreachable warnings for -ftrivial-auto-var-init (PR102276)

2022-02-24 Thread Richard Biener via Gcc-patches
On Sat, 19 Feb 2022, Qing Zhao wrote:

> Hi,
> 
> Per our discussion in the bug report 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102276
> 
> We decided to go with the following solution:
> 
> 1. avoid emitting switch-unreachable warnings for -ftrivial-auto-var-init;
> 2. adding a new option -Wtrivial-auto-var-init to emit warnings for the 
> switch-unreadable cases to suggest the user modify the source code;
> 3. update documentation of -ftrivial-auto-var-init for the limitation on 
> switch-unreachable cases and introduce the new option -Wtrivial-auto-var-init
> 
> with the above 1, we can resolve the current immediate issue of spurious 
> warnings of using -ftrivial-auto-var-init to make kernel build succeed;
> with the above 2, we provide the user a way to know that 
> -ftrivial-auto-var-init has limitation on the switch-unreachable cases, and 
> user should modify the source code to avoid this problem;
> with the above 3, we will provide the user a clear documentation of the 
> -ftrivial-auto-var-init and also provide suggestions how to resolve this 
> issue. 
> 
> There are two patches included for this bug.  This is the first one.
> 
> The patches has been bootstrapped and regression tested on both x86 and 
> aarch64.
> 
> Okay for commit?
> 
> Thanks.
> 
> Qing.
> 
> ===
> 
> From 65bc9607ff35ad49e5501ec5c392293c5b6358d0 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Fri, 18 Feb 2022 15:35:53 +
> Subject: [PATCH 1/2] Don't emit switch-unreachable warnings for
>  -ftrivial-auto-var-init (PR102276)
> 
> for the following testing case:
>   1 int g(int *);
>   2 int f1()
>   3 {
>   4 switch (0) {
>   5 int x;
>   6 default:
>   7 return g();
>   8 }
>   9 }
> compiling with -O -ftrivial-auto-var-init causes spurious warning:
> warning: statement will never be executed [-Wswitch-unreachable]
> 5 | int x;
>   | ^
> This is due to the compiler-generated initialization at the point of
> the declaration.
> 
> We could avoid the warning by adjusting the routine
> "maybe_warn_switch_unreachable" to exclude the following cases:
> 
> when
> flag_auto_var_init > AUTO_INIT_UNINITIALIZED
> And
> call to .DEFERRED_INIT
> 
> 2022-02-18 Qing Zhao  
> gcc/ChangeLog:
> 
>   * gimplify.cc (maybe_warn_switch_unreachable): Don't warn for compiler
>   -generated initializations for -ftrivial-auto-var-init.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/auto-init-pr102276-1.c: New test.
>   * gcc.dg/auto-init-pr102276-2.c: New test.
> ---
>  gcc/gimplify.cc |  8 -
>  gcc/testsuite/gcc.dg/auto-init-pr102276-1.c | 38 +
>  gcc/testsuite/gcc.dg/auto-init-pr102276-2.c | 38 +
>  3 files changed, 83 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-2.c
> 
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
> index f570daa015a..4e3bbf5314d 100644
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -2103,7 +2103,13 @@ maybe_warn_switch_unreachable (gimple_seq seq)
> && TREE_CODE (gimple_goto_dest (stmt)) == LABEL_DECL
> && DECL_ARTIFICIAL (gimple_goto_dest (stmt)))
>   /* Don't warn for compiler-generated gotos.  These occur
> -in Duff's devices, for example.  */;
> +in Duff's devices, for example.  */
> + ;
> +  else if ((flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
> + && (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)))
> + /* Don't warn for compiler-generated initializations for
> +   -ftrivial-auto-var-init.  */
> + ;

I think you want to instead skip these in warn_switch_unreachable_r
since otherwise a .DEFERRED_INIT can silence the warning for a real
stmt following it that is not reachable.

Richard.

>else
>   warning_at (gimple_location (stmt), OPT_Wswitch_unreachable,
>   "statement will never be executed");
> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr102276-1.c 
> b/gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
> new file mode 100644
> index 000..d574926e0c8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/auto-init-pr102276-1.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -Wall -ftrivial-auto-var-init=zero" } */
> +
> +int g(int *);
> +int f()
> +{
> +switch (0) { 
> +int x;  /* { dg-bogus "statement will never be executed" } */
> +default:
> +return g();
> +}
> +}
> +
> +int g1(int);
> +int f1()
> +{
> +switch (0) {
> +int x; /* { dg-bogus "statement will never be executed" } */
> +default:
> +return g1(x);  /* { dg-warning "is used uninitialized" } */
> +}
> +}
> +
> +struct S
> +{
> +  char a;
> +  int b;
> +};
> +int g2(int);
> +int f2(int input)
> +{
> +switch (0) {
> +struct S x; /* { dg-bogus "statement will never be 

Re: [PATCH][libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

2022-02-24 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 24, 2022 at 11:01:22AM +0100, Tom de Vries wrote:
> [ was: Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70 ]
> 
> On 2/24/22 09:29, Tom de Vries wrote:
> > I'll try to submit a patch with one or more test-cases.
> 
> Hi,
> 
> These test-cases exercise the omp declare variant construct using the
> available nvptx isas.
> 
> OK for trunk?
> 
> Thanks,
> - Tom

> [libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c
> 
> Add openmp test-cases that test the omp declare variant construct:
> ...
>   #pragma omp declare variant (f30) match (device={isa("sm_30")})
> ...
> using the available nvptx isas.
> 
> On a Pascal board GT 1030 with sm_61, we have these unsupported:
> ...
> UNSUPPORTED: libgomp.c/declare-variant-3-sm70.c
> UNSUPPORTED: libgomp.c/declare-variant-3-sm75.c
> UNSUPPORTED: libgomp.c/declare-variant-3-sm80.c
> ...
> and on a Turing board T400 with sm_75, we have this only this one:
> ...
> UNSUPPORTED: libgomp.c/declare-variant-3-sm80.c
> ...
> 
> Tested on x86_64 with nvptx accelerator.

I think testing it through dg-do link tests with -fdump-tree-optimized
or so would be better, you wouldn't need access to actual hardware level
and checking in the dump what function is actually called for each case is
easy.

Jakub



Re: [PATCH] sccvn: Fix visit_reference_op_call value numbering of vdefs [PR104601]

2022-02-24 Thread Richard Biener via Gcc-patches
On Thu, 24 Feb 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled, because -fipa-pure-const discovers
> that bar is const, but when sccvn during fre3 sees
>   # .MEM_140 = VDEF <.MEM_96>
>   *__pred$__d_43 = _50 (_49);
> where _50 value numbers to , it value numbers .MEM_140 to
> vuse_ssa_val (gimple_vuse (stmt)).  For const/pure calls that return
> a SSA_NAME (or don't have lhs) that is fine, those calls don't store
> anything, but if the lhs is present and not an SSA_NAME, value numbering
> the vdef to anything but itself means that e.g. walk_non_aliased_vuses
> won't consider the call, but the call acts as a store to its lhs.
> When it is ignored, sccvn will return whatever has been stored to the
> lhs earlier.
> 
> I've bootstrapped/regtested an earlier version of this patch, which did the
> if (!lhs && gimple_call_lhs (stmt))
>   changed |= set_ssa_val_to (vdef, vdef);
> part before else if (vnresult->result_vdef), and that regressed
> +FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo (" 1
> +FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo2 (" 1
> so this updated patch uses result_vdef there as before and only otherwise
> (which I think must be the const/pure case) decides based on whether the
> lhs is non-SSA_NAME.
> 
> Ok for trunk if it passes another bootstrap/regtest?

OK.

Thanks,
Richard.

> 2022-02-24  Jakub Jelinek  
> 
>   PR tree-optimization/104601
>   * tree-ssa-sccvn.cc (visit_reference_op_call): For calls with
>   non-SSA_NAME lhs value number vdef to itself instead of e.g. the
>   vuse value number.
> 
>   * g++.dg/torture/pr104601.C: New test.
> 
> --- gcc/tree-ssa-sccvn.cc.jj  2022-02-11 00:19:22.432063254 +0100
> +++ gcc/tree-ssa-sccvn.cc 2022-02-24 09:43:39.715051959 +0100
> @@ -5218,12 +5218,20 @@ visit_reference_op_call (tree lhs, gcall
>  
>if (vnresult)
>  {
> -  if (vnresult->result_vdef && vdef)
> - changed |= set_ssa_val_to (vdef, vnresult->result_vdef);
> -  else if (vdef)
> - /* If the call was discovered to be pure or const reflect
> -that as far as possible.  */
> - changed |= set_ssa_val_to (vdef, vuse_ssa_val (gimple_vuse (stmt)));
> +  if (vdef)
> + {
> +   if (vnresult->result_vdef)
> + changed |= set_ssa_val_to (vdef, vnresult->result_vdef);
> +   else if (!lhs && gimple_call_lhs (stmt))
> + /* If stmt has non-SSA_NAME lhs, value number the vdef to itself,
> +as the call still acts as a lhs store.  */
> + changed |= set_ssa_val_to (vdef, vdef);
> +   else
> + /* If the call was discovered to be pure or const reflect
> +that as far as possible.  */
> + changed |= set_ssa_val_to (vdef,
> +vuse_ssa_val (gimple_vuse (stmt)));
> + }
>  
>if (!vnresult->result && lhs)
>   vnresult->result = lhs;
> @@ -5248,7 +5256,11 @@ visit_reference_op_call (tree lhs, gcall
> if (TREE_CODE (fn) == ADDR_EXPR
> && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL
> && (flags_from_decl_or_type (TREE_OPERAND (fn, 0))
> -   & (ECF_CONST | ECF_PURE)))
> +   & (ECF_CONST | ECF_PURE))
> +   /* If stmt has non-SSA_NAME lhs, value number the
> +  vdef to itself, as the call still acts as a lhs
> +  store.  */
> +   && (lhs || gimple_call_lhs (stmt) == NULL_TREE))
>   vdef_val = vuse_ssa_val (gimple_vuse (stmt));
>   }
> changed |= set_ssa_val_to (vdef, vdef_val);
> --- gcc/testsuite/g++.dg/torture/pr104601.C.jj2022-02-23 
> 16:23:52.437366019 +0100
> +++ gcc/testsuite/g++.dg/torture/pr104601.C   2022-02-23 16:23:37.080578941 
> +0100
> @@ -0,0 +1,32 @@
> +// PR tree-optimization/104601
> +// { dg-do run }
> +// { dg-options "-std=c++17" }
> +
> +#include 
> +#include 
> +
> +inline std::optional
> +foo (std::vector::iterator b, std::vector::iterator c,
> + std::optional h (int))
> +{
> +  std::optional d;
> +  find_if (b, c, [&](auto e) { d = h(e); return d; });
> +  return d;
> +}
> +
> +std::optional
> +bar (int)
> +{
> +  return 1;
> +}
> +
> +int
> +main ()
> +{
> +  std::vector g(10);
> +  auto b = g.begin ();
> +  auto c = g.end ();
> +  auto e = foo (b, c, bar);
> +  if (!e)
> +__builtin_abort ();
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


[PATCH][libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

2022-02-24 Thread Tom de Vries via Gcc-patches

[ was: Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70 ]

On 2/24/22 09:29, Tom de Vries wrote:

I'll try to submit a patch with one or more test-cases.


Hi,

These test-cases exercise the omp declare variant construct using the 
available nvptx isas.


OK for trunk?

Thanks,
- Tom[libgomp, testsuite, nvptx] Add libgomp.c/declare-variant-3-sm*.c

Add openmp test-cases that test the omp declare variant construct:
...
  #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
using the available nvptx isas.

On a Pascal board GT 1030 with sm_61, we have these unsupported:
...
UNSUPPORTED: libgomp.c/declare-variant-3-sm70.c
UNSUPPORTED: libgomp.c/declare-variant-3-sm75.c
UNSUPPORTED: libgomp.c/declare-variant-3-sm80.c
...
and on a Turing board T400 with sm_75, we have this only this one:
...
UNSUPPORTED: libgomp.c/declare-variant-3-sm80.c
...

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-02-24  Tom de Vries  

	* testsuite/lib/libgomp.exp
	(check_effective_target_offload_device_nvptx_sm_xx)
	(check_effective_target_offload_device_nvptx_sm_30)
	(check_effective_target_offload_device_nvptx_sm_35)
	(check_effective_target_offload_device_nvptx_sm_53)
	(check_effective_target_offload_device_nvptx_sm_70)
	(check_effective_target_offload_device_nvptx_sm_75)
	(check_effective_target_offload_device_nvptx_sm_80): New proc.
	* testsuite/libgomp.c/declare-variant-3-sm30.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm35.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm53.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm70.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm75.c: New test.
	* testsuite/libgomp.c/declare-variant-3-sm80.c: New test.
	* testsuite/libgomp.c/declare-variant-3.h: New header file.

---
 libgomp/testsuite/lib/libgomp.exp  | 46 +++
 .../testsuite/libgomp.c/declare-variant-3-sm30.c   |  5 ++
 .../testsuite/libgomp.c/declare-variant-3-sm35.c   |  5 ++
 .../testsuite/libgomp.c/declare-variant-3-sm53.c   |  5 ++
 .../testsuite/libgomp.c/declare-variant-3-sm70.c   |  5 ++
 .../testsuite/libgomp.c/declare-variant-3-sm75.c   |  5 ++
 .../testsuite/libgomp.c/declare-variant-3-sm80.c   |  5 ++
 libgomp/testsuite/libgomp.c/declare-variant-3.h| 66 ++
 8 files changed, 142 insertions(+)

diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp
index 8c5ecfff0ac..d664863b15c 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -426,6 +426,52 @@ proc check_effective_target_offload_device_nvptx { } {
 } ]
 }
 
+# Return 1 if using nvptx offload device which supports -misa=sm_$SM.
+proc check_effective_target_offload_device_nvptx_sm_xx { sm } {
+if { ![check_effective_target_offload_device_nvptx] } {
+	return 0
+}
+return [check_runtime_nocache offload_device_nvptx_sm_$sm {
+  int main ()
+	{
+	  int x = 1;
+	  #pragma omp target map(tofrom: x)
+	x--;
+	  return x;
+	}
+} "-foffload=-misa=sm_$sm" ]
+}
+
+# See check_effective_target_offload_device_nvptx_sm_xx.
+proc check_effective_target_offload_device_nvptx_sm_30 { } {
+return [check_effective_target_offload_device_nvptx_sm_xx 30]
+}
+
+# See check_effective_target_offload_device_nvptx_sm_xx.
+proc check_effective_target_offload_device_nvptx_sm_35 { } {
+return [check_effective_target_offload_device_nvptx_sm_xx 35]
+}
+
+# See check_effective_target_offload_device_nvptx_sm_xx.
+proc check_effective_target_offload_device_nvptx_sm_53 { } {
+return [check_effective_target_offload_device_nvptx_sm_xx 53]
+}
+
+# See check_effective_target_offload_device_nvptx_sm_xx.
+proc check_effective_target_offload_device_nvptx_sm_70 { } {
+return [check_effective_target_offload_device_nvptx_sm_xx 70]
+}
+
+# See check_effective_target_offload_device_nvptx_sm_xx.
+proc check_effective_target_offload_device_nvptx_sm_75 { } {
+return [check_effective_target_offload_device_nvptx_sm_xx 75]
+}
+
+# See check_effective_target_offload_device_nvptx_sm_xx.
+proc check_effective_target_offload_device_nvptx_sm_80 { } {
+return [check_effective_target_offload_device_nvptx_sm_xx 80]
+}
+
 # Return 1 if at least one Nvidia GPU is accessible.
 
 proc check_effective_target_openacc_nvidia_accel_present { } {
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c b/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
new file mode 100644
index 000..7c680b07a94
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3-sm30.c
@@ -0,0 +1,5 @@
+/* { dg-do run { target { offload_target_nvptx } } } */
+/* { dg-require-effective-target offload_device_nvptx_sm_30 } */
+/* { dg-additional-options "-foffload=-misa=sm_30" } */
+
+#include "declare-variant-3.h"
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c b/libgomp/testsuite/libgomp.c/declare-variant-3-sm35.c
new file mode 100644
index 000..b8b2a714248
--- /dev/null
+++ 

Re: [PATCH 5/5 V1] RISC-V:Implement architecture extension test macros for Crypto extension

2022-02-24 Thread Kito Cheng via Gcc-patches
I would suggest implementing that in riscv_subset_list::parse so that
it also affect the ELF attribute emission.

On Wed, Feb 23, 2022 at 5:44 PM  wrote:
>
> From: LiaoShihua 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):Add __riscv_zks, 
> __riscv_zk, __riscv_zkn
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/predef-17.c: New test.
>
> ---
>  gcc/config/riscv/riscv-c.cc|  9 
>  gcc/testsuite/gcc.target/riscv/predef-17.c | 59 ++
>  2 files changed, 68 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/predef-17.c
>
> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> index 73c62f41274..d6c153e8d7c 100644
> --- a/gcc/config/riscv/riscv-c.cc
> +++ b/gcc/config/riscv/riscv-c.cc
> @@ -63,6 +63,15 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>builtin_define ("__riscv_fdiv");
>builtin_define ("__riscv_fsqrt");
>  }
> +
> +  if (TARGET_ZBKB && TARGET_ZBKC && TARGET_ZBKX && TARGET_ZKNE && 
> TARGET_ZKND && TARGET_ZKNH)
> +{
> +  builtin_define ("__riscv_zk");
> +  builtin_define ("__riscv_zkn");
> +}
> +
> +  if (TARGET_ZBKB && TARGET_ZBKC && TARGET_ZBKX && TARGET_ZKSED && 
> TARGET_ZKSH)
> +  builtin_define ("__riscv_zks");
>
>switch (riscv_abi)
>  {
> diff --git a/gcc/testsuite/gcc.target/riscv/predef-17.c 
> b/gcc/testsuite/gcc.target/riscv/predef-17.c
> new file mode 100644
> index 000..4366dee1016
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/predef-17.c
> @@ -0,0 +1,59 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64i_zbkb_zbkc_zbkx_zknd_zkne_zknh_zksed_zksh 
> -mabi=lp64 -mcmodel=medlow -misa-spec=2.2" } */
> +
> +int main () {
> +
> +#ifndef __riscv_arch_test
> +#error "__riscv_arch_test"
> +#endif
> +
> +#if __riscv_xlen != 64
> +#error "__riscv_xlen"
> +#endif
> +
> +#if !defined(__riscv_i)
> +#error "__riscv_i"
> +#endif
> +
> +#if !defined(__riscv_zk)
> +#error "__riscv_zk"
> +#endif
> +
> +#if !defined(__riscv_zkn)
> +#error "__riscv_zkn"
> +#endif
> +
> +#if !defined(__riscv_zks)
> +#error "__riscv_zks"
> +#endif
> +
> +#if !defined(__riscv_zbkb)
> +#error "__riscv_zbkb"
> +#endif
> +
> +#if !defined(__riscv_zbkc)
> +#error "__riscv_zbkc"
> +#endif
> +
> +#if !defined(__riscv_zbkx)
> +#error "__riscv_zbkx"
> +#endif
> +
> +#if !defined(__riscv_zknd)
> +#error "__riscv_zknd"
> +#endif
> +
> +#if !defined(__riscv_zkne)
> +#error "__riscv_zkne"
> +#endif
> +
> +#if !defined(__riscv_zknh)
> +#error "__riscv_zknh"
> +#endif
> +
> +#if !defined(__riscv_zksh)
> +#error "__riscv_zksh"
> +#endif
> +
> +  return 0;
> +}
> \ No newline at end of file
> --
> 2.31.1.windows.1
>


[PATCH] sccvn: Fix visit_reference_op_call value numbering of vdefs [PR104601]

2022-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase is miscompiled, because -fipa-pure-const discovers
that bar is const, but when sccvn during fre3 sees
  # .MEM_140 = VDEF <.MEM_96>
  *__pred$__d_43 = _50 (_49);
where _50 value numbers to , it value numbers .MEM_140 to
vuse_ssa_val (gimple_vuse (stmt)).  For const/pure calls that return
a SSA_NAME (or don't have lhs) that is fine, those calls don't store
anything, but if the lhs is present and not an SSA_NAME, value numbering
the vdef to anything but itself means that e.g. walk_non_aliased_vuses
won't consider the call, but the call acts as a store to its lhs.
When it is ignored, sccvn will return whatever has been stored to the
lhs earlier.

I've bootstrapped/regtested an earlier version of this patch, which did the
if (!lhs && gimple_call_lhs (stmt))
  changed |= set_ssa_val_to (vdef, vdef);
part before else if (vnresult->result_vdef), and that regressed
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo (" 1
+FAIL: gcc.dg/pr51879-16.c scan-tree-dump-times pre "foo2 (" 1
so this updated patch uses result_vdef there as before and only otherwise
(which I think must be the const/pure case) decides based on whether the
lhs is non-SSA_NAME.

Ok for trunk if it passes another bootstrap/regtest?

2022-02-24  Jakub Jelinek  

PR tree-optimization/104601
* tree-ssa-sccvn.cc (visit_reference_op_call): For calls with
non-SSA_NAME lhs value number vdef to itself instead of e.g. the
vuse value number.

* g++.dg/torture/pr104601.C: New test.

--- gcc/tree-ssa-sccvn.cc.jj2022-02-11 00:19:22.432063254 +0100
+++ gcc/tree-ssa-sccvn.cc   2022-02-24 09:43:39.715051959 +0100
@@ -5218,12 +5218,20 @@ visit_reference_op_call (tree lhs, gcall
 
   if (vnresult)
 {
-  if (vnresult->result_vdef && vdef)
-   changed |= set_ssa_val_to (vdef, vnresult->result_vdef);
-  else if (vdef)
-   /* If the call was discovered to be pure or const reflect
-  that as far as possible.  */
-   changed |= set_ssa_val_to (vdef, vuse_ssa_val (gimple_vuse (stmt)));
+  if (vdef)
+   {
+ if (vnresult->result_vdef)
+   changed |= set_ssa_val_to (vdef, vnresult->result_vdef);
+ else if (!lhs && gimple_call_lhs (stmt))
+   /* If stmt has non-SSA_NAME lhs, value number the vdef to itself,
+  as the call still acts as a lhs store.  */
+   changed |= set_ssa_val_to (vdef, vdef);
+ else
+   /* If the call was discovered to be pure or const reflect
+  that as far as possible.  */
+   changed |= set_ssa_val_to (vdef,
+  vuse_ssa_val (gimple_vuse (stmt)));
+   }
 
   if (!vnresult->result && lhs)
vnresult->result = lhs;
@@ -5248,7 +5256,11 @@ visit_reference_op_call (tree lhs, gcall
  if (TREE_CODE (fn) == ADDR_EXPR
  && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL
  && (flags_from_decl_or_type (TREE_OPERAND (fn, 0))
- & (ECF_CONST | ECF_PURE)))
+ & (ECF_CONST | ECF_PURE))
+ /* If stmt has non-SSA_NAME lhs, value number the
+vdef to itself, as the call still acts as a lhs
+store.  */
+ && (lhs || gimple_call_lhs (stmt) == NULL_TREE))
vdef_val = vuse_ssa_val (gimple_vuse (stmt));
}
  changed |= set_ssa_val_to (vdef, vdef_val);
--- gcc/testsuite/g++.dg/torture/pr104601.C.jj  2022-02-23 16:23:52.437366019 
+0100
+++ gcc/testsuite/g++.dg/torture/pr104601.C 2022-02-23 16:23:37.080578941 
+0100
@@ -0,0 +1,32 @@
+// PR tree-optimization/104601
+// { dg-do run }
+// { dg-options "-std=c++17" }
+
+#include 
+#include 
+
+inline std::optional
+foo (std::vector::iterator b, std::vector::iterator c,
+ std::optional h (int))
+{
+  std::optional d;
+  find_if (b, c, [&](auto e) { d = h(e); return d; });
+  return d;
+}
+
+std::optional
+bar (int)
+{
+  return 1;
+}
+
+int
+main ()
+{
+  std::vector g(10);
+  auto b = g.begin ();
+  auto c = g.end ();
+  auto e = foo (b, c, bar);
+  if (!e)
+__builtin_abort ();
+}

Jakub



Re: [PATCH] Check if loading const from mem is faster

2022-02-24 Thread Richard Biener via Gcc-patches
On Thu, 24 Feb 2022, Jiufu Guo wrote:

> Jiufu Guo via Gcc-patches  writes:
> 
> > Segher Boessenkool  writes:
> >
> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
> >>> I'm assuming we're always dealing with
> >>> 
> >>>   (set (reg:MODE ..) )
> >>> 
> >>> here and CSE is not substituting into random places of an
> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
> >>> to for a constant, if it should implicitely evaluate the cost
> >>> of putting the result into a register for example.
> >>
> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
> >> for anything that is used as input in a machine instruction -- but you
> >> need much more context to determine that.  insn_cost is much simpler and
> >> much easier to use.
> >>
> >>> Using RTX_COST with SET and 1 at least looks no worse than using
> >>> your proposed new target hook and comparing it with the original
> >>> unfolded src (again with SET and 1).
> >>
> >> It is required to generate valid instructions no matter what, before
> >> the pass has finished that is.  On all more modern architectures it is
> >> futile to think you can usefully consider the cost of an RTL expression
> >> and derive a real-world cost of the generated code from that.
> >
> > Thanks Segher for pointing out these!  Here is  another reason that I
> > did not use rtx_cost: in a few passes, there are codes to check the
> > constants and store them in constant pool.  I'm thinking to integerate
> > those codes in a consistent way.
> 
> Hi Segher, Richard!
> 
> I'm thinking the way like: For a constant,
> 1. if the constant could be used as an immediate for the
> instruction, then retreated as an operand;
> 2. otherwise if the constant can not be stored into a
> constant pool, then handle through instructions;
> 3. if it is faster to access constant from pool, then emit
> constant as data(.rodata);
> 4. otherwise, handle the constant by instructions.
> 
> And to store the constant into a pool, besides force_const_mem,
> create reference (toc) may be needed on some platforms.
> 
> For this particular issue in CSE, there is already code that
> tries to put constant into a pool (invoke force_const_mem).
> While the code is too late.  So, we may check the constant
> earlier and store it into constant pool if profitable.
> 
> And another thing as Segher pointed out, CSE is doing too
> much work.  It may be ok to separate the constant handling
> logic from CSE.

Not sure - CSE just is value numbering, I don't see that it does
more than that.  Yes, it might have developed "heuristics" over
the years what to CSE and to what and where to substitute and
where not.  But in the end it does just value numbering.

> 
> I update a new version patch as follow (did not seprate CSE):

How is the new target hook better in any way compared to rtx_cost
or insn_cost?  It looks like a total hack.

I suppose the actual way of materializing a constant is done
behind GCCs back and not exposed anywhere?  But instead you
claim the constants are valid when they actually are not?
Isn't the problem then that the rs6000 backend lies?

Btw, all of this is of course not appropriate for stage4 and changes
to CSE need testing on more than one target.

Richard.

> Thanks for the comments and suggestions again!
> 
> 
> BR,
> Jiufu
> 
> ---
>  gcc/config/rs6000/rs6000.cc   | 39 ++-
>  gcc/cse.cc| 36 -
>  gcc/doc/tm.texi   |  5 +++
>  gcc/doc/tm.texi.in|  2 +
>  gcc/target.def|  8 
>  gcc/targhooks.cc  |  6 +++
>  gcc/targhooks.h   |  1 +
>  .../gcc.target/powerpc/medium_offset.c|  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr63281.c| 14 +++
>  gcc/testsuite/gcc.target/powerpc/pr93012.c|  2 +-
>  10 files changed, 84 insertions(+), 31 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr63281.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index d7a7cfe860f..0a8f487d516 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1361,6 +1361,9 @@ static const struct attribute_spec 
> rs6000_attribute_table[] =
>  #undef TARGET_CANNOT_FORCE_CONST_MEM
>  #define TARGET_CANNOT_FORCE_CONST_MEM rs6000_cannot_force_const_mem
>  
> +#undef TARGET_FASTER_LOADING_CONSTANT
> +#define TARGET_FASTER_LOADING_CONSTANT rs6000_faster_loading_const
> +
>  #undef TARGET_DELEGITIMIZE_ADDRESS
>  #define TARGET_DELEGITIMIZE_ADDRESS rs6000_delegitimize_address
>  
> @@ -9684,8 +9687,8 @@ rs6000_init_stack_protect_guard (void)
>  static bool
>  rs6000_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
>  {
> -  if (GET_CODE (x) == HIGH
> -  && GET_CODE (XEXP (x, 0)) == UNSPEC)
> +  /* Exclude CONSTANT HIGH part.  */
> +  if (GET_CODE (x) == HIGH)
>  

Re: [PATCH][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-24 Thread Richard Biener via Gcc-patches
On Wed, 23 Feb 2022, Qing Zhao wrote:

> 
> 
> > On Feb 23, 2022, at 11:49 AM, Jakub Jelinek  wrote:
> > 
> > On Wed, Feb 23, 2022 at 05:33:57PM +, Qing Zhao wrote:
> >> From my understanding, __builtin_clear_padding (), does not _use_ 
> >> any variable,
> >> therefore, no uninitialized usage warning should be emitted for it. 
> > 
> > __builtin_clear_padding ()
> > sometimes expands to roughly:
> > *(int *)((char *) + 32) = 0;
> > etc., in that case it shouldn't be suppressed in any way, it doesn't read
> > anything, only stores.
> > Or at other times it is:
> > *(int *)((char *) + 32) &= 0xfec7dab1;
> > etc., in that case it reads bytes from the object which can be
> > uninitialized, we mask some bits off and store.
> 
> Okay, I see. 
> So, only the MEM_REF that will be used to read first should be suppressed 
> warning. Then there is only one (out of 4) MEM_REF
> should be suppressed warning, that’s the following one (line 4371 and then 
> line 4382):
> 
> 4371   tree dst = build2_loc (buf->loc, MEM_REF, atype, buf->base,
> 4372  build_int_cst (buf->alias_type, 
> off));
> 4373   tree src;
> 4374   gimple *g;
> 4375   if (all_ones
> 4376   && nonzero_first == start
> 4377   && nonzero_last == start + eltsz)
> 4378 src = build_zero_cst (type);
> 4379   else
> 4380 {
> 4381   src = make_ssa_name (type);
> 4382   g = gimple_build_assign (src, unshare_expr (dst));
> 4383   gimple_set_location (g, buf->loc);
> 4384   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
> 4385   tree mask = native_interpret_expr (type,
> 4386  buf->buf + i + 
> start,
> 4387  eltsz);
> 4388   gcc_assert (mask && TREE_CODE (mask) == INTEGER_CST);
> 4389   mask = fold_build1 (BIT_NOT_EXPR, type, mask);
> 4390   tree src_masked = make_ssa_name (type);
> 4391   g = gimple_build_assign (src_masked, BIT_AND_EXPR,
> 4392src, mask);
> 4393   gimple_set_location (g, buf->loc);
> 4394   gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
> 4395   src = src_masked;
> 4396 }
> 4397   g = gimple_build_assign (dst, src);
> 
> 
> All the other 3 MEM_REFs are not read. So, we can just exclude them from 
> suppressing warning, right?
> Another question, for the above MEM_REF, should I suppress warning for line 
> 4371 “dst”? Or shall I 
> Suppress warning for line 4382 (for the “unshared_expr(dst)”)?
> 
> I think that we should suppress warning for the latter, i.e 
> “unshared_expr(dst)” at line 4382, right?

Yes, the one that's put into the GIMPLE stmt.

> > 
> > It is similar to what object.bitfld = 3; expands to,
> > but usually only after the uninit pass.  Though, we have the
> > optimize_bit_field_compare optimization, that is done very early
> > and I wonder what uninit does about that.  Perhaps it ignores
> > BIT_FIELD_REFs, I'd need to check that.
> 
> Yes, I see that uninitialized warning specially handles BIT_INSERT_EXPR as: 
> (tree-ssa-uninit.cc)
> 
>  573   /* Do not warn if the result of the access is then used for
>  574  a BIT_INSERT_EXPR. */
>  575   if (lhs && TREE_CODE (lhs) == SSA_NAME)
>  576 FOR_EACH_IMM_USE_FAST (luse_p, liter, lhs)
>  577   {
>  578 gimple *use_stmt = USE_STMT (luse_p);
>  579 /* BIT_INSERT_EXPR first operand should not be considered
>  580a use for the purpose of uninit warnings.  */

That follows the COMPLEX_EXPR handling I think.

> > 
> > Anyway, if we want to disable uninit warnings for __builtin_clear_padding,
> > we should do that with suppress_warning on the read stmts that load
> > a byte (or more adjacent ones) before they are masked off and stored again,
> > so that we don't warn about that.
> 
> IN addition to this read stmts, shall we suppress warnings for the following:
> 
> /* Emit a runtime loop:
>for (; buf.base != end; buf.base += sz)
>  __builtin_clear_padding (buf.base);  */
> 
> static void
> clear_padding_emit_loop (clear_padding_struct *buf, tree type,
>  tree end, bool for_auto_init)
> {
> 
> i.e, should we suppress warnings for the above “buf.base != end”, “buf.base 
> += sz”?
> 
> No need to suppress warning for them since they just read the address of the 
> object, not the object itself?

No need to supporess those indeed.

Richard.


Re: [Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-24 Thread Tom de Vries via Gcc-patches

On 2/22/22 17:03, Tobias Burnus wrote:

Hi Tom,

On 22.02.22 15:43, Tom de Vries wrote:

On 2/17/22 18:24, Tobias Burnus wrote:

--- a/gcc/config/nvptx/t-omp-device
+++ b/gcc/config/nvptx/t-omp-device
@@ -1,4 +1,4 @@
 echo kind: gpu > $@
 echo arch: nvptx >> $@
-    echo isa: sm_30 sm_35 >> $@
+    echo isa: sm_30 sm_35 sm_53 sm_70 sm_75 sm_80 >> $@


I'm not sure I understand how this is used.  Is this user-visible?  Is
there a libgomp test-case where we can observe a difference?


That's used for OpenMP context selectors like; that way, one can generate,
e.g. one code used with nvptx and one with gcn as with:

#pragma omp declare variant (on_nvptx) 
match(construct={target},device={arch(nvptx)})
#pragma omp declare variant (on_gcn) 
match(construct={target},device={arch(gcn)})

...
   #pragma omp target map(from:v)
   v = on ();
which then either calls 'on' or 'on_nvptx' or 'on_gcn'
(from libgomp/testsuite/libgomp.c/target-42.c)


The following testcases use 'arch(nvptx)':

libgomp/testsuite/libgomp.c-c++-common/on_device_arch.h
libgomp/testsuite/libgomp.c/target-42.c
libgomp/testsuite/libgomp.c/usleep.h
libgomp/testsuite/libgomp.fortran/declare-variant-1.f90

For ISA, there is only one run-time test:

libgomp/testsuite/libgomp.c/declare-variant-1.c

but only for x86-64: match (device={isa("avx512f")})

The sm_35 also appears, but only in the compile-time tests:
gcc/testsuite/{c-c++-common,gfortran.dg}/gomp/declare-variant-{9,10}.*



Thanks for the explanation.

I've updated the patch to include changes to 
nvptx_omp_device_kind_arch_isa, and committed.


I'll try to submit a patch with one or more test-cases.

Thanks,
- Tom

[nvptx] Add missing t-omp-device isas

In t-omp-device we list isas that can be used in omp declare variant like so:
...
  #pragma omp declare variant (f30) match (device={isa("sm_30")})
...
and in nvptx_omp_device_kind_arch_isa we handle them.

Update both to reflect the current list of isas.

Tested on x86_64-linux with nvptx accelerator.

gcc/ChangeLog:

2022-02-23  Tom de Vries  

	* config/nvptx/nvptx.cc (nvptx_omp_device_kind_arch_isa): Handle
	sm_70, sm_75 and sm_80.
	* config/nvptx/t-omp-device: Add sm_53, sm_70, sm_75 and sm_80.

Co-Authored-By: Tobias Burnus 

---
 gcc/config/nvptx/nvptx.cc | 8 +++-
 gcc/config/nvptx/t-omp-device | 2 +-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 6f6d592e462..b9451c2ed09 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -6181,7 +6181,13 @@ nvptx_omp_device_kind_arch_isa (enum omp_device_kind_arch_isa trait,
   if (strcmp (name, "sm_35") == 0)
 	return TARGET_SM35 && !TARGET_SM53;
   if (strcmp (name, "sm_53") == 0)
-	return TARGET_SM53;
+	return TARGET_SM53 && !TARGET_SM70;
+  if (strcmp (name, "sm_70") == 0)
+	return TARGET_SM70 && !TARGET_SM75;
+  if (strcmp (name, "sm_75") == 0)
+	return TARGET_SM75 && !TARGET_SM80;
+  if (strcmp (name, "sm_80") == 0)
+	return TARGET_SM80;
   return 0;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/nvptx/t-omp-device b/gcc/config/nvptx/t-omp-device
index 8765d9f1881..4228218a424 100644
--- a/gcc/config/nvptx/t-omp-device
+++ b/gcc/config/nvptx/t-omp-device
@@ -1,4 +1,4 @@
 omp-device-properties-nvptx: $(srcdir)/config/nvptx/nvptx.cc
 	echo kind: gpu > $@
 	echo arch: nvptx >> $@
-	echo isa: sm_30 sm_35 >> $@
+	echo isa: sm_30 sm_35 sm_53 sm_70 sm_75 sm_80 >> $@


[committed][nvptx] Add shf.{l,r}.wrap insn

2022-02-24 Thread Tom de Vries via Gcc-patches

On 2/23/22 12:40, Tom de Vries wrote:

Hi,

Ptx contains funnel shift operations shf.l.wrap and shf.r.wrap that can be
used to implement 32-bit left or right rotate.

Add define_insns rotlsi3 and rotrsi3.

Currently testing.



And committed.

Thanks,
- Tom


[nvptx] Add shf.{l,r}.wrap insn

gcc/ChangeLog:

2022-02-23  Tom de Vries  

* config/nvptx/nvptx.md (define_insn "rotlsi3", define_insn
"rotrsi3"): New define_insn.

gcc/testsuite/ChangeLog:

2022-02-23  Tom de Vries  

* gcc.target/nvptx/rotate-run.c: New test.
* gcc.target/nvptx/rotate.c: New test.

---
  gcc/config/nvptx/nvptx.md   | 16 
  gcc/testsuite/gcc.target/nvptx/rotate-run.c | 23 +++
  gcc/testsuite/gcc.target/nvptx/rotate.c | 20 
  3 files changed, 59 insertions(+)

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 216e89f230ac..4989b5642e29 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -808,6 +808,22 @@
""
"%.\\tshr.u%T0\\t%0, %1, %2;")
  
+(define_insn "rotlsi3"

+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+   (rotate:SI (match_operand:SI 1 "nvptx_register_operand" "R")
+  (and:SI (match_operand:SI 2 "nvptx_nonmemory_operand" "Ri")
+  (const_int 31]
+  "TARGET_SM35"
+  "%.\\tshf.l.wrap.b32\\t%0, %1, %1, %2;")
+
+(define_insn "rotrsi3"
+  [(set (match_operand:SI 0 "nvptx_register_operand" "=R")
+   (rotatert:SI (match_operand:SI 1 "nvptx_register_operand" "R")
+(and:SI (match_operand:SI 2 "nvptx_nonmemory_operand" "Ri")
+(const_int 31]
+  "TARGET_SM35"
+  "%.\\tshf.r.wrap.b32\\t%0, %1, %1, %2;")
+
  ;; Logical operations
  
  (define_code_iterator any_logic [and ior xor])

diff --git a/gcc/testsuite/gcc.target/nvptx/rotate-run.c 
b/gcc/testsuite/gcc.target/nvptx/rotate-run.c
new file mode 100644
index ..14cb6f8b0b3f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/rotate-run.c
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "rotate.c"
+
+#define ASSERT(EXPR)   \
+  do   \
+{  \
+  if (!(EXPR)) \
+   __builtin_abort (); \
+} while (0)
+
+int
+main (void)
+{
+  ASSERT (rotl (0x12345678, 8) == 0x34567812);
+  ASSERT (rotl (0x12345678, 8 + 32) == 0x34567812);
+
+  ASSERT (rotr (0x12345678, 8) == 0x78123456);
+  ASSERT (rotr (0x12345678, 8 + 32) == 0x78123456);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/nvptx/rotate.c 
b/gcc/testsuite/gcc.target/nvptx/rotate.c
new file mode 100644
index ..1c9b83b4809d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/rotate.c
@@ -0,0 +1,20 @@
+/* { dg-do assemble } */
+/* { dg-options "-O2 -save-temps" } */
+
+#define MASK 0x1f
+
+unsigned int
+rotl (unsigned int val, unsigned int cnt) {
+  cnt &= MASK;
+  return (val << cnt) | (val >> (-cnt & MASK));
+}
+
+unsigned int
+rotr (unsigned int val, unsigned int cnt) {
+  cnt &= MASK;
+  return (val >> cnt) | (val << (-cnt & MASK));
+}
+
+/* { dg-final { scan-assembler-times "shf.l.wrap.b32" 1 } } */
+/* { dg-final { scan-assembler-times "shf.r.wrap.b32" 1 } } */
+/* { dg-final { scan-assembler-not "and.b32" } } */


[committed][nvptx] Fix dummy location in gen_comment

2022-02-24 Thread Tom de Vries via Gcc-patches

On 2/23/22 12:58, Thomas Schwinge wrote:

Hi!

On 2022-02-23T12:14:57+0100, Tom de Vries via Gcc-patches 
 wrote:

[ Re: [committed][nvptx] Add -mptx-comment ]

On 2/22/22 14:53, Tom de Vries wrote:

Add functionality that indicates which insns are added by -minit-regs, such
that for instance we have for pr53465.s:
...
  // #APP
// 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1
  // Start: Added by -minit-regs=3:
  // #NO_APP
  mov.u32 %r26, 0;
  // #APP
// 9 "gcc/testsuite/gcc.c-torture/execute/pr53465.c" 1
  // End: Added by -minit-regs=3:
  // #NO_APP
...

Can be switched off using -mno-ptx-comment.

Tested on nvptx.


But tested in combination with another patch, which is still waiting for
review.

This patch by itself caused some regressions


I'd just begun analyzing and determined that it was
commit c2b23aaaf4457278403c01cd145cd3936683384e
"[nvptx] Add -mptx-comment" that causes a load of FAILs in nvptx
offloading testing:

 Program received signal SIGSEGV, Segmentation fault.
 0x0084abad in final_scan_insn_1 (insn=insn@entry=0x77380940, 
file=file@entry=0x1f50c40, optimize_p=optimize_p@entry=0, 
nopeepholes=nopeepholes@entry=0, seen=seen@entry=0x7fffd07c) at 
[...]/source-gcc/gcc/final.cc:2650
 2650if (*loc.file && loc.line)
 (gdb) print loc
 $1 = {file = 0x0, line = 0, column = 0, data = 0x0, sysp = false}
 (gdb) bt
 #0  0x0084abad in final_scan_insn_1 
(insn=insn@entry=0x77380940, file=file@entry=0x1f50c40, 
optimize_p=optimize_p@entry=0, nopeepholes=nopeepholes@entry=0, 
seen=seen@entry=0x7fffd07c) at [...]/source-gcc/gcc/final.cc:2650
 #1  0x0084b86a in final_scan_insn (insn=insn@entry=0x77380940, 
file=file@entry=0x1f50c40, optimize_p=optimize_p@entry=0, 
nopeepholes=nopeepholes@entry=0, seen=seen@entry=0x7fffd07c) at 
[...]/source-gcc/gcc/final.cc:2942
 #2  0x0084823a in final_1 (first=0x774631c0, file=0x1f50c40, 
seen=1, optimize_p=0) at [...]/source-gcc/gcc/final.cc:1999
 #3  0x0085091a in rest_of_handle_final () at 
[...]/source-gcc/gcc/final.cc:4287
 #4  0x00850de4 in (anonymous namespace)::pass_final::execute 
(this=0x1f4bd00) at [...]/source-gcc/gcc/final.cc:4365
 #5  0x00b781b1 in execute_one_pass (pass=pass@entry=0x1f4bd00) at 
[...]/source-gcc/gcc/passes.cc:2639
 #6  0x00b7855a in execute_pass_list_1 (pass=0x1f4bd00) at 
[...]/source-gcc/gcc/passes.cc:2739
 #7  0x00b7858d in execute_pass_list_1 (pass=0x1f4b820) at 
[...]/source-gcc/gcc/passes.cc:2740
 #8  0x00b7858d in execute_pass_list_1 (pass=0x1f49d20, 
pass@entry=0x1f45780) at [...]/source-gcc/gcc/passes.cc:2740
 #9  0x00b785e9 in execute_pass_list (fn=0x772e1e40, 
pass=0x1f45780) at [...]/source-gcc/gcc/passes.cc:2750
 #10 0x00732a66 in cgraph_node::expand (this=0x772efbb0) at 
[...]/source-gcc/gcc/cgraphunit.cc:1836
 #11 0x0073336a in cgraph_order_sort::process (this=0x20730f8) at 
[...]/source-gcc/gcc/cgraphunit.cc:2075
 #12 0x007336f4 in output_in_order () at 
[...]/source-gcc/gcc/cgraphunit.cc:2143
 #13 0x00733dbe in symbol_table::compile (this=0x77542000) at 
[...]/source-gcc/gcc/cgraphunit.cc:2347
 #14 0x0065d79b in lto_main () at 
[...]/source-gcc/gcc/lto/lto.cc:655
 #15 0x00c709e6 in compile_file () at 
[...]/source-gcc/gcc/toplev.cc:454
 #16 0x00c73abb in do_compile (no_backend=no_backend@entry=false) 
at [...]/source-gcc/gcc/toplev.cc:2160
 #17 0x00c73ea6 in toplev::main (this=this@entry=0x7fffd4b0, 
argc=argc@entry=16, argv=0x1f1db40, argv@entry=0x7fffd5b8) at 
[...]/source-gcc/gcc/toplev.cc:2312
 #18 0x0174fe5f in main (argc=16, argv=0x7fffd5b8) at 
[...]/source-gcc/gcc/main.cc:41


currently testing attached
fix.


Per the test results that I've got so far (but is still running), your
proposed fix does resolve the SIGSEGVs, thanks.


Thanks for testing this, and sorry for the fall-out.

Now committed.

Thanks,
- Tom