[committed][testsuite] Require non_strict_prototype in a few tests

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

Require effective target non_strict_prototype in a few test-cases.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[testsuite] Require non_strict_prototype in a few tests

gcc/testsuite/ChangeLog:

2022-02-10  Tom de Vries  

* gcc.c-torture/compile/pr100576.c: Require effective target
non_strict_prototype.
* gcc.c-torture/compile/pr97576.c: Same.

---
 gcc/testsuite/gcc.c-torture/compile/pr100576.c | 2 ++
 gcc/testsuite/gcc.c-torture/compile/pr97576.c  | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr100576.c 
b/gcc/testsuite/gcc.c-torture/compile/pr100576.c
index f2f40ec4512..f16a8224c6e 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr100576.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr100576.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target non_strict_prototype } */
+
 /* PR middle-end/100576 */
 
 const char v[] = {0x12};
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr97576.c 
b/gcc/testsuite/gcc.c-torture/compile/pr97576.c
index 28294c8597a..d2816132cc0 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr97576.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr97576.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target non_strict_prototype } */
+
 void
 pc (void);
 


[committed][testsuite] Require alloca support in a few tests

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

Require effective target alloca in a few test-cases.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[testsuite] Require alloca support in a few tests

gcc/testsuite/ChangeLog:

2022-02-10  Tom de Vries  

* c-c++-common/Walloca-larger-than.c: Require effective target alloca.
* c-c++-common/Warray-bounds-9.c: Same.
* c-c++-common/Wdangling-pointer-2.c: Same.
* c-c++-common/Wdangling-pointer-4.c: Same.
* c-c++-common/Wdangling-pointer-5.c: Same.
* c-c++-common/Wdangling-pointer.c: Same.
* c-c++-common/auto-init-11.c: Same.
* c-c++-common/auto-init-12.c: Same.
* c-c++-common/auto-init-15.c: Same.
* c-c++-common/auto-init-16.c: Same.
* c-c++-common/torture/builtin-clear-padding-4.c: Same.
* gcc.c-torture/compile/pr99787-1.c: Same.
* gcc.dg/Walloca-larger-than-4.c: Same.
* gcc.dg/Wdangling-pointer.c: Same.
* gcc.dg/Wfree-nonheap-object-2.c: Same.
* gcc.dg/Wfree-nonheap-object.c: Same.
* gcc.dg/Wstringop-overflow-56.c: Same.
* gcc.dg/Wstringop-overflow-57.c: Same.
* gcc.dg/Wstringop-overflow-67.c: Same.
* gcc.dg/Wstringop-overflow-71.c: Same.
* gcc.dg/Wvla-larger-than-5.c: Same.
* gcc.dg/analyzer/taint-alloc-1.c: Same.
* gcc.dg/analyzer/torture/ubsan-1.c: Same.
* gcc.dg/graphite/pr99085.c: Same.
* gcc.dg/pr100225.c: Same.
* gcc.dg/pr98721-1.c: Same.
* gcc.dg/pr99122-2.c: Same.
* gcc.dg/sso-14.c: Same.
* gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: Same.
* gcc.dg/uninit-38.c: Same.
* gcc.dg/uninit-39.c: Same.
* gcc.dg/uninit-41.c: Same.
* gcc.dg/uninit-pr100250.c: Same.
* gcc.dg/uninit-pr101300.c: Same.
* gcc.dg/uninit-pr101494.c: Same.
* gcc.dg/uninit-pr98578.c: Same.
* gcc.dg/uninit-pr98583.c: Same.
* gcc.dg/vla-stexp-1.c: Same.
* gcc.dg/vla-stexp-2.c: Same.
* gcc.dg/vla-stexp-4.c: Same.
* gcc.dg/vla-stexp-5.c: Same.

---
 gcc/testsuite/c-c++-common/Walloca-larger-than.c | 3 ++-
 gcc/testsuite/c-c++-common/Warray-bounds-9.c | 3 ++-
 gcc/testsuite/c-c++-common/Wdangling-pointer-2.c | 3 ++-
 gcc/testsuite/c-c++-common/Wdangling-pointer-4.c | 3 ++-
 gcc/testsuite/c-c++-common/Wdangling-pointer-5.c | 3 ++-
 gcc/testsuite/c-c++-common/Wdangling-pointer.c   | 3 ++-
 gcc/testsuite/c-c++-common/auto-init-11.c| 1 +
 gcc/testsuite/c-c++-common/auto-init-12.c| 1 +
 gcc/testsuite/c-c++-common/auto-init-15.c| 1 +
 gcc/testsuite/c-c++-common/auto-init-16.c| 1 +
 gcc/testsuite/c-c++-common/torture/builtin-clear-padding-4.c | 2 ++
 gcc/testsuite/gcc.c-torture/compile/pr99787-1.c  | 1 +
 gcc/testsuite/gcc.dg/Walloca-larger-than-4.c | 3 ++-
 gcc/testsuite/gcc.dg/Wdangling-pointer.c | 3 ++-
 gcc/testsuite/gcc.dg/Wfree-nonheap-object-2.c| 3 ++-
 gcc/testsuite/gcc.dg/Wfree-nonheap-object.c  | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-56.c | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-57.c | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-67.c | 3 ++-
 gcc/testsuite/gcc.dg/Wstringop-overflow-71.c | 3 ++-
 gcc/testsuite/gcc.dg/Wvla-larger-than-5.c| 3 ++-
 gcc/testsuite/gcc.dg/analyzer/taint-alloc-1.c| 1 +
 gcc/testsuite/gcc.dg/analyzer/torture/ubsan-1.c  | 1 +
 gcc/testsuite/gcc.dg/graphite/pr99085.c  | 1 +
 gcc/testsuite/gcc.dg/pr100225.c  | 1 +
 gcc/testsuite/gcc.dg/pr98721-1.c | 1 +
 gcc/testsuite/gcc.dg/pr99122-2.c | 1 +
 gcc/testsuite/gcc.dg/sso-14.c| 1 +
 gcc/testsuite/gcc.dg/tree-ssa/builtin-sprintf-warn-25.c  | 3 ++-
 gcc/testsuite/gcc.dg/uninit-38.c | 3 ++-
 gcc/testsuite/gcc.dg/uninit-39.c | 3 ++-
 gcc/testsuite/gcc.dg/uninit-41.c | 3 ++-
 gcc/testsuite/gcc.dg/uninit-pr100250.c   | 3 ++-
 gcc/testsuite/gcc.dg/uninit-pr101300.c   | 3 ++-
 gcc/testsuite/gcc.dg/uninit-pr101494.c   | 3 ++-
 gcc/testsuite/gcc.dg/uninit-pr98578.c| 3 ++-
 gcc/testsuite/gcc.dg/uninit-pr98583.c| 3 ++-
 gcc/testsuite/gcc.dg/vla-stexp-1.c   | 1 +
 gcc/testsuite/gcc.dg/vla-stexp-2.c   | 1 +
 gcc/testsuite/gcc.dg/vla-stexp-4.c   | 1 +
 gcc/testsuite/gcc.dg/vla-stexp-5.c   | 1 +
 41 files changed, 66 insertions(+), 24 deletions(-)

diff --git 

[committed][nvptx] Handle asm insn in prevent_branch_around_nothing

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

With GOMP_NVPTX_JIT=-00 and -mptx=3.1, I run into:
...
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_prof-version-1.c \
  -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2 \
  execution test
...

The problem is that we're generating a diverging branch around nothing:
...
{
.reg.u32%x;
mov.u32 %x, %tid.x;
setp.ne.u32 %r23, %x, 0;
}
@%r23   bra $L2;
$L2:
...
which the driver JIT has problems with at -O0, so consequently we run into the
nvptx_uniform_warp_check.

Fix this by handling asm ("") and alike in prevent_branch_around_nothing.

Tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[nvptx] Handle asm insn in prevent_branch_around_nothing

gcc/ChangeLog:

2022-02-10  Tom de Vries  

PR target/104456
* config/nvptx/nvptx.cc (prevent_branch_around_nothing): Handle asm
insn.

---
 gcc/config/nvptx/nvptx.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 5b26c0f4c7d..afbad5bdde6 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -5257,6 +5257,14 @@ prevent_branch_around_nothing (void)
case CODE_FOR_nvptx_join:
case CODE_FOR_nop:
  continue;
+   case -1:
+ /* Handle asm ("") and similar.  */
+ if (GET_CODE (PATTERN (insn)) == ASM_INPUT
+ || GET_CODE (PATTERN (insn)) == ASM_OPERANDS
+ || (GET_CODE (PATTERN (insn)) == PARALLEL
+ && asm_noperands (PATTERN (insn)) >= 0))
+   continue;
+ /* FALLTHROUGH.  */
default:
  seen_label = NULL;
  continue;


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
Hi!

On Thu, Feb 10, 2022 at 04:28:02PM -0600, Bill Schmidt wrote:
> On 2/10/22 4:11 PM, Segher Boessenkool wrote:
> >> No, trunk has this, for example:
> >>
> >>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> >>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
> > I see this on trunk:
> >
> >   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> > VCLZLSBB_V16QI vclzlsbb_v16qi {}
> >
> > Oh, you changed it?  Please fix it, then.
> 
> In a patch you approved, yes.

Yes, I missed it.  That is not an argument that it would be good or
should not be change.

> I don't really understand why you want
> it changed now.

Because it is wrong.

> You must not be looking at the most recent trunk revision.

Indeed I haven't been able to update master for a week or so, it does
not bootstrap, as we have talked about.

> >> Throughout the new builtin infrastructure, the defaults are set for
> >> little-endian, and the "endian" flag changes behavior for big-endian.
> > That is a big mistake.  There are many machine instructions  that are
> > *always* big-endian (most even!), and none that are always
> > little-endian.  So this should be fixed, sooner rather than later :-(
> 
> That does not seem like a good idea in stage 4 to me.  That requires
> yet another patch to reverse a bunch of other things unnecessarily.

Things that were added in stage 4, a few days ago even.  Things that are
broken and wrong.  Things I do not want to have to release with and deal
with all the pain of having broken released versions.

> This is a purely arbitrary choice.

No, it is not.  It flies in the face of consistency.

> The endian flag is only used when
> a built-in function must have one behavior for big-endian, and another
> behavior for little-endian.  Which one is chosen as the default is
> absolutely arbitrary.

The one that corresponds to the name should be the default.  I don't see
how you can argue otherwise.

> When we expand the built-in we will either
> accept the default or change to the other.  The existence of machine
> instructions that are only big-endian has nothing to do with the case;
> what matters is the existence of built-in functions that have two
> behaviors.

Everything in our backend is BE by default, just like everything in the
architecture is.  Yes, LE works almost as well (or just as well) in most
places, but everything is named assuming BE.  This consistency is hugely
important, without it the reader will not understand things as well and
as easily.

> >> That's something that should be fixed, I guess, but it's orthogonal
> >> to this patch.
> > Fixing it later is more work :-(
> >
> > Please at least open a bug report for it.
> 
> I can do that.

Thanks!

> > The other things need fixing before the patch is okay.
> 
> I'd ask you to reconsider, as explained above.

It is purely an implementation thing, and it is completely trivial to
do.  If you truly are afraid of breaking things (you should not be), it
is marginally acceptable to do this as the very first thing in stage 1.

Consistency matters.  Naming matters.  These shape how we think about
things.


Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
Hi!

On Thu, Feb 10, 2022 at 05:43:26PM -0500, David Edelsohn wrote:
> -mbig/-mlittle only applies to Linux, not AIX and not Darwin.
> 
> I changed the BE testcases to add "-mbig" for little endian default
> targets because the compiler implicitly should be operating in big
> endian mode for other targets and the testcases should succeed.
> 
> For the LE testcases, I changed the target selector to
> "powrpc*-*-linux*" because that is the only PowerPC target that can
> operate as little endian.  I could not find a generic "le" target
> selector.

It is fully generic.  I added it in
  commit 89453706e0032f9a9c2107631873d9dad38dc14c
  Author: Segher Boessenkool 
  Date:   Wed May 23 19:31:05 2018 +0200

  testsuite: Introduce be/le selectors

It is very useful, just like ilp32 / lp64 :-)

> powerpc*-*-linux* understands "-mlittle", so I left the
> dg-options clause because there is no need to separate out "-mlittle"
> for that subset of PowerPC targets.

Yes.  powerpc64le-linux is the only problematic thing (you can make
powerpc64-linux be LE as well, and in principle you can make
powerpc64le-linux be BE, that just doesn't work right now -- this all
*does* work for powerpc-linux and powerpcle-linux).

Implicit assumptions make everything harder.


Segher


Re: [PATCH] Add single_use to simplification (uncond_op + vec_cond -> cond_op).

2022-02-10 Thread Richard Biener via Gcc-patches
On Fri, Feb 11, 2022 at 2:38 AM liuhongt  wrote:
>
> >>> Confirmed.  When uncond_op is expensive (there's *div amongst them) that's
> >>> definitely unwanted.  OTOH when it is cheap then combining will reduce
> >>> latency.
> >>>
> >>> GIMPLE wise it's a neutral transform if uncond_op is not single-use unless
> >>> we need two v_c_es.
> >>
> >> We can leave it to rtl combine/fwprop which will consider rtx_cost for 
> >> them.
> >>
> >
> >That certainly makes sense for the !single_use case.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and
> aarch64-unknown-linux-gnu.
> Also Bootstrapped and regtested on CLX with gcc configure --with-arch=native
> --with-cpu=native.
>
> Ok for trunk?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR tree-optimization/104479
> * match.pd (uncond_op + vec_cond -> cond_op): Add single_use
> for the dest of uncond_op.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr104479.c: New test.
> * gcc.target/i386/cond_op_shift_w-1.c: Adjust testcase.
> ---
>  gcc/match.pd  | 12 ---
>  .../gcc.target/i386/cond_op_shift_w-1.c   |  3 +-
>  gcc/testsuite/gcc.target/i386/pr104479.c  | 33 +++
>  3 files changed, 42 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr104479.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7bbb80172fc..c195c8cc882 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7385,13 +7385,15 @@ and,
>(vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3)
>(with { tree op_type = TREE_TYPE (@4); }
> (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> -   && is_truth_type_for (op_type, TREE_TYPE (@0)))
> +   && is_truth_type_for (op_type, TREE_TYPE (@0))
> +   && single_use (@4))
>  (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3))
>   (simplify
>(vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3)))
>(with { tree op_type = TREE_TYPE (@4); }
> (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> -   && is_truth_type_for (op_type, TREE_TYPE (@0)))
> +   && is_truth_type_for (op_type, TREE_TYPE (@0))
> +   && single_use (@4))
>  (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1)))
>
>  /* Same for ternary operations.  */
> @@ -7401,13 +7403,15 @@ and,
>(vec_cond @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4)
>(with { tree op_type = TREE_TYPE (@5); }
> (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> -   && is_truth_type_for (op_type, TREE_TYPE (@0)))
> +   && is_truth_type_for (op_type, TREE_TYPE (@0))
> +   && single_use (@5))
>  (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4))
>   (simplify
>(vec_cond @0 @1 (view_convert? (uncond_op@5 @2 @3 @4)))
>(with { tree op_type = TREE_TYPE (@5); }
> (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), 
> op_type)
> -   && is_truth_type_for (op_type, TREE_TYPE (@0)))
> +   && is_truth_type_for (op_type, TREE_TYPE (@0))
> +   && single_use (@5))
>  (view_convert (cond_op (bit_not @0) @2 @3 @4
>   (view_convert:op_type @1)))
>  #endif
> diff --git a/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c 
> b/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c
> index 54c854f2f37..23ab8fa166f 100644
> --- a/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c
> +++ b/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c
> @@ -1,7 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -march=skylake-avx512 -fdump-tree-optimized 
> -DTYPE=int16" } */
> -/* { dg-final { scan-tree-dump-times ".COND_SHR" 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times ".COND_SHL" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "\.COND_" 4 "optimized" } } */
>  /* { dg-final { scan-assembler-times "vpsraw"  1 } } */
>  /* { dg-final { scan-assembler-times "vpsllw"  1 } } */
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr104479.c 
> b/gcc/testsuite/gcc.target/i386/pr104479.c
> new file mode 100644
> index 000..4ca4c482542
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr104479.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=icelake-server -Ofast -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "\.COND_SHR" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "\.COND_FMA" "optimized" } } */
> +
> +void
> +cond_shr (unsigned int* __restrict dst,
> + unsigned int* __restrict src,
> + unsigned int* __restrict y,
> + int i_width)
> +{
> +  for(int x = 0; x < i_width; x++)
> +{
> +  unsigned int temp = src[x] >> 3;
> +  dst[x] =  temp > 255 ? temp : y[x];
> +}
> +}
> +
> +
> +void
> +cond_fma (float* __restrict dst,
> + float* __restrict src1,
> + float* __restrict src2,
> + float* __restrict 

Re: Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-10 Thread Richard Biener via Gcc-patches
On Thu, Feb 10, 2022 at 11:20 PM Thomas Schwinge
 wrote:
>
> Hi!
>
> On 2022-02-10T16:36:51+, Michael Matz via Gcc-patches 
>  wrote:
> > On Thu, 10 Feb 2022, Richard Biener via Gcc-patches wrote:
> >> On Wed, Feb 9, 2022 at 2:21 PM Thomas Schwinge  
> >> wrote:
> >> > OK to push (now, or in next development stage 1?) the attached
> >> > "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
> >> > or should that be done differently -- or, per the current state (why?)
> >> > not at all?
>
> First, thanks for (indirectly) having confirmed that my confusion is not
> completely off, why this is currently missing.  ;-)
>
> >> Hmm, I wonder if we shouldn't simply dump DECL_UID as
> >>
> >>  'uid NNN'
> >
> > Yes, much better in line with the normal dump_tree output.
>
> >> somewhere.  For example after or before DECL_NAME?
>
> Heh -- that's what I wanted to do initially, but then I saw that we've
> currently got in 'print_node_brief' (and very similar in 'print_node'):
>
> [...]
>   fprintf (file, "%s <%s", prefix, get_tree_code_name (TREE_CODE (node)));
>   dump_addr (file, " ", node);
>
>   if (tclass == tcc_declaration)
> {
>   if (DECL_NAME (node))
> fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
>   else if (TREE_CODE (node) == LABEL_DECL
>&& LABEL_DECL_UID (node) != -1)
> {
>   if (dump_flags & TDF_NOUID)
> fprintf (file, " L.");
>   else
> fprintf (file, " L.%d", (int) LABEL_DECL_UID (node));
> }
>   else
> {
>   if (dump_flags & TDF_NOUID)
> fprintf (file, " %c.",
>  TREE_CODE (node) == CONST_DECL ? 'C' : 'D');
>   else
> fprintf (file, " %c.%u",
>  TREE_CODE (node) == CONST_DECL ? 'C' : 'D',
>  DECL_UID (node));
> }
> }
> [...]
>
> That is, if there's no 'DECL_NAME', we print 'L.[UID]', 'C.[UID]',
> 'D.[UID]'.  The same we do in 'gcc/tree-pretty-print.cc:dump_decl_name',
> I found.  But in the latter function, we also do it that same way if
> there is a 'DECL_NAME' ('i' -> 'iD.4249', for example), so that's why I
> copied that style back to my proposed 'print_node_brief'/'print_node'
> change.
>
> Are you now suggesting to only print 'DECL_NAME' as '[NAME] uid [UID]',
> but keep 'L.[UID]', 'C.[UID]', 'D.[UID]' in the "dot" form, or change
> these to 'L uid [UID]', 'C uid [UID]', 'D uid [UID]' correspondingly?

I'd say these should then be 'D.[UID] uid [UID]' even if that's
somewhat redundant.

> And also do the similar changes in
> 'gcc/tree-pretty-print.cc:dump_decl_name' (as well as another dozen or so
> places where such things are printed...), or don't change those?

Don't change those - you were targeting the tree dumper, not the
pretty printers.
The tree dumpers generally dump attributes separately.


>
> I don't care very much which way, just have some slight preference to
> keep things similar.
>
>
> Grüße
>  Thomas
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [PATCH] Update -Warray-bounds documentation [PR104355]

2022-02-10 Thread Richard Sandiford via Gcc-patches
Martin Sebor via Gcc-patches  writes:
> The -Warray-bounds description in the manual is out of date in
> a couple of ways.  First it claims that the option is only active
> with optimization, which isn't entirely correct since at least one
> instance is issued even without it.  Second, the description of
> its level 2 suggests it controls the warning for all trailing
> array members, when it only controls it for trailing one-element
> arrays (this was made tighter in GCC 10 but we neglected to update
> the manual).
>
> In addition, the word "always" in the description of the option
> is also interpreted by some as implying that every instance of
> the warning is necessarily a true positive.  I've reworded
> the description to hopefully avoid this misreading(*).
>
> Finally, the generic text that talks about the interaction with
> optimizations says that -Wmaybe-uninitialized is not issued unless
> optimization is enabled.  That's also not accurate anymore since
> at least one instance of the warning is independent of optimization
> (passing uninitialized objects by reference to const arguments).
>
> The attached changes correct these oversights.
>
> Martin
>
> [*] It should probably be made clearer in the generic text that
> no instance of any warning, not just -Warray-bounds, should be
> taken to be a definitive indication of a bug in the code.  I've
> left that for later.

Yeah, maybe, but I guess it's unlikely to be useful in practice.
The chances of users happening to read a given bit of generic text
seem pretty low.

Doesn't mean we shouldn't do it of course (provided that we don't then
castigate users for having failed to notice it).

> Update -Warray-bounds documentation [PR104355].
>
> Resolves:
> PR middle-end/104355 - Misleading and outdated -Warray-bounds documentation
>
> gcc/ChangeLog:
>   * doc/invoke.texi (-Warray-bounds): Update documentation.
>
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index b49ba22df89..b7b1f47a5ce 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -5641,8 +5641,10 @@ warns that an unrecognized option is present.
>  
>  The effectiveness of some warnings depends on optimizations also being
>  enabled. For example @option{-Wsuggest-final-types} is more effective
> -with link-time optimization and @option{-Wmaybe-uninitialized} does not
> -warn at all unless optimization is enabled.
> +with link-time optimization and some instances of other warnings may
> +not be issued at all unless optimization is enabled.  While optimization
> +in general improves the efficacy of control and data flow sensitive
> +warnings, in some cases it may also cause false positives.
>  
>  @table @gcctabopt
>  @item -Wpedantic
> @@ -7691,20 +7693,22 @@ void f (char c, int i)
>  @itemx -Warray-bounds=@var{n}
>  @opindex Wno-array-bounds
>  @opindex Warray-bounds
> -This option is only active when @option{-ftree-vrp} is active
> -(default for @option{-O2} and above). It warns about subscripts to arrays
> -that are always out of bounds. This warning is enabled by @option{-Wall}.
> +Warn about out of bounds subscripts or offsets into arrays. This warning
> +level is enabled by @option{-Wall}.  It is the most effective when

It's not clear to me which level is “this level” here.  How about
something like “This warning is enabled at level 1…”?

“It is more effective when” seems more natural to me than “It is the most
effective when”, but maybe that's just me.

Looks good to me otherwise FWIW.  OK with those changes if you agree and
if no-one has further comments by Monday.

Thanks,
Richard

> +@option{-ftree-vrp} is active (the default for @option{-O2} and above)
> +but a subset of instances are issued even without optimization.
>  
>  @table @gcctabopt
>  @item -Warray-bounds=1
> -This is the warning level of @option{-Warray-bounds} and is enabled
> +This is the default warning level of @option{-Warray-bounds} and is enabled
>  by @option{-Wall}; higher levels are not, and must be explicitly requested.
>  
>  @item -Warray-bounds=2
> -This warning level also warns about out of bounds access for
> -arrays at the end of a struct and for arrays accessed through
> -pointers. This warning level may give a larger number of
> -false positives and is deactivated by default.
> +This warning level also warns out of bounds accesses to trailing struct
> +members of one-element array types (@pxref{Zero Length}) and about
> +the intermediate results of pointer arithmetic that may yield out of
> +bounds values. This warning level may give a larger number of false
> +positives and is deactivated by default.
>  @end table
>  
>  @item -Warray-compare


Re: [PATCH] c: Add diagnostic when operator= is used as truth cond [PR25689]

2022-02-10 Thread Zhao Wei Liew via Gcc-patches
On Fri, 11 Feb 2022 at 00:14, Jason Merrill  wrote:
>
> On 2/9/22 21:18, Zhao Wei Liew via Gcc-patches wrote:
> > Hi!
> >
> > I wrote a patch for PR 25689, but I feel like it may not be the ideal
> > fix. Furthermore, there are some standing issues with the patch for
> > which I would like tips on how to fix them.
> > Specifically, there are 2 issues:
> > 1. GCC warns about  if (a.operator=(0)). That said, this may not be a
> > major issue as I don't think such code is widely written.
>
> Can you avoid this by checking CALL_EXPR_OPERATOR_SYNTAX?

Thanks! It worked. There is no longer a warning for `if (a.operator=(0))`.
The updated patch is at the bottom of this email.

>
> > 2. GCC does not warn for `if (a = b)` where the default copy/move
> > assignment operator is used.
>
> The code for trivial copy-assignment should be pretty recognizable, as a
> MODIFY_EXPR of two MEM_REFs; it's built in build_over_call after the
> comment "We must only copy the non-tail padding parts."

Ah, I see what you mean. Thanks! However, it seems like that's the case only
for non-empty classes. GCC already warns for MODIFY_EXPR, so there's
nothing we need to do there.

On the other hand, for empty classes, it seems that a COMPOUND_EXPR
is built in build_over_call under the is_really_empty_class guard (line 9791).
I don't understand the tree structure that I should identify though.
Could you give me any further explanations on that?

> > -  if (TREE_CODE (cond) == MODIFY_EXPR
> > +  /* Also check if this is a call to operator=().
> > + Example: if (my_struct = 5) {...}
> > +  */
> > +  tree fndecl = NULL_TREE;
> > +  if (TREE_OPERAND_LENGTH(cond) >= 1) {
> > +fndecl = cp_get_callee_fndecl(TREE_OPERAND(cond, 0));
>
> Let's use cp_get_callee_fndecl_nofold.
>
> Please add a space before all (

Got it. May I know why it's better to use *_nofold here?

On an unrelated note, I adjusted the if condition to use INDIRECT_REF_P (cond)
instead of TREE_OPERAND_LENGTH (cond) >= 1.
I hope that's better for semantics.

--Everything below is the updated patch-

When compiling the following code with g++ -Wparentheses, GCC does not
warn on the if statement:

struct A {
A& operator=(int);
operator bool();
};

void f(A a) {
if (a = 0); // no warning
}

This is because a = 0 is a call to operator=, which GCC does not check
for.

This patch fixes that by checking for calls to operator= when deciding
to warn.

v1: gcc:gnu.org/pipermail/gcc-patches/2022-February/590158.html
Changes since v1:
1. Use CALL_EXPR_OPERATOR_SYNTAX to avoid warnings for explicit
   operator=() calls.
2. Use INDIRECT_REF_P to filter implicit operator=() calls.
3. Use cp_get_callee_fndecl_nofold.
4. Add spaces before (.

PR c/25689

gcc/cp/ChangeLog:

* semantics.cc (maybe_convert_cond): Handle the operator=() case
  as well.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wparentheses-31.C: New test.
---
 gcc/cp/semantics.cc | 18 +-
 gcc/testsuite/g++.dg/warn/Wparentheses-31.C | 11 +++
 2 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-31.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 466d6b56871f4..b45903a6a6fde 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -836,7 +836,23 @@ maybe_convert_cond (tree cond)
   /* Do the conversion.  */
   cond = convert_from_reference (cond);

-  if (TREE_CODE (cond) == MODIFY_EXPR
+  /* Check for operator syntax calls to operator=().
+ Example: if (my_struct = 5) {...}
+  */
+  tree fndecl = NULL_TREE;
+  if (INDIRECT_REF_P (cond)) {
+tree fn = TREE_OPERAND (cond, 0);
+if (TREE_CODE (fn) == CALL_EXPR
+&& CALL_EXPR_OPERATOR_SYNTAX (fn)) {
+  fndecl = cp_get_callee_fndecl_nofold (fn);
+}
+  }
+
+  if ((TREE_CODE (cond) == MODIFY_EXPR
+|| (fndecl != NULL_TREE
+&& DECL_OVERLOADED_OPERATOR_P (fndecl)
+&& DECL_OVERLOADED_OPERATOR_IS (fndecl, NOP_EXPR)
+&& DECL_ASSIGNMENT_OPERATOR_P (fndecl)))
   && warn_parentheses
   && !warning_suppressed_p (cond, OPT_Wparentheses)
   && warning_at (cp_expr_loc_or_input_loc (cond),
diff --git a/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
b/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
new file mode 100644
index 0..abd7476ccb461
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
@@ -0,0 +1,11 @@
+/* PR c/25689 */
+/* { dg-options "-Wparentheses" }  */
+
+struct A {
+   A& operator=(int);
+   operator bool();
+};
+
+void f(A a) {
+   if (a = 0); /* { dg-warning "suggest parentheses" } */
+}
--
2.17.1


Re: [Patch, fortran] PR37336 (Finalization) - [F03] Finish derived-type finalization

2022-02-10 Thread Jerry D via Gcc-patches

For what it is worth.

On 2/10/22 11:49 AM, Harald Anlauf via Fortran wrote:

Hi Paul,

Am 10.02.22 um 13:25 schrieb Paul Richard Thomas via Fortran:

Conclusions on ifort:
(i) The agreement between gfortran, with the patch applied, and ifort is
strongest of all the other brands;
(ii) The disagreements are all down to the treatment of the parent
component of arrays of extended types: gfortran finalizes the parent
component as an array, whereas ifort does a scalarization. I have a 
patch

ready to do likewise.

Overall conclusions:
(i) Sort out whether or not derived type constructors are considered 
to be

functions;
(ii) Come to a conclusion about scalarization of parent components of
extended type arrays;
(iii) Check and, if necessary, correct the ordering of finalization in
intrinsic assignment of class arrays.
(iv) Finalization is difficult to graft on to existing pre-F2003 
compilers,

as witnessed by the range of implementations.

I would be really grateful for thoughts on (i) and (ii). My gut 
feeling, as

remarked in the submission, is that we should aim to be as close as
possible, if not identical to, ifort. Happily, that is already the case.


I am really sorry to be such a bother, but before we think we should
do the same as Intel, we need to understand what Intel does and whether
that is actually correct.  Or not inconsistent with the standard.
And I would really like to understand even the most simple, stupid case.

I did reduce testcase finalize_38.f90 to an almost bare minimum,
see attached, and changed the main to

  type(simple), parameter   :: ThyType   = simple(21)
  type(simple)  :: ThyType2  = simple(22)
  type(simple), allocatable :: MyType, MyType2

  print *, "At start of program: ", final_count

  MyType = ThyType
  print *, "After 1st allocation:", final_count

  MyType2 = ThyType2
  print *, "After 2nd allocation:", final_count

Note that "ThyType" is now a parameter.


- snip 
Ignore whether Thytype is  a Parameter.  Regardless Mytype and Mytype2 
are allocated upon the assignment.  Now if these are never used 
anywhere, it seems to me the deallocation can be done by the compiler 
anywhere after the last time it is used.  So it can be either after the 
PRINT statement before the end if the program or right after the 
assignment before your PRINT statements that examine the value of 
final_count.  I think the result is arbitrary/undefined in your reduced 
test case


I do not have the Intel compiler yet, so I was going to suggest see what 
it does if your test program prints something from within MyType and 
MyType2 after all your current print statements at the end.  Try this 
variation of the main program.


program test_final
  use testmode
  implicit none
  type(simple), parameter   :: ThyType   = simple(21)
  type(simple)  :: ThyType2  = simple(22)
  type(simple), allocatable :: MyType, MyType2

  print *, "At start of program: ", final_count

  MyType = ThyType
  print *, "After 1st allocation:", final_count

  MyType2 = ThyType2
  print *, "After 2nd allocation:", final_count

  print  *, MyType%ind, MyType2%ind, final_count
  deallocate(Mytype)
  print  *, MyType%ind, MyType2%ind, final_count
  deallocate(Mytype2)
  print  *, MyType%ind, MyType2%ind, final_count

end program test_final

I get with trunk:

$ ./a.out
 At start of program:    0
 After 1st allocation:    0
 After 2nd allocation:   0
  21 22   0
   0  22   1
   0  0 2

Which makes sense to me.

Regards,

Jerry


[PATCH] Add single_use to simplification (uncond_op + vec_cond -> cond_op).

2022-02-10 Thread liuhongt via Gcc-patches
>>> Confirmed.  When uncond_op is expensive (there's *div amongst them) that's
>>> definitely unwanted.  OTOH when it is cheap then combining will reduce
>>> latency.
>>> 
>>> GIMPLE wise it's a neutral transform if uncond_op is not single-use unless
>>> we need two v_c_es.
>> 
>> We can leave it to rtl combine/fwprop which will consider rtx_cost for them.
>>
>
>That certainly makes sense for the !single_use case.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} and
aarch64-unknown-linux-gnu.
Also Bootstrapped and regtested on CLX with gcc configure --with-arch=native
--with-cpu=native.

Ok for trunk?

gcc/ChangeLog:

PR tree-optimization/104479
* match.pd (uncond_op + vec_cond -> cond_op): Add single_use
for the dest of uncond_op.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104479.c: New test.
* gcc.target/i386/cond_op_shift_w-1.c: Adjust testcase.
---
 gcc/match.pd  | 12 ---
 .../gcc.target/i386/cond_op_shift_w-1.c   |  3 +-
 gcc/testsuite/gcc.target/i386/pr104479.c  | 33 +++
 3 files changed, 42 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr104479.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7bbb80172fc..c195c8cc882 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7385,13 +7385,15 @@ and,
   (vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3)
   (with { tree op_type = TREE_TYPE (@4); }
(if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
-   && is_truth_type_for (op_type, TREE_TYPE (@0)))
+   && is_truth_type_for (op_type, TREE_TYPE (@0))
+   && single_use (@4))
 (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3))
  (simplify
   (vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3)))
   (with { tree op_type = TREE_TYPE (@4); }
(if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
-   && is_truth_type_for (op_type, TREE_TYPE (@0)))
+   && is_truth_type_for (op_type, TREE_TYPE (@0))
+   && single_use (@4))
 (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1)))
 
 /* Same for ternary operations.  */
@@ -7401,13 +7403,15 @@ and,
   (vec_cond @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4)
   (with { tree op_type = TREE_TYPE (@5); }
(if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
-   && is_truth_type_for (op_type, TREE_TYPE (@0)))
+   && is_truth_type_for (op_type, TREE_TYPE (@0))
+   && single_use (@5))
 (view_convert (cond_op @0 @1 @2 @3 (view_convert:op_type @4))
  (simplify
   (vec_cond @0 @1 (view_convert? (uncond_op@5 @2 @3 @4)))
   (with { tree op_type = TREE_TYPE (@5); }
(if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
-   && is_truth_type_for (op_type, TREE_TYPE (@0)))
+   && is_truth_type_for (op_type, TREE_TYPE (@0))
+   && single_use (@5))
 (view_convert (cond_op (bit_not @0) @2 @3 @4
  (view_convert:op_type @1)))
 #endif
diff --git a/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c 
b/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c
index 54c854f2f37..23ab8fa166f 100644
--- a/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c
+++ b/gcc/testsuite/gcc.target/i386/cond_op_shift_w-1.c
@@ -1,7 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -march=skylake-avx512 -fdump-tree-optimized -DTYPE=int16" 
} */
-/* { dg-final { scan-tree-dump-times ".COND_SHR" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times ".COND_SHL" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\.COND_" 4 "optimized" } } */
 /* { dg-final { scan-assembler-times "vpsraw"  1 } } */
 /* { dg-final { scan-assembler-times "vpsllw"  1 } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr104479.c 
b/gcc/testsuite/gcc.target/i386/pr104479.c
new file mode 100644
index 000..4ca4c482542
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104479.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-march=icelake-server -Ofast -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "\.COND_SHR" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "\.COND_FMA" "optimized" } } */
+
+void
+cond_shr (unsigned int* __restrict dst,
+ unsigned int* __restrict src,
+ unsigned int* __restrict y,
+ int i_width)
+{
+  for(int x = 0; x < i_width; x++)
+{
+  unsigned int temp = src[x] >> 3;
+  dst[x] =  temp > 255 ? temp : y[x];
+}
+}
+
+
+void
+cond_fma (float* __restrict dst,
+ float* __restrict src1,
+ float* __restrict src2,
+ float* __restrict src3,
+ unsigned int* __restrict y,
+ int i_width)
+{
+  for(int x = 0; x < i_width; x++)
+{
+  float temp = __builtin_fmaf (src1[x], src2[x], src3[x]);
+  dst[x] = temp > 0.0f ? temp : y[x];
+}
+}
-- 
2.18.1



[PATCH] Update -Warray-bounds documentation [PR104355]

2022-02-10 Thread Martin Sebor via Gcc-patches

The -Warray-bounds description in the manual is out of date in
a couple of ways.  First it claims that the option is only active
with optimization, which isn't entirely correct since at least one
instance is issued even without it.  Second, the description of
its level 2 suggests it controls the warning for all trailing
array members, when it only controls it for trailing one-element
arrays (this was made tighter in GCC 10 but we neglected to update
the manual).

In addition, the word "always" in the description of the option
is also interpreted by some as implying that every instance of
the warning is necessarily a true positive.  I've reworded
the description to hopefully avoid this misreading(*).

Finally, the generic text that talks about the interaction with
optimizations says that -Wmaybe-uninitialized is not issued unless
optimization is enabled.  That's also not accurate anymore since
at least one instance of the warning is independent of optimization
(passing uninitialized objects by reference to const arguments).

The attached changes correct these oversights.

Martin

[*] It should probably be made clearer in the generic text that
no instance of any warning, not just -Warray-bounds, should be
taken to be a definitive indication of a bug in the code.  I've
left that for later.Update -Warray-bounds documentation [PR104355].

Resolves:
PR middle-end/104355 - Misleading and outdated -Warray-bounds documentation

gcc/ChangeLog:
	* doc/invoke.texi (-Warray-bounds): Update documentation.


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b49ba22df89..b7b1f47a5ce 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5641,8 +5641,10 @@ warns that an unrecognized option is present.
 
 The effectiveness of some warnings depends on optimizations also being
 enabled. For example @option{-Wsuggest-final-types} is more effective
-with link-time optimization and @option{-Wmaybe-uninitialized} does not
-warn at all unless optimization is enabled.
+with link-time optimization and some instances of other warnings may
+not be issued at all unless optimization is enabled.  While optimization
+in general improves the efficacy of control and data flow sensitive
+warnings, in some cases it may also cause false positives.
 
 @table @gcctabopt
 @item -Wpedantic
@@ -7691,20 +7693,22 @@ void f (char c, int i)
 @itemx -Warray-bounds=@var{n}
 @opindex Wno-array-bounds
 @opindex Warray-bounds
-This option is only active when @option{-ftree-vrp} is active
-(default for @option{-O2} and above). It warns about subscripts to arrays
-that are always out of bounds. This warning is enabled by @option{-Wall}.
+Warn about out of bounds subscripts or offsets into arrays. This warning
+level is enabled by @option{-Wall}.  It is the most effective when
+@option{-ftree-vrp} is active (the default for @option{-O2} and above)
+but a subset of instances are issued even without optimization.
 
 @table @gcctabopt
 @item -Warray-bounds=1
-This is the warning level of @option{-Warray-bounds} and is enabled
+This is the default warning level of @option{-Warray-bounds} and is enabled
 by @option{-Wall}; higher levels are not, and must be explicitly requested.
 
 @item -Warray-bounds=2
-This warning level also warns about out of bounds access for
-arrays at the end of a struct and for arrays accessed through
-pointers. This warning level may give a larger number of
-false positives and is deactivated by default.
+This warning level also warns out of bounds accesses to trailing struct
+members of one-element array types (@pxref{Zero Length}) and about
+the intermediate results of pointer arithmetic that may yield out of
+bounds values. This warning level may give a larger number of false
+positives and is deactivated by default.
 @end table
 
 @item -Warray-compare


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 10, 2022 at 05:43:26PM -0500, David Edelsohn via Gcc-patches wrote:
> For the LE testcases, I changed the target selector to
> "powrpc*-*-linux*" because that is the only PowerPC target that can
> operate as little endian.  I could not find a generic "le" target
> selector.  powerpc*-*-linux* understands "-mlittle", so I left the

There is
proc check_effective_target_be { } {
return [check_no_compiler_messages be object {
int dummy[__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ ? 1 : -1];
}]
}
and
proc check_effective_target_le { } {
return [check_no_compiler_messages le object {
int dummy[__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ ? 1 : -1];
}]
}
so you can just use { dg-do compile { target { le } } } etc.

Jakub



testsuite: Fix up g++.dg/warn/Wuninitialized-32.C test for ilp32 [PR104373]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 10, 2022 at 10:57:02AM +0100, Richard Biener via Gcc-patches wrote:
> > >>>   * g++.dg/warn/Wuninitialized-32.C: New testcase.

The testcase FAILs whenever size_t is not unsigned long:
FAIL: g++.dg/warn/Wuninitialized-32.C  -std=c++98 (test for excess errors)
Excess errors:
.../gcc/testsuite/g++.dg/warn/Wuninitialized-32.C:4:7: error: 'operator new' 
takes type 'size_t' ('unsigned int') as first parameter [-fpermissive]

Fixed by using __SIZE_TYPE__ instead of unsigned long.

Regtested on x86_64-linux -m32/-m64, committed to trunk as obvious.

2022-02-11  Jakub Jelinek  

PR tree-optimization/104373
* g++.dg/warn/Wuninitialized-32.C (operator new[]): Use __SIZE_TYPE__
as type of the first argument instead of unsigned long.

--- gcc/testsuite/g++.dg/warn/Wuninitialized-32.C.jj2022-02-11 
00:19:22.376064016 +0100
+++ gcc/testsuite/g++.dg/warn/Wuninitialized-32.C   2022-02-11 
00:25:45.194857715 +0100
@@ -1,7 +1,7 @@
 // { dg-do compile }
 // { dg-additional-options "-Wall" }
 
-void* operator new[](unsigned long, void* __p);
+void* operator new[](__SIZE_TYPE__, void* __p);
 
 struct allocator
 {


Jakub



Re: [PATCH] handle "invisible" reference in -Wdangling-pointer (PR104436)

2022-02-10 Thread Martin Sebor via Gcc-patches

On 2/8/22 15:37, Jason Merrill wrote:

On 2/8/22 16:59, Martin Sebor wrote:

Transforming a by-value arguments to by-reference as GCC does for some
class types can trigger -Wdangling-pointer when the argument is used
to store the address of a local variable.  Since the stored value is
not accessible in the caller the warning is a false positive.

The attached patch handles this case by excluding PARM_DECLs with
the DECL_BY_REFERENCE bit set from consideration.

While testing the patch I noticed some instances of the warning are
uninitentionally duplicated as the pass runs more than once.  To avoid
that, I also introduce warning suppression into the handler for this
instance of the warning.  (There might still be others.)


The second test should verify that we do warn about returning 't' from a 
function; we don't want to ignore the DECL_BY_REFERENCE RESULT_DECL.


The indirect aggregate case isn't handled and needs more work but
since you brought it up I thought I should look into finishing it.
The attached patch #2 adds support for it.  It also incorporates
Richard's suggestion to test PARM_DECL.  Patch #2 applies on top
of patch #1 which is unchanged from the first revision.

I have retested it on x86_64-linux and by building Glibc and
Binutils + GDB.

If now is too late for the aggregate enhancement I'm okay with
deferring it until stage 1.




+  tree var = SSA_NAME_VAR (lhs_ref.ref);
+  if (DECL_BY_REFERENCE (var))
+    /* Avoid by-value arguments transformed into by-reference.  */
+    continue;


I wonder if we can we express this property of invisiref parms somewhere 
more general?  I imagine optimizations would find it useful as well. 
Could pointer_query somehow treat the reference as pointing to a 
function-local object?


I don't quite see where in the pointer_query class this would be
useful (the class also isn't used for optimization).  I could add
a helper to the access_ref class to query this property in warning
code but as this is the only potential client I'm also not sure
that's quite what you had in mind.  I'd need some guidance as to
what you're thinking of here.

Martin




I previously tried to express this by marking the reference as 
'restrict', but that was wrong 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97474).


Jason

Subject: [PATCH 1/2] Avoid -Wdangling-pointer for by-transparent-reference
 arguments [PR104436].

This change avoids -Wdangling-pointer for by-value arguments transformed
into by-transparent-reference.

Resolves:
PR middle-end/104436 - spurious -Wdangling-pointer assigning local address to a class passed by value

gcc/ChangeLog:

	PR middle-end/104436
	* gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores):
	Check for warning suppression.  Avoid by-value arguments transformed
	into by-transparent-reference.

gcc/testsuite/ChangeLog:

	PR middle-end/104436
	* c-c++-common/Wdangling-pointer-7.c: New test.
	* g++.dg/warn/Wdangling-pointer-4.C: New test.
---
 gcc/gimple-ssa-warn-access.cc | 13 ++-
 .../c-c++-common/Wdangling-pointer-7.c| 20 +++
 .../g++.dg/warn/Wdangling-pointer-4.C | 34 +++
 3 files changed, 66 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wdangling-pointer-7.c
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-pointer-4.C

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 80d41ea4383..0c319a32b70 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -4517,6 +4517,9 @@ pass_waccess::check_dangling_stores (basic_block bb,
   if (!stmt)
 	break;
 
+  if (warning_suppressed_p (stmt, OPT_Wdangling_pointer_))
+	continue;
+
   if (is_gimple_call (stmt)
 	  && !(gimple_call_flags (stmt) & (ECF_CONST | ECF_PURE)))
 	/* Avoid looking before nonconst, nonpure calls since those might
@@ -4542,10 +4545,16 @@ pass_waccess::check_dangling_stores (basic_block bb,
 	}
   else if (TREE_CODE (lhs_ref.ref) == SSA_NAME)
 	{
-	  /* Avoid looking at or before stores into unknown objects.  */
 	  gimple *def_stmt = SSA_NAME_DEF_STMT (lhs_ref.ref);
 	  if (!gimple_nop_p (def_stmt))
+	/* Avoid looking at or before stores into unknown objects.  */
 	return;
+
+	  tree var = SSA_NAME_VAR (lhs_ref.ref);
+	  if (DECL_BY_REFERENCE (var))
+	/* Avoid by-value arguments transformed into by-reference.  */
+	continue;
+
 	}
   else if (TREE_CODE (lhs_ref.ref) == MEM_REF)
 	{
@@ -4578,6 +4587,8 @@ pass_waccess::check_dangling_stores (basic_block bb,
 		  "storing the address of local variable %qD in %qE",
 		  rhs_ref.ref, lhs))
 	{
+	  suppress_warning (stmt, OPT_Wdangling_pointer_);
+
 	  location_t loc = DECL_SOURCE_LOCATION (rhs_ref.ref);
 	  inform (loc, "%qD declared here", rhs_ref.ref);
 
diff --git a/gcc/testsuite/c-c++-common/Wdangling-pointer-7.c b/gcc/testsuite/c-c++-common/Wdangling-pointer-7.c
new file mode 100644
index 

Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread David Edelsohn via Gcc-patches
On Thu, Feb 10, 2022 at 4:17 PM Bill Schmidt  wrote:
>
> Hi!
>
> On 2/10/22 2:50 PM, Segher Boessenkool wrote:
> > On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
> >> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
> >> These built-ins were misimplemented as always having big-endian semantics.
> >>
> >> Because the built-in infrastructure has changed, the modifications to the
> >> source are different but achieve the same purpose.  The modifications to
> >> the test suite are identical (after fixing the issue with -mbig that David
> >> pointed out with the original patch).
> >>  /* 1 argument vector functions added in ISA 3.0 (power9). */
> >> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",  CONST,  
> >> vclzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",CONST,  vclzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",CONST,  vclzlsbb_v4si)
> >> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",  CONST,  
> >> vctzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",CONST,  vctzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",CONST,  vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",  CONST,  
> >> vctzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",CONST,  vctzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",CONST,  vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",  CONST,  
> >> vclzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",CONST,  vclzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",CONST,  vclzlsbb_v4si)
> > Please change the default to be equal to the builtin name, so, the BE
> > version.  We do that everywhere else as well, and it makes a lot more
> > sense (since everything in Power has BE numbering).
> >
> > The trunk version has this correct afaics?
>
> No, trunk has this, for example:
>
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
>
> So the backport matches what is on trunk.
>
> Throughout the new builtin infrastructure, the defaults are set for
> little-endian, and the "endian" flag changes behavior for big-endian.
>
> >
> >> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> >> @@ -1,6 +1,7 @@
> >>  /* { dg-do compile { target { powerpc*-*-* } } } */
> > (Delete the redundant target clause when modifying any testcase, please).
>
> Okay.
> >
> >>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> >>  /* { dg-options "-mdejagnu-cpu=power9" } */
> >> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> > You don't need the target clause, if it already is BE by default it does
> > not do anything to add it redundantly.
> >
> > But this is wrong anyway: the name of the target triple does not say
> > whether we are BE or LE.  Instead you should use the be or le selectors.
> > But again, just add -mbig always.
>
> This was added by David Edelsohn to the trunk version of the patch, because
> -mbig actually is not supported on all subtargets.  (I found that quite
> surprising also.)  Apparently this doesn't work on AIX, for example.  But
> -mlittle works everywhere.  Go figure.

-mbig/-mlittle only applies to Linux, not AIX and not Darwin.

I changed the BE testcases to add "-mbig" for little endian default
targets because the compiler implicitly should be operating in big
endian mode for other targets and the testcases should succeed.

For the LE testcases, I changed the target selector to
"powrpc*-*-linux*" because that is the only PowerPC target that can
operate as little endian.  I could not find a generic "le" target
selector.  powerpc*-*-linux* understands "-mlittle", so I left the
dg-options clause because there is no need to separate out "-mlittle"
for that subset of PowerPC targets.

Thanks, David

>
> That's something that should be fixed, I guess, but it's orthogonal
> to this patch.
>
> Thanks!
> Bill
>
> >
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
> >> @@ -0,0 +1,15 @@
> >> +/* { dg-do compile { target { powerpc*-*-* } } } */
> >> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> >> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
> > And here you do it correctly :-)
> >
> > Okay with those fixes (all happen a few times).  Thanks!
> >
> >
> > Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/10/22 4:11 PM, Segher Boessenkool wrote:
> On Thu, Feb 10, 2022 at 03:17:05PM -0600, Bill Schmidt wrote:
  /* 1 argument vector functions added in ISA 3.0 (power9). */
 -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",CONST,  vclzlsbb_v16qi)
 -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",  CONST,  vclzlsbb_v8hi)
 -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",  CONST,  vclzlsbb_v4si)
 -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",CONST,  vctzlsbb_v16qi)
 -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",  CONST,  vctzlsbb_v8hi)
 -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",  CONST,  vctzlsbb_v4si)
 +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",CONST,  vctzlsbb_v16qi)
 +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",  CONST,  vctzlsbb_v8hi)
 +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",  CONST,  vctzlsbb_v4si)
 +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",CONST,  vclzlsbb_v16qi)
 +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",  CONST,  vclzlsbb_v8hi)
 +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",  CONST,  vclzlsbb_v4si)
>>> Please change the default to be equal to the builtin name, so, the BE
>>> version.  We do that everywhere else as well, and it makes a lot more
>>> sense (since everything in Power has BE numbering).
>>>
>>> The trunk version has this correct afaics?
>> No, trunk has this, for example:
>>
>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
> I see this on trunk:
>
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> VCLZLSBB_V16QI vclzlsbb_v16qi {}
>
> Oh, you changed it?  Please fix it, then.

In a patch you approved, yes.  I don't really understand why you want
it changed now.  You must not be looking at the most recent trunk
revision.

>
>> Throughout the new builtin infrastructure, the defaults are set for
>> little-endian, and the "endian" flag changes behavior for big-endian.
> That is a big mistake.  There are many machine instructions  that are
> *always* big-endian (most even!), and none that are always
> little-endian.  So this should be fixed, sooner rather than later :-(

That does not seem like a good idea in stage 4 to me.  That requires
yet another patch to reverse a bunch of other things unnecessarily.

This is a purely arbitrary choice.  The endian flag is only used when
a built-in function must have one behavior for big-endian, and another
behavior for little-endian.  Which one is chosen as the default is
absolutely arbitrary.  When we expand the built-in we will either
accept the default or change to the other.  The existence of machine
instructions that are only big-endian has nothing to do with the case;
what matters is the existence of built-in functions that have two
behaviors.

  /* { dg-require-effective-target powerpc_p9vector_ok } */
  /* { dg-options "-mdejagnu-cpu=power9" } */
 +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
>>> You don't need the target clause, if it already is BE by default it does
>>> not do anything to add it redundantly.
>>>
>>> But this is wrong anyway: the name of the target triple does not say
>>> whether we are BE or LE.  Instead you should use the be or le selectors.
>>> But again, just add -mbig always.
>> This was added by David Edelsohn to the trunk version of the patch, because
>> -mbig actually is not supported on all subtargets.  (I found that quite
>> surprising also.)
> Huh.  Yeah I think I encountered that before.
>
> So this is because these options are in sysv4.opt .
>
>> Apparently this doesn't work on AIX, for example.  But 
>> -mlittle works everywhere.  Go figure.
> ... and -mlittle is exactly the same?  Wtw.
>
> I only looked at the .opt files, maybe one of them is handled directly,
> or more likely in specs?  And not symmetrically?
>
>> That's something that should be fixed, I guess, but it's orthogonal
>> to this patch.
> Fixing it later is more work :-(
>
> Please at least open a bug report for it.

I can do that.

>
>
> The other things need fixing before the patch is okay.

I'd ask you to reconsider, as explained above.

Thanks,
Bill

>
>
> Segher


Re: Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-10 Thread Thomas Schwinge
Hi!

On 2022-02-10T16:36:51+, Michael Matz via Gcc-patches 
 wrote:
> On Thu, 10 Feb 2022, Richard Biener via Gcc-patches wrote:
>> On Wed, Feb 9, 2022 at 2:21 PM Thomas Schwinge  
>> wrote:
>> > OK to push (now, or in next development stage 1?) the attached
>> > "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
>> > or should that be done differently -- or, per the current state (why?)
>> > not at all?

First, thanks for (indirectly) having confirmed that my confusion is not
completely off, why this is currently missing.  ;-)

>> Hmm, I wonder if we shouldn't simply dump DECL_UID as
>>
>>  'uid NNN'
>
> Yes, much better in line with the normal dump_tree output.

>> somewhere.  For example after or before DECL_NAME?

Heh -- that's what I wanted to do initially, but then I saw that we've
currently got in 'print_node_brief' (and very similar in 'print_node'):

[...]
  fprintf (file, "%s <%s", prefix, get_tree_code_name (TREE_CODE (node)));
  dump_addr (file, " ", node);

  if (tclass == tcc_declaration)
{
  if (DECL_NAME (node))
fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
  else if (TREE_CODE (node) == LABEL_DECL
   && LABEL_DECL_UID (node) != -1)
{
  if (dump_flags & TDF_NOUID)
fprintf (file, " L.");
  else
fprintf (file, " L.%d", (int) LABEL_DECL_UID (node));
}
  else
{
  if (dump_flags & TDF_NOUID)
fprintf (file, " %c.",
 TREE_CODE (node) == CONST_DECL ? 'C' : 'D');
  else
fprintf (file, " %c.%u",
 TREE_CODE (node) == CONST_DECL ? 'C' : 'D',
 DECL_UID (node));
}
}
[...]

That is, if there's no 'DECL_NAME', we print 'L.[UID]', 'C.[UID]',
'D.[UID]'.  The same we do in 'gcc/tree-pretty-print.cc:dump_decl_name',
I found.  But in the latter function, we also do it that same way if
there is a 'DECL_NAME' ('i' -> 'iD.4249', for example), so that's why I
copied that style back to my proposed 'print_node_brief'/'print_node'
change.

Are you now suggesting to only print 'DECL_NAME' as '[NAME] uid [UID]',
but keep 'L.[UID]', 'C.[UID]', 'D.[UID]' in the "dot" form, or change
these to 'L uid [UID]', 'C uid [UID]', 'D uid [UID]' correspondingly?
And also do the similar changes in
'gcc/tree-pretty-print.cc:dump_decl_name' (as well as another dozen or so
places where such things are printed...), or don't change those?

I don't care very much which way, just have some slight preference to
keep things similar.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH, testsuite] Fix attr-retain-*.c testcases on 32-bit PowerPC [PR100407]

2022-02-10 Thread Pat Haugen via Gcc-patches
Per Alan's comment in the bugzilla, fix attr-retain-* tescases for 32-bit 
PowerPC.

Bootstrapped and regression tested on powerpc64(32/64) and powerpc64le.
Ok for master?

-Pat


2022-02-10  Pat Haugen  

PR testsuite/100407

gcc/testsuite/
* gcc.c-torture/compile/attr-retain-1.c: Add -G0 for 32-bit PowerPC.
* gcc.c-torture/compile/attr-retain-2.c: Likewise.



diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c 
b/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
index 6cab155..4a366eb 100644
--- a/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-retain-1.c
@@ -1,4 +1,5 @@
 /* { dg-do compile { target R_flag_in_section } } */
+/* { dg-options "-G0" { target { powerpc*-*-* && ilp32 } } } */
 /* { dg-final { scan-assembler ".text.*,\"axR\"" } } */
 /* { dg-final { scan-assembler ".bss.*,\"awR\"" } } */
 /* { dg-final { scan-assembler ".data.*,\"awR\"" } } */
diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c 
b/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c
index 0208ffe..d9fc150 100644
--- a/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c
+++ b/gcc/testsuite/gcc.c-torture/compile/attr-retain-2.c
@@ -11,5 +11,6 @@
 /* { dg-final { scan-assembler ".bss.used_lcomm2,\"awR\"" { target arm-*-* } } 
} */
 /* { dg-final { scan-assembler ".data.used_foo_sec,\"awR\"" } } */
 /* { dg-options "-ffunction-sections -fdata-sections" } */
+/* { dg-options "-ffunction-sections -fdata-sections -G0" { target { 
powerpc*-*-* && ilp32 } } } */
 
 #include "attr-retain-1.c"


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
On Thu, Feb 10, 2022 at 03:17:05PM -0600, Bill Schmidt wrote:
> >>  /* 1 argument vector functions added in ISA 3.0 (power9). */
> >> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",CONST,  vclzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",  CONST,  vclzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",  CONST,  vclzlsbb_v4si)
> >> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",CONST,  vctzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",  CONST,  vctzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",  CONST,  vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",CONST,  vctzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",  CONST,  vctzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",  CONST,  vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",CONST,  vclzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",  CONST,  vclzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",  CONST,  vclzlsbb_v4si)
> > Please change the default to be equal to the builtin name, so, the BE
> > version.  We do that everywhere else as well, and it makes a lot more
> > sense (since everything in Power has BE numbering).
> >
> > The trunk version has this correct afaics?
> 
> No, trunk has this, for example:
> 
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}

I see this on trunk:

  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
VCLZLSBB_V16QI vclzlsbb_v16qi {}

Oh, you changed it?  Please fix it, then.

> Throughout the new builtin infrastructure, the defaults are set for
> little-endian, and the "endian" flag changes behavior for big-endian.

That is a big mistake.  There are many machine instructions  that are
*always* big-endian (most even!), and none that are always
little-endian.  So this should be fixed, sooner rather than later :-(

> >>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> >>  /* { dg-options "-mdejagnu-cpu=power9" } */
> >> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> > You don't need the target clause, if it already is BE by default it does
> > not do anything to add it redundantly.
> >
> > But this is wrong anyway: the name of the target triple does not say
> > whether we are BE or LE.  Instead you should use the be or le selectors.
> > But again, just add -mbig always.
> 
> This was added by David Edelsohn to the trunk version of the patch, because
> -mbig actually is not supported on all subtargets.  (I found that quite
> surprising also.)

Huh.  Yeah I think I encountered that before.

So this is because these options are in sysv4.opt .

> Apparently this doesn't work on AIX, for example.  But 
> -mlittle works everywhere.  Go figure.

... and -mlittle is exactly the same?  Wtw.

I only looked at the .opt files, maybe one of them is handled directly,
or more likely in specs?  And not symmetrically?

> That's something that should be fixed, I guess, but it's orthogonal
> to this patch.

Fixing it later is more work :-(

Please at least open a bug report for it.


The other things need fixing before the patch is okay.


Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/10/22 2:50 PM, Segher Boessenkool wrote:
> On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
>> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
>> These built-ins were misimplemented as always having big-endian semantics.
>>
>> Because the built-in infrastructure has changed, the modifications to the
>> source are different but achieve the same purpose.  The modifications to
>> the test suite are identical (after fixing the issue with -mbig that David
>> pointed out with the original patch).
>>  /* 1 argument vector functions added in ISA 3.0 (power9). */
>> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",  CONST,  vclzlsbb_v16qi)
>> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",CONST,  vclzlsbb_v8hi)
>> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",CONST,  vclzlsbb_v4si)
>> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",  CONST,  vctzlsbb_v16qi)
>> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",CONST,  vctzlsbb_v8hi)
>> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",CONST,  vctzlsbb_v4si)
>> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",  CONST,  vctzlsbb_v16qi)
>> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",CONST,  vctzlsbb_v8hi)
>> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",CONST,  vctzlsbb_v4si)
>> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",  CONST,  vclzlsbb_v16qi)
>> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",CONST,  vclzlsbb_v8hi)
>> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",CONST,  vclzlsbb_v4si)
> Please change the default to be equal to the builtin name, so, the BE
> version.  We do that everywhere else as well, and it makes a lot more
> sense (since everything in Power has BE numbering).
>
> The trunk version has this correct afaics?

No, trunk has this, for example:

  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
    VCLZLSBB_V16QI vctzlsbb_v16qi {endian}

So the backport matches what is on trunk.  

Throughout the new builtin infrastructure, the defaults are set for
little-endian, and the "endian" flag changes behavior for big-endian.

>
>> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
>> @@ -1,6 +1,7 @@
>>  /* { dg-do compile { target { powerpc*-*-* } } } */
> (Delete the redundant target clause when modifying any testcase, please).

Okay.
>
>>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>>  /* { dg-options "-mdejagnu-cpu=power9" } */
>> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> You don't need the target clause, if it already is BE by default it does
> not do anything to add it redundantly.
>
> But this is wrong anyway: the name of the target triple does not say
> whether we are BE or LE.  Instead you should use the be or le selectors.
> But again, just add -mbig always.

This was added by David Edelsohn to the trunk version of the patch, because
-mbig actually is not supported on all subtargets.  (I found that quite
surprising also.)  Apparently this doesn't work on AIX, for example.  But 
-mlittle works everywhere.  Go figure.

That's something that should be fixed, I guess, but it's orthogonal
to this patch.

Thanks!
Bill

>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
>> @@ -0,0 +1,15 @@
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
>> +/* { dg-require-effective-target powerpc_p9vector_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
> And here you do it correctly :-)
>
> Okay with those fixes (all happen a few times).  Thanks!
>
>
> Segher


Re: [PATCH] configure: Implement --enable-host-pie

2022-02-10 Thread Joseph Myers
Some general observations:

* There are various toplevel GCC subdirectories that are built for the 
host (possibly in addition to the target in some cases) but aren't changed 
in this patch.  Do they get a PIE or PIC build anyway by default?  Such 
directories include, I think: fixincludes (as a corner case, for the 
installed fixincludes), gmp, mpfr, mpc, isl (host libraries whose 
configure scripts aren't part of GCC, so any changes to ensure they build 
as PIE when needed would need to be at top level), intl, libbacktrace, 
libiberty, gnattools, gotools.

(Using a bootstrap compiler that *doesn't* default to PIE might help 
detect any such issues, though only for directores that get built for the 
host in that build - some may not get built by default.)

For directories that are only used as host libraries but don't install any 
executables, even if this patch needs additions the -z now one shouldn't.

* I don't see anything obvious here (or for the existing 
--enable-host-shared) that actually causes the configure option to apply 
only to the host and not to the target, in the case of subdirectories such 
as libbacktrace that get built for both host and target.  (Though static 
target libraries may well default to PIC in many cases anyway.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
> These built-ins were misimplemented as always having big-endian semantics.
> 
> Because the built-in infrastructure has changed, the modifications to the
> source are different but achieve the same purpose.  The modifications to
> the test suite are identical (after fixing the issue with -mbig that David
> pointed out with the original patch).

>  /* 1 argument vector functions added in ISA 3.0 (power9). */
> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",   CONST,  vclzlsbb_v16qi)
> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi", CONST,  vclzlsbb_v8hi)
> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si", CONST,  vclzlsbb_v4si)
> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",   CONST,  vctzlsbb_v16qi)
> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi", CONST,  vctzlsbb_v8hi)
> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si", CONST,  vctzlsbb_v4si)
> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",   CONST,  vctzlsbb_v16qi)
> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi", CONST,  vctzlsbb_v8hi)
> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si", CONST,  vctzlsbb_v4si)
> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",   CONST,  vclzlsbb_v16qi)
> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi", CONST,  vclzlsbb_v8hi)
> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si", CONST,  vclzlsbb_v4si)

Please change the default to be equal to the builtin name, so, the BE
version.  We do that everywhere else as well, and it makes a lot more
sense (since everything in Power has BE numbering).

The trunk version has this correct afaics?

> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile { target { powerpc*-*-* } } } */

(Delete the redundant target clause when modifying any testcase, please).

>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */

You don't need the target clause, if it already is BE by default it does
not do anything to add it redundantly.

But this is wrong anyway: the name of the target triple does not say
whether we are BE or LE.  Instead you should use the be or le selectors.
But again, just add -mbig always.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */

And here you do it correctly :-)

Okay with those fixes (all happen a few times).  Thanks!


Segher


Re: [PATCH] df: Don't set bbs dirty because of debug insn moves [PR104459]

2022-02-10 Thread Alexandre Oliva via Gcc-patches
On Feb 10, 2022, Jakub Jelinek  wrote:

>   PR rtl-optimization/104459
>   * df-scan.cc (df_insn_change_bb): Don't call df_set_bb_dirty when
>   moving DEBUG_INSNs between bbs.

Thanks, that looks quite reasonable to me.  I suppose if we can
reconsider a variant that distinguishes debug insns if we find this to
break something.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


[pushed] c++: ICE on xtreme-header_a.H

2022-02-10 Thread Jason Merrill via Gcc-patches
This test regressed after my PR103752 patch with -march=cascadelake.  I
don't understand why that flag makes a difference, but this patch is correct
in any case.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* module.cc (depset::hash::add_specializations): Use
STRIP_TEMPLATE.
---
 gcc/cp/module.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 3cf0af10bc0..6e6b008b3a5 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -12976,7 +12976,7 @@ depset::hash::add_specializations (bool decl_p)
/* Implicit instantiations only walked if we reach them.  */
needs_reaching = true;
   else if (!DECL_LANG_SPECIFIC (spec)
-  || !DECL_MODULE_PURVIEW_P (spec))
+  || !DECL_MODULE_PURVIEW_P (STRIP_TEMPLATE (spec)))
/* Likewise, GMF explicit or partial specializations.  */
needs_reaching = true;
 

base-commit: 4a8083285c3edf50088a095870b217ab0881dff0
-- 
2.27.0



[PATCH] PR fortran/104211 - ICE in find_array_section, at fortran/expr.cc:1720

2022-02-10 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

when referencing a bad array section after an erroneous previous
declaration we might hit an assert.  The assert can be replaced
by a more gracious error recovery.  Reported by Gerhard.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From d0250b563eb51f5f5fba5a73a40451cedeb5900d Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 10 Feb 2022 21:22:48 +0100
Subject: [PATCH] Fortran: improve error recovery on bad array section

gcc/fortran/ChangeLog:

	PR fortran/104211
	* expr.cc (find_array_section): Replace assertion by error
	recovery when encountering bad array constructor.

gcc/testsuite/ChangeLog:

	PR fortran/104211
	* gfortran.dg/pr104211.f90: New test.
---
 gcc/fortran/expr.cc|  8 +++-
 gcc/testsuite/gfortran.dg/pr104211.f90 | 11 +++
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr104211.f90

diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index ed82a94022f..c9c0ba4cc2e 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -1718,7 +1718,13 @@ find_array_section (gfc_expr *expr, gfc_ref *ref)
 	}

   cons = gfc_constructor_lookup (base, limit);
-  gcc_assert (cons);
+  if (cons == NULL)
+	{
+	  gfc_error ("Error in array constructor referenced at %L",
+		 >u.ar.where);
+	  t = false;
+	  goto cleanup;
+	}
   gfc_constructor_append_expr (>value.constructor,
    gfc_copy_expr (cons->expr), NULL);
 }
diff --git a/gcc/testsuite/gfortran.dg/pr104211.f90 b/gcc/testsuite/gfortran.dg/pr104211.f90
new file mode 100644
index 000..21b0a26a17f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr104211.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/104211 - ICE in find_array_section
+! Contributed by G.Steinmetz
+
+program p
+  type t
+ real :: n
+  end type
+  type(t), parameter :: a(3) = [t(2)] ! { dg-error "Different shape" }
+  type(t), parameter :: b(2) = a(2:3) ! { dg-error "Error in array constructor" }
+end
--
2.34.1



Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

On 2/10/22 2:06 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
>> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
>> These built-ins were misimplemented as always having big-endian semantics.
> What is different compared to the trunk version?

The infrastructure changed, so:

(1) Instead of changing the default pattern in rs6000-builtins.def, I have
to change it in rs6000-builtin.def.  (Note the missing "s".)

(2) Instead of having the endian change driven by an "endian" flag in the
built-in description in rs6000-builtins.def, I have to add some more ad-hoc
code in rs6000_expand_builtin to handle the change to the big-endian
pattern.

That's all.

Thanks!
Bill

>
>
> Segher


Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
Hi!

On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
> These built-ins were misimplemented as always having big-endian semantics.

What is different compared to the trunk version?


Segher


Re: [PATCH] combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]

2022-02-10 Thread Segher Boessenkool
On Thu, Feb 10, 2022 at 06:23:58PM +0100, Jakub Jelinek wrote:
> On Thu, Feb 10, 2022 at 10:42:03AM -0600, Segher Boessenkool wrote:
> > > Not on x86, that isn't a general auto-inc-dec target, but uses PRE_DEC
> > > etc. only for the sp hard register.
> > 
> > Ugh.  Does it have any benefit from using autoinc at all then?  (Actual
> > benefit, not notational convenience).
> 
> That is the most accurate description of what the push/pop instructions
> actually do, and the backend has been doing it for decades.

It is exactly as accurate as the simpler direct representation as a
parallel (which we have used since 1992).


Segher


Re: [Patch, fortran] PR37336 (Finalization) - [F03] Finish derived-type finalization

2022-02-10 Thread Harald Anlauf via Gcc-patches

Hi Paul,

Am 10.02.22 um 13:25 schrieb Paul Richard Thomas via Fortran:

Conclusions on ifort:
(i) The agreement between gfortran, with the patch applied, and ifort is
strongest of all the other brands;
(ii) The disagreements are all down to the treatment of the parent
component of arrays of extended types: gfortran finalizes the parent
component as an array, whereas ifort does a scalarization. I have a patch
ready to do likewise.

Overall conclusions:
(i) Sort out whether or not derived type constructors are considered to be
functions;
(ii) Come to a conclusion about scalarization of parent components of
extended type arrays;
(iii) Check and, if necessary, correct the ordering of finalization in
intrinsic assignment of class arrays.
(iv) Finalization is difficult to graft on to existing pre-F2003 compilers,
as witnessed by the range of implementations.

I would be really grateful for thoughts on (i) and (ii). My gut feeling, as
remarked in the submission, is that we should aim to be as close as
possible, if not identical to, ifort. Happily, that is already the case.


I am really sorry to be such a bother, but before we think we should
do the same as Intel, we need to understand what Intel does and whether
that is actually correct.  Or not inconsistent with the standard.
And I would really like to understand even the most simple, stupid case.

I did reduce testcase finalize_38.f90 to an almost bare minimum,
see attached, and changed the main to

  type(simple), parameter   :: ThyType   = simple(21)
  type(simple)  :: ThyType2  = simple(22)
  type(simple), allocatable :: MyType, MyType2

  print *, "At start of program: ", final_count

  MyType = ThyType
  print *, "After 1st allocation:", final_count

  MyType2 = ThyType2
  print *, "After 2nd allocation:", final_count

Note that "ThyType" is now a parameter.

I tested the above and found:

Intel:
 At start of program:0
 After 1st allocation:   1
 After 2nd allocation:   2

NAG 7.0:
 At start of program:  0
 After 1st allocation: 0
 After 2nd allocation: 0

Crayftn 12.0.2:
 At start of program:  2
 After 1st allocation: 2
 After 2nd allocation: 2

Nvidia 22.1:
 At start of program: 0
 After 1st allocation:0
 After 2nd allocation:0

So my stupid questions are:

- is ThyType invoking a constructor?  It is a parameter, after all.
  Should using it in an assignment invoke a destructor?  If so why?

  And why does Intel then increment the final_count?

- is the initialization of ThyType2 invoking a constructor?
  It might, if that is the implementation in the compiler, but
  should there be a finalization?

  Then ThyType2 is used in an intrinsic assignment, basically the
  same as the other one before.  Now what is the difference?

Are all compilers correct, but I do not see it?

Someone please help!


Best regards

Paul



Cheers,
Harald
module testmode
  implicit none

  type :: simple
 integer :: ind
  contains
final :: destructor1
  end type simple

  integer :: final_count = 0

contains

  subroutine destructor1(self)
type(simple), intent(inout) :: self
final_count = final_count + 1
  end subroutine destructor1

end module testmode

program test_final
  use testmode
  implicit none
  type(simple), parameter   :: ThyType   = simple(21)
  type(simple)  :: ThyType2  = simple(22)
  type(simple), allocatable :: MyType, MyType2

  print *, "At start of program: ", final_count

  MyType = ThyType
  print *, "After 1st allocation:", final_count

  MyType2 = ThyType2
  print *, "After 2nd allocation:", final_count

end program test_final


[r12-7175 Regression] FAIL: g++.dg/warn/Wuninitialized-32.C -std=c++98 (test for excess errors) on Linux/x86_64

2022-02-10 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

0f58ba4dd6b25b16d25494ae18d15dfa681f9b65 is the first bad commit
commit 0f58ba4dd6b25b16d25494ae18d15dfa681f9b65
Author: Richard Biener 
Date:   Fri Feb 4 09:46:43 2022 +0100

tree-optimization/104373 - early diagnostic on unreachable code

caused

FAIL: g++.dg/warn/Wuninitialized-32.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/warn/Wuninitialized-32.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/warn/Wuninitialized-32.C  -std=c++20 (test for excess errors)
FAIL: g++.dg/warn/Wuninitialized-32.C  -std=c++98 (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7175/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Wuninitialized-32.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Wuninitialized-32.C --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] libstdc++: Strengthen memory order for atomic::wait/notify

2022-02-10 Thread Thomas Rodgers via Gcc-patches
Committed to trunk, backported to gcc-11.

On Wed, Feb 9, 2022 at 12:37 PM Jonathan Wakely  wrote:

> On Wed, 9 Feb 2022 at 17:35, Thomas Rodgers via Libstdc++
>  wrote:
> >
> > This patch changes the memory order used in the spin wait code to match
> > that of libc++.
>
> OK for trunk (and gcc-11 if needed).
>
>


[PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Bill Schmidt via Gcc-patches
Hi!

This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
These built-ins were misimplemented as always having big-endian semantics.

Because the built-in infrastructure has changed, the modifications to the
source are different but achieve the same purpose.  The modifications to
the test suite are identical (after fixing the issue with -mbig that David
pointed out with the original patch).

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for releases/gcc-11?

Thanks!
Bill


2022-02-10  Bill Schmidt  

gcc/
PR target/95082
* config/rs6000/rs6000-builtin.def (VCLZLSBB_V16QI): Change default
pattern.
(VCLZLSBB_V8HI): Likewise.
(VCLZLSBB_V4SI): Likewise.
(VCTZLSBB_V16QI): Likewise.
(VCTZLSBB_V8HI): Likewise.
(VCTZLSBB_V4SI): Likewise.
* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Make big-endian
adjustments to P9V_BUILTIN_VC[LT]ZLSBB_* built-in expansions.

gcc/testsuite/
PR target/95082
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to big-endian.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to big-endian.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
---
 gcc/config/rs6000/rs6000-builtin.def  | 12 
 gcc/config/rs6000/rs6000-call.c   | 30 +++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c | 15 ++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c | 15 ++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c | 15 ++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c | 15 ++
 10 files changed, 100 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 6270444ef70..b28ee02070a 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2678,12 +2678,12 @@ BU_P9V_64BIT_AV_X (STXVL,   "stxvl",MISC)
 BU_P9V_64BIT_AV_X (XST_LEN_R,  "xst_len_r",MISC)
 
 /* 1 argument vector functions added in ISA 3.0 (power9). */
-BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi", CONST,  vclzlsbb_v16qi)
-BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",   CONST,  vclzlsbb_v8hi)
-BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",   CONST,  vclzlsbb_v4si)
-BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi", CONST,  vctzlsbb_v16qi)
-BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",   CONST,  vctzlsbb_v8hi)
-BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",   CONST,  vctzlsbb_v4si)
+BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi", CONST,  vctzlsbb_v16qi)
+BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",   CONST,  vctzlsbb_v8hi)
+BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",   CONST,  vctzlsbb_v4si)
+BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi", CONST,  vclzlsbb_v16qi)
+BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",   CONST,  vclzlsbb_v8hi)
+BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",   CONST,  vclzlsbb_v4si)
 
 /* Built-in support for Power9 "VSU option" string operations includes
new awareness of the "vector compare not equal" (vcmpneb, vcmpneb.,
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index ef20cb30388..27bb25fa4d8 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13221,6 +13221,36 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,
}
   break;
 
+case P9V_BUILTIN_VCLZLSBB_V16QI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vclzlsbb_v16qi;
+  break;
+
+case P9V_BUILTIN_VCLZLSBB_V8HI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vclzlsbb_v8hi;
+  break;
+
+case P9V_BUILTIN_VCLZLSBB_V4SI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vclzlsbb_v4si;
+  break;
+
+case P9V_BUILTIN_VCTZLSBB_V16QI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vctzlsbb_v16qi;
+  break;
+
+case P9V_BUILTIN_VCTZLSBB_V8HI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vctzlsbb_v8hi;
+  break;
+
+case P9V_BUILTIN_VCTZLSBB_V4SI:
+  if (BYTES_BIG_ENDIAN)
+   icode = CODE_FOR_vctzlsbb_v4si;
+  break;
+
  

[committed] analyzer: handle more casts of string literals [PR98797]

2022-02-10 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7184-g2ac7b19f1e9219f46ccf55f25d8acb3e02e9a2d4.

gcc/analyzer/ChangeLog:
PR analyzer/98797
* region-model-manager.cc
(region_model_manager::maybe_fold_sub_svalue): Generalize getting
individual chars of a STRING_CST from element_region to any
subregion which is a concrete access of a single byte from its
parent region.
* region.cc (region::get_relative_concrete_byte_range): New.
* region.h (region::get_relative_concrete_byte_range): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/98797
* gcc.dg/analyzer/casts-1.c: Mark xfails as fixed; add further
test coverage for casts of string literals.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.cc| 19 +++
 gcc/analyzer/region.cc  | 28 +++
 gcc/analyzer/region.h   |  6 
 gcc/testsuite/gcc.dg/analyzer/casts-1.c | 45 -
 4 files changed, 84 insertions(+), 14 deletions(-)

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index 010ad078849..d7156c5499f 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -782,15 +782,22 @@ region_model_manager::maybe_fold_sub_svalue (tree type,
   /* Handle getting individual chars from a STRING_CST.  */
   if (tree cst = parent_svalue->maybe_get_constant ())
 if (TREE_CODE (cst) == STRING_CST)
-  if (const element_region *element_reg
-   = subregion->dyn_cast_element_region ())
-   {
- const svalue *idx_sval = element_reg->get_index ();
- if (tree cst_idx = idx_sval->maybe_get_constant ())
+  {
+   /* If we have a concrete 1-byte access within the parent region... */
+   byte_range subregion_bytes (0, 0);
+   if (subregion->get_relative_concrete_byte_range (_bytes)
+   && subregion_bytes.m_size_in_bytes == 1)
+ {
+   /* ...then attempt to get that char from the STRING_CST.  */
+   HOST_WIDE_INT hwi_start_byte
+ = subregion_bytes.m_start_byte_offset.to_shwi ();
+   tree cst_idx
+ = build_int_cst_type (size_type_node, hwi_start_byte);
if (const svalue *char_sval
= maybe_get_char_from_string_cst (cst, cst_idx))
  return get_or_create_cast (type, char_sval);
-   }
+ }
+  }
 
   if (const initial_svalue *init_sval
= parent_svalue->dyn_cast_initial_svalue ())
diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 0adc75e577d..5ac24fb9f9b 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -539,6 +539,34 @@ region::get_relative_concrete_offset (bit_offset_t *) const
   return false;
 }
 
+/* Attempt to get the position and size of this region expressed as a
+   concrete range of bytes relative to its parent.
+   If successful, return true and write to *OUT.
+   Otherwise return false.  */
+
+bool
+region::get_relative_concrete_byte_range (byte_range *out) const
+{
+  /* We must have a concrete offset relative to the parent.  */
+  bit_offset_t rel_bit_offset;
+  if (!get_relative_concrete_offset (_bit_offset))
+return false;
+  /* ...which must be a whole number of bytes.  */
+  if (rel_bit_offset % BITS_PER_UNIT != 0)
+return false;
+  byte_offset_t start_byte_offset = rel_bit_offset / BITS_PER_UNIT;
+
+  /* We must have a concrete size, which must be a whole number
+ of bytes.  */
+  byte_size_t num_bytes;
+  if (!get_byte_size (_bytes))
+return false;
+
+  /* Success.  */
+  *out = byte_range (start_byte_offset, num_bytes);
+  return true;
+}
+
 /* Dump a description of this region to stderr.  */
 
 DEBUG_FUNCTION void
diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h
index 53112175266..2f987e49fa8 100644
--- a/gcc/analyzer/region.h
+++ b/gcc/analyzer/region.h
@@ -182,6 +182,12 @@ public:
  Otherwise return false.  */
   virtual bool get_relative_concrete_offset (bit_offset_t *out) const;
 
+  /* Attempt to get the position and size of this region expressed as a
+ concrete range of bytes relative to its parent.
+ If successful, return true and write to *OUT.
+ Otherwise return false.  */
+  bool get_relative_concrete_byte_range (byte_range *out) const;
+
   void
   get_subregions_for_binding (region_model_manager *mgr,
  bit_offset_t start_bit_offset,
diff --git a/gcc/testsuite/gcc.dg/analyzer/casts-1.c 
b/gcc/testsuite/gcc.dg/analyzer/casts-1.c
index 15cd85f77cf..7e4af384971 100644
--- a/gcc/testsuite/gcc.dg/analyzer/casts-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/casts-1.c
@@ -13,6 +13,14 @@ struct s2
   char arr[4];
 };
 
+struct s3
+{
+  struct inner {
+char a;
+char b;
+  } arr[2];
+};
+
 void test_1 ()
 {
   struct s1 x = {'A', 'B', 'C', 'D'};
@@ -24,10 +32,16 @@ void test_1 ()
   __analyzer_eval 

Re: [Patch] OpenMP/C++: Permit mapping classes with virtual members [PR102204]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 10, 2022 at 06:35:05PM +0100, Tobias Burnus wrote:
>   PR C++/102204
> gcc/cp/ChangeLog:
> 
>   * decl2.cc (cp_omp_mappable_type_1):

Description of the change is missing.

> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.c++/target-virtual-1.C: New test.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/gomp/unmappable-1.C: Remove previously expected dg-message.

Let's do it for GCC 12 already.  Ok with the ChangeLog fixed up.

Jakub



[Patch] OpenMP/C++: Permit mapping classes with virtual members [PR102204]

2022-02-10 Thread Tobias Burnus

This patch removes for C++ the OpenMP 4.5 requirement that
a class may not be mapped if there are virtual members.

It does not do anything beyond and, as RTTI is not accessible
(→ OpenMP 5.2) and restrictions exists on using virtual
functions (5.0/5.2), that seems to be fine for now.

OK? (For GCC 13, I assume)

Tobias

PS:

OpenMP 4.5 had:
"A mappable type cannot contain virtual members."

OpenMP 5.0:
"The effect of invoking a virtual member function of an object
 on a device other than the device on which the object was
 constructed is implementation defined."

OpenMP 5.2:
"• The run-time type information (RTTI) of an object can
   only be accessed from the device on which it was constructed.
 • Invoking a virtual member function of an object
   on a device other than the device on which the object was
   constructed results in unspecified behavior,
   unless the object is accessible
   and was constructed on the host device.
• If an object of polymorphic class type is destructed,
  virtual member functions of any previously existing
  corresponding objects in other device data environments
  must not be invoked."
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP/C++: Permit mapping classes with virtual members [PR102204]

	PR C++/102204
gcc/cp/ChangeLog:

	* decl2.cc (cp_omp_mappable_type_1):

libgomp/ChangeLog:

	* testsuite/libgomp.c++/target-virtual-1.C: New test.

gcc/testsuite/ChangeLog:

	* g++.dg/gomp/unmappable-1.C: Remove previously expected dg-message.

 gcc/cp/decl2.cc  |  8 
 gcc/testsuite/g++.dg/gomp/unmappable-1.C |  2 +-
 libgomp/testsuite/libgomp.c++/target-virtual-1.C | 50 
 3 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 78908339989..c6bfcfe631a 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1540,14 +1540,6 @@ cp_omp_mappable_type_1 (tree type, bool notes)
   /* Arrays have mappable type if the elements have mappable type.  */
   while (TREE_CODE (type) == ARRAY_TYPE)
 type = TREE_TYPE (type);
-  /* A mappable type cannot contain virtual members.  */
-  if (CLASS_TYPE_P (type) && CLASSTYPE_VTABLES (type))
-{
-  if (notes)
-	inform (DECL_SOURCE_LOCATION (TYPE_MAIN_DECL (type)),
-		"type %qT with virtual members is not mappable", type);
-  result = false;
-}
   /* All data members must be non-static.  */
   if (CLASS_TYPE_P (type))
 {
diff --git a/gcc/testsuite/g++.dg/gomp/unmappable-1.C b/gcc/testsuite/g++.dg/gomp/unmappable-1.C
index d00ccb5ad79..364f884500c 100644
--- a/gcc/testsuite/g++.dg/gomp/unmappable-1.C
+++ b/gcc/testsuite/g++.dg/gomp/unmappable-1.C
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fopenmp" } */
 
-class C /* { dg-message "type .C. with virtual members is not mappable" } */
+class C
 {
 public:
   static int static_member; /* { dg-message "static field .C::static_member. is not mappable" } */
diff --git a/libgomp/testsuite/libgomp.c++/target-virtual-1.C b/libgomp/testsuite/libgomp.c++/target-virtual-1.C
new file mode 100644
index 000..a6ac30e7cf0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/target-virtual-1.C
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* Check that classes with virtual member functions works,
+   when using it as declared type. */
+struct base {
+float data [100];
+
+base() = default;
+virtual ~base() = default;
+};
+
+struct derived : public base {
+int scalar, array[5];
+
+derived() = default;
+void do_work ()
+{
+  int error = 0;
+  #pragma omp target map (tofrom: this[:1], error)
+  {
+	if (scalar != 42 || this->array[0] != 123 || array[4] != 555)
+	  error = 1;
+	if (data[0] != 333 || data[99] != -3)
+	  error = 1;
+	this->scalar = 99;
+	array[0] = 5;
+	array[4] = -4;
+	this->data[0] = 11;
+	this->data[99] = 99;
+  }
+  if (error)
+	__builtin_abort ();
+  if (data[0] != 11 || data[99] != 99)
+	__builtin_abort ();
+  if (scalar != 99 || array[0] != 5 || array[4] != -4)
+	__builtin_abort ();
+}   
+};
+
+int
+main ()
+{
+  struct derived x;
+  x.data[0] = 333;
+  x.data[99] = -3;
+  x.scalar = 42;
+  x.array[0] = 123;
+  x.array[4] = 555;
+  x.do_work ();
+  return 0;
+}


Re: [PATCH 1/3, 11 backport] libstdc++: Implement P2325 changes to default-constructibility of views

2022-02-10 Thread Patrick Palka via Gcc-patches
On Thu, 10 Feb 2022, Patrick Palka wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for the 11 branch?
> The backport to the 10 branch hasn't been started yet, I figured it'd
> be good to first get the 11 backport right then base the 10 backport
> on the 11 one.
> 
> NB: This backport of r12-1606 to the 11 branch deliberately omits parts
> of P2325R3 so as to maximize backward compatibility with pre-P2325R3 code.
> In particular, we don't remove the default ctors for back_insert_iterator,
> front_insert_iterator, ostream_iterator, ref_view and basic_istream_view.
> 
> This implements the wording changes of P2325R3 "Views should not be
> required to be default constructible".  Changes are relatively
> straightforward, besides perhaps those to __box (which now stands
> for copyable-box instead of semiregular-box) and __non_propagating_cache.
> 
> For __box, this patch implements the recommended practice to also avoid
> std::optional when the boxed type is nothrow_move/copy_constructible.
> 
> For __non_propagating_cache, now that it's used by split_view::_M_current,
> we need to add assignment from a value of the underlying type to the
> subset of the std::optional API implemented for the cache (needed by
> split_view::begin()).  Hence the new __non_propagating_cache::operator=
> overload.
> 
> In passing, this fixes the undesirable list-init in the constructors of
> the partial specialization of __box as reported in PR100475 comment #7.
> 
>   PR libstdc++/103904
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/iterator_concepts.h (weakly_incrementable): Remove
>   default_initializable requirement.
>   * include/bits/ranges_base.h (ranges::view): Likewise.
>   * include/bits/ranges_util.h (subrange): Constrain the default
>   ctor.
>   * include/bits/stl_iterator.h (common_iterator): Constrain the
>   default ctor.
>   (counted_iterator): Likewise.
>   * include/std/ranges (__detail::__box::operator=): Handle
>   self-assignment in the primary template.
>   (__detail::__box): In the partial specialization: adjust
>   constraints as per P2325.  Add specialized operator= for the
>   case when the wrapped type is not copyable.  Constrain the
>   default ctor.  Avoid list-initialization.
>   (single_view): Constraint the default ctor.
>   (iota_view): Relax semiregular constraint to copyable.
>   Constrain the default ctor.
>   (iota_view::_Iterator): Constraint the default ctor.
>   (basic_istream_view): Remove the default ctor.  Remove NSDMIs.
>   Remove redundant checks for empty _M_stream.
>   (basic_istream_view::_Iterator): Likewise.
>   (ref_view): Remove the default ctor.  Remove NSDMIs.
>   (ref_view::_Iterator): Constrain the default ctor.
>   (__detail::__non_propagating_cache::operator=): Define overload
>   for assigning from a value of the underlying type.
>   (filter_view): Likewise.
>   (filter_view::_Iterator): Likewise.
>   (transform_view): Likewise.
>   (transform_view::_Iterator): Likewise.
>   (take_view): Likewise.
>   (take_view::_Iterator): Likewise.
>   (take_while_view): Likewise.
>   (take_while_view::_Iterator): Likewise.
>   (drop_while_view): Likewise.
>   (drop_while_view::_Iterator): Likewise.
>   (join_view): Likewise.
>   (split_view::_OuterIter::__current): Adjust after changing the
>   type of _M_current.
>   (split_view::_M_current): Wrap it in a __non_propagating_cache.
>   (split_view::split_view): Constrain the default ctor.
>   (common_view): Constrain the default ctor.
>   (reverse_view): Likewise.
>   (elements_view): Likewise.
>   * include/std/span (enable_view>):
>   Define this partial specialization to true unconditionally.
>   * include/std/version (__cpp_lib_ranges): Adjust value.
>   * testsuite/std/ranges/adaptors/detail/semiregular_box.cc:
>   Rename to ...
>   * testsuite/std/ranges/adaptors/detail/copyable_box.cc: ... this.
>   (test02): Adjust now that __box is copyable-box not
>   semiregular-box.
>   (test03): New test.
>   * testsuite/std/ranges/p2325.cc: New test.
>   * testsuite/std/ranges/single_view.cc (test06): New test.
>   * testsuite/std/ranges/view.cc: Adjust now that view doesn't
>   require default_initializable.
> 
> (cherry picked from commit 4b4f5666b4c2f3aab2a9f3d53d394e390b9b682d)
> ---
>  libstdc++-v3/include/bits/iterator_concepts.h |   3 +-
>  libstdc++-v3/include/bits/ranges_base.h   |   3 +-
>  libstdc++-v3/include/bits/ranges_util.h   |   2 +-
>  libstdc++-v3/include/bits/stl_iterator.h  |   3 +-
>  libstdc++-v3/include/std/ranges   | 136 +
>  libstdc++-v3/include/std/span |   3 +-
>  libstdc++-v3/include/std/version  |   2 +-
>  .../{semiregular_box.cc => copyable_box.cc}   |  51 -
>  

Re: [PATCH] combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 10, 2022 at 10:42:03AM -0600, Segher Boessenkool wrote:
> > Not on x86, that isn't a general auto-inc-dec target, but uses PRE_DEC
> > etc. only for the sp hard register.
> 
> Ugh.  Does it have any benefit from using autoinc at all then?  (Actual
> benefit, not notational convenience).

That is the most accurate description of what the push/pop instructions
actually do, and the backend has been doing it for decades.

Jakub



Re: [PATCH] doc: invoke: RISC-V: Clean up the -mstrict-align wording

2022-02-10 Thread Palmer Dabbelt

On Tue, 08 Feb 2022 01:25:01 PST (-0800), sch...@linux-m68k.org wrote:

On Feb 07 2022, Palmer Dabbelt wrote:


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0ebe538ccdc..5e8af05e359 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27702,7 +27702,7 @@ integer load/stores only.
 @item -mstrict-align
 @itemx -mno-strict-align
 @opindex mstrict-align
-Do not or do generate unaligned memory accesses.  The default is set depending


I think the logic is that -mstrict-align is "do not" and
-mno-strict-align is "do".


Ya, sorry, looks like I wasn't really paying attention in either of 
these.


Re: [PATCH 1/3, 11 backport] libstdc++: Implement P2325 changes to default-constructibility of views

2022-02-10 Thread Patrick Palka via Gcc-patches
On Thu, 10 Feb 2022, Patrick Palka wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for the 11 branch?
> The backport to the 10 branch hasn't been started yet, I figured it'd
> be good to first get the 11 backport right then base the 10 backport
> on the 11 one.
> 
> NB: This backport of r12-1606 to the 11 branch deliberately omits parts
> of P2325R3 so as to maximize backward compatibility with pre-P2325R3 code.
> In particular, we don't remove the default ctors for back_insert_iterator,
> front_insert_iterator, ostream_iterator, ref_view and basic_istream_view.

FWIW here's a diff of the changes in this backport relative to r12-1606:

 libstdc++-v3/include/bits/stl_iterator.h   | 13 ++-
 libstdc++-v3/include/bits/stream_iterator.h|  5 +
 libstdc++-v3/include/std/ranges| 24 +++--
 .../24_iterators/back_insert_iterator/constexpr.cc |  3 ++-
 .../front_insert_iterator/constexpr.cc |  3 ++-
 .../ostream_iterator/requirements/constexpr.cc | 24 +
 libstdc++-v3/testsuite/std/ranges/97600.cc |  3 ++-
 libstdc++-v3/testsuite/std/ranges/p2325.cc | 25 ++
 8 files changed, 89 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 7fe727d8093..549bc26dee5 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -639,6 +639,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef _Container  container_type;
 #if __cplusplus > 201703L
   using difference_type = ptrdiff_t;
+
+  constexpr back_insert_iterator() noexcept : container(nullptr) { }
 #endif
 
   /// The only way to create this %iterator is with a container.
@@ -740,6 +742,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef _Container  container_type;
 #if __cplusplus > 201703L
   using difference_type = ptrdiff_t;
+
+  constexpr front_insert_iterator() noexcept : container(nullptr) { }
 #endif
 
   /// The only way to create this %iterator is with a container.
@@ -839,12 +843,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
 #if __cplusplus > 201703L && defined __cpp_lib_concepts
   using _Iter = std::__detail::__range_iter_t<_Container>;
+
+protected:
+  _Container* container = nullptr;
+  _Iter iter = _Iter();
 #else
   typedef typename _Container::iterator_Iter;
-#endif
+
 protected:
   _Container* container;
   _Iter iter;
+#endif
 
 public:
   /// A nested typedef for the type of whatever container you used.
@@ -852,6 +861,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus > 201703L && defined __cpp_lib_concepts
   using difference_type = ptrdiff_t;
+
+  insert_iterator() = default;
 #endif
 
   /**
diff --git a/libstdc++-v3/include/bits/stream_iterator.h 
b/libstdc++-v3/include/bits/stream_iterator.h
index d07474d4996..fd8920b8d01 100644
--- a/libstdc++-v3/include/bits/stream_iterator.h
+++ b/libstdc++-v3/include/bits/stream_iterator.h
@@ -192,6 +192,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const _CharT*_M_string;
 
 public:
+#if __cplusplus > 201703L
+  constexpr ostream_iterator() noexcept
+  : _M_stream(nullptr), _M_string(nullptr) { }
+#endif
+
   /// Construct from an ostream.
   ostream_iterator(ostream_type& __s)
   : _M_stream(std::__addressof(__s)), _M_string(0) {}
diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index a01b5e79f1f..bf6cfae2a6e 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -680,6 +680,8 @@ namespace views
 : public view_interface>
 {
 public:
+  basic_istream_view() = default;
+
   constexpr explicit
   basic_istream_view(basic_istream<_CharT, _Traits>& __stream)
: _M_stream(std::__addressof(__stream))
@@ -688,7 +690,8 @@ namespace views
   constexpr auto
   begin()
   {
-   *_M_stream >> _M_object;
+   if (_M_stream != nullptr)
+ *_M_stream >> _M_object;
return _Iterator{this};
   }
 
@@ -697,8 +700,8 @@ namespace views
   { return default_sentinel; }
 
 private:
-  basic_istream<_CharT, _Traits>* _M_stream;
-  _Val _M_object;
+  basic_istream<_CharT, _Traits>* _M_stream = nullptr;
+  _Val _M_object = _Val();
 
   struct _Iterator
   {
@@ -720,6 +723,7 @@ namespace views
_Iterator&
operator++()
{
+ __glibcxx_assert(_M_parent->_M_stream != nullptr);
  *_M_parent->_M_stream >> _M_parent->_M_object;
  return *this;
}
@@ -730,18 +734,21 @@ namespace views
 
_Val&
operator*() const
-   { return _M_parent->_M_object; }
+   {
+ __glibcxx_assert(_M_parent->_M_stream != nullptr);
+ return _M_parent->_M_object;
+   }
 
friend bool
operator==(const 

[PATCH 1/3, 11 backport] libstdc++: Implement P2325 changes to default-constructibility of views

2022-02-10 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, does this look OK for the 11 branch?
The backport to the 10 branch hasn't been started yet, I figured it'd
be good to first get the 11 backport right then base the 10 backport
on the 11 one.

NB: This backport of r12-1606 to the 11 branch deliberately omits parts
of P2325R3 so as to maximize backward compatibility with pre-P2325R3 code.
In particular, we don't remove the default ctors for back_insert_iterator,
front_insert_iterator, ostream_iterator, ref_view and basic_istream_view.

This implements the wording changes of P2325R3 "Views should not be
required to be default constructible".  Changes are relatively
straightforward, besides perhaps those to __box (which now stands
for copyable-box instead of semiregular-box) and __non_propagating_cache.

For __box, this patch implements the recommended practice to also avoid
std::optional when the boxed type is nothrow_move/copy_constructible.

For __non_propagating_cache, now that it's used by split_view::_M_current,
we need to add assignment from a value of the underlying type to the
subset of the std::optional API implemented for the cache (needed by
split_view::begin()).  Hence the new __non_propagating_cache::operator=
overload.

In passing, this fixes the undesirable list-init in the constructors of
the partial specialization of __box as reported in PR100475 comment #7.

PR libstdc++/103904

libstdc++-v3/ChangeLog:

* include/bits/iterator_concepts.h (weakly_incrementable): Remove
default_initializable requirement.
* include/bits/ranges_base.h (ranges::view): Likewise.
* include/bits/ranges_util.h (subrange): Constrain the default
ctor.
* include/bits/stl_iterator.h (common_iterator): Constrain the
default ctor.
(counted_iterator): Likewise.
* include/std/ranges (__detail::__box::operator=): Handle
self-assignment in the primary template.
(__detail::__box): In the partial specialization: adjust
constraints as per P2325.  Add specialized operator= for the
case when the wrapped type is not copyable.  Constrain the
default ctor.  Avoid list-initialization.
(single_view): Constraint the default ctor.
(iota_view): Relax semiregular constraint to copyable.
Constrain the default ctor.
(iota_view::_Iterator): Constraint the default ctor.
(basic_istream_view): Remove the default ctor.  Remove NSDMIs.
Remove redundant checks for empty _M_stream.
(basic_istream_view::_Iterator): Likewise.
(ref_view): Remove the default ctor.  Remove NSDMIs.
(ref_view::_Iterator): Constrain the default ctor.
(__detail::__non_propagating_cache::operator=): Define overload
for assigning from a value of the underlying type.
(filter_view): Likewise.
(filter_view::_Iterator): Likewise.
(transform_view): Likewise.
(transform_view::_Iterator): Likewise.
(take_view): Likewise.
(take_view::_Iterator): Likewise.
(take_while_view): Likewise.
(take_while_view::_Iterator): Likewise.
(drop_while_view): Likewise.
(drop_while_view::_Iterator): Likewise.
(join_view): Likewise.
(split_view::_OuterIter::__current): Adjust after changing the
type of _M_current.
(split_view::_M_current): Wrap it in a __non_propagating_cache.
(split_view::split_view): Constrain the default ctor.
(common_view): Constrain the default ctor.
(reverse_view): Likewise.
(elements_view): Likewise.
* include/std/span (enable_view>):
Define this partial specialization to true unconditionally.
* include/std/version (__cpp_lib_ranges): Adjust value.
* testsuite/std/ranges/adaptors/detail/semiregular_box.cc:
Rename to ...
* testsuite/std/ranges/adaptors/detail/copyable_box.cc: ... this.
(test02): Adjust now that __box is copyable-box not
semiregular-box.
(test03): New test.
* testsuite/std/ranges/p2325.cc: New test.
* testsuite/std/ranges/single_view.cc (test06): New test.
* testsuite/std/ranges/view.cc: Adjust now that view doesn't
require default_initializable.

(cherry picked from commit 4b4f5666b4c2f3aab2a9f3d53d394e390b9b682d)
---
 libstdc++-v3/include/bits/iterator_concepts.h |   3 +-
 libstdc++-v3/include/bits/ranges_base.h   |   3 +-
 libstdc++-v3/include/bits/ranges_util.h   |   2 +-
 libstdc++-v3/include/bits/stl_iterator.h  |   3 +-
 libstdc++-v3/include/std/ranges   | 136 +
 libstdc++-v3/include/std/span |   3 +-
 libstdc++-v3/include/std/version  |   2 +-
 .../{semiregular_box.cc => copyable_box.cc}   |  51 -
 libstdc++-v3/testsuite/std/ranges/p2325.cc| 180 ++
 .../testsuite/std/ranges/single_view.cc   |  15 ++
 

[PATCH 3/3, 11 backport] libstdc++: invalid default init in _CachedPosition [PR101231]

2022-02-10 Thread Patrick Palka via Gcc-patches
The primary template for _CachedPosition is a dummy implementation for
non-forward ranges, the iterators for which generally can't be cached.
Because this implementation doesn't actually cache anything, _M_has_value
is defined to be false and so calls to _M_get (which are always guarded
by _M_has_value) are unreachable.

Still, to suppress a "control reaches end of non-void function" warning
I made _M_get return {}, but after P2325 input iterators are no longer
necessarily default constructible so this workaround now breaks valid
programs.

This patch fixes this by instead using __builtin_unreachable to squelch
the warning.

PR libstdc++/103904
PR libstdc++/101231

libstdc++-v3/ChangeLog:

* include/std/ranges (_CachedPosition::_M_get): For non-forward
ranges, just call __builtin_unreachable.
* testsuite/std/ranges/istream_view.cc (test05): New test.

(cherry picked from commit 1af937eb6246ad7f63ebff03590e9eede33aca81)
---
 libstdc++-v3/include/std/ranges   |  2 +-
 libstdc++-v3/testsuite/std/ranges/istream_view.cc | 12 
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index bf6cfae2a6e..a4228ba9aa0 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -1221,7 +1221,7 @@ namespace views::__adaptor
_M_get(const _Range&) const
{
  __glibcxx_assert(false);
- return {};
+ __builtin_unreachable();
}
 
constexpr void
diff --git a/libstdc++-v3/testsuite/std/ranges/istream_view.cc 
b/libstdc++-v3/testsuite/std/ranges/istream_view.cc
index af76a1ab39e..f5c0c2a6bb0 100644
--- a/libstdc++-v3/testsuite/std/ranges/istream_view.cc
+++ b/libstdc++-v3/testsuite/std/ranges/istream_view.cc
@@ -83,6 +83,17 @@ test04()
   static_assert(!std::forward_iterator);
 }
 
+void
+test05()
+{
+  // PR libstdc++/101231
+  auto words = std::istringstream{"42"};
+  auto is = ranges::istream_view(words);
+  auto r = is | views::filter([](auto) { return true; });
+  for (auto x : r)
+;
+}
+
 void
 test06()
 {
@@ -99,5 +110,6 @@ main()
   test02();
   test03();
   test04();
+  test05();
   test06();
 }
-- 
2.35.1.102.g2b9c120970



[PATCH 2/3, 11 backport] libstdc++: Sync __cpp_lib_ranges macro defined in ranges_cmp.h

2022-02-10 Thread Patrick Palka via Gcc-patches
r12-1606 bumped the value of __cpp_lib_ranges defined in ,
but this macro is also defined in , so it needs to
be updated there as well.

PR libstdc++/103904

libstdc++-v3/ChangeLog:

* include/bits/ranges_cmp.h (__cpp_lib_ranges): Adjust value.

(cherry picked from commit 12bdd39755a25d237b7776153cbe03e171396fc5)
---
 libstdc++-v3/include/bits/ranges_cmp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/ranges_cmp.h 
b/libstdc++-v3/include/bits/ranges_cmp.h
index f859a33b2c1..1d7da30dddf 100644
--- a/libstdc++-v3/include/bits/ranges_cmp.h
+++ b/libstdc++-v3/include/bits/ranges_cmp.h
@@ -57,7 +57,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #ifdef __cpp_lib_concepts
 // Define this here, included by all the headers that need to define it.
-#define __cpp_lib_ranges 201911L
+#define __cpp_lib_ranges 202106L
 
 namespace ranges
 {
-- 
2.35.1.102.g2b9c120970



[PATCH] configure: Implement --enable-host-bind-now

2022-02-10 Thread Marek Polacek via Gcc-patches
As promised in the --enable-host-pie patch, this patch adds another
configure option, --enable-host-bind-now, which adds -z now when linking
the compiler executables in order to extend hardening.  BIND_NOW with RELRO
allows the GOT to be marked RO; this prevents GOT modification attacks.

This option does not affect linking of target libraries; you can use
LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW.

Bootstrapped/regtested on x86_64-pc-linux-gnu (with the option enabled vs
not enabled).  I suppose this is GCC 13 material, but maybe I'll get some
comments anyway.

c++tools/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.
* configure: Regenerate.

gcc/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Add
-Wl,-z,now to LD_PICFLAG if --enable-host-bind-now.
* configure: Regenerate.
* doc/install.texi: Document --enable-host-bind-now.

lto-plugin/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Link with
-z,now.
* configure: Regenerate.
---
 c++tools/configure  | 11 +++
 c++tools/configure.ac   |  7 +++
 gcc/configure   | 20 ++--
 gcc/configure.ac| 13 -
 gcc/doc/install.texi|  6 ++
 lto-plugin/configure| 20 ++--
 lto-plugin/configure.ac | 11 +++
 7 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/c++tools/configure b/c++tools/configure
index 88087009383..006efe07b35 100755
--- a/c++tools/configure
+++ b/c++tools/configure
@@ -628,6 +628,7 @@ EGREP
 GREP
 CXXCPP
 LD_PICFLAG
+enable_host_bind_now
 PICFLAG
 MAINTAINER
 CXX_AUX_TOOLS
@@ -702,6 +703,7 @@ enable_maintainer_mode
 enable_checking
 enable_default_pie
 enable_host_pie
+enable_host_bind_now
 with_gcc_major_version_only
 '
   ac_precious_vars='build_alias
@@ -1336,6 +1338,7 @@ Optional Features:
   yes,no,all,none,release.
   --enable-default-pieenable Position Independent Executable as default
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
 
 Optional Packages:
   --with-PACKAGE[=ARG]use PACKAGE [ARG=yes]
@@ -3007,6 +3010,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now; LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"
+fi
+
+
+
 
 # Check if O_CLOEXEC is defined by fcntl
 
diff --git a/c++tools/configure.ac b/c++tools/configure.ac
index 1e42689f2eb..d3f23f66f00 100644
--- a/c++tools/configure.ac
+++ b/c++tools/configure.ac
@@ -110,6 +110,13 @@ AC_ARG_ENABLE(host-pie,
[build host code as PIE])],
 [PICFLAG=-fPIE; LD_PICFLAG=-pie], [])
 AC_SUBST(PICFLAG)
+
+# Enable --enable-host-bind-now
+AC_ARG_ENABLE(host-bind-now,
+[AS_HELP_STRING([--enable-host-bind-now],
+   [link host code as BIND_NOW])],
+[LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"], [])
+AC_SUBST(enable_host_bind_now)
 AC_SUBST(LD_PICFLAG)
 
 # Check if O_CLOEXEC is defined by fcntl
diff --git a/gcc/configure b/gcc/configure
index bd4fe1fd6ca..70156b17a40 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -635,6 +635,7 @@ CET_HOST_FLAGS
 LD_PICFLAG
 PICFLAG
 enable_default_pie
+enable_host_bind_now
 enable_host_pie
 enable_host_shared
 enable_plugin
@@ -1026,6 +1027,7 @@ enable_version_specific_runtime_libs
 enable_plugin
 enable_host_shared
 enable_host_pie
+enable_host_bind_now
 enable_libquadmath_support
 with_linker_hash_style
 with_diagnostics_color
@@ -1789,6 +1791,7 @@ Optional Features:
   --enable-plugin enable plugin support
   --enable-host-sharedbuild host code as shared libraries
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
   --disable-libquadmath-support
   disable libquadmath support for Fortran
   --enable-default-pieenable Position Independent Executable as default
@@ -19661,7 +19664,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19676 "configure"
+#line 19679 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19767,7 +19770,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19782 "configure"
+#line 19785 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -32236,6 +32239,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now;
+fi
+
+
+
 # Check whether --enable-libquadmath-support was given.
 if test "${enable_libquadmath_support+set}" = set; then :
   enableval=$enable_libquadmath_support; ENABLE_LIBQUADMATH_SUPPORT=$enableval
@@ -32422,6 +32433,8 @@ else
   PICFLAG=
 fi
 
+
+
 if test 

[PATCH] configure: Implement --enable-host-pie

2022-02-10 Thread Marek Polacek via Gcc-patches
This patch implements the --enable-host-pie configure option which
makes the compiler executables PIE.  This can be used to enhance
protection against ROP attacks, and can be viewed as part of a wider
trend to harden binaries.

It is similar to the option --enable-host-shared, except that --e-h-s
won't add -shared to the linker flags whereas --e-h-p will add -pie.
It is different from --enable-default-pie because that option just
adds an implicit -fPIE/-pie when the compiler is invoked, but the
compiler itself isn't PIE.

Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
regressions.

I plan to add an option to link with -Wl,-z,now.

Bootstrapped/regtested on x86_64-pc-linux-gnu (with the option enabled vs
not enabled).  I suppose this is GCC 13 material, but maybe I'll get some
comments anyway.

c++tools/ChangeLog:

* Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
Use pic/libiberty.a if PICFLAG is set.
* configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
(--enable-host-pie): New check.
* configure: Regenerate.

gcc/ChangeLog:

* Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.
* d/Make-lang.in: Remove NO_PIE_CFLAGS.
* doc/install.texi: Document --enable-host-pie.

libcody/ChangeLog:

* Makefile.in: Pass LD_PICFLAG to LDFLAGS.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.

libcpp/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libdecnumber/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

zlib/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.
---
 c++tools/Makefile.in  | 11 ++---
 c++tools/configure| 17 +++---
 c++tools/configure.ac | 11 +++--
 gcc/Makefile.in   | 29 ++--
 gcc/configure | 47 +++
 gcc/configure.ac  | 36 +-
 gcc/d/Make-lang.in|  2 +-
 gcc/doc/install.texi  | 16 +++--
 libcody/Makefile.in   |  2 +-
 libcody/configure | 30 -
 libcody/configure.ac  | 26 --
 libcpp/configure  | 22 +-
 libcpp/configure.ac   | 19 ++--
 libdecnumber/configure| 22 +-
 libdecnumber/configure.ac | 19 ++--
 zlib/configure| 30 -
 zlib/configure.ac | 21 ++---
 17 files changed, 295 insertions(+), 65 deletions(-)

diff --git a/c++tools/Makefile.in b/c++tools/Makefile.in
index d6a33613732..4d5a5b0522b 100644
--- a/c++tools/Makefile.in
+++ b/c++tools/Makefile.in
@@ -28,8 +28,9 @@ AUTOCONF := @AUTOCONF@
 AUTOHEADER := @AUTOHEADER@
 CXX := @CXX@
 CXXFLAGS := @CXXFLAGS@
-PIEFLAG := @PIEFLAG@
-CXXOPTS := $(CXXFLAGS) $(PIEFLAG) -fno-exceptions -fno-rtti
+PICFLAG := @PICFLAG@
+LD_PICFLAG := @LD_PICFLAG@
+CXXOPTS := $(CXXFLAGS) $(PICFLAG) -fno-exceptions -fno-rtti
 LDFLAGS := @LDFLAGS@
 exeext := @EXEEXT@
 LIBIBERTY := ../libiberty/libiberty.a
@@ -88,11 +89,15 @@ ifeq (@CXX_AUX_TOOLS@,yes)
 
 all::g++-mapper-server$(exeext)
 
+ifneq ($(PICFLAG),)
+override LIBIBERTY := ../libiberty/pic/libiberty.a
+endif
+
 MAPPER.O := server.o resolver.o
 CODYLIB = ../libcody/libcody.a
 CXXINC += -I$(srcdir)/../libcody -I$(srcdir)/../include -I$(srcdir)/../gcc -I. 
-I../gcc
 g++-mapper-server$(exeext): $(MAPPER.O) $(CODYLIB)
-   +$(CXX) $(LDFLAGS) $(PIEFLAG) -o $@ $^ $(LIBIBERTY) $(NETLIBS)
+   +$(CXX) $(LDFLAGS) $(PICFLAG) $(LD_PICFLAG) -o $@ $^ $(LIBIBERTY) 
$(NETLIBS)
 
 # copy to gcc dir so tests there can run
 all::../gcc/g++-mapper-server$(exeext)
diff --git a/c++tools/configure b/c++tools/configure
index 742816e4253..88087009383 100755
--- a/c++tools/configure
+++ b/c++tools/configure
@@ -627,7 +627,8 @@ get_gcc_base_ver
 EGREP
 GREP
 CXXCPP
-PIEFLAG
+LD_PICFLAG
+PICFLAG
 MAINTAINER
 CXX_AUX_TOOLS
 AUTOHEADER
@@ -700,6 +701,7 @@ enable_c___tools
 enable_maintainer_mode
 enable_checking
 enable_default_pie
+enable_host_pie
 with_gcc_major_version_only
 '

Re: [PATCH] combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]

2022-02-10 Thread Segher Boessenkool
On Thu, Feb 10, 2022 at 05:21:07PM +0100, Jakub Jelinek wrote:
> On Thu, Feb 10, 2022 at 10:10:13AM -0600, Segher Boessenkool wrote:
> > But we do have that in other cases, and not just for combine.  IMO it
> > is a good idea to robustify for_each_inc_dec (simply have it skip if the
> > address is not MODE_INT or such).  It also is a good idea to robustify
> > combine subst, just as you do.  It is best to do both!
> 
> Well, skipping would mean the callback isn't called on it so the autoinc
> isn't detected.

Which is fine, because it isn't valid in the first place!  The only
thing we have to do is not ICE, this RTL is not long for this world.

> > So does it not fail if you make this valid code (by using another
> > register)?  bp, si, or di maybe?
> 
> Not on x86, that isn't a general auto-inc-dec target, but uses PRE_DEC
> etc. only for the sp hard register.

Ugh.  Does it have any benefit from using autoinc at all then?  (Actual
benefit, not notational convenience).

> For other targets we'd need to somehow convince all the earlier passes
> (gimple and RTL) not to try to propagate the constant value into the
> addition inside of a memory address.

I wonder if there is any target for which autoinc is more convenient
than inconvenient (other than in inline asm, a whole separate
challenge) :-(


Segher


[PATCH] i386: Fix vec_unpacks_float_lo_v4si operand constraint [PR104469]

2022-02-10 Thread Uros Bizjak via Gcc-patches
2022-02-10  Uroš Bizjak  

gcc/ChangeLog:

PR target/104469
* config/i386/sse.md (vec_unpacks_float_lo_v4si):
Change operand 1 constraint to register_operand.

gcc/testsuite/ChangeLog:

PR target/104469
* gcc.target/i386/pr104469.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master and release branches.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 36b35f68349..b2f56345c65 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -9223,7 +9223,7 @@ (define_expand "vec_unpacks_float_lo_v4si"
 (define_expand "vec_unpacks_float_hi_v8si"
   [(set (match_dup 2)
(vec_select:V4SI
- (match_operand:V8SI 1 "vector_operand")
+ (match_operand:V8SI 1 "register_operand")
  (parallel [(const_int 4) (const_int 5)
 (const_int 6) (const_int 7)])))
(set (match_operand:V4DF 0 "register_operand")
diff --git a/gcc/testsuite/gcc.target/i386/pr104469.c 
b/gcc/testsuite/gcc.target/i386/pr104469.c
new file mode 100644
index 000..39cc31fde1f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104469.c
@@ -0,0 +1,12 @@
+/* PR target/104469 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512f" } */
+
+typedef double __attribute__((__vector_size__ (64))) F;
+typedef int __attribute__((__vector_size__ (32))) V;
+
+F
+foo (V v)
+{
+  return __builtin_convertvector (v, F);
+}


Re: Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-10 Thread Michael Matz via Gcc-patches
Hi,

On Thu, 10 Feb 2022, Richard Biener via Gcc-patches wrote:

> On Wed, Feb 9, 2022 at 2:21 PM Thomas Schwinge  
> wrote:
> >
> > Hi!
> >
> > OK to push (now, or in next development stage 1?) the attached
> > "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
> > or should that be done differently -- or, per the current state (why?)
> > not at all?
> >
> > This does work for my current debugging task, but I've not yet run
> > 'make check' in case anything needs to be adjusted there.
> 
> Hmm, I wonder if we shouldn't simply dump DECL_UID as
> 
>  'uid NNN'

Yes, much better in line with the normal dump_tree output.


Ciao,
Michael.

> 
> somewhere.  For example after or before DECL_NAME?
> 
> >
> > Grüße
> >  Thomas
> >
> >
> > -
> > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 
> > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: 
> > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
> > Registergericht München, HRB 106955
> 


Re: [PATCH] combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 10, 2022 at 10:10:13AM -0600, Segher Boessenkool wrote:
> > case POST_DEC:
> >   {
> > poly_int64 size = GET_MODE_SIZE (GET_MODE (mem));
> > rtx r1 = XEXP (x, 0);
> > rtx c = gen_int_mode (-size, GET_MODE (r1));
> > return fn (mem, x, r1, r1, c, data);
> >   }
> > and that code rightfully expects that the PRE_DEC operand has non-VOIDmode
> > (as it needs to be a REG) - gen_int_mode for VOIDmode results in ICE.
> > I think it is better not to emit the clearly invalid RTL during substitution
> > like we do for other cases, than to adding workarounds for invalid IL
> > created by combine to rtlanal.cc and perhaps elsewhere.
> 
> But we do have that in other cases, and not just for combine.  IMO it
> is a good idea to robustify for_each_inc_dec (simply have it skip if the
> address is not MODE_INT or such).  It also is a good idea to robustify
> combine subst, just as you do.  It is best to do both!

Well, skipping would mean the callback isn't called on it so the autoinc
isn't detected.
But we could do:
 rtx c = (GET_MODE (r1) == VOIDmode
  ? GEN_INT (-size) : gen_int_mode (-size, GET_MODE (r1)));
with a comment why do do that.

> > PR middle-end/104446
> > * combine.cc (subst): Don't substitute CONST_INTs into RTX_AUTOINC
> > operands.
> > 
> > * gcc.target/i386/pr104446.c: New test.
> 
> > +/* PR middle-end/104446 */
> > +/* { dg-do compile { target ia32 } } */
> > +/* { dg-options "-O2 -mrtd" } */
> > +
> > +register volatile int a __asm__("%esp");
> > +void foo (void *);
> > +void bar (void *);
> > +
> > +void
> > +baz (void)
> > +{
> > +  foo (__builtin_return_address (0));
> > +  a = 0;
> > +  bar (__builtin_return_address (0));
> > +}
> 
> So does it not fail if you make this valid code (by using another
> register)?  bp, si, or di maybe?

Not on x86, that isn't a general auto-inc-dec target, but uses PRE_DEC
etc. only for the sp hard register.
For other targets we'd need to somehow convince all the earlier passes
(gimple and RTL) not to try to propagate the constant value into the
addition inside of a memory address.

Jakub



Re: [PATCH] c: Add diagnostic when operator= is used as truth cond [PR25689]

2022-02-10 Thread Jason Merrill via Gcc-patches

On 2/9/22 21:18, Zhao Wei Liew via Gcc-patches wrote:

Hi!

I wrote a patch for PR 25689, but I feel like it may not be the ideal
fix. Furthermore, there are some standing issues with the patch for
which I would like tips on how to fix them.
Specifically, there are 2 issues:
1. GCC warns about  if (a.operator=(0)). That said, this may not be a
major issue as I don't think such code is widely written.


Can you avoid this by checking CALL_EXPR_OPERATOR_SYNTAX?


2. GCC does not warn for `if (a = b)` where the default copy/move
assignment operator is used.


The code for trivial copy-assignment should be pretty recognizable, as a 
MODIFY_EXPR of two MEM_REFs; it's built in build_over_call after the 
comment "We must only copy the non-tail padding parts."



I've included a code snippet in PR25689 that shows the 2 issues I
mentioned. I appreciate any feedback, thanks!

Everything below is the actual patch

When compiling the following code with g++ -Wparentheses, GCC does not
warn on the if statement:

struct A {
A& operator=(int);
operator bool();
};

void f(A a) {
if (a = 0); // no warning
}

This is because a = 0 is a call to operator=, which GCC does not check
for.

This patch fixes that by checking for calls to operator= when deciding
to warn.

PR c/25689

gcc/cp/ChangeLog:

* semantics.cc (maybe_convert_cond): Handle the operator=() case
  as well.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wparentheses-31.C: New test.
---
  gcc/cp/semantics.cc | 14 +-
  gcc/testsuite/g++.dg/warn/Wparentheses-31.C | 11 +++
  2 files changed, 24 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-31.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 466d6b56871f4..6a25d039585f2 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -836,7 +836,19 @@ maybe_convert_cond (tree cond)
/* Do the conversion.  */
cond = convert_from_reference (cond);

-  if (TREE_CODE (cond) == MODIFY_EXPR
+  /* Also check if this is a call to operator=().
+ Example: if (my_struct = 5) {...}
+  */
+  tree fndecl = NULL_TREE;
+  if (TREE_OPERAND_LENGTH(cond) >= 1) {
+fndecl = cp_get_callee_fndecl(TREE_OPERAND(cond, 0));


Let's use cp_get_callee_fndecl_nofold.

Please add a space before all (


+  }
+
+  if ((TREE_CODE (cond) == MODIFY_EXPR
+|| (fndecl != NULL_TREE
+&& DECL_OVERLOADED_OPERATOR_P(fndecl)
+&& DECL_OVERLOADED_OPERATOR_IS(fndecl, NOP_EXPR)
+&& DECL_ASSIGNMENT_OPERATOR_P(fndecl)))
&& warn_parentheses
&& !warning_suppressed_p (cond, OPT_Wparentheses)
&& warning_at (cp_expr_loc_or_input_loc (cond),
diff --git a/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
b/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
new file mode 100644
index 0..abd7476ccb461
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wparentheses-31.C
@@ -0,0 +1,11 @@
+/* PR c/25689 */
+/* { dg-options "-Wparentheses" }  */
+
+struct A {
+   A& operator=(int);
+   operator bool();
+};
+
+void f(A a) {
+   if (a = 0); /* { dg-warning "suggest parentheses" } */
+}
--
2.17.1





Re: [PATCH] combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]

2022-02-10 Thread Segher Boessenkool
Hi!

On Thu, Feb 10, 2022 at 10:55:03AM +0100, Jakub Jelinek wrote:
> The following testcase ICEs, because combine substitutes
> (insn 10 9 11 2 (set (reg/v:SI 7 sp [ a ])
> (const_int 0 [0])) "pr104446.c":9:5 81 {*movsi_internal}
>  (nil))
> (insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [0  S4 A32])
> (reg:SI 85)) "pr104446.c":10:3 56 {*pushsi2}
>  (expr_list:REG_DEAD (reg:SI 85)
> (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> (nil
> forming
> (insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (const_int 0 [0])) [0  S4 A32])
> (reg:SI 85)) "pr104446.c":10:3 56 {*pushsi2}
>  (expr_list:REG_DEAD (reg:SI 85)
> (expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
> (nil
> which is invalid RTL (pre_dec's argument must be a REG).
> I know substitution creates various forms of invalid RTL and hopes that
> invalid RTL just won't recog.

This is not a "hope"; it is a requirement.  If the backend accepts
invalid insns, this is a bug in the backend.

> But unfortunately in this case we ICE before we get to recog, as
> try_combine does:
>   if (n_auto_inc)
> {
>   int new_n_auto_inc = 0;
>   for_each_inc_dec (newpat, count_auto_inc, _n_auto_inc);
> 
>   if (n_auto_inc != new_n_auto_inc)
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "Number of auto_inc expressions changed\n");
>   undo_all ();
>   return 0;
> }
> }
> and for_each_inc_dec under the hood will do e.g. for the PRE_DEC case:
> case PRE_DEC:
> case POST_DEC:
>   {
> poly_int64 size = GET_MODE_SIZE (GET_MODE (mem));
> rtx r1 = XEXP (x, 0);
> rtx c = gen_int_mode (-size, GET_MODE (r1));
> return fn (mem, x, r1, r1, c, data);
>   }
> and that code rightfully expects that the PRE_DEC operand has non-VOIDmode
> (as it needs to be a REG) - gen_int_mode for VOIDmode results in ICE.
> I think it is better not to emit the clearly invalid RTL during substitution
> like we do for other cases, than to adding workarounds for invalid IL
> created by combine to rtlanal.cc and perhaps elsewhere.

But we do have that in other cases, and not just for combine.  IMO it
is a good idea to robustify for_each_inc_dec (simply have it skip if the
address is not MODE_INT or such).  It also is a good idea to robustify
combine subst, just as you do.  It is best to do both!

> As for the testcase, of course it is UB at runtime to modify sp that way,
> but if such code is never reached, we must compile it, not to ICE on it.

It is an error at compile time already.  The stack pointer is a fixed
register.  The generic parts of the compiler use it, it is not just a
backend thing.

There are many more ways to ICE the compiler with register vars, btw.

And yes, ICEs are bad of course :-)  QoI thing...

> And I don't see why on other targets which use the autoinc rtxes much more
> it couldn't happen with other registers.

Yes.  My point (in the PR) was that it is easy enough to make this valid
code instead!

>   PR middle-end/104446
>   * combine.cc (subst): Don't substitute CONST_INTs into RTX_AUTOINC
>   operands.
> 
>   * gcc.target/i386/pr104446.c: New test.

> +/* PR middle-end/104446 */
> +/* { dg-do compile { target ia32 } } */
> +/* { dg-options "-O2 -mrtd" } */
> +
> +register volatile int a __asm__("%esp");
> +void foo (void *);
> +void bar (void *);
> +
> +void
> +baz (void)
> +{
> +  foo (__builtin_return_address (0));
> +  a = 0;
> +  bar (__builtin_return_address (0));
> +}

So does it not fail if you make this valid code (by using another
register)?  bp, si, or di maybe?

Okay with that fixed.  If fixing it is too hard, okay like this (I don't
have to maintain other peoples' backends' testsuites after all...)

Thanks!


Segher


[PATCH] pr104458.c: Replace long with long long for -mx32

2022-02-10 Thread H.J. Lu via Gcc-patches
PR target/104458
* gcc.target/i386/pr104458.c: Replace long with long long.
---
 gcc/testsuite/gcc.target/i386/pr104458.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr104458.c 
b/gcc/testsuite/gcc.target/i386/pr104458.c
index d1d28c13118..db5999df7dd 100644
--- a/gcc/testsuite/gcc.target/i386/pr104458.c
+++ b/gcc/testsuite/gcc.target/i386/pr104458.c
@@ -9,5 +9,5 @@ int i;
 void
 foo (F f)
 {
-  i += i % (long) f;
+  i += i % (long long) f;
 }
-- 
2.34.1



[committed] analyzer: fix testsuite issues seen with mingw [PR102052]

2022-02-10 Thread David Malcolm via Gcc-patches
Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-7180-g8383d41d704571d7ca234c7d2f551b7b69255194.

gcc/testsuite/ChangeLog:
PR analyzer/102052
* gcc.dg/analyzer/fields.c (size_t): Use __SIZE_TYPE__ rather than
hardcoding long unsigned int.
* gcc.dg/analyzer/gzio-3.c (size_t): Likewise.
* gcc.dg/analyzer/gzio-3a.c (size_t): Likewise.
* gcc.dg/analyzer/pr98969.c (test_1): Use __UINTPTR_TYPE__ rather
than long int.
(test_2): Likewise.
* gcc.dg/analyzer/pr99716-2.c (test_mountpoint): Use "rand" rather
than "random".
* gcc.dg/analyzer/pr99774-1.c (size_t): Use __SIZE_TYPE__ rather
than hardcoding long unsigned int.
* gcc.dg/analyzer/strndup-1.c: Add MinGW to targets that don't
implement strndup.
* gcc.dg/analyzer/zlib-5.c (size_t): Use __SIZE_TYPE__ rather
than hardcoding long unsigned int.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/analyzer/fields.c| 2 +-
 gcc/testsuite/gcc.dg/analyzer/gzio-3.c| 2 +-
 gcc/testsuite/gcc.dg/analyzer/gzio-3a.c   | 2 +-
 gcc/testsuite/gcc.dg/analyzer/pr98969.c   | 4 ++--
 gcc/testsuite/gcc.dg/analyzer/pr99716-2.c | 2 +-
 gcc/testsuite/gcc.dg/analyzer/pr99774-1.c | 2 +-
 gcc/testsuite/gcc.dg/analyzer/strndup-1.c | 2 +-
 gcc/testsuite/gcc.dg/analyzer/zlib-5.c| 2 +-
 8 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/fields.c 
b/gcc/testsuite/gcc.dg/analyzer/fields.c
index de55208070a..0bf877fcf1e 100644
--- a/gcc/testsuite/gcc.dg/analyzer/fields.c
+++ b/gcc/testsuite/gcc.dg/analyzer/fields.c
@@ -1,4 +1,4 @@
-typedef long unsigned int size_t;
+typedef __SIZE_TYPE__ size_t;
 
 extern size_t strlen (const char *__s)
   __attribute__ ((__nothrow__ , __leaf__))
diff --git a/gcc/testsuite/gcc.dg/analyzer/gzio-3.c 
b/gcc/testsuite/gcc.dg/analyzer/gzio-3.c
index 0a11f65fdca..426683244ff 100644
--- a/gcc/testsuite/gcc.dg/analyzer/gzio-3.c
+++ b/gcc/testsuite/gcc.dg/analyzer/gzio-3.c
@@ -1,4 +1,4 @@
-typedef long unsigned int size_t;
+typedef __SIZE_TYPE__ size_t;
 typedef struct _IO_FILE FILE;
 extern size_t fread(void *__restrict __ptr, size_t __size, size_t __n,
 FILE *__restrict __stream);
diff --git a/gcc/testsuite/gcc.dg/analyzer/gzio-3a.c 
b/gcc/testsuite/gcc.dg/analyzer/gzio-3a.c
index 15ed0103fe0..faf86fa3877 100644
--- a/gcc/testsuite/gcc.dg/analyzer/gzio-3a.c
+++ b/gcc/testsuite/gcc.dg/analyzer/gzio-3a.c
@@ -1,4 +1,4 @@
-typedef long unsigned int size_t;
+typedef __SIZE_TYPE__ size_t;
 typedef struct _IO_FILE FILE;
 extern size_t fread(void *__restrict __ptr, size_t __size, size_t __n,
 FILE *__restrict __stream);
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr98969.c 
b/gcc/testsuite/gcc.dg/analyzer/pr98969.c
index 7e1587d7094..e4e4f059197 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr98969.c
+++ b/gcc/testsuite/gcc.dg/analyzer/pr98969.c
@@ -4,14 +4,14 @@ struct foo
 };
 
 void
-test_1 (long int i)
+test_1 (__UINTPTR_TYPE__ i)
 {
   struct foo *f = (struct foo *)i;
   f->expr = __builtin_malloc (1024);
 } /* { dg-bogus "leak" } */
 
 void
-test_2 (long int i)
+test_2 (__UINTPTR_TYPE__ i)
 {
   __builtin_free (((struct foo *)i)->expr);
   __builtin_free (((struct foo *)i)->expr); /* { dg-warning "double-'free' of 
'\\*\\(\\(struct foo \\*\\)i\\)\\.expr'" } */
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr99716-2.c 
b/gcc/testsuite/gcc.dg/analyzer/pr99716-2.c
index 7c9881c61ff..adc9819643a 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr99716-2.c
+++ b/gcc/testsuite/gcc.dg/analyzer/pr99716-2.c
@@ -10,7 +10,7 @@ extern int foo (void);
 void
 test_mountpoint (const char *mp)
 {
-  const int nr_passes = 5 + (random () & 31);
+  const int nr_passes = 5 + (rand () & 31);
   int pass;
   int ret = 1;
   FILE *fp;
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr99774-1.c 
b/gcc/testsuite/gcc.dg/analyzer/pr99774-1.c
index 620cf6571ed..a0bca8b1fe2 100644
--- a/gcc/testsuite/gcc.dg/analyzer/pr99774-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/pr99774-1.c
@@ -7,7 +7,7 @@ typedef unsigned char uint8_t;
 typedef unsigned short uint16_t;
 typedef unsigned long uint64_t;
 typedef unsigned long uint64_t;
-typedef long unsigned int size_t;
+typedef __SIZE_TYPE__ size_t;
 
 extern void *calloc(size_t __nmemb, size_t __size)
   __attribute__((__nothrow__, __leaf__))
diff --git a/gcc/testsuite/gcc.dg/analyzer/strndup-1.c 
b/gcc/testsuite/gcc.dg/analyzer/strndup-1.c
index 58223533b5d..edf494ac284 100644
--- a/gcc/testsuite/gcc.dg/analyzer/strndup-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/strndup-1.c
@@ -1,4 +1,4 @@
-/* { dg-skip-if "no strndup in libc" { *-*-darwin[789]* *-*-darwin10* } } */
+/* { dg-skip-if "no strndup in libc" { *-*-darwin[789]* *-*-darwin10* 
*-*-mingw* } } */
 #include 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/zlib-5.c 
b/gcc/testsuite/gcc.dg/analyzer/zlib-5.c
index afb61023330..1e3746d91fc 100644
--- 

[PATCH] x86: Update PR 35513 tests

2022-02-10 Thread H.J. Lu via Gcc-patches
1. Require linker with GNU_PROPERTY_1_NEEDED support for PR 35513
run-time tests.
2. Compile pr35513-8.c to scan assembly code.

PR testsuite/104481
* g++.target/i386/pr35513-1.C: Require property_1_needed target.
* g++.target/i386/pr35513-2.C: Likewise.
* gcc.target/i386/pr35513-8.c: Change to compile.
* lib/target-supports.exp (check_compile): Support assembly code.
(check_effective_target_property_1_needed): New proc.
---
 gcc/testsuite/g++.target/i386/pr35513-1.C |  2 +-
 gcc/testsuite/g++.target/i386/pr35513-2.C |  2 +-
 gcc/testsuite/gcc.target/i386/pr35513-8.c |  2 +-
 gcc/testsuite/lib/target-supports.exp | 37 +++
 4 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/g++.target/i386/pr35513-1.C 
b/gcc/testsuite/g++.target/i386/pr35513-1.C
index 6f8db37fb7c..daa615662c5 100644
--- a/gcc/testsuite/g++.target/i386/pr35513-1.C
+++ b/gcc/testsuite/g++.target/i386/pr35513-1.C
@@ -1,4 +1,4 @@
-// { dg-do run }
+// { dg-do run { target property_1_needed } }
 // { dg-options "-O2 -mno-direct-extern-access" }
 
 #include 
diff --git a/gcc/testsuite/g++.target/i386/pr35513-2.C 
b/gcc/testsuite/g++.target/i386/pr35513-2.C
index 9143ff3f0a5..ecccdaeb666 100644
--- a/gcc/testsuite/g++.target/i386/pr35513-2.C
+++ b/gcc/testsuite/g++.target/i386/pr35513-2.C
@@ -1,4 +1,4 @@
-// { dg-do run  }
+// { dg-do run { target property_1_needed } }
 // { dg-options "-O2 -mno-direct-extern-access" }
 
 class Foo 
diff --git a/gcc/testsuite/gcc.target/i386/pr35513-8.c 
b/gcc/testsuite/gcc.target/i386/pr35513-8.c
index 7ba67de2156..d51f7efb353 100644
--- a/gcc/testsuite/gcc.target/i386/pr35513-8.c
+++ b/gcc/testsuite/gcc.target/i386/pr35513-8.c
@@ -1,4 +1,4 @@
-/* { dg-do assemble { target { *-*-linux* && { ! ia32 } } } } */
+/* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
 /* { dg-require-effective-target maybe_x32 } */
 /* { dg-options "-mx32 -O2 -fno-pic -fexceptions -fasynchronous-unwind-tables 
-mno-direct-extern-access" } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4463cc8d7ed..0d8a7df5026 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -30,6 +30,7 @@
 #
 # Assume by default that CONTENTS is C code.  
 # Otherwise, code should contain:
+# "/* Assembly" for assembly code,
 # "// C++" for c++,
 # "// D" for D,
 # "! Fortran" for Fortran code,
@@ -57,6 +58,7 @@ proc check_compile {basename type contents args} {
set options ""
 }
 switch -glob -- $contents {
+   "*/* Assembly*" { set src ${basename}[pid].S }
"*! Fortran*" { set src ${basename}[pid].f90 }
"*// C++*" { set src ${basename}[pid].cc }
"*// D*" { set src ${basename}[pid].d }
@@ -11758,3 +11760,38 @@ proc check_effective_target_pytest3 { } {
 return 0;
 }
 }
+
+proc check_effective_target_property_1_needed { } {
+  return [check_no_compiler_messages_nocache property_1_needed executable {
+/* Assembly code */
+#ifdef __LP64__
+# define __PROPERTY_ALIGN 3
+#else
+# define __PROPERTY_ALIGN 2
+#endif
+
+   .section ".note.gnu.property", "a"
+   .p2align __PROPERTY_ALIGN
+   .long 1f - 0f   /* name length.  */
+   .long 4f - 1f   /* data length.  */
+   /* NT_GNU_PROPERTY_TYPE_0.   */
+   .long 5 /* note type.  */
+0:
+   .asciz "GNU"/* vendor name.  */
+1:
+   .p2align __PROPERTY_ALIGN
+   /* GNU_PROPERTY_1_NEEDED.  */
+   .long 0xb0008000/* pr_type.  */
+   .long 3f - 2f   /* pr_datasz.  */
+2:
+   /* GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS.  */
+   .long 1
+3:
+   .p2align __PROPERTY_ALIGN
+4:
+   .text
+   .globl main
+main:
+   .byte 0
+  } ""]
+}
-- 
2.34.1



[committed] libstdc++: Add atomic_fetch_xor to

2022-02-10 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

This function (and the explicit memory over version) are present in both
C++  and C , so should be in C++  too.
There is a library issue incoming for this, but the resolution is
obvious.

libstdc++-v3/ChangeLog:

* include/c_compatibility/stdatomic.h (atomic_fetch_xor): Add
using-declaration.
(atomic_fetch_xor_explicit): Likewise.
* testsuite/29_atomics/headers/stdatomic.h/c_compat.cc: Check
arithmetic and logical operations for atomic_int.
---
 libstdc++-v3/include/c_compatibility/stdatomic.h  |  2 ++
 .../29_atomics/headers/stdatomic.h/c_compat.cc| 11 +++
 2 files changed, 13 insertions(+)

diff --git a/libstdc++-v3/include/c_compatibility/stdatomic.h 
b/libstdc++-v3/include/c_compatibility/stdatomic.h
index 95c72615b4e..c97cbac984e 100644
--- a/libstdc++-v3/include/c_compatibility/stdatomic.h
+++ b/libstdc++-v3/include/c_compatibility/stdatomic.h
@@ -111,6 +111,8 @@ using std::atomic_fetch_sub;
 using std::atomic_fetch_sub_explicit;
 using std::atomic_fetch_or;
 using std::atomic_fetch_or_explicit;
+using std::atomic_fetch_xor;
+using std::atomic_fetch_xor_explicit;
 using std::atomic_fetch_and;
 using std::atomic_fetch_and_explicit;
 using std::atomic_flag_test_and_set;
diff --git a/libstdc++-v3/testsuite/29_atomics/headers/stdatomic.h/c_compat.cc 
b/libstdc++-v3/testsuite/29_atomics/headers/stdatomic.h/c_compat.cc
index 80d2e150647..6dd4f5b00ca 100644
--- a/libstdc++-v3/testsuite/29_atomics/headers/stdatomic.h/c_compat.cc
+++ b/libstdc++-v3/testsuite/29_atomics/headers/stdatomic.h/c_compat.cc
@@ -116,6 +116,17 @@ static_assert( requires (::atomic_int* i, int* e) {
   ::atomic_compare_exchange_weak_explicit(i, e, 3,
  memory_order_acq_rel,
  memory_order_relaxed);
+
+  ::atomic_fetch_add(i, 1);
+  ::atomic_fetch_add_explicit(i, 1, memory_order_relaxed);
+  ::atomic_fetch_sub(i, 1);
+  ::atomic_fetch_sub_explicit(i, 1, memory_order_relaxed);
+  ::atomic_fetch_and(i, 1);
+  ::atomic_fetch_and_explicit(i, 1, memory_order_relaxed);
+  ::atomic_fetch_or(i, 1);
+  ::atomic_fetch_or_explicit(i, 1, memory_order_relaxed);
+  ::atomic_fetch_xor(i, 1);
+  ::atomic_fetch_xor_explicit(i, 1, memory_order_relaxed);
 } );
 
 static_assert( requires (::atomic_flag* f) {
-- 
2.34.1



[committed] libstdc++: Decouple HAVE_FCNTL_H from HAVE_DIRENT_H check

2022-02-10 Thread Jonathan Wakely via Gcc-patches
On Tue, 8 Feb 2022 at 21:18, Jonathan Wakely wrote:

>
>
> On Tue, 8 Feb 2022 at 21:02, Dimitar Dimitrov  wrote:
>
>> On Mon, Feb 07, 2022 at 09:05:45PM +, Jonathan Wakely wrote:
>> > On Mon, 7 Feb 2022 at 21:01, Jonathan Wakely 
>> wrote:
>> >
>> > >
>> > >
>> > > On Mon, 7 Feb 2022 at 20:12, Dimitar Dimitrov 
>> wrote:
>> > >
>> > >> On PRU target with newlib, we have the following combination in
>> config.h:
>> > >>   /* #undef HAVE_DIRENT_H */
>> > >>   #define HAVE_FCNTL_H 1
>> > >>   #define HAVE_UNLINKAT 1
>> > >>
>> > >> In newlib, targets which do not define dirent.h, get a build error
>> when
>> > >> including :
>> > >>
>> > >>
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/include/sys/dirent.h;hb=HEAD
>> > >>
>> > >> While fs_dir.cc correctly checks for HAVE_FCNTL_H, dir-common.h
>> doesn't,
>> > >> and instead uses HAVE_DIRENT_H. This results in unlinkat() function
>> call
>> > >> in fs_dir.cc without the needed  include in dir-common.h.
>> Thus
>> > >> a build failure:
>> > >>   .../gcc/libstdc++-v3/src/c++17/fs_dir.cc:151:11: error:
>> ‘::unlinkat’
>> > >> has not been declared; did you mean ‘unlink’?
>> > >>
>> > >> Fix by encapsulating  include with the correct check.
>> > >>
>> > >
>> > > But there's no point doing anything in that file if we don't have
>> > > , the whole thing is unusable. There's no point making the
>> > > members using unlinkat compile if you can't ever construct the type.
>> > >
>> > > So I think we want a different fix.
>> > >
>> >
>> >
>> > Maybe something like:
>> >
>> > --- a/libstdc++-v3/src/filesystem/dir-common.h
>> > +++ b/libstdc++-v3/src/filesystem/dir-common.h
>> > @@ -70,6 +70,8 @@ struct DIR { };
>> > inline DIR* opendir(const char*) { return nullptr; }
>> > inline dirent* readdir(DIR*) { return nullptr; }
>> > inline int closedir(DIR*) { return -1; }
>> > +#undef _GLIBCXX_HAVE_DIRFD
>> > +#undef _GLIBCXX_HAVE_UNLINKAT
>> > #endif
>> > } // namespace __gnu_posix
>> Yes, this fixes the PRU target, and does not regress
>> x86_64-pc-linux-gnu.
>>
>
> Thanks for checking it. I'm just testing it myself on
> powerpc64le-linux-gnu and will push when it finishes.
>
>
Sorry for the delay, that's pushed to trunk now.
commit 3d5f4f76e6db0895181ebca538748379bfe6058f
Author: Jonathan Wakely 
Date:   Tue Feb 8 21:05:30 2022

libstdc++: Fix directory iterator build for newlib

When building for newlib HAVE_OPENAT and HAVE_UNLINKAT are (sometimes?)
defined, but  is only included when HAVE_DIRENT_H is defined.
Since directory iterators are completely useless without ,
just override the HAVE_OPENAT and HAVE_UNLINKAT detection when we don't
have .

libstdc++-v3/ChangeLog:

* src/filesystem/dir-common.h (_GLIBCXX_HAVE_DIRFD): Undefine
when  is not available.
(_GLIBCXX_HAVE_UNLINKAT):  Likewise.

diff --git a/libstdc++-v3/src/filesystem/dir-common.h 
b/libstdc++-v3/src/filesystem/dir-common.h
index 511b988f1c7..365fd527f4d 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -70,6 +70,8 @@ struct DIR { };
 inline DIR* opendir(const char*) { return nullptr; }
 inline dirent* readdir(DIR*) { return nullptr; }
 inline int closedir(DIR*) { return -1; }
+#undef _GLIBCXX_HAVE_DIRFD
+#undef _GLIBCXX_HAVE_UNLINKAT
 #endif
 } // namespace __gnu_posix
 


Re: [Patch, fortran] PR37336 (Finalization) - [F03] Finish derived-type finalization

2022-02-10 Thread Paul Richard Thomas via Gcc-patches
Hi Harald,


I have run your modified version of finalize_38.f90, and now I see
> that you can get a bloody head just from scratching too much...
>
> crayftn 12.0.2:
>
>   1,  3,  1
>
 It appears that Cray interpret a derived type constructor as being a
function call and so "6 If a specification expression in a scoping unit
references a function, the result is finalized before execution of the
executable constructs in the scoping unit."
A call to 'test' as the first statement might be useful to diagnose: call
test(2, 0, [0,0], -10)

>   2,  21,  0
>
21 is presumably the value left over from simple(21) but quite why it
should happen in this order is not apparent to me.

>   11,  3,  2
>
I am mystified as to why the finalization of 'var' is not occurring because
"1 When an intrinsic assignment statement is executed (10.2.1.3), if the
variable is not an unallocated allocatable variable, it is finalized after
evaluation of expr and before the definition of the variable." Note the
double negative! 'var' has been allocated and should return 1 to 'scalar'

>   12,  21,  1
>   21,  4,  3
>
This is a residue of earlier differences in the final count.

>   23,  21,  22 | 42,  43
>
The value is inexplicable to me.

  31,  6,  4
>   41,  7,  5
>   51,  9,  7
>   61,  10,  8
>   71,  13,  10
>   101,  2,  1
>
One again, a function 'expr' finalization has been added after intrinsic
assignment; ie. derived type constructor == function.

>   102,  4,  3
>


>   111,  3,  2
>   121,  4,  2
>   122,  0,  4
>   123,  5,  6 | 2*0
>
>From the value of 'array', I would devine that the source in the allocation
is being finalized as an array, whereas I would expect each invocation of
'simple' to generate a scalar final call.

>   131,  5,  2
>   132,  0,  4
>   133,  7,  8 | 2*0
>
The final count has increased by 1, as expected.  The value of 'scalar' is
stuck at 0, so the second line is explicable. The array value is explicable
if the finalization is of 'expr' and that 'var' is not finalized or the
finalization of 'var' is occuring after assignment; ie. wrong order.
***I notice from the code that even with the patch, gfortran is finalizing
before evaluation of 'expr', which is incorrect. It should be after
evaluation of 'expr' and before the assignment.***

  141,  6,  3
>
Final count offset - OK

  151,  10,  5
>
The two extra calls come, I presume from the source in the allocation.
Since the type is extended, we see two finalizations each for the
allocation and the deallocation.

  161,  16,  9
>
 I think that the extra two finalizations come from the evaluation of 'src'
in 'constructor2'.

  171,  18,  11
>
Final count offset - OK

  175,  0.,  20. | 10.,  20.
>
The value of 'rarray' is mystifying.

Conclusions from Cray:
(i) Determine if derived type constructors should be interpreted as
function calls.
(ii) The order of finalization in class array assignment needs to be
checked and fixed if necessary.

>
> nagfor 7.0:
>
>   1 0 1
>
"1 When an intrinsic assignment statement is executed (10.2.1.3), if the
variable is not an unallocated allocatable variable, it is finalized after
evaluation of expr and before the definition of the variable."   So I think
that NAG has this wrong, either because the timing is right and an
unallocatable allocatable is being finalized or because the timing is wrong.

  11 1 2
>   23 21 22 | 42 43
>
It seems that the finalization is occurring after assignment.

  71 9 10
>   72 11 99
>
It seems that the finalization of the function 'expr' after assignment is
not happening.

  131 3 2
>   132 5 4
>
I am not sure that I know where the extra final call is nor where the
scalar value of 5 comes from.

  141 4 3
>   151 6 5
>   161 10 9
>   171 12 11
>
 The above are OK since there is an offset in the final count, starting at
131.

Conclusions from NAG:
(i) Some minor nits but pretty close to my interpretation.


Intel 2021.5.0:
>
>   131   3   2
>   132   0   4
>   133   5   6 |   0   0
>   141   4   3
>   151   7   5
>   152   3   0
>   153   0   0 |   1   3
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> [...]
>

ifort (IFORT) 2021.1 Beta 20201112 manages to carry on to the end.
 161  13   9
 162  20   0
 163   0   0 |  10  20
 171  14  11

Conclusions on ifort:
(i) The agreement between gfortran, with the patch applied, and ifort is
strongest of all the other brands;
(ii) The disagreements are all down to the treatment of the parent
component of arrays of extended types: gfortran finalizes the parent
component as an array, whereas ifort does a scalarization. I have a patch
ready to do likewise.

Overall conclusions:
(i) Sort out whether or not derived type constructors are 

[PATCH] df: Don't set bbs dirty because of debug insn moves [PR104459]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, we get -fcompare-debug failure, which is caused by
cfg_layout_merge_blocks successfully merging two bbs where both bbs
contained just CODE_LABEL, NOTE_INSN_BASIC_BLOCK and in the -g case both
some debug insns at the end.  cfg_layout_merge_blocks calls
update_bb_for_insn_chain which for the post-label insns in the second block
(except for BARRIERs) calls df_insn_change_bb.  This function changes
the bb of the insns and for notes just punts, but for other insns calls
df_set_bb_dirty.  Now the problem is that because there were only debug
insns and notes in the second block, df_set_bb_dirty is called on both
only in the -g case and not with -g0.  df_set_bb_dirty these days
sets both the BB_MODIFIED flag and marks the bb as dirty, and the former
is what 6 spots in cfgcleanup.cc use in code-generation decisions,
in this case
  may_thread |= (target->flags & BB_MODIFIED) != 0;
in particular.  So, with -g may_thread is true while with -g0 it is not
and we diverge from that point onwards.
I've thought about introducing df_set_bb_dirty_nondebug that wouldn't
set BB_MODIFIED but would mark the bb dirty, but then I went through
history and found changes like:
https://gcc.gnu.org/legacy-ml/gcc-patches/2010-10/msg00059.html
so I've also tried just not calling df_set_bb_dirty for debug insns
at all and it passed x86_64-linux and i686-linux
--enable-checking=yes,rtl,extra,df bootstraps/regtests, so perhaps
that works too.
Now that I look at it again, if we don't need those from %d to %d messages
for debug insns in the dump files, another way to fix it would be just to
change the very first line in the hunk from
  if (!INSN_P (insn))
to
  if (!DEBUG_INSN_P (insn))
Though, df_set_bb_dirty_nondebug which will do everything but
set bb->flags |= BB_MODIFIED is yet another option I can test.
Perhaps even that PR42889 was solely about those 6 decisions in cfgcleanup
(at that point it used df_get_bb_dirty) and not about actually the
recomputation of some of the problems causing different code generations.

2022-02-10  Jakub Jelinek  

PR rtl-optimization/104459
* df-scan.cc (df_insn_change_bb): Don't call df_set_bb_dirty when
moving DEBUG_INSNs between bbs.

* gcc.dg/pr104459.c: New test.

--- gcc/df-scan.cc.jj   2022-01-18 00:18:02.720744815 +0100
+++ gcc/df-scan.cc  2022-02-10 11:10:54.039135547 +0100
@@ -1769,13 +1769,15 @@ df_insn_change_bb (rtx_insn *insn, basic
   if (!INSN_P (insn))
 return;
 
-  df_set_bb_dirty (new_bb);
+  if (!DEBUG_INSN_P (insn))
+df_set_bb_dirty (new_bb);
   if (old_bb)
 {
   if (dump_file)
fprintf (dump_file, "  from %d to %d\n",
 old_bb->index, new_bb->index);
-  df_set_bb_dirty (old_bb);
+  if (!DEBUG_INSN_P (insn))
+   df_set_bb_dirty (old_bb);
 }
   else
 if (dump_file)
--- gcc/testsuite/gcc.dg/pr104459.c.jj  2022-02-10 11:09:38.397181836 +0100
+++ gcc/testsuite/gcc.dg/pr104459.c 2022-02-10 11:09:16.049490953 +0100
@@ -0,0 +1,38 @@
+/* PR rtl-optimization/104459 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -funswitch-loops -fno-tree-dce -fcompare-debug -w" } */
+
+void
+foo (int x, int y)
+{
+  unsigned int a;
+
+  for (;;)
+{
+  short int *p = (short int *) 
+  unsigned int q = 0;
+
+  a /= 2;
+  if (a)
+   {
+ q -= y;
+ while (q)
+   ;
+   }
+
+  if (x)
+   {
+ for (q = 0; q != 1; q += 2)
+   {
+ unsigned int n;
+
+ n = *p ? 0 : q;
+ y += n < 1;
+
+ n = a || *p;
+ if (n % x == 0)
+   y /= x;
+   }
+   }
+}
+}

Jakub



Re: [PATCH] gfortran: Respect target's NO_DOT_IN_LABEL in trans-common.cc

2022-02-10 Thread Tobias Burnus

On 10.02.22 11:07, Roger Sayle wrote:


The fix is to tweak trans-common.cc to respect the target's NO_DOT_IN_LABEL
(and NO_DOLLAR_IN_LABEL) when generating internal equiv.%d symbols.


In general, I think the patch is okay – but as '_' is a valid identifier
and with -fdollar-ok '$' is valid as well, I wonder whether there should
be a prefix and/or a capital letter. Namely:

+#if !defined (NO_DOT_IN_LABEL)
+#define GFC_EQUIV_FMT "equiv.%d"
+#elif !defined (NO_DOLLAR_IN_LABEL)
+#define GFC_EQUIV_FMT "equiv$%d"
+#else
+#define GFC_EQUIV_FMT "equiv_%d"
+#endif

The first one we want to keep as is for backwards compatibility. And
the '.' is fine in this regard. But I think for the other two, we need
to do more, e.g., '_Equiv' + '$%d' / '_%d'. The '_' and capital letter
should place everything into the compiler namespace (in terms of C/C++)
and Fortran itself requires for a name (case insensitive, lower cased by
gfortran): [A-Z][A-Z0-9_]* – thus, '_E' contains two characters not
accessible in a normal Fortran program (ignoring bind(C,name="...").

If the _Equiv[_$]%d sounds sensible to you: Please change; if not,
what concerns do you have?


Ok for mainline?

OK with changing the prefix for the non-dot variant.

Tobias


2022-02-10  Roger Sayle  

gcc/fortran/ChangeLog
  * trans-common.cc (GFC_EQUIV_FMT): New macro respecting the
  target's NO_DOT_IN_LABEL and NO_DOLLAR_IN_LABEL preferences.
  (build_equiv_decl): Use GFC_EQUIV_FMT here.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] ubsan: Separate -f{,no-}delete-null-pointer-checks from -fsanitize={null,{,returns-}nonnull-attribute} [PR104426]

2022-02-10 Thread Richard Biener via Gcc-patches
On Wed, 9 Feb 2022, Jakub Jelinek wrote:

> On Wed, Feb 09, 2022 at 03:41:23PM +0100, Richard Biener wrote:
> > On Wed, 9 Feb 2022, Jakub Jelinek wrote:
> > 
> > > On Wed, Feb 09, 2022 at 11:19:25AM +0100, Richard Biener wrote:
> > > > That does look like bogus abstraction though - I'd rather have
> > > > the target be specific w/o option checks and replace 
> > > > targetm.zero_addres_valid uses with a wrapper (like you do for
> > > > flag_delete_null_pointer_checks), if we think that the specific
> > > > query should be adjusted by sanitize flags (why?) or
> > > > folding_initializer (why?).
> > > 
> > > Based on discussions on IRC, here is a WIP patch.
> > > 
> > > Unfortunately, there are 3 unresolved issues:
> > > 1) ipa-icf.cc uses
> > >   && opt_for_fn (decl, flag_delete_null_pointer_checks))
> > >there is a pointer type, but I guess we'd need to adjust the
> > >target hook to take a defaulted fndecl argument and use that
> > >for the options
> > 
> > Yeah, I'd use a struct function arg tho, not a decl.
> 
> But both opts_for_fn and sanitizer_flag_p take a fndecl tree, not cfun.

Hmm, ok - the go for decl.

> > > 2) rtlanal.cc has:
> > > case SYMBOL_REF:
> > >   return flag_delete_null_pointer_checks && !SYMBOL_REF_WEAK (x);
> > >Is there any way how to find out address space of a SYMBOL_REF?
> > 
> > TYPE_ADDR_SPACE (TREE_TYPE (SYMBOL_REF_DECL ())) I guess.
> 
> And default to ADDR_SPACE_GENERIC if there is no SYMBOL_REF_DECL?
> That can work.

Yeah, alternatively the caller would need to pass down the MEM
since the address-space is only in the MEM attrs :/

> > >Or shall it hardcode ADDR_SPACE_GENERIC?
> > > 3) tree-ssa-structalias.cc has:
> > >   if ((TREE_CODE (t) == INTEGER_CST
> > >&& integer_zerop (t))
> > >   /* The only valid CONSTRUCTORs in gimple with pointer typed
> > >  elements are zero-initializer.  But in IPA mode we also
> > >  process global initializers, so verify at least.  */
> > >   || (TREE_CODE (t) == CONSTRUCTOR
> > >   && CONSTRUCTOR_NELTS (t) == 0))
> > > {
> > >   if (flag_delete_null_pointer_checks)
> > > temp.var = nothing_id;
> > >   else
> > > temp.var = nonlocal_id;
> > >   temp.type = ADDRESSOF;
> > >   temp.offset = 0;
> > >   results->safe_push (temp);
> > >   return;
> > > }
> > >mpt really sure where to get the address space from in that case
> > > 
> > > And perhaps I didn't do it right in some other spots too.
> > 
> > This case is really difficult since we track pointers through integers
> > (mind the missing POINTER_TYPE_P check above).  Of course we have
> > no idea what address-space the integer was converted from or will
> > be converted to so what the above wants to check is whether
> > there is _any_ address-space that could have a zero pointer pointing
> > to a valid object ...
> 
> Ugh.  So that would be ADDR_SPACE_ANY ((unsigned char) -1) and use that
> in the hook?
> But we'd penalize x86 through it because for the __seg_?s address spaces
> we allow 0 address...

Yes :/  Alternatively we can have PTA give up on non-default
address-space to/from non-pointer conversions which means not
to track points-to across such transitions.

Might be worth filing a tracking bug for this and leave things
in the current slightly broken state?  In this case it would
mean using ADDR_SPACE_GENERIC.  Also note that the specific
place can cobble up multiple fields and thus fields with
pointers to _different_ address-spaces ...

> > > --- gcc/targhooks.cc.jj   2022-01-18 11:58:59.919977242 +0100
> > > +++ gcc/targhooks.cc  2022-02-09 13:21:08.958835833 +0100
> > > @@ -1598,7 +1598,7 @@ default_addr_space_subset_p (addr_space_
> > >  bool
> > >  default_addr_space_zero_address_valid (addr_space_t as ATTRIBUTE_UNUSED)
> > >  {
> > > -  return false;
> > > +  return !flag_delete_null_pointer_checks_;
> > 
> > As said, I'd not do that, but check it in zero_address_valid only.
> > Otherwise all targets overriding the hook have to remember to check
> > this flag.  I suppose we'd then do
> > 
> >   if (option_set (flag_delete_null_pointer_check))
> > use flag_delete_null_pointer_check;
> >   else
> > use targetm.zero_address_valid;
> > 
> > possibly only for the default address-space.
> 
> The advantage of checking the option in the hook is that it can precisely
> decide what exactly it wants for each address space.  It can e.g. decide
> to ignore the flag and say that in some address space 0 is always valid or 0
> is never valid, or honor it under some conditions etc.
> Doing it outside of the hook means we do the decision globally, and either
> we hardcode targetm.addr_space.zero_address_valid || 
> !flag_delete_null_pointer_check_, or
> targetm.addr_space.zero_address_valid && !flag_delete_null_pointer_check_

As said I'm leaning towards documenting that 
-f[no-]delete-null-pointer-checks only has effects on the generic
address-space.  It's 

[PATCH] gfortran: Respect target's NO_DOT_IN_LABEL in trans-common.cc

2022-02-10 Thread Roger Sayle

This patch fixes 9 unexpected failures in the gfortran testsuite on
nvptx-none.  The issue is that gfortran's EQUIVALENCE internally uses
symbols such as "equiv.0" even on platforms that define NO_DOT_IN_LABEL.
On nvptx-none, this then results in the following error message(s):
ptxas application ptx input, fatal: Parsing error near '.0': syntax error
ptxas fatal   : Ptx assembly aborted due to errors

The fix is to tweak trans-common.cc to respect the target's NO_DOT_IN_LABEL
(and NO_DOLLAR_IN_LABEL) when generating internal equiv.%d symbols.
Only the nvptx, mmix and xtensa backends define NO_DOT_IN_LABEL which
explains why no-one has spotted/fixed this issue since the problematic
code was last changed back in 2005(!).

This patch has been tested on nvptx-none hosted by x86_64-pc-linux-gnu
with make and make -k check with no new failures, and the nine fewer
failures in the gfortran testsuite described above.  Ok for mainline?


2022-02-10  Roger Sayle  

gcc/fortran/ChangeLog
* trans-common.cc (GFC_EQUIV_FMT): New macro respecting the
target's NO_DOT_IN_LABEL and NO_DOLLAR_IN_LABEL preferences.
(build_equiv_decl): Use GFC_EQUIV_FMT here.


Thanks in advance,
Roger
--

diff --git a/gcc/fortran/trans-common.cc b/gcc/fortran/trans-common.cc
index 7b4d198..184a976 100644
--- a/gcc/fortran/trans-common.cc
+++ b/gcc/fortran/trans-common.cc
@@ -338,6 +338,13 @@ build_field (segment_info *h, tree union_type, 
record_layout_info rli)
   h->field = field;
 }
 
+#if !defined (NO_DOT_IN_LABEL)
+#define GFC_EQUIV_FMT "equiv.%d"
+#elif !defined (NO_DOLLAR_IN_LABEL)
+#define GFC_EQUIV_FMT "equiv$%d"
+#else
+#define GFC_EQUIV_FMT "equiv_%d"
+#endif
 
 /* Get storage for local equivalence.  */
 
@@ -356,7 +363,7 @@ build_equiv_decl (tree union_type, bool is_init, bool 
is_saved, bool is_auto)
   return decl;
 }
 
-  snprintf (name, sizeof (name), "equiv.%d", serial++);
+  snprintf (name, sizeof (name), GFC_EQUIV_FMT, serial++);
   decl = build_decl (input_location,
 VAR_DECL, get_identifier (name), union_type);
   DECL_ARTIFICIAL (decl) = 1;


Re: [PATCH] tree-optimization/104373 - early uninit diagnostic on unreachable code

2022-02-10 Thread Richard Biener via Gcc-patches
On Wed, 9 Feb 2022, Martin Sebor wrote:

> On 2/9/22 00:12, Richard Biener wrote:
> > On Tue, 8 Feb 2022, Jeff Law wrote:
> > 
> >>
> >>
> >> On 2/8/2022 12:03 AM, Richard Biener via Gcc-patches wrote:
> >>> The following improves early uninit diagnostics by computing edge
> >>> reachability using our value-numbering framework in its cheapest
> >>> mode and ignoring unreachable blocks when looking
> >>> for uninitialized uses.  To not ICE with -fdump-tree-all the
> >>> early uninit pass needs a dumpfile since VN tries to dump statistics.
> >>>
> >>> For gimple-match.c at -O0 -g this causes a 2% increase in compile-time.
> >>>
> >>> In theory all early diagnostic passes could benefit from a VN run but
> >>> that would require more refactoring that's not appropriate at this stage.
> >>> This patch addresses a GCC 12 diagnostic regression and also happens to
> >>> fix one XFAIL in gcc.dg/uninit-pr20644-O0.c
> >>>
> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk?
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
> >>> 2022-02-04  Richard Biener  
> >>>
> >>>   PR tree-optimization/104373
> >>>   * tree-ssa-sccvn.h (do_rpo_vn): New export exposing the
> >>>   walk kind.
> >>>   * tree-ssa-sccvn.cc (do_rpo_vn): Export, get the default
> >>>   walk kind as argument.
> >>>   (run_rpo_vn): Adjust.
> >>>   (pass_fre::execute): Likewise.
> >>>   * tree-ssa-uninit.cc (warn_uninitialized_vars): Skip
> >>>   blocks not reachable.
> >>>   (execute_late_warn_uninitialized): Mark all edges as
> >>>   executable.
> >>>   (execute_early_warn_uninitialized): Use VN to compute
> >>>   executable edges.
> >>>   (pass_data_early_warn_uninitialized): Enable a dump file,
> >>>   change dump name to warn_uninit.
> >>>
> >>>   * g++.dg/warn/Wuninitialized-32.C: New testcase.
> >>>   * gcc.dg/uninit-pr20644-O0.c: Remove XFAIL.
> >> I'm conflicted on this ;-)
> >>
> >> I generally lean on the side of eliminating false positives in these
> >> diagnostics.  So identifying unreachable blocks and using that to prune the
> >> set of warnings we generate, even at -O0 is good from that point of view.
> >>
> >> But doing something like this has many of the problems that relying on
> >> optimizations does, even if we don't optimize away the unreachable code.
> >> Right now the warning should be fairly stable at -O0 -- the set of
> >> diagnostics
> >> you get isn't going to change a lot release to release which is important
> >> to
> >> some users.   Second, at -O0 whether or not you get a warning isn't a
> >> function
> >> of how good our unreachable code analysis might be.
> >>
> >> This was quite a contentious topic many years ago.  So much that I dropped
> >> some work on Wuninit on the floor as I just couldn't take the arguing.  So
> >> be
> >> aware that you might be opening a can of worms.
> >>
> >> So the question comes down to a design decision.   What's more important to
> >> the end users?  Fewer false positives or better stability in the warning? 
> >> I
> >> think the former, but there's certainly been a vocal group that prefers the
> >> latter.
> > 
> > I see - I didn't think of this aspect at all but that means I have no
> > idea on whether it is important or not ...
> > 
> > In our own setup we're running into "instabilities" with optimization
> > when building packages that enable -Werror, so I can see shops doing
> > dev builds at -O0 with warnings and -Werror but drop -Werror for
> > optimized builds.
> > 
> >> On the implementation side I have zero concerns.    Looking further out,
> >> ISTM
> >> we could mark the blocks as unreachable (rather than deducing it from edge
> >> flags).  That would make it pretty easy to mark those blocks relatively
> >> early
> >> and allow us to suppress any middle end diagnostic occurring in an
> >> unreachable
> >> block.
> > 
> > So what I had in mind is that for the set of early diagnostic passes
> > 
> >PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
> >NEXT_PASS (pass_fixup_cfg);
> >NEXT_PASS (pass_build_ssa);
> >NEXT_PASS (pass_warn_printf);
> >NEXT_PASS (pass_warn_nonnull_compare);
> >NEXT_PASS (pass_early_warn_uninitialized);
> >NEXT_PASS (pass_warn_access, /*early=*/true);
> > 
> > we'd run VN and keep it's lattice around (and not just the
> > EDGE_EXECUTABLE flags).  That would for example allow
> > pass_warn_nonnull_compare to see that in
> > 
> > void foo (void *p __attribute__((nonnull)))
> > {
> >void *q = p;
> >if (q)
> >  bar (q);
> > }
> > 
> > we are comparing a never NULL pointer.  Currently the q = p copy
> > makes it not realize this.  Likewise some constants can be
> > propagated this way.
> > 
> > Of course using the VN lattice means quite some changes in those
> > passes.  Even without the VN lattice having unreachable edges
> > marked could improve diagnostics for, say PHI nodes, if only
> > a single executable edge remains.
> > 
> > Martin, do you have any thoughts here?  Any opinion on 

Re: [PATCH] [PATCH, v4, 1/1, AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-02-10 Thread Richard Sandiford via Gcc-patches
Dan Li  writes:
> On 2/9/22 08:08, Richard Sandiford wrote:
>> Dan Li  writes:
>>> +
>>> +  /* When shadow call stack is enabled, the scs_pop in the epilogue will
>>> + restore x30, and we don't need to pop x30 again in the traditional
>>> + way.  Pop candidates record the registers that need to be popped
>>> + eventually.  */
>>> +  if (frame.is_scs_enabled)
>>> +{
>>> +  if (frame.wb_push_candidate2 == R30_REGNUM)
>>> +   frame.wb_pop_candidate2 = INVALID_REGNUM;
>>> +  else if (frame.wb_push_candidate1 == R30_REGNUM)
>>> +   frame.wb_pop_candidate1 = INVALID_REGNUM;
>> 
>> Although it makes no difference to the behaviour, I think it would be
>> clearer to use pop rather than push in the checks here.
>> 
>
> Got it.
>>> @@ -7885,8 +7914,8 @@ aarch64_save_callee_saves (poly_int64 start_offset,
>>> bool frame_related_p = aarch64_emit_cfi_for_reg_p (regno);
>>>   
>>> if (skip_wb
>>> - && (regno == cfun->machine->frame.wb_candidate1
>>> - || regno == cfun->machine->frame.wb_candidate2))
>>> + && (regno == cfun->machine->frame.wb_push_candidate1
>>> + || regno == cfun->machine->frame.wb_push_candidate2))
>>> continue;
>>>   
>>> if (cfun->machine->reg_is_wrapped_separately[regno])
>>> @@ -7996,8 +8025,8 @@ aarch64_restore_callee_saves (poly_int64 
>>> start_offset, unsigned start,
>>> rtx reg, mem;
>>>   
>>> if (skip_wb
>>> - && (regno == cfun->machine->frame.wb_candidate1
>>> - || regno == cfun->machine->frame.wb_candidate2))
>>> + && (regno == cfun->machine->frame.wb_push_candidate1
>>> + || regno == cfun->machine->frame.wb_push_candidate2))
>> 
>> Shouldn't this be using pop rather than push?
>> 
>
> There might be a little difference:
>
> - Using push candidates means that a register to be ignored in pop
> candidates will not be emitted again during the "restore" (pop_candidates
> should always be a subset of push_candidates, since popping a register
> without a push might not make sense).

The push candidates are simply a subset of the saved registers though.
Similarly, the pop candidates are simply a subset of the restored registers.
So I think the requirement operates at that level: the restored registers
must be a subset of the saved registers.

In other circumstances it could have been the other way around:
there might have been a change that stopped us from saving two
registers during the allocation, but we wanted to carry on restoring
two registers during the deallocation.  I don't think there's a
reason that the push candidates *have* to be a superset of the
pop candidates (even though they are with the current change).

> - Using pop candidates means that a registers to be ignored in pop
> candidates will be re-emitted during the "restore". For example,
> if we specify to ignore the x20 register in pop:
>
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -7502,6 +7502,8 @@ aarch64_layout_frame (void)
>  frame.wb_pop_candidate1 = INVALID_REGNUM;
>   }
>   
> +  if (frame.wb_pop_candidate2 == R20_REGNUM)
> +   frame.wb_pop_candidate2 = INVALID_REGNUM;
> /* If candidate2 is INVALID_REGNUM, we need to adjust max_push_offset to
>256 to ensure that the offset meets the requirements of emit_move_insn.
>Similarly, if candidate1 is INVALID_REGNUM, we need to set
>
> With the test case:
>
> int main(void)
> {
>  __asm__ ("":::"x19", "x20");
>  return 0;
> }
>
> When we use "pop_candidate[12]", one more insn is emitted:
>
> 00400604 :
> 400604:   a9bf53f3stp x19, x20, [sp, #-16]!
> 400608:   5280mov w0, #0x0
> +  40060c:   f94007f4ldr x20, [sp, #8]
> 400610:   f84107f3ldr x19, [sp], #16
> 400614:   d65f03c0ret
>
> But in the case of ignoring a specific register (like scs ignores x30),
> there is no difference between the two (because we always need
> to explicitly specify which registers to ignore in the parameter of
> aarch64_restore_callee_saves).

I think this is the correct behaviour.  If we don't want to restore
a register at all then it should be excluded from the restore list
somehow.  In your case you're doing that be using a limit of
X29_REGNUM instead of X30_REGNUM.

FWIW, I did wonder whether aarch64_restore_callee_saves should be
doing the scs pop, rather than aarch64_expand_epilogue, and in an
earlier draft of the previous review I'd asked for that.  It does
seem conceptually cleaner, but in practice, it would probably have
been awkward to implement.  E.g. we'd need to explicitly stop an
LDP being formed with X30 as the second register.

But treating scs push and scs pop as part of the register save and
restore sequences would have one advantage: it would allow the
scs push and scs pop to be shrink-wrapped.

Thanks,
Richard

> If pop looks better here, I'd like to change it to pop in the
> next 

[PATCH] combine: Fix ICE with substitution of CONST_INT into PRE_DEC argument [PR104446]

2022-02-10 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because combine substitutes
(insn 10 9 11 2 (set (reg/v:SI 7 sp [ a ])
(const_int 0 [0])) "pr104446.c":9:5 81 {*movsi_internal}
 (nil))
(insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [0  S4 A32])
(reg:SI 85)) "pr104446.c":10:3 56 {*pushsi2}
 (expr_list:REG_DEAD (reg:SI 85)
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil
forming
(insn 13 11 14 2 (set (mem/f:SI (pre_dec:SI (const_int 0 [0])) [0  S4 A32])
(reg:SI 85)) "pr104446.c":10:3 56 {*pushsi2}
 (expr_list:REG_DEAD (reg:SI 85)
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil
which is invalid RTL (pre_dec's argument must be a REG).
I know substitution creates various forms of invalid RTL and hopes that
invalid RTL just won't recog.
But unfortunately in this case we ICE before we get to recog, as
try_combine does:
  if (n_auto_inc)
{
  int new_n_auto_inc = 0;
  for_each_inc_dec (newpat, count_auto_inc, _n_auto_inc);

  if (n_auto_inc != new_n_auto_inc)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Number of auto_inc expressions changed\n");
  undo_all ();
  return 0;
}
}
and for_each_inc_dec under the hood will do e.g. for the PRE_DEC case:
case PRE_DEC:
case POST_DEC:
  {
poly_int64 size = GET_MODE_SIZE (GET_MODE (mem));
rtx r1 = XEXP (x, 0);
rtx c = gen_int_mode (-size, GET_MODE (r1));
return fn (mem, x, r1, r1, c, data);
  }
and that code rightfully expects that the PRE_DEC operand has non-VOIDmode
(as it needs to be a REG) - gen_int_mode for VOIDmode results in ICE.
I think it is better not to emit the clearly invalid RTL during substitution
like we do for other cases, than to adding workarounds for invalid IL
created by combine to rtlanal.cc and perhaps elsewhere.
As for the testcase, of course it is UB at runtime to modify sp that way,
but if such code is never reached, we must compile it, not to ICE on it.
And I don't see why on other targets which use the autoinc rtxes much more
it couldn't happen with other registers.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-10  Jakub Jelinek  

PR middle-end/104446
* combine.cc (subst): Don't substitute CONST_INTs into RTX_AUTOINC
operands.

* gcc.target/i386/pr104446.c: New test.

--- gcc/combine.cc.jj   2022-02-08 20:08:13.821404850 +0100
+++ gcc/combine.cc  2022-02-09 12:19:56.768294280 +0100
@@ -5534,6 +5534,12 @@ subst (rtx x, rtx from, rtx to, int in_d
  if (!x)
return gen_rtx_CLOBBER (VOIDmode, const0_rtx);
}
+ /* CONST_INTs shouldn't be substituted into PRE_DEC, PRE_MODIFY
+etc. arguments, otherwise we can ICE before trying to recog
+it.  See PR104446.  */
+ else if (CONST_SCALAR_INT_P (new_rtx)
+  && GET_RTX_CLASS (GET_CODE (x)) == RTX_AUTOINC)
+   return gen_rtx_CLOBBER (VOIDmode, const0_rtx);
  else
SUBST (XEXP (x, i), new_rtx);
}
--- gcc/testsuite/gcc.target/i386/pr104446.c.jj 2022-02-09 12:29:14.311505584 
+0100
+++ gcc/testsuite/gcc.target/i386/pr104446.c2022-02-09 12:28:35.329050754 
+0100
@@ -0,0 +1,15 @@
+/* PR middle-end/104446 */
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -mrtd" } */
+
+register volatile int a __asm__("%esp");
+void foo (void *);
+void bar (void *);
+
+void
+baz (void)
+{
+  foo (__builtin_return_address (0));
+  a = 0;
+  bar (__builtin_return_address (0));
+}

Jakub



[PATCH] middle-end/104467 - fix vector extract simplification

2022-02-10 Thread Richard Biener via Gcc-patches
This fixes a bogus vector type used for a CTOR build as part of
vector extract simplification.  The code failed to consider a
CTOR of vector elements.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

2022-02-10  Richard Biener  

PR middle-end/104467
* match.pd (vector extract simplification): Multiply the
number of CTOR elements with the number of element elements.

* gcc.dg/torture/pr104467.c: New testcase.
---
 gcc/match.pd|  2 +-
 gcc/testsuite/gcc.dg/torture/pr104467.c | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr104467.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 4fe590983f3..d9d83591045 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6943,7 +6943,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  TREE_TYPE (TREE_TYPE (ctor)))
 ? type
 : build_vector_type (TREE_TYPE (TREE_TYPE (ctor)),
- count));
+ count * k));
  res = (constant_p ? build_vector_from_ctor (evtype, vals)
 : build_constructor (evtype, vals));
}
diff --git a/gcc/testsuite/gcc.dg/torture/pr104467.c 
b/gcc/testsuite/gcc.dg/torture/pr104467.c
new file mode 100644
index 000..c3bfb60698a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr104467.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-mavx" { target x86_64-*-* i?86-*-* } } */
+
+unsigned long __attribute__((__vector_size__ (8 * sizeof (long u;
+signed long __attribute__((__vector_size__ (8 * sizeof (long s;
+
+void
+foo (void)
+{
+  s &= u + (0, 0);
+}
-- 
2.34.1


Re: Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-10 Thread Richard Biener via Gcc-patches
On Wed, Feb 9, 2022 at 2:21 PM Thomas Schwinge  wrote:
>
> Hi!
>
> OK to push (now, or in next development stage 1?) the attached
> "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
> or should that be done differently -- or, per the current state (why?)
> not at all?
>
> This does work for my current debugging task, but I've not yet run
> 'make check' in case anything needs to be adjusted there.

Hmm, I wonder if we shouldn't simply dump DECL_UID as

 'uid NNN'

somewhere.  For example after or before DECL_NAME?

>
> Grüße
>  Thomas
>
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


[PATCH][libgomp, nvptx] Add spinlock test-cases

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

Add spinlock test-cases for nvptx.

Strictly speaking, these are invalid openACC, because they're not guaranteed
to terminate.

But I've tested these without problems on cards from nvidia architectures
Kepler, Maxwell, Pascal and Turing (though on Turing, that's what you expect
given that it's explicitly supported).

These have been submitted separately, to make reverting easy in case of
problems.

Tested on x86_64 with nvptx accelerator.

Thomas, OK for the unusual openACC aspect of it?

Thanks,
- Tom

[libgomp, nvptx] Add spinlock test-cases

libgomp/ChangeLog:

2022-02-02  Tom de Vries  

* testsuite/libgomp.oacc-c/spin-lock-global-2.c: New test.
* testsuite/libgomp.oacc-c/spin-lock-global-3.c: New test.
* testsuite/libgomp.oacc-c/spin-lock-shared-2.c: New test.
* testsuite/libgomp.oacc-c/spin-lock-shared-3.c: New test.

---
 libgomp/testsuite/libgomp.oacc-c/spin-lock-global-2.c | 8 
 libgomp/testsuite/libgomp.oacc-c/spin-lock-global-3.c | 7 +++
 libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-2.c | 8 
 libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-3.c | 7 +++
 4 files changed, 30 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-c/spin-lock-global-2.c 
b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global-2.c
new file mode 100644
index 000..b6a8728cb42
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global-2.c
@@ -0,0 +1,8 @@
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+
+/* Define to 0 to have a regular spinlock.
+   Makes the test-case invalid OpenACC: there's nothing that guarantees that
+   the program will terminate.  So, we only do this for nvptx.  */
+#define SPIN_CNT_MAX 0
+
+#include "spin-lock-global.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c/spin-lock-global-3.c 
b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global-3.c
new file mode 100644
index 000..157384e4cb4
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global-3.c
@@ -0,0 +1,7 @@
+/* As in spin-lock-global-2.c.  */
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+
+/* Also test without JIT optimization.  */
+/* { dg-set-target-env-var GOMP_NVPTX_JIT "-O0" } */
+
+#include "spin-lock-global-2.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-2.c 
b/libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-2.c
new file mode 100644
index 000..43e4686b841
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-2.c
@@ -0,0 +1,8 @@
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+
+/* Define to 0 to have a regular spinlock.
+   Makes the test-case invalid OpenACC: there's nothing that guarantees that
+   the program will terminate.  So, we only do this for nvptx.  */
+#define SPIN_CNT_MAX 0
+
+#include "spin-lock-shared.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-3.c 
b/libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-3.c
new file mode 100644
index 000..79f22f7ec4e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/spin-lock-shared-3.c
@@ -0,0 +1,7 @@
+/* As in spin-lock-global-2.c.  */
+/* { dg-do run { target openacc_nvidia_accel_selected } } */
+
+/* Also test without JIT optimization.  */
+/* { dg-set-target-env-var GOMP_NVPTX_JIT "-O0" } */
+
+#include "spin-lock-shared-2.c"


[PATCH][libgomp, openacc] Add terminating spinlock test-cases

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

The OpenACC execution model states that implementing a critical
section across workers using atomic operations and a busy-wait loop may never
succeed, since the scheduler may suspend the worker that owns the lock, in
which case the worker waiting on the lock can never complete.

Add a test-case that implements the next best thing: a spinlock using a
busy-wait loop that gives up after a certain number of tries.

This ensures termination, and makes the test-case a valid one, while still
excercising atomic exchange and atomic store.

OK for trunk?

Thanks,
- Tom

[libgomp, openacc] Add terminating spinlock test-cases

libgomp/ChangeLog:

2022-02-02  Tom de Vries  

* testsuite/libgomp.oacc-c/spin-lock-global.c: New test.
* testsuite/libgomp.oacc-c/spin-lock-global.h: New test.
* testsuite/libgomp.oacc-c/spin-lock-shared.c: New test.
* testsuite/libgomp.oacc-c/spin-lock-shared.h: New test.

---
 .../testsuite/libgomp.oacc-c/spin-lock-global.c|  43 ++
 .../testsuite/libgomp.oacc-c/spin-lock-global.h| 169 +
 .../testsuite/libgomp.oacc-c/spin-lock-shared.c|  35 +
 .../testsuite/libgomp.oacc-c/spin-lock-shared.h| 135 
 4 files changed, 382 insertions(+)

diff --git a/libgomp/testsuite/libgomp.oacc-c/spin-lock-global.c 
b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global.c
new file mode 100644
index 000..0c1da9e842f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global.c
@@ -0,0 +1,43 @@
+#include 
+#include 
+#include 
+#include 
+
+enum memmodel
+  {
+MEMMODEL_RELAXED = 0,
+MEMMODEL_ACQUIRE = 2,
+MEMMODEL_RELEASE = 3,
+MEMMODEL_SEQ_CST = 5,
+  };
+
+#define TYPE unsigned int
+#define LOCKVAR1 lock_32_1
+#define LOCKVAR2 lock_32_2
+#define TESTS tests_32
+#include "spin-lock-global.h"
+#undef TYPE
+#undef LOCKVAR1
+#undef LOCKVAR2
+#undef TESTS
+
+#define TYPE unsigned long long int
+#define LOCKVAR1 lock_64_1
+#define LOCKVAR2 lock_64_2
+#define TESTS tests_64
+#include "spin-lock-global.h"
+#undef TYPE
+#undef LOCKVAR1
+#undef LOCKVAR2
+#undef TESTS
+
+#define N (7 * 1000)
+
+int
+main (void)
+{
+  tests_32 (N);
+  tests_64 (N);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c/spin-lock-global.h 
b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global.h
new file mode 100644
index 000..ea63fafccb9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c/spin-lock-global.h
@@ -0,0 +1,169 @@
+#define XSTR(S) STR (S)
+#define STR(S) #S
+
+#define PRINTF(...)\
+  {\
+printf (__VA_ARGS__);  \
+fflush (NULL); \
+  }
+
+#define DO_PRAGMA(x) _Pragma (#x)
+
+#ifndef SPIN_CNT_MAX
+/* Define to have limited-spin spinlock.
+   Ensures that the program will terminate.  */
+#define SPIN_CNT_MAX 0x8000U
+#endif
+
+#define TEST_1(N, LOCKVAR, VERIFY, N_GANGS, N_WORKERS) \
+  assert (N % N_GANGS == 0);   \
+   \
+  DO_PRAGMA (acc parallel  \
+num_gangs(N_GANGS) \
+num_workers(N_WORKERS) \
+copy (lock_cnt)\
+copy (spin_cnt_max_hit)\
+present (LOCKVAR)) \
+  {\
+TYPE unlocked = (TYPE)0;   \
+TYPE locked = ~unlocked;   \
+   \
+LOCKVAR = unlocked;
\
+   \
+unsigned int n_gangs   \
+  = __builtin_goacc_parlevel_size (GOMP_DIM_GANG); \
+   \
+DO_PRAGMA (acc loop worker)
\
+  for (unsigned int i = 0; i < N / n_gangs; i++)   \
+   {   \
+ TYPE res; \
+   \
+ unsigned int spin_cnt = 0;\
+ while (1) \
+   {   \
+ res = __atomic_exchange_n (, locked,  \
+MEMMODEL_ACQUIRE); 

[committed][nvptx] Handle sm_7x shared atomic store more optimal

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

For sm_7x atomic stores we fall back on expand_atomic_store, but this
results in using membar.sys for shared stores.

Fix this by adding an nvptx_atomic_store insn that adds a membar.cta for a
shared store.

Tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[nvptx] Handle sm_7x shared atomic store more optimal

gcc/ChangeLog:

2022-02-02  Tom de Vries  

* config/nvptx/nvptx.md (define_insn "nvptx_atomic_store"): New
define_insn.
(define_expand "atomic_store"): Use nvptx_atomic_store for
TARGET_SM70.
(define_c_enum "unspecv"): Add UNSPECV_ST.

gcc/testsuite/ChangeLog:

2022-02-02  Tom de Vries  

* gcc.target/nvptx/atomic-store-2.c: New test.

---
 gcc/config/nvptx/nvptx.md   | 22 +++--
 gcc/testsuite/gcc.target/nvptx/atomic-store-2.c | 26 +
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 1a283b41922..4c378ec6ecb 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -57,6 +57,7 @@ (define_c_enum "unspecv" [
UNSPECV_CAS
UNSPECV_CAS_LOCAL
UNSPECV_XCHG
+   UNSPECV_ST
UNSPECV_BARSYNC
UNSPECV_WARPSYNC
UNSPECV_UNIFORM_WARP_CHECK
@@ -2067,8 +2068,11 @@ (define_expand "atomic_store"
 }
 
   if (TARGET_SM70)
-/* Fall back to expand_atomic_store.  */
-FAIL;
+{
+   emit_insn (gen_nvptx_atomic_store (operands[0], operands[1],
+   operands[2]));
+   DONE;
+}
 
   bool maybe_shared_p = nvptx_mem_maybe_shared_p (operands[0]);
   if (!maybe_shared_p)
@@ -2081,6 +2085,20 @@ (define_expand "atomic_store"
   DONE;
 })
 
+(define_insn "nvptx_atomic_store"
+  [(set (match_operand:SDIM 0 "memory_operand" "+m") ;; memory
+   (unspec_volatile:SDIM
+[(match_operand:SDIM 1 "nvptx_nonmemory_operand" "Ri") ;; input
+ (match_operand:SI 2 "const_int_operand")] ;; model
+  UNSPECV_ST))]
+  "TARGET_SM70"
+  {
+const char *t
+  = "%.\tst%A0.b%T0\t%0, %1;";
+return nvptx_output_atomic_insn (t, operands, 0, 2);
+  }
+  [(set_attr "atomic" "true")])
+
 (define_insn "atomic_fetch_add"
   [(set (match_operand:SDIM 1 "memory_operand" "+m")
(unspec_volatile:SDIM
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
new file mode 100644
index 000..cd5e4c38267
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-store-2.c
@@ -0,0 +1,26 @@
+/* Test the atomic store expansion for sm > sm_6x targets,
+   shared state space.  */
+
+/* { dg-do compile } */
+/* { dg-options "-misa=sm_75" } */
+
+enum memmodel
+{
+  MEMMODEL_SEQ_CST = 5
+};
+
+unsigned int u32 __attribute__((shared));
+unsigned long long int u64 __attribute__((shared));
+
+int
+main()
+{
+  __atomic_store_n (, 0, MEMMODEL_SEQ_CST);
+  __atomic_store_n (, 0, MEMMODEL_SEQ_CST);
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "st.shared.b32" 1 } } */
+/* { dg-final { scan-assembler-times "st.shared.b64" 1 } } */
+/* { dg-final { scan-assembler-times "membar.cta" 4 } } */


[committed][nvptx] Handle pre-sm_7x shared atomic store using atomic exchange

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

The ptx isa specifies (for pre-sm_7x) that atomic operations on shared memory
locations do not guarantee atomicity with respect to normal store instructions
to the same address.

This can be fixed by:
- inserting barriers between normal stores and atomic operations to a common
  address
- using atom.exch to store to locations accessed by other atomic operations.

It's not clearly spelled out which barriers are needed, and a barrier seem more
expensive than atomic exchange.

Implement the pre-sm_7x shared atomic store using atomic exchange.

That includes stores using generic addressing, since those may also point to
shared memory.

Tested on x86-64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[nvptx] Handle pre-sm_7x shared atomic store using atomic exchange

gcc/ChangeLog:

2022-02-02  Tom de Vries  

* config/nvptx/nvptx-protos.h (nvptx_mem_maybe_shared_p): Declare.
* config/nvptx/nvptx.cc (nvptx_mem_data_area): New static function.
(nvptx_mem_maybe_shared_p): New function.
* config/nvptx/nvptx.md (define_expand "atomic_store"): New
define_expand.

gcc/testsuite/ChangeLog:

2022-02-02  Tom de Vries  

* gcc.target/nvptx/atomic-store-1.c: New test.
* gcc.target/nvptx/atomic-store-3.c: New test.
* gcc.target/nvptx/stack-atomics-run.c: Update.

---
 gcc/config/nvptx/nvptx-protos.h|  1 +
 gcc/config/nvptx/nvptx.cc  | 22 
 gcc/config/nvptx/nvptx.md  | 30 ++
 gcc/testsuite/gcc.target/nvptx/atomic-store-1.c| 26 +++
 gcc/testsuite/gcc.target/nvptx/atomic-store-3.c| 25 ++
 gcc/testsuite/gcc.target/nvptx/stack-atomics-run.c |  6 -
 6 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx-protos.h b/gcc/config/nvptx/nvptx-protos.h
index a846e341917..0bf9af406a2 100644
--- a/gcc/config/nvptx/nvptx-protos.h
+++ b/gcc/config/nvptx/nvptx-protos.h
@@ -60,5 +60,6 @@ extern const char *nvptx_output_simt_exit (rtx);
 extern const char *nvptx_output_red_partition (rtx, rtx);
 extern const char *nvptx_output_atomic_insn (const char *, rtx *, int, int);
 extern bool nvptx_mem_local_p (rtx);
+extern bool nvptx_mem_maybe_shared_p (const_rtx);
 #endif
 #endif
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 1b0227a2c31..5b26c0f4c7d 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -76,6 +76,7 @@
 #include "intl.h"
 #include "opts.h"
 #include "tree-pretty-print.h"
+#include "rtl-iter.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -2787,6 +2788,27 @@ nvptx_print_operand_address (FILE *file, machine_mode 
mode, rtx addr)
   nvptx_print_address_operand (file, addr, mode);
 }
 
+static nvptx_data_area
+nvptx_mem_data_area (const_rtx x)
+{
+  gcc_assert (GET_CODE (x) == MEM);
+
+  const_rtx addr = XEXP (x, 0);
+  subrtx_iterator::array_type array;
+  FOR_EACH_SUBRTX (iter, array, addr, ALL)
+if (SYMBOL_REF_P (*iter))
+  return SYMBOL_DATA_AREA (*iter);
+
+  return DATA_AREA_GENERIC;
+}
+
+bool
+nvptx_mem_maybe_shared_p (const_rtx x)
+{
+  nvptx_data_area area = nvptx_mem_data_area (x);
+  return area == DATA_AREA_SHARED || area == DATA_AREA_GENERIC;
+}
+
 /* Print an operand, X, to FILE, with an optional modifier in CODE.
 
Meaning of CODE:
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index cced68e0d4a..1a283b41922 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -2051,6 +2051,36 @@ (define_insn "atomic_exchange"
   }
   [(set_attr "atomic" "true")])
 
+(define_expand "atomic_store"
+  [(match_operand:SDIM 0 "memory_operand" "=m")  ;; memory
+   (match_operand:SDIM 1 "nvptx_nonmemory_operand" "Ri")  ;; input
+   (match_operand:SI 2 "const_int_operand")] ;; model
+  ""
+{
+  struct address_info info;
+  decompose_mem_address (, operands[0]);
+  if (info.base != NULL && REG_P (*info.base)
+  && REGNO_PTR_FRAME_P (REGNO (*info.base)))
+{
+  emit_insn (gen_mov (operands[0], operands[1]));
+  DONE;
+}
+
+  if (TARGET_SM70)
+/* Fall back to expand_atomic_store.  */
+FAIL;
+
+  bool maybe_shared_p = nvptx_mem_maybe_shared_p (operands[0]);
+  if (!maybe_shared_p)
+/* Fall back to expand_atomic_store.  */
+FAIL;
+
+  rtx tmpreg = gen_reg_rtx (mode);
+  emit_insn (gen_atomic_exchange (tmpreg, operands[0], operands[1],
+   operands[2]));
+  DONE;
+})
+
 (define_insn "atomic_fetch_add"
   [(set (match_operand:SDIM 1 "memory_operand" "+m")
(unspec_volatile:SDIM
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c
new file mode 100644
index 000..cee3815eda5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-store-1.c
@@ -0,0 +1,26 @@
+/* Test the atomic store expansion for sm 

[PATCH] tree-optimization/104466 - fix cut error perventing alias disambiguation

2022-02-10 Thread Richard Biener via Gcc-patches
The following fixes a cut error in disambiguating using restrict
info.  Instead of using the for this purpose computed rbase1/rbase2
which preserve MEM_REF bases even when they are based on a decl the
code performs the check on the bases that drop info for those ...

Bootstrapped and tested on x86_64-unknown-linux-gnu.

The patch helps addressing regressions in SPEC so I am planning to
push it even though the bug itself is not a regression on the
specific testcase (a testcase which would "regress" in inlining
behavior could be viewed as regression though).  I am also considering
to push for GCC 11.3 but likely not further.

Richard.

2022-02-10  Richard Biener  

PR tree-optimization/104466
* tree-ssa-alias.cc (refs_may_alias_p_2): Use rbase1/rbase2
for the MR_DEPENDENCE checks as intended.

* gfortran.dg/pr104466.f90: New testcase.
---
 gcc/testsuite/gfortran.dg/pr104466.f90 | 116 +
 gcc/tree-ssa-alias.cc  |   8 +-
 2 files changed, 120 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr104466.f90

diff --git a/gcc/testsuite/gfortran.dg/pr104466.f90 
b/gcc/testsuite/gfortran.dg/pr104466.f90
new file mode 100644
index 000..ec0e45866be
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr104466.f90
@@ -0,0 +1,116 @@
+! { dg-do compile }
+! { dg-options "-std=legacy -O2 --param max-inline-insns-auto=0 --param 
max-inline-insns-single=0 -fdump-tree-lim2-details" }
+
+  MODULE mod_param
+integer, parameter :: Ngrids = 1
+integer, dimension(Ngrids) :: N
+END  
+  MODULE mod_forces
+TYPE T_FORCES
+  real, pointer :: sustr(:,:)
+  real, pointer :: svstr(:,:)
+  real, pointer :: bustr(:,:)
+  real, pointer :: bvstr(:,:)
+  real, pointer :: srflx(:,:)
+  real, pointer :: stflx(:,:,:)
+END TYPE 
+TYPE (T_FORCES), allocatable :: FORCES(:)
+  END  
+  MODULE mod_grid
+TYPE T_GRID
+  real, pointer :: f(:,:)
+  real, pointer :: Hz(:,:,:)
+  real, pointer :: z_r(:,:,:)
+  real, pointer :: z_w(:,:,:)
+END TYPE 
+TYPE (T_GRID), allocatable :: GRID(:)
+  END  
+  MODULE mod_scalars
+USE mod_param
+  END  
+  MODULE mod_mixing
+TYPE T_MIXING
+  integer,  pointer :: ksbl(:,:)
+  real, pointer :: Akv(:,:,:)
+  real, pointer :: Akt(:,:,:,:)
+  real, pointer :: alpha(:,:)
+  real, pointer :: beta(:,:)
+  real, pointer :: bvf(:,:,:)
+  real, pointer :: hsbl(:,:)
+  real, pointer :: ghats(:,:,:,:)
+END TYPE 
+TYPE (T_MIXING), allocatable :: MIXING(:)
+  END  
+  MODULE mod_ocean
+TYPE T_OCEAN
+  real, pointer :: pden(:,:,:)
+  real, pointer :: u(:,:,:,:)
+  real, pointer :: v(:,:,:,:)
+END TYPE 
+TYPE (T_OCEAN), allocatable :: OCEAN(:)
+  END  
+  MODULE lmd_skpp_mod
+  PRIVATE
+  PUBLIC  lmd_skpp
+  CONTAINS
+  SUBROUTINE lmd_skpp 
+  USE mod_forces
+  USE mod_grid
+  USE mod_mixing
+  USE mod_ocean
+  integer tile
+  integer UBi, UBj 
+  CALL lmd_skpp_tile (ng, tile, LBi, 
UBi, LBj, UBj,   &
+ IminS, ImaxS, JminS, JmaxS,   nstp0,  
   &
+ GRID(ng) % f, GRID(ng) % Hz,  
  &
+ GRID(ng) % z_r,   GRID(ng) % z_w, 
  &
+ OCEAN(ng) % u,OCEAN(ng) % v,  
  &
+ OCEAN(ng) % pden, FORCES(ng) % srflx, 
  &
+ FORCES(ng) % stflx,   FORCES(ng) % bustr, 
  &
+ FORCES(ng) % bvstr,   FORCES(ng) % sustr, 
  &
+ FORCES(ng) % svstr,   MIXING(ng) % alpha, 
  &
+ MIXING(ng) % beta,MIXING(ng) % bvf,   
  &
+ MIXING(ng) % ghats,   MIXING(ng) % Akt,   
  &
+ MIXING(ng) % Akv, MIXING(ng) % hsbl,  
  MIXING(ng) % ksbl)
+  END  
+  SUBROUTINE lmd_skpp_tile (ng, tile,   LBi, 
UBi, LBj, UBj, &
+ IminS, ImaxS, JminS, JmaxS, nstp, 
  f, Hz, z_r, z_w,&
+ u, v, pden, srflx, stflx, 
  bustr, bvstr, sustr, svstr, &
+ alpha,  beta, 
  bvf,  

[committed][nvptx] Workaround sub.u16 driver JIT bug

2022-02-10 Thread Tom de Vries via Gcc-patches
Hi,

There's a nvidia driver JIT bug that mishandles this code (minimized from
builtin-arith-overflow-15.c):
...
int main (void) {
  signed char r;
  unsigned char y = (unsigned char) 0x80;
  if (__builtin_sub_overflow ((unsigned char)0, (unsigned char)y, ))
__builtin_abort ();
  return 0;
}
...
which at ptx level minimizes to:
...
  mov.u16 r22, 0x0080;
  st.local.u16 [frame_var],r22;
  ld.local.u16 r32,[frame_var];
  sub.u16 r33,0x,r32;
  cvt.u32.u16 r35,r33;
...
where we expect r35 == 0xff80 but get instead 0xff80, and where using
nvptx-none-run -O0 fixes the problem.  [ See also
https://github.com/vries/nvidia-bugs/tree/master/builtin-arith-overflow-15 . ]

Try to workaround the bug by using sub.s16 instead of sub.u16.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx] Workaround sub.u16 driver JIT bug

gcc/ChangeLog:

2022-02-07  Tom de Vries  

PR target/97005
* config/nvptx/nvptx.md (define_insn "sub3"): Workaround
driver JIT bug by using sub.s16 instead of sub.u16.

---
 gcc/config/nvptx/nvptx.md | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index bb0c0b3b9a5..cced68e0d4a 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -506,7 +506,14 @@ (define_insn "sub3"
(minus:HSDIM (match_operand:HSDIM 1 "nvptx_register_operand" "R")
 (match_operand:HSDIM 2 "nvptx_register_operand" "R")))]
   ""
-  "%.\\tsub%t0\\t%0, %1, %2;")
+  {
+if (GET_MODE (operands[0]) == HImode)
+  /* Workaround https://developer.nvidia.com/nvidia_bug/3527713.
+See PR97005.  */
+  return "%.\\tsub.s16\\t%0, %1, %2;";
+
+return "%.\\tsub%t0\\t%0, %1, %2;";
+  })
 
 (define_insn "mul3"
   [(set (match_operand:HSDIM 0 "nvptx_register_operand" "=R")


Re: [PATCH] nvptx: Tweak constraints on copysign instructions.

2022-02-10 Thread Tom de Vries via Gcc-patches

On 2/8/22 14:09, Roger Sayle wrote:


Many thanks to Thomas Schwinge for confirming my hypothesis that the
register
usage regression, PR target/104345, is solely due to libgcc's _muldc3
function.
In addition to the isinf functionality in the previously proposed nvptx
patch at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588453.html which
significantly reduces the number of instructions in _muldc3, the patch below
further reduces both the number of instructions and the number of explicitly
declared registers, by permitting floating point constant immediate operands
in nvptx's copysign instruction.

Fingers-crossed, the combination with all of the previous proposed nvptx
patches improves things.  Ultimately, increasing register usage from 50 to
51 registers, reducing the number of concurrent threads by ~2%, can easily
be countered if we're now executing significantly fewer instructions in each
kernel, for a net performance win.

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
with a "make" and "make -k check" with no new failures.  Ok for mainline?




LGTM, applied.

Thanks,
- Tom


2022-02-08  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (copysign3): Allow immediate
floating point constants as operands 1 and/or 2.

Thanks in advance,
Roger
--



Re: [PATCH] PR target/104345: Use nvptx "set" instruction for cond ? -1 : 0.

2022-02-10 Thread Tom de Vries via Gcc-patches

On 2/3/22 22:00, Roger Sayle wrote:


This patch addresses the "increased register pressure" regression on
nvptx-none caused by my change to transition the backend to a
STORE_FLAG_VALUE = 1 target.  This improved code generation for the
more common case of producing 0/1 Boolean values, but unfortunately
made things marginally worse when a 0/-1 mask value is desired.
Unfortunately, nvptx kernels are extremely sensitive to changes in
register usage, which was observable in the reported PR.

This patch provides optimizations for -(cond ? 1 : 0), effectively
simplify this into cond ? -1 : 0, where these ternary operators are
provided by nvptx's selp instruction, and for the specific case of
SImode, using (restoring) nvptx's "set" instruction (which avoids
the need for a predicate register).

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
with a "make" and "make -k check" with no new failures.  Unfortunately,
the exact register usage of a nvptx kernel depends upon the version of
the Cuda drivers being used (and the hardware), but I believe this
change should resolve the PR (for Thomas) by improving code generation
for the cases that regressed.  Ok for mainline?




LGTM, applied.

Thanks,
- Tom


2022-02-03  Roger Sayle  

gcc/ChangeLog
PR target/104345
* config/nvptx/nvptx.md (sel_true): Fix indentation.
(sel_false): Likewise.
(define_code_iterator eqne): New code iterator for EQ and NE.
(*selp_neg_): New define_insn_and_split to optimize
the negation of a selp instruction.
(*selp_not_): New define_insn_and_split to optimize
the bitwise not of a selp instruction.
(*setcc_int): Use set instruction for neg:SI of a selp.

gcc/testsuite/ChangeLog
PR target/104345
* gcc.target/nvptx/neg-selp.c: New test case.


Thanks in advance,
Roger
--



Re: [PATCH] nvptx: Fix and use BI mode logic instructions (e.g. and.pred).

2022-02-10 Thread Tom de Vries via Gcc-patches

On 1/16/22 12:49, Roger Sayle wrote:


This patch adds support for nvptx's BImode and.pred, or.pred and
xor.pred instructions.  Technically, nvptx.md previously defined
andbi3, iorbi3 and xorbi3 instructions, but the assembly language
mnemonic output for these was incorrect (e.g. and.b1) and would be
rejected by the ptxas assembler.


Thanks for catching and fixing that :) !


The most significant part of this
patch is the new define_split which teaches the compiler to actually
use these instructions when appropriate (exposing the latent bug above).

After https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587999.html
(still awaiting review/approval), the function:

int foo(int x, int y) { return (x==21) && (y==69); }

when compiled with -O2 produces:

 mov.u32 %r26, %ar0;
 mov.u32 %r27, %ar1;
 setp.eq.u32 %r31, %r26, 21;
 setp.eq.u32 %r34, %r27, 69;
 selp.u32%r37, 1, 0, %r31;
 selp.u32%r38, 1, 0, %r34;
 and.b32 %value, %r37, %r38;

with this patch we now save an extra instruction and generate:

 mov.u32 %r26, %ar0;
 mov.u32 %r27, %ar1;
 setp.eq.u32 %r31, %r26, 21;
 setp.eq.u32 %r34, %r27, 69;
 and.pred%r39, %r34, %r31;
 selp.u32%value, 1, 0, %r39;

This patch has been tested (on top of the patch mentioned above) on
nvptx-none hosted on x86_64-pc-linux-gnu (including newlib) with a
make and make -k check with no new failures.  Ok for mainline?




LGTM, applied.

Thanks,
- Tom


2022-01-16  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (any_logic): Move code iterator earlier
in machine description.
(logic): Move code attribute earlier in machine description.
(ilogic): New code attribute, like logic but "ior" for IOR.
(and3, ior3, xor3): Delete. Replace with...
(3): New define_insn for HSDIM logic operations.
(bi3): New define_insn for BI mode logic operations.
(define_split): Lower logic operations from integer modes to
BI mode predicate operations.

gcc/testsuite/ChangeLog
* gcc.target/nvptx/bool-2.c: New test case for and.pred.
* gcc.target/nvptx/bool-3.c: New test case for or.pred.
* gcc.target/nvptx/bool-4.c: New test case for xor.pred.


Many thanks in advance.

Roger
--



Re: [PATCH] nvptx: Add support for 64-bit mul.hi (and other) instructions.

2022-02-10 Thread Tom de Vries via Gcc-patches

On 1/14/22 10:54, Roger Sayle wrote:


Now that the middle-end MULT_HIGHPART_EXPR pieces are in place, this
patch adds support for nvptx's mul.hi.s64 and mul.hi.u64 instructions,
as previously reviewed (provisionally pre-approved) back in August 2020:
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551373.html
Since then a few things have changed, so this patch uses the new
SMUL_HIGHPART and UMUL_HIGHPART RTX expressions, but the test cases
remain the same.  Like the x86_64 backend, this patch retains the
"trunc" forms of these instructions (while the RTL optimizers/combine
may still generate them).

Given that we're rapidly approaching stage 4, I also took the liberty
of including support in nvptx.md for a few other instructions.  With
the new 64-bit highpart multiplication instructions added above, we
can now provide a define_expand for efficient 64-bit (to 128-bit)
widening multiplications.  This patch also adds support for nvptx's
testp.infinite instruction (for implementing __builtin_isinf) and
the not.pred instruction.

As an example of the code generation improvements, the function
int foo(double x) { return __builtin_isinf(x); }
previously generated with -O2:

 mov.f64 %r26, %ar0;
 abs.f64 %r28, %r26;
 setp.leu.f64%r31, %r28, 0d7fef;
 selp.u32%r30, 1, 0, %r31;
 mov.u32 %r29, %r30;
 cvt.u16.u8  %r35, %r29;
 mov.u16 %r33, %r35;
 xor.b16 %r32, %r33, 1;
 cvt.u32.u16 %r34, %r32;
 cvt.u32.u8  %value, %r34;

and with this patch now generates:

 mov.f64 %r23, %ar0;
 testp.infinite.f64  %r24, %r23;
 selp.u32%value, 1, 0, %r24;

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
(including newlib) with a make and make -k check with no new failures.
Ok for mainline?




LGTM, applied.

Thanks,
- Tom


2022-01-14  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (UNSPEC_ISINF): New UNSPEC.
(one_cmplbi2): New define_insn for not.pred.
(mulditi3): New define_expand for signed widening multiply.
(umulditi3): New define_expand for unsigned widening multiply.
(smul3_highpart): New define_insn for signed highpart mult.
(umul3_highpart): New define_insn for unsigned highpart mult.
(*smulhi3_highpart_2): Renamed from smulhi3_highpart.
(*smulsi3_highpart_2): Renamed from smulsi3_highpart.
(*umulhi3_highpart_2): Renamed from umulhi3_highpart.
(*umulsi3_highpart_2): Renamed from umulsi3_highpart.
(*setcc_from_not_bi): New define_insn.
(*setcc_isinf): New define_insn for testp.infinite.
(isinf2): New define_expand.

gcc/testsuite/ChangeLog
* gcc.target/nvptx/mul-hi64.c: New test case.
* gcc.target/nvptx/umul-hi64.c: New test case.
* gcc.target/nvptx/mul-wide64.c: New test case.
* gcc.target/nvptx/umul-wide64.c: New test case.
* gcc.target/nvptx/isinf.c: New test case.


Thanks in advance,
Roger
--



Re: [PATCH] nvptx: Expand QI mode operations using SI mode instructions.

2022-02-10 Thread Tom de Vries via Gcc-patches

On 1/10/22 11:58, Roger Sayle wrote:


One of the unusual target features of the Nvidia PTX ISA is that it
doesn't provide QI mode (byte sized) operations or registers. 


[ FWIW: I recently happened to check this, and it actually supports 
.u8/.s8/.b8 regs, but indeed just for very few operations: ld/st/cvt. ]



Somewhat
conventionally, 8-bit quantities are read from/written to memory using
special instructions, but stored internally using SImode (32-bit) registers.
GCC's middle-end accomodates targets without QImode optabs, by widening
operations until suitable support is found, and with the current nvptx
backend this means 16-bit HImode operations.  The inconvenience is that
nvptx is also a TARGET_TRULY_NOOP_TRUNCATION=false target, meaning that
additional instructions are required to convert between the SImode
registers used to hold QImode values, and the HImode registers used to
operate on them (and back again).  This results in a large amount of
shuffling and type conversion in code dealing with bytes, i.e. using
char or Boolean types.

This patch improves the situation by providing expanders in the nvptx
machine description to perform QImode operations natively in SImode
instead of HImode.  An alternate implementation might be to provide
some form of target hook to specify which fallback modes to use during
RTL expansion, but I think this requirement is unusual, and a solution
entirely in the nvptx backend doesn't disturb/affect other targets.

The improvements can be quite dramatic, as shown in the example below:

int foo(int x, int y) { return (x==21) && (y==69); }

previously with -O2 required 15 instructions:

 mov.u32 %r26, %ar0;
 mov.u32 %r27, %ar1;
 setp.eq.u32 %r31, %r26, 21;
 selp.u32%r30, 1, 0, %r31;
 mov.u32 %r29, %r30;
 setp.eq.u32 %r34, %r27, 69;
 selp.u32%r33, 1, 0, %r34;
 mov.u32 %r32, %r33;
 cvt.u16.u8  %r39, %r29;
 mov.u16 %r36, %r39;
 cvt.u16.u8  %r39, %r32;
 mov.u16 %r37, %r39;
 and.b16 %r35, %r36, %r37;
 cvt.u32.u16 %r38, %r35;
 cvt.u32.u8  %value, %r38;

with this patch, now requires only 7 instructions:

 mov.u32 %r26, %ar0;
 mov.u32 %r27, %ar1;
 setp.eq.u32 %r31, %r26, 21;
 setp.eq.u32 %r34, %r27, 69;
 selp.u32%r37, 1, 0, %r31;
 selp.u32%r38, 1, 0, %r34;
 and.b32 %value, %r37, %r38;


This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
(including newlib) with a make and make -k check with no new failures.
Ok for mainline?




LGTM, applied.

Thanks,
- Tom


2022-01-10  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (cmp): Renamed from *cmp.
(setcc_from_bi): Additionally support QImode.
(extendbi2): Additionally support QImode.
(zero_extendbi2): Additionally support QImode.
(any_sbinary, any_ubinary, any_sunary, any_uunary): New code
iterators for signed and unsigned, binary and unary operations.
(qi3, qi3, qi2, qi2): New
expanders to perform QImode operations using SImode instructions.
(cstoreqi4): New define_expand.
(*ext_truncsi2_qi): New define_insn.
(*zext_truncsi2_qi): New define_insn.

gcc/testsuite/ChangeLog
* gcc.target/nvptx/bool-1.c: New test case.


Thanks in advance,
Roger
--



Re: [PATCH] RISC-V: Add target machine headers as a dependency for riscv-sr.o

2022-02-10 Thread Kito Cheng via Gcc-patches
Hi Maciej:

OK for release branches, thanks!

On Tue, Feb 8, 2022 at 8:24 PM Maciej W. Rozycki  wrote:
>
> On Mon, 7 Feb 2022, Kito Cheng wrote:
>
> > OK to trunk, thanks for fixing this issue, I hit that issue before but
> > I didn't figure out what happened...since that issue will disappear
> > when I clean build :p
>
>  Committed to trunk now, thanks for you review.
>
>  How about release branches?
>
>   Maciej


Re: [PATCH] nvptx: Improved support for HFMode including neghf2 and abshf2.

2022-02-10 Thread Tom de Vries via Gcc-patches

On 1/8/22 13:21, Roger Sayle wrote:


This patch adds more support for _Float16 (HFmode) to the nvptx backend.
Currently negation, absolute value and floating point comparisons are
implemented by promoting to float (SFmode).  This patch adds suitable
define_insns to nvptx.md, most conditional on TARGET_SM53 (-misa=sm_53).
This patch also adds support for HFmode fused multiply-add.

One subtlety is that neghf2 and abshf2 are implemented by (HImode)
bit manipulation operations to update the sign bit.  The NVidia PTX
ISA documentation for neg.f16 and abs.f16 contains the caution
"Future implementations may comply with the IEEE 754 standard by preserving
the (NaN) payload and modifying only the sign bit".  Given the availability
of suitable replacements, I thought it best to provide IEEE 754 compliant
implementations.  If anyone observes a performance penalty from this
choice I'm happy to provide a -ffast-math variant (or revisit this
decision).

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
(including newlib) with a make and make -k check with no new failures.
Ok for mainline?



LGTM, applied.

Thanks,
- Tom



2022-01-08  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (*cmpf): New define_insn.
(cstorehf4): New define_expand.
(fmahf4): New define_insn.
(neghf2): New define_insn.
(abshf2): New define_insn.

gcc/testsuite/ChangeLog
* gcc.target/nvptx/float16-3.c: New test case for neghf2.
* gcc.target/nvptx/float16-4.c: New test case for abshf2.
* gcc.target/nvptx/float16-5.c: New test case for fmahf4.
* gcc.target/nvptx/float16-6.c: New test case.


Thanks in advance,
Roger
--



[committed] wwwdocs: frontends: Adjust the Mercury reference

2022-02-10 Thread Gerald Pfeifer
---
 htdocs/frontends.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/frontends.html b/htdocs/frontends.html
index cd67b089..748ca182 100644
--- a/htdocs/frontends.html
+++ b/htdocs/frontends.html
@@ -32,7 +32,7 @@ are very mature.
 http://www.gnu-pascal.de/gpc/h-index.html;>GNU Pascal 
Compiler (GPC).
 
 http://www.mercurylang.org/download/gcc-backend.html;>Mercury,
+href="https://mercurylang.org/download/gcc-backend.html;>Mercury,
 a declarative logic/functional language. The University of Melbourne Mercury
 compiler is written in Mercury; originally it compiled via C but now it also
 has a back end that generates assembler directly, using the GCC back end.
-- 
2.35.1


[committed] doc: Tweak the www.bitwizard.nl reference

2022-02-10 Thread Gerald Pfeifer
gcc:
* doc/install.texi (Specific): Change the www.bitwizard.nl
reference to use https.
---
 gcc/doc/install.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 93eae1f2582..7258f9def6c 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -4189,7 +4189,7 @@ See @uref{https://gcc.gnu.org/PR10877,,bug 10877} for 
more information.
 
 If you receive Signal 11 errors when building on GNU/Linux, then it is
 possible you have a hardware problem.  Further information on this can be
-found on @uref{http://www.bitwizard.nl/sig11/,,www.bitwizard.nl}.
+found on @uref{https://www.bitwizard.nl/sig11/,,www.bitwizard.nl}.
 
 @html
 
-- 
2.35.1