date:20230602

Re: [Patch, fortran] PR37336 finalization

2023-06-02 Thread Thomas Koenig via Gcc-patches


Hi Paul,


I propose to backport
r13-6747-gd7caf313525a46f200d7f5db1ba893f853774aee to 12-branch very
soon.


Is this something that we usually do?

While finalization was basically broken before, some people still used
working subsets (or subsets that were broken, and they adapted or
wrote their code accordingly).

What is the general opinion on that?  I'm undecided.


Before that, I propose to remove the F2003/2008 finalization of
structure and array constructors in 13- and 14-branches. I can see why
it was removed from the standard in a correction to F2008 and think
that it is likely to cause endless confusion and maintenance
complications. However, finalization of function results within
constructors will be retained.


That, I agree with.  Should it be noted somewhere as an intentional
deviation from the standard?

Best regards

Thomas

Re: [PATCH, committed] Fortran: fix diagnostics for SELECT RANK [PR100607]

2023-06-02 Thread Paul Richard Thomas via Gcc-patches

Hi Harald,

It looks good to me. Thanks to you and Steve for the fix. I suggest
that it is such and obvious one that it deserved back-porting.

Cheers

Paul

On Fri, 2 Jun 2023 at 19:06, Harald Anlauf via Fortran
 wrote:
>
> Dear all,
>
> I've committed that attached simple patch on behalf of Steve
> after discussion in the PR and regtesting on x86_64-pc-linux-gnu.
>
> It fixes a duplicate error message and an ICE.
>
> Pushed as r14-1505-gfae09dfc0e6bf4cfe35d817558827aea78c6426f .
>
> Thanks,
> Harald
>


-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

Re: [pushed] analyzer: implement various atomic builtins [PR109015]

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

Hi David,

The new test ICEs the compiler on aarch64-linux-gnu [1].  Would you please 
investigate?

Running gcc:gcc.dg/analyzer/analyzer.exp ...
FAIL: gcc.dg/analyzer/atomic-builtins-qemu-sockets.c (internal compiler error: 
in validate, at analyzer/store.cc:1329)
FAIL: gcc.dg/analyzer/atomic-builtins-qemu-sockets.c (test for excess errors)

This is a simple native build on aarch64-linux-gnu.  Please let me know if you 
need any help in reproducing this.

[1] 
https://ci.linaro.org/job/tcwg_gcc_check--master-aarch64-build/82/artifact/artifacts/artifacts.precommit/06-check_regression/results.compare/*view*/

Thanks!

--
Maxim Kuvyrkov
https://www.linaro.org




> On Jun 2, 2023, at 17:32, David Malcolm via Gcc-patches 
>  wrote:
> 
> This patch implements many of the __atomic_* builtins from
> sync-builtins.def as known_function subclasses within the analyzer.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Pushed to trunk as r14-1497-gef768035ae8090.
> 
> gcc/analyzer/ChangeLog:
> PR analyzer/109015
> * kf.cc (class kf_atomic_exchange): New.
> (class kf_atomic_exchange_n): New.
> (class kf_atomic_fetch_op): New.
> (class kf_atomic_op_fetch): New.
> (class kf_atomic_load): New.
> (class kf_atomic_load_n): New.
> (class kf_atomic_store_n): New.
> (register_atomic_builtins): New function.
> (register_known_functions): Call register_atomic_builtins.
> 
> gcc/testsuite/ChangeLog:
> PR analyzer/109015
> * gcc.dg/analyzer/atomic-builtins-1.c: New test.
> * gcc.dg/analyzer/atomic-builtins-haproxy-proxy.c: New test.
> * gcc.dg/analyzer/atomic-builtins-qemu-sockets.c: New test.
> * gcc.dg/analyzer/atomic-types-1.c: New test.
> ---
> gcc/analyzer/kf.cc| 355 
> .../gcc.dg/analyzer/atomic-builtins-1.c   | 544 ++
> .../analyzer/atomic-builtins-haproxy-proxy.c  |  55 ++
> .../analyzer/atomic-builtins-qemu-sockets.c   |  18 +
> .../gcc.dg/analyzer/atomic-types-1.c  |  11 +
> 5 files changed, 983 insertions(+)
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/atomic-builtins-1.c
> create mode 100644 
> gcc/testsuite/gcc.dg/analyzer/atomic-builtins-haproxy-proxy.c
> create mode 100644 
> gcc/testsuite/gcc.dg/analyzer/atomic-builtins-qemu-sockets.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/atomic-types-1.c
> 
> diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
> index 93c46630f36..104499e 100644
> --- a/gcc/analyzer/kf.cc
> +++ b/gcc/analyzer/kf.cc
> @@ -69,6 +69,235 @@ kf_alloca::impl_call_pre (const call_details ) const
>   cd.maybe_set_lhs (ptr_sval);
> }
> 
> +/* Handler for:
> +   void __atomic_exchange (type *ptr, type *val, type *ret, int memorder).  
> */
> +
> +class kf_atomic_exchange : public internal_known_function
> +{
> +public:
> +  /* This is effectively:
> +   *RET = *PTR;
> +   *PTR = *VAL;
> +  */
> +  void impl_call_pre (const call_details ) const final override
> +  {
> +const svalue *ptr_ptr_sval = cd.get_arg_svalue (0);
> +tree ptr_ptr_tree = cd.get_arg_tree (0);
> +const svalue *val_ptr_sval = cd.get_arg_svalue (1);
> +tree val_ptr_tree = cd.get_arg_tree (1);
> +const svalue *ret_ptr_sval = cd.get_arg_svalue (2);
> +tree ret_ptr_tree = cd.get_arg_tree (2);
> +/* Ignore the memorder param.  */
> +
> +region_model *model = cd.get_model ();
> +region_model_context *ctxt = cd.get_ctxt ();
> +
> +const region *val_region
> +  = model->deref_rvalue (val_ptr_sval, val_ptr_tree, ctxt);
> +const svalue *star_val_sval = model->get_store_value (val_region, ctxt);
> +const region *ptr_region
> +  = model->deref_rvalue (ptr_ptr_sval, ptr_ptr_tree, ctxt);
> +const svalue *star_ptr_sval = model->get_store_value (ptr_region, ctxt);
> +const region *ret_region
> +  = model->deref_rvalue (ret_ptr_sval, ret_ptr_tree, ctxt);
> +model->set_value (ptr_region, star_val_sval, ctxt);
> +model->set_value (ret_region, star_ptr_sval, ctxt);
> +  }
> +};
> +
> +/* Handler for:
> +   __atomic_exchange_n (type *ptr, type val, int memorder).  */
> +
> +class kf_atomic_exchange_n : public internal_known_function
> +{
> +public:
> +  /* This is effectively:
> +   RET = *PTR;
> +   *PTR = VAL;
> +   return RET;
> +  */
> +  void impl_call_pre (const call_details ) const final override
> +  {
> +const svalue *ptr_sval = cd.get_arg_svalue (0);
> +tree ptr_tree = cd.get_arg_tree (0);
> +const svalue *set_sval = cd.get_arg_svalue (1);
> +/* Ignore the memorder param.  */
> +
> +region_model *model = cd.get_model ();
> +region_model_context *ctxt = cd.get_ctxt ();
> +
> +const region *dst_region = model->deref_rvalue (ptr_sval, ptr_tree, 
> ctxt);
> +const svalue *ret_sval = model->get_store_value (dst_region, ctxt);
> +model->set_value (dst_region, set_sval, ctxt);
> +cd.maybe_set_lhs (ret_sval);
> +  }
> +};
> +
> +/* Handler for:
> +   type __atomic_fetch_add (type *ptr, type val, int memorder);
>

Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-02 Thread Jeff Law via Gcc-patches





On 6/2/23 01:07, yanzhang.wang--- via Gcc-patches wrote:

From: Yanzhang Wang 

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
  when enabling -mno-omit-leaf-frame-pointer
(riscv_option_override): Override omit-frame-pointer.
(riscv_frame_pointer_required): Save s0 for non-leaf function
(TARGET_FRAME_POINTER_REQUIRED): Override defination
* config/riscv/riscv.opt: Add option support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/omit-frame-pointer-1.c: New test.
* gcc.target/riscv/omit-frame-pointer-2.c: New test.
* gcc.target/riscv/omit-frame-pointer-3.c: New test.
* gcc.target/riscv/omit-frame-pointer-4.c: New test.
* gcc.target/riscv/omit-frame-pointer-test.c: New test.

Not ACKing or NAKing at this time.

Why do you want this feature?

jeff

RE: [PATCH V2] RISC-V: Fix warning in predicated.md

2023-06-02 Thread Li, Pan2 via Gcc-patches

Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Saturday, June 3, 2023 4:57 AM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; pal...@dabbelt.com; 
pal...@rivosinc.com; rdapp@gmail.com; sch...@linux-m68k.org
Subject: Re: [PATCH V2] RISC-V: Fix warning in predicated.md



On 6/2/23 03:33, juzhe.zh...@rivai.ai wrote:
> From: Juzhe-Zhong 
> 
> Notice there is warning in predicates.md:
> ../../../riscv-gcc/gcc/config/riscv/predicates.md: In function â€˜bool 
> arith_operand_or_mode_mask(rtx, machine_mode)â€™:
> ../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison 
> between signed and unsigned integer expressions [-Wsign-compare]
>   (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
> ../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison 
> between signed and unsigned integer expressions [-Wsign-compare]
>   || INTVAL (op) == GET_MODE_MASK (SImode)"
> 
> gcc/ChangeLog:
> 
>  * config/riscv/predicates.md: Change INTVAL into UINTVAL.
> 
> ---
>   gcc/config/riscv/predicates.md | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/riscv/predicates.md 
> b/gcc/config/riscv/predicates.md index d14b1ca30bb..04ca6ceabc7 100644
> --- a/gcc/config/riscv/predicates.md
> +++ b/gcc/config/riscv/predicates.md
> @@ -30,7 +30,7 @@
>   (define_predicate "arith_operand_or_mode_mask"
> (ior (match_operand 0 "arith_operand")
>  (and (match_code "const_int")
> -(match_test "INTVAL (op) == GET_MODE_MASK (HImode)
> +(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
>|| UINTVAL (op) == GET_MODE_MASK (SImode)"
OK.

jeff

RE: Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-02 Thread Li, Pan2 via Gcc-patches

Committed, with Robin's suggestion for commit log, thanks Robin and Kito.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Kito Cheng via Gcc-patches
Sent: Saturday, June 3, 2023 9:10 AM
To: 钟居哲 
Cc: Robin Dapp ; gcc-patches ; 
kito.cheng ; palmer ; palmer 
; jeffreyalaw 
Subject: Re: Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance 
vwmul.vv instruction optimizations

Lgtm, thanks:)

juzhe.zh...@rivai.ai  於 2023年6月2日 週五 15:20 寫道：

> Thanks. I am gonna wait for Jeff or Kito final approve.
>
> --
> juzhe.zh...@rivai.ai
>
>
> *From:* Robin Dapp 
> *Date:* 2023-06-02 15:18
> *To:* juzhe.zh...@rivai.ai; gcc-patches 
> *CC:* rdapp.gcc ; kito.cheng 
> ; Kito.cheng ; palmer 
> ; palmer ; jeffreyalaw 
> 
> *Subject:* Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to 
> enhance vwmul.vv instruction optimizations
> >>> I like the code examples in general but find them hard to read at 
> >>> lengths > 5-10 or so.  Could we condense this a bit?
> > Ok, Do I need to send V2 ? Or condense the commit log when merged 
> > the
> patch?
>
> Sure, just condense a bit. No need for V2.
>
> Regards
> Robin
>
>
>

Re: Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-02 Thread Kito Cheng via Gcc-patches

Lgtm, thanks:)

juzhe.zh...@rivai.ai  於 2023年6月2日 週五 15:20 寫道：

> Thanks. I am gonna wait for Jeff or Kito final approve.
>
> --
> juzhe.zh...@rivai.ai
>
>
> *From:* Robin Dapp 
> *Date:* 2023-06-02 15:18
> *To:* juzhe.zh...@rivai.ai; gcc-patches 
> *CC:* rdapp.gcc ; kito.cheng ;
> Kito.cheng ; palmer ; palmer
> ; jeffreyalaw 
> *Subject:* Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance
> vwmul.vv instruction optimizations
> >>> I like the code examples in general but find them hard to read
> >>> at lengths > 5-10 or so.  Could we condense this a bit?
> > Ok, Do I need to send V2 ? Or condense the commit log when merged the
> patch?
>
> Sure, just condense a bit. No need for V2.
>
> Regards
> Robin
>
>
>

Re: PING^2: [PATCH] release the sorted FDE array when deregistering a frame [PR109685]

2023-06-02 Thread Jeff Law via Gcc-patches





On 6/2/23 08:54, Thomas Neumann wrote:
Summary: The old linear scan logic called free while searching the list 
of frames. The atomic fast path finds the frame quickly, but forgot the 
free call. This patches adds the missing free. Bugzilla #109685.


See:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619026.html

Also OK.

jeff

Re: PING: [PATCH] fix radix sort on 32bit platforms [PR109670]

2023-06-02 Thread Jeff Law via Gcc-patches





On 6/2/23 09:00, Thomas Neumann via Gcc-patches wrote:
Summary: The radix sort did not handle the uppermost byte correctly, 
which sometimes broke win32 exceptions. Bugzilla #109670. The reporter 
confirmed that the patch fixes the bug.


See:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618000.html

OK.  Sorry for the delay.
jeff

Re: [PATCH V2] RISC-V: Fix warning in predicated.md

2023-06-02 Thread Jeff Law via Gcc-patches





On 6/2/23 03:33, juzhe.zh...@rivai.ai wrote:

From: Juzhe-Zhong 

Notice there is warning in predicates.md:
../../../riscv-gcc/gcc/config/riscv/predicates.md: In function â€˜bool 
arith_operand_or_mode_mask(rtx, machine_mode)â€™:
../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison 
between signed and unsigned integer expressions [-Wsign-compare]
  (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison 
between signed and unsigned integer expressions [-Wsign-compare]
  || INTVAL (op) == GET_MODE_MASK (SImode)"

gcc/ChangeLog:

 * config/riscv/predicates.md: Change INTVAL into UINTVAL.

---
  gcc/config/riscv/predicates.md | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index d14b1ca30bb..04ca6ceabc7 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -30,7 +30,7 @@
  (define_predicate "arith_operand_or_mode_mask"
(ior (match_operand 0 "arith_operand")
 (and (match_code "const_int")
-(match_test "INTVAL (op) == GET_MODE_MASK (HImode)
+(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
 || UINTVAL (op) == GET_MODE_MASK (SImode)"

OK.

jeff

[pushed] Darwin, PPC: Fix struct layout with pragma pack [PR110044].

2023-06-02 Thread Iain Sandoe via Gcc-patches

@David: I am not sure what sets the ABI on AIX (for Darwin, it is effectively
"whatever the system compiler [Apple gcc-4] does") but from an inspection of
the code, it seems that (if the platform should honour #pragma pack) a similar
effect could be present there too.

Tested on powerpc-apple-darwin9, powerpc64-linux-gnu and on i686 and x86_64
Darwin.  Checked that the testcases also pass for Apple gcc-4.2.1.
pushed to trunk, thanks
Iain

--- 8< ---

This bug was essentially that darwin_rs6000_special_round_type_align()
was ignoring externally-imposed capping of field alignment.

Signed-off-by: Iain Sandoe 

PR target/110044

gcc/ChangeLog:

* config/rs6000/rs6000.cc (darwin_rs6000_special_round_type_align):
Make sure that we do not have a cap on field alignment before altering
the struct layout based on the type alignment of the first entry.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/darwin-abi-13-0.c: New test.
* gcc.target/powerpc/darwin-abi-13-1.c: New test.
* gcc.target/powerpc/darwin-abi-13-2.c: New test.
* gcc.target/powerpc/darwin-structs-0.h: New test.
---
 gcc/config/rs6000/rs6000.cc   |  3 +-
 .../gcc.target/powerpc/darwin-abi-13-0.c  | 23 +++
 .../gcc.target/powerpc/darwin-abi-13-1.c  | 27 +
 .../gcc.target/powerpc/darwin-abi-13-2.c  | 27 +
 .../gcc.target/powerpc/darwin-structs-0.h | 29 +++
 5 files changed, 108 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/darwin-abi-13-0.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/darwin-abi-13-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/darwin-abi-13-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/darwin-structs-0.h

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5b3b8b52e7e..42f49e4a56b 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -8209,7 +8209,8 @@ darwin_rs6000_special_round_type_align (tree type, 
unsigned int computed,
   type = TREE_TYPE (type);
   } while (AGGREGATE_TYPE_P (type));
 
-  if (! AGGREGATE_TYPE_P (type) && type != error_mark_node)
+  if (type != error_mark_node && ! AGGREGATE_TYPE_P (type)
+  && ! TYPE_PACKED (type) && maximum_field_alignment == 0)
 align = MAX (align, TYPE_ALIGN (type));
 
   return align;
diff --git a/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-0.c 
b/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-0.c
new file mode 100644
index 000..d8d3c63a083
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-0.c
@@ -0,0 +1,23 @@
+/* { dg-do compile { target powerpc*-*-darwin* } } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-Wno-long-long" } */
+
+#include "darwin-structs-0.h"
+
+int tcd[sizeof(cd) != 12 ? -1 : 1];
+int acd[__alignof__(cd) != 4 ? -1 : 1];
+
+int sdc[sizeof(dc) != 16 ? -1 : 1];
+int adc[__alignof__(dc) != 8 ? -1 : 1];
+
+int scL[sizeof(cL) != 12 ? -1 : 1];
+int acL[__alignof__(cL) != 4 ? -1 : 1];
+
+int sLc[sizeof(Lc) != 16 ? -1 : 1];
+int aLc[__alignof__(Lc) != 8 ? -1 : 1];
+
+int scD[sizeof(cD) != 32 ? -1 : 1];
+int acD[__alignof__(cD) != 16 ? -1 : 1];
+
+int sDc[sizeof(Dc) != 32 ? -1 : 1];
+int aDc[__alignof__(Dc) != 16 ? -1 : 1];
diff --git a/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-1.c 
b/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-1.c
new file mode 100644
index 000..4d888d383fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile { target powerpc*-*-darwin* } } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-Wno-long-long" } */
+
+#pragma pack(push, 1)
+
+#include "darwin-structs-0.h"
+
+int tcd[sizeof(cd) != 9 ? -1 : 1];
+int acd[__alignof__(cd) != 1 ? -1 : 1];
+
+int sdc[sizeof(dc) != 9 ? -1 : 1];
+int adc[__alignof__(dc) != 1 ? -1 : 1];
+
+int scL[sizeof(cL) != 9 ? -1 : 1];
+int acL[__alignof__(cL) != 1 ? -1 : 1];
+
+int sLc[sizeof(Lc) != 9 ? -1 : 1];
+int aLc[__alignof__(Lc) != 1 ? -1 : 1];
+
+int scD[sizeof(cD) != 17 ? -1 : 1];
+int acD[__alignof__(cD) != 1 ? -1 : 1];
+
+int sDc[sizeof(Dc) != 17 ? -1 : 1];
+int aDc[__alignof__(Dc) != 1 ? -1 : 1];
+
+#pragma pack(pop)
diff --git a/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-2.c 
b/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-2.c
new file mode 100644
index 000..3bd52c0a8f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/darwin-abi-13-2.c
@@ -0,0 +1,27 @@
+/* { dg-do compile { target powerpc*-*-darwin* } } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-Wno-long-long" } */
+
+#pragma pack(push, 2)
+
+#include "darwin-structs-0.h"
+
+int tcd[sizeof(cd) != 10 ? -1 : 1];
+int acd[__alignof__(cd) != 2 ? -1 : 1];
+
+int sdc[sizeof(dc) != 10 ? -1 : 1];
+int adc[__alignof__(dc) != 2 ? -1 : 1];
+
+int scL[sizeof(cL) != 10 ? -1 : 1];
+int acL[__alignof__(cL) != 2 ? -1 : 1];
+
+int sLc[sizeof(Lc)

[PATCH, committed] Fortran: fix diagnostics for SELECT RANK [PR100607]

2023-06-02 Thread Harald Anlauf via Gcc-patches

Dear all,

I've committed that attached simple patch on behalf of Steve
after discussion in the PR and regtesting on x86_64-pc-linux-gnu.

It fixes a duplicate error message and an ICE.

Pushed as r14-1505-gfae09dfc0e6bf4cfe35d817558827aea78c6426f .

Thanks,
Harald

From fae09dfc0e6bf4cfe35d817558827aea78c6426f Mon Sep 17 00:00:00 2001
From: Steve Kargl 
Date: Fri, 2 Jun 2023 19:44:11 +0200
Subject: [PATCH] Fortran: fix diagnostics for SELECT RANK [PR100607]

gcc/fortran/ChangeLog:

	PR fortran/100607
	* resolve.cc (resolve_select_rank): Remove duplicate error.
	(resolve_fl_var_and_proc): Prevent NULL pointer dereference and
	suppress error message for temporary.

gcc/testsuite/ChangeLog:

	PR fortran/100607
	* gfortran.dg/select_rank_6.f90: New test.
---
 gcc/fortran/resolve.cc  | 10 ++---
 gcc/testsuite/gfortran.dg/select_rank_6.f90 | 48 +
 2 files changed, 52 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/select_rank_6.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 2ba3101f1fe..fd059dddf05 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -10020,11 +10020,6 @@ resolve_select_rank (gfc_code *code, gfc_namespace *old_ns)
 			   || gfc_expr_attr (code->expr1).pointer))
 	gfc_error ("RANK (*) at %L cannot be used with the pointer or "
 		   "allocatable selector at %L", >where, >expr1->where);
-
-  if (case_value == -1 && (gfc_expr_attr (code->expr1).allocatable
-			   || gfc_expr_attr (code->expr1).pointer))
-	gfc_error ("RANK (*) at %L cannot be used with the pointer or "
-		   "allocatable selector at %L", >where, >expr1->where);
 }

   /* Add EXEC_SELECT to switch on rank.  */
@@ -13262,7 +13257,10 @@ resolve_fl_var_and_proc (gfc_symbol *sym, int mp_flag)

   if (allocatable)
 	{
-	  if (dimension && as->type != AS_ASSUMED_RANK)
+	  if (dimension
+	  && as
+	  && as->type != AS_ASSUMED_RANK
+	  && !sym->attr.select_rank_temporary)
 	{
 	  gfc_error ("Allocatable array %qs at %L must have a deferred "
 			 "shape or assumed rank", sym->name, >declared_at);
diff --git a/gcc/testsuite/gfortran.dg/select_rank_6.f90 b/gcc/testsuite/gfortran.dg/select_rank_6.f90
new file mode 100644
index 000..d0121777bb5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/select_rank_6.f90
@@ -0,0 +1,48 @@
+! { dg-do compile }
+! PR fortran/100607 - fix diagnostics for SELECT RANK
+! Contributed by T.Burnus
+
+program p
+  implicit none
+  integer, allocatable :: A(:,:,:)
+
+  allocate(a(5:6,-2:2, 99:100))
+  call foo(a)
+  call bar(a)
+
+contains
+
+  subroutine foo(x)
+integer, allocatable :: x(..)
+if (rank(x) /= 3) stop 1
+if (any (lbound(x) /= [5, -2, 99])) stop 2
+
+select rank (x)
+rank(3)
+  if (any (lbound(x) /= [5, -2, 99])) stop 3
+end select
+
+select rank (x) ! { dg-error "pointer or allocatable selector at .2." }
+rank(*) ! { dg-error "pointer or allocatable selector at .2." }
+  if (rank(x) /= 1) stop 4
+  if (lbound(x, 1) /= 1) stop 5
+end select
+  end
+
+  subroutine bar(x)
+integer :: x(..)
+if (rank(x) /= 3) stop 6
+if (any (lbound(x) /= 1)) stop 7
+
+select rank (x)
+rank(3)
+  if (any (lbound(x) /= 1)) stop 8
+end select
+
+select rank (x)
+rank(*)
+  if (rank(x) /= 1) stop 9
+  if (lbound(x, 1) /= 1) stop 10
+end select
+  end
+end
--
2.35.3

[PATCH] libatomic: Enable lock-free 128-bit atomics on AArch64 [PR110061]

2023-06-02 Thread Wilco Dijkstra via Gcc-patches


Enable lock-free 128-bit atomics on AArch64.  This is backwards compatible with
existing binaries, gives better performance than locking atomics and is what
most users expect.

Note 128-bit atomic loads use a load/store exclusive loop if LSE2 is not 
supported.
This results in an implicit store which is invisible to software as long as the 
given
address is writeable (which will be true when using atomics in actual code).

A simple test on an old Cortex-A72 showed 2.7x speedup of 128-bit atomics.

Passes regress, OK for commit?

libatomic/
PR target/110061
config/linux/aarch64/atomic_16.S: Implement lock-free ARMv8.0 atomics.
config/linux/aarch64/host-config.h: Use atomic_16.S for baseline v8.0.
State we have lock-free atomics.

---

diff --git a/libatomic/config/linux/aarch64/atomic_16.S 
b/libatomic/config/linux/aarch64/atomic_16.S
index 
05439ce394b9653c9bcb582761ff7aaa7c8f9643..0485c284117edf54f41959d2fab9341a9567b1cf
 100644
--- a/libatomic/config/linux/aarch64/atomic_16.S
+++ b/libatomic/config/linux/aarch64/atomic_16.S
@@ -22,6 +22,21 @@
.  */
 
 
+/* AArch64 128-bit lock-free atomic implementation.
+
+   128-bit atomics are now lock-free for all AArch64 architecture versions.
+   This is backwards compatible with existing binaries and gives better
+   performance than locking atomics.
+
+   128-bit atomic loads use a exclusive loop if LSE2 is not supported.
+   This results in an implicit store which is invisible to software as long
+   as the given address is writeable.  Since all other atomics have explicit
+   writes, this will be true when using atomics in actual code.
+
+   The libat__16 entry points are ARMv8.0.
+   The libat__16_i1 entry points are used when LSE2 is available.  */
+
+
.arch   armv8-a+lse
 
 #define ENTRY(name)\
@@ -37,6 +52,10 @@ name:\
.cfi_endproc;   \
.size name, .-name;
 
+#define ALIAS(alias,name)  \
+   .global alias;  \
+   .set alias, name;
+
 #define res0 x0
 #define res1 x1
 #define in0  x2
@@ -70,6 +89,24 @@ name:\
 #define SEQ_CST 5
 
 
+ENTRY (libat_load_16)
+   mov x5, x0
+   cbnzw1, 2f
+
+   /* RELAXED.  */
+1: ldxpres0, res1, [x5]
+   stxpw4, res0, res1, [x5]
+   cbnzw4, 1b
+   ret
+
+   /* ACQUIRE/CONSUME/SEQ_CST.  */
+2: ldaxp   res0, res1, [x5]
+   stxpw4, res0, res1, [x5]
+   cbnzw4, 2b
+   ret
+END (libat_load_16)
+
+
 ENTRY (libat_load_16_i1)
cbnzw1, 1f
 
@@ -93,6 +130,23 @@ ENTRY (libat_load_16_i1)
 END (libat_load_16_i1)
 
 
+ENTRY (libat_store_16)
+   cbnzw4, 2f
+
+   /* RELAXED.  */
+1: ldxpxzr, tmp0, [x0]
+   stxpw4, in0, in1, [x0]
+   cbnzw4, 1b
+   ret
+
+   /* RELEASE/SEQ_CST.  */
+2: ldxpxzr, tmp0, [x0]
+   stlxp   w4, in0, in1, [x0]
+   cbnzw4, 2b
+   ret
+END (libat_store_16)
+
+
 ENTRY (libat_store_16_i1)
cbnzw4, 1f
 
@@ -101,14 +155,14 @@ ENTRY (libat_store_16_i1)
ret
 
/* RELEASE/SEQ_CST.  */
-1: ldaxp   xzr, tmp0, [x0]
+1: ldxpxzr, tmp0, [x0]
stlxp   w4, in0, in1, [x0]
cbnzw4, 1b
ret
 END (libat_store_16_i1)
 
 
-ENTRY (libat_exchange_16_i1)
+ENTRY (libat_exchange_16)
mov x5, x0
cbnzw4, 2f
 
@@ -126,22 +180,55 @@ ENTRY (libat_exchange_16_i1)
stxpw4, in0, in1, [x5]
cbnzw4, 3b
ret
-4:
-   cmp w4, RELEASE
-   b.ne6f
 
-   /* RELEASE.  */
-5: ldxpres0, res1, [x5]
+   /* RELEASE/ACQ_REL/SEQ_CST.  */
+4: ldaxp   res0, res1, [x5]
stlxp   w4, in0, in1, [x5]
-   cbnzw4, 5b
+   cbnzw4, 4b
ret
+END (libat_exchange_16)
 
-   /* ACQ_REL/SEQ_CST.  */
-6: ldaxp   res0, res1, [x5]
-   stlxp   w4, in0, in1, [x5]
-   cbnzw4, 6b
+
+ENTRY (libat_compare_exchange_16)
+   ldp exp0, exp1, [x1]
+   cbz w4, 3f
+   cmp w4, RELEASE
+   b.hs4f
+
+   /* ACQUIRE/CONSUME.  */
+1: ldaxp   tmp0, tmp1, [x0]
+   cmp tmp0, exp0
+   ccmptmp1, exp1, 0, eq
+   bne 2f
+   stxpw4, in0, in1, [x0]
+   cbnzw4, 1b
+   mov x0, 1
ret
-END (libat_exchange_16_i1)
+
+2: stp tmp0, tmp1, [x1]
+   mov x0, 0
+   ret
+
+   /* RELAXED.  */
+3: ldxptmp0, tmp1, [x0]
+   cmp tmp0, exp0
+   ccmptmp1, exp1, 0, eq
+   bne 2b
+   stxpw4, in0, in1, [x0]
+   cbnzw4, 3b
+   mov x0, 1
+   ret
+
+   /* RELEASE/ACQ_REL/SEQ_CST.  */
+4: ldaxp   tmp0, tmp1, [x0]
+   cmp tmp0, exp0
+   ccmptmp1, exp1, 0, eq
+   bne 2b
+   stlxp   w4, in0, in1, [x0]
+   cbnzw4, 4b
+   mov x0, 1
+   ret
+END (libat_compare_exchange_16)

PING Re: [PATCH RFA (tree-eh)] c++: use __cxa_call_terminate for MUST_NOT_THROW [PR97720]

2023-06-02 Thread Jason Merrill via Gcc-patches

Since Jonathan approved the library change, I'm looking for middle-end 
approval for the tree-eh change, even without advice on the potential 
follow-up.


On 5/24/23 14:55, Jason Merrill wrote:

Middle-end folks: any thoughts about how best to make the change described in
the last paragraph below?

Library folks: any thoughts on the changes to __cxa_call_terminate?

-- 8< --

[except.handle]/7 says that when we enter std::terminate due to a throw,
that is considered an active handler.  We already implemented that properly
for the case of not finding a handler (__cxa_throw calls __cxa_begin_catch
before std::terminate) and the case of finding a callsite with no landing
pad (the personality function calls __cxa_call_terminate which calls
__cxa_begin_catch), but for the case of a throw in a try/catch in a noexcept
function, we were emitting a cleanup that calls std::terminate directly
without ever calling __cxa_begin_catch to handle the exception.

A straightforward way to fix this seems to be calling __cxa_call_terminate
instead.  However, that requires exporting it from libstdc++, which we have
not previously done.  Despite the name, it isn't actually part of the ABI
standard.  Nor is __cxa_call_unexpected, as far as I can tell, but that one
is also used by clang.  For this case they use __clang_call_terminate; it
seems reasonable to me for us to stick with __cxa_call_terminate.

I also change __cxa_call_terminate to take void* for simplicity in the front
end (and consistency with __cxa_call_unexpected) but that isn't necessary if
it's undesirable for some reason.

This patch does not fix the issue that representing the noexcept as a
cleanup is wrong, and confuses the handler search; since it looks like a
cleanup in the EH tables, the unwinder keeps looking until it finds the
catch in main(), which it should never have gotten to.  Without the
try/catch in main, the unwinder would reach the end of the stack and say no
handler was found.  The noexcept is a handler, and should be treated as one,
as it is when the landing pad is omitted.

The best fix for that issue seems to me to be to represent an
ERT_MUST_NOT_THROW after an ERT_TRY in an action list as though it were an
ERT_ALLOWED_EXCEPTIONS (since indeed it is an exception-specification).  The
actual code generation shouldn't need to change (apart from the change made
by this patch), only the action table entry.

PR c++/97720

gcc/cp/ChangeLog:

* cp-tree.h (enum cp_tree_index): Add CPTI_CALL_TERMINATE_FN.
(call_terminate_fn): New macro.
* cp-gimplify.cc (gimplify_must_not_throw_expr): Use it.
* except.cc (init_exception_processing): Set it.
(cp_protect_cleanup_actions): Return it.

gcc/ChangeLog:

* tree-eh.cc (lower_resx): Pass the exception pointer to the
failure_decl.
* except.h: Tweak comment.

libstdc++-v3/ChangeLog:

* libsupc++/eh_call.cc (__cxa_call_terminate): Take void*.
* config/abi/pre/gnu.ver: Add it.

gcc/testsuite/ChangeLog:

* g++.dg/eh/terminate2.C: New test.
---
  gcc/cp/cp-tree.h |  2 ++
  gcc/except.h |  2 +-
  gcc/cp/cp-gimplify.cc|  2 +-
  gcc/cp/except.cc |  5 -
  gcc/testsuite/g++.dg/eh/terminate2.C | 30 
  gcc/tree-eh.cc   | 16 ++-
  libstdc++-v3/libsupc++/eh_call.cc|  4 +++-
  libstdc++-v3/config/abi/pre/gnu.ver  |  7 +++
  8 files changed, 63 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/eh/terminate2.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a1b882f11fe..a8465a988b5 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -217,6 +217,7 @@ enum cp_tree_index
 definitions.  */
  CPTI_ALIGN_TYPE,
  CPTI_TERMINATE_FN,
+CPTI_CALL_TERMINATE_FN,
  CPTI_CALL_UNEXPECTED_FN,
  
  /* These are lazily inited.  */

@@ -358,6 +359,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
  /* Exception handling function declarations.  */
  #define terminate_fn  cp_global_trees[CPTI_TERMINATE_FN]
  #define call_unexpected_fncp_global_trees[CPTI_CALL_UNEXPECTED_FN]
+#define call_terminate_fn  cp_global_trees[CPTI_CALL_TERMINATE_FN]
  #define get_exception_ptr_fn  
cp_global_trees[CPTI_GET_EXCEPTION_PTR_FN]
  #define begin_catch_fn
cp_global_trees[CPTI_BEGIN_CATCH_FN]
  #define end_catch_fn  cp_global_trees[CPTI_END_CATCH_FN]
diff --git a/gcc/except.h b/gcc/except.h
index 5ecdbc0d1dc..378a9e4cb77 100644
--- a/gcc/except.h
+++ b/gcc/except.h
@@ -155,7 +155,7 @@ struct GTY(()) eh_region_d
  struct eh_region_u_must_not_throw {
/* A function decl to be invoked if this region is actually reachable
 from within the function, rather than implementable from the runtime.
-The normal way for this to happen is for there to be a

Re: [PATCH] libstdc++: Do not assume existence of char8_t codecvt facet

2023-06-02 Thread Jonathan Wakely via Gcc-patches

On Fri, 2 Jun 2023 at 16:45, Joseph Faulls wrote:

> It is not required that codecvt facet be
> supported by
>
> the locale, nor is it added as part of the default locale. This can lead to
>
> dangerous behaviour when static_cast.
>

Ouch, yes indeed. I don't know why I added that there. Thanks for the
patch, I'll apply it to trunk and gcc-13 ASAP.




>
> libstdc++-v3/ChangeLog:
>
>
>
> * include/bits/locale_classes.tcc: Remove check.
>
> ---
>
> libstdc++-v3/include/bits/locale_classes.tcc | 3 ---
>
> 1 file changed, 3 deletions(-)
>
>
>
> diff --git a/libstdc++-v3/include/bits/locale_classes.tcc
> b/libstdc++-v3/include/bits/locale_classes.tcc
>
> index 94838cd7796..2351dd5bcfb 100644
>
> --- a/libstdc++-v3/include/bits/locale_classes.tcc
>
> +++ b/libstdc++-v3/include/bits/locale_classes.tcc
>
> @@ -129,9 +129,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>_GLIBCXX_STD_FACET(time_put);
>
>_GLIBCXX_STD_FACET(messages);
>
> #endif
>
> -#ifdef _GLIBCXX_USE_CHAR8_T
>
> -  _GLIBCXX_STD_FACET(codecvt);
>
> -#endif
>
> #if __cplusplus >= 201103L
>
>_GLIBCXX_STD_FACET(codecvt);
>
>_GLIBCXX_STD_FACET(codecvt);
>
> --
>
> 2.34.1
>

Re: [PATCH][committed] btf: Fix -Wformat errors

2023-06-02 Thread Alex Coplan via Gcc-patches

On 02/06/2023 18:45, Jakub Jelinek wrote:
> On Fri, Jun 02, 2023 at 06:18:38PM +0200, Rainer Orth wrote:
> > Hi Alex,
> > 
> > > g:7aae58b04b92303ccda3ead600be98f0d4b7f462 introduced -Wformat errors
> > > breaking bootstrap on some targets. This patch fixes that.
> > >
> > > Committed as obvious.
> > >
> > > Thanks,
> > > Alex
> > >
> > > gcc/ChangeLog:
> > >
> > >   * btfout.cc (btf_asm_type): Use PRIu64 instead of %lu for uint64_t.
> > >   (btf_asm_datasec_type): Likewise.
> > 
> > This is PR libstdc++/110077.  Btw., your fix is incomplete: it needs
> > another change (%lu -> %zu) in btf_asm_func_type.
> 
> Can we rely on %zu working?  Sure, it is in C99 and so in C++11 as well,
> but I don't see it being used inside of gcc/ at all and not sure if all host
> C libraries support it.

Looks like the follow-up patch from David fixes this without relying on %zu:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620524.html

Alex

> 
>   Jakub
>

Re: [PATCH][committed] btf: Fix -Wformat errors

2023-06-02 Thread Jakub Jelinek via Gcc-patches

On Fri, Jun 02, 2023 at 06:18:38PM +0200, Rainer Orth wrote:
> Hi Alex,
> 
> > g:7aae58b04b92303ccda3ead600be98f0d4b7f462 introduced -Wformat errors
> > breaking bootstrap on some targets. This patch fixes that.
> >
> > Committed as obvious.
> >
> > Thanks,
> > Alex
> >
> > gcc/ChangeLog:
> >
> > * btfout.cc (btf_asm_type): Use PRIu64 instead of %lu for uint64_t.
> > (btf_asm_datasec_type): Likewise.
> 
> This is PR libstdc++/110077.  Btw., your fix is incomplete: it needs
> another change (%lu -> %zu) in btf_asm_func_type.

Can we rely on %zu working?  Sure, it is in C99 and so in C++11 as well,
but I don't see it being used inside of gcc/ at all and not sure if all host
C libraries support it.

Jakub

[PATCH][committed] btf: fix bootstrap -Wformat errors [PR110073]

2023-06-02 Thread David Faust via Gcc-patches

Commit 7aae58b04b9 "btf: improve -dA comments for testsuite" broke
bootstrap on a number of architectures because it introduced some
new -Wformat errors.

Fix those errors by properly using PRIu64 and a small refactor to
the offending code.

Based on the suggested patch from Rainer Orth.
Committed as obvious.

PR debug/110073

gcc/ChangeLog:

* btfout.cc (btf_absolute_func_id): New function.
(btf_asm_func_type): Call it here.  Change index parameter from
size_t to ctf_id_t.  Use PRIu64 formatter.
---
 gcc/btfout.cc | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 1ea68b9e8ba..e07fed302c2 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -192,6 +192,14 @@ btf_relative_var_id (ctf_id_t abs)
   return abs - (num_types_added + 1);
 }
 
+/* Return the final BTF ID of the func record at relative index REL.  */
+
+static ctf_id_t
+btf_absolute_func_id (ctf_id_t rel)
+{
+  return rel + (num_types_added + 1) + num_vars_added;
+}
+
 /* Return the relative index of the func record with final BTF ID ABS.  */
 
 static ctf_id_t
@@ -937,13 +945,12 @@ btf_asm_func_arg (ctf_container_ref ctfc, ctf_func_arg_t 
* farg,
 /* Asm'out a BTF_KIND_FUNC type.  */
 
 static void
-btf_asm_func_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd, size_t i)
+btf_asm_func_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd, ctf_id_t id)
 {
   ctf_id_t ref_id = dtd->dtd_data.ctti_type;
   dw2_asm_output_data (4, dtd->dtd_data.ctti_name,
-  "TYPE %lu BTF_KIND_FUNC '%s'",
-  num_types_added + num_vars_added + 1 + i,
-  dtd->dtd_name);
+  "TYPE %" PRIu64 " BTF_KIND_FUNC '%s'",
+  btf_absolute_func_id (id), dtd->dtd_name);
   dw2_asm_output_data (4, BTF_TYPE_INFO (BTF_KIND_FUNC, 0, dtd->linkage),
   "btt_info: kind=%u, kflag=%u, linkage=%u",
   BTF_KIND_FUNC, 0, dtd->linkage);
-- 
2.40.1

Re: [PATCH] c++: is_specialization_of_friend confusion [PR109923]

2023-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/23 10:29, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


-- >8 --

The check for a non-template member function of a class template in
is_specialization_of_friend is overbroad, and accidentally holds for a
non-template hidden friend too, which causes the predicate to return
true for

   decl = void non_templ_friend(A, A)
   friend_decl = void non_templ_friend(A, A)

This patch refines the check appropriately.

PR c++/109923

gcc/cp/ChangeLog:

* pt.cc (is_specialization_of_friend): Fix overbroad check for
a non-template member function of a class template.

gcc/testsuite/ChangeLog:

* g++.dg/template/friend79.C: New test.
---
  gcc/cp/pt.cc |  1 +
  gcc/testsuite/g++.dg/template/friend79.C | 20 
  2 files changed, 21 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/template/friend79.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 7c2a5647665..a15d1d062c6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1319,6 +1319,7 @@ is_specialization_of_friend (tree decl, tree friend_decl)
   of a template class, we want to check if DECL is a specialization
   if this.  */
if (TREE_CODE (friend_decl) == FUNCTION_DECL
+  && DECL_CLASS_SCOPE_P (friend_decl)
&& DECL_TEMPLATE_INFO (friend_decl)
&& !DECL_USE_TEMPLATE (friend_decl))
  {
diff --git a/gcc/testsuite/g++.dg/template/friend79.C 
b/gcc/testsuite/g++.dg/template/friend79.C
new file mode 100644
index 000..cd2030df019
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/friend79.C
@@ -0,0 +1,20 @@
+// PR c++/109923
+
+template
+struct A {
+private:
+  int x;
+
+public:
+  A() : x(0) { }
+
+  friend void non_templ_friend(A val, A weird) {
+val.x++;   // always works
+weird.x++; // { dg-error "private" } should only work when T=void
+  }
+};
+
+int main() {
+  non_templ_friend(A(), A()); // { dg-bogus "" }
+  non_templ_friend(A(), A());  // { dg-message "required from here" 
}
+}

Re: [PATCH] c++: simplify TEMPLATE_TEMPLATE_PARM hashing

2023-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/23 10:29, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk?


OK.


-- >8 --

r10-7815-gaa576f2a860c82 added special hashing for TEMPLATE_TEMPLATE_PARM
since non-lowered ttps had TYPE_CANONICAL but level lowered ttps did not.
But this is no longer the case ever since r13-737-gd0ef9e06197d14 made
us set TYPE_CANONICAL for level lowered ttps as well.  So this special
hashing is now unnecessary, and we can fall back to using TYPE_CANONICAL.

gcc/cp/ChangeLog:

* pt.cc (iterative_hash_template_arg): Don't hash
TEMPLATE_TEMPLATE_PARM specially.
---
  gcc/cp/pt.cc | 13 +
  1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 688a87a4bd3..7c2a5647665 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1879,19 +1879,8 @@ iterative_hash_template_arg (tree arg, hashval_t val)
  return hash_tmpl_and_args (TI_TEMPLATE (ti), TI_ARGS (ti));
}
  
-  switch (TREE_CODE (arg))

+  switch (code)
{
-   case TEMPLATE_TEMPLATE_PARM:
- {
-   tree tpi = TEMPLATE_TYPE_PARM_INDEX (arg);
-
-   /* Do not recurse with TPI directly, as that is unbounded
-  recursion.  */
-   val = iterative_hash_object (TEMPLATE_PARM_LEVEL (tpi), val);
-   val = iterative_hash_object (TEMPLATE_PARM_IDX (tpi), val);
- }
- break;
-
case  DECLTYPE_TYPE:
  val = iterative_hash_template_arg (DECLTYPE_TYPE_EXPR (arg), val);
  break;

Re: [PATCH] c++: replace in_template_function

2023-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/23 11:15, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk?


OK.


-- >8 --

All uses of in_template_function besides the one in cp_make_fname_decl
seem like they could be generalized to apply to all template contexts,
not just function templates.  To that end this patch replaces the
predicate with a cheaper and more general in_template_context predicate
that returns true for all template contexts.  If we legitimately need
to consider only function template contexts, as in cp_make_fname_decl,
we can just additionallly check e.g. current_function_decl.

One concrete benefit of this is that we no longer instantiate/odr-use
entities based on uses within a non-function template such as in the
adjusted testcase below.

gcc/cp/ChangeLog:

* class.cc (build_base_path): Check in_template_context instead
of in_template_function.
(resolves_to_fixed_type_p): Likewise.
* cp-tree.h (in_template_context): Define.
(in_template_function): Remove.
* decl.cc (cp_make_fname_decl): Check current_function_decl
and in_template_context instead of in_template_function.
* decl2.cc (mark_used): Check in_template_context instead of
in_template_function.
* pt.cc (in_template_function): Remove.
* semantics.cc (enforce_access): Check in_template_context
instead of current_template_parms directly.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Waddress-of-packed-member2.C: No longer expect a()
to be marked as odr-used.
---
  gcc/cp/class.cc   |  4 ++--
  gcc/cp/cp-tree.h  |  2 +-
  gcc/cp/decl.cc|  2 +-
  gcc/cp/decl2.cc   |  2 +-
  gcc/cp/pt.cc  | 19 ---
  gcc/cp/semantics.cc   |  2 +-
  .../g++.dg/warn/Waddress-of-packed-member2.C  |  2 +-
  7 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index bc84f4f731a..778759237dc 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -344,7 +344,7 @@ build_base_path (enum tree_code code,
  
bool uneval = (cp_unevaluated_operand != 0

 || processing_template_decl
-|| in_template_function ());
+|| in_template_context);
  
/* For a non-pointer simple base reference, express it as a COMPONENT_REF

   without taking its address (and so causing lambda capture, 91933).  */
@@ -8055,7 +8055,7 @@ resolves_to_fixed_type_p (tree instance, int* nonnull)
/* processing_template_decl can be false in a template if we're in
   instantiate_non_dependent_expr, but we still want to suppress
   this check.  */
-  if (in_template_function ())
+  if (in_template_context)
  {
/* In a template we only care about the type of the result.  */
if (nonnull)
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a1b882f11fe..ce2095c7aaa 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -1924,6 +1924,7 @@ extern GTY(()) struct saved_scope *scope_chain;
  #define current_template_parms scope_chain->template_parms
  #define current_template_depth \
(current_template_parms ? TMPL_PARMS_DEPTH (current_template_parms) : 0)
+#define in_template_context (current_template_parms != NULL_TREE)
  
  #define processing_template_decl scope_chain->x_processing_template_decl

  #define processing_specialization scope_chain->x_processing_specialization
@@ -7353,7 +7354,6 @@ extern tree lookup_template_variable  (tree, 
tree);
  extern bool uses_template_parms   (tree);
  extern bool uses_template_parms_level (tree, int);
  extern bool uses_outer_template_parms_in_constraints (tree);
-extern bool in_template_function   (void);
  extern bool need_generic_capture  (void);
  extern tree instantiate_class_template(tree);
  extern tree instantiate_template  (tree, tree, tsubst_flags_t);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index a672e4844f1..3985c6d2d1f 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -5021,7 +5021,7 @@ cp_make_fname_decl (location_t loc, tree id, int type_dep)
tree domain = NULL_TREE;
tree init = NULL_TREE;
  
-  if (!(type_dep && in_template_function ()))

+  if (!(type_dep && current_function_decl && in_template_context))
  {
const char *name = NULL;
bool release_name = false;
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b510cdac554..b402befba6d 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -5782,7 +5782,7 @@ mark_used (tree decl, tsubst_flags_t complain /* = 
tf_warning_or_error */)
  && DECL_OMP_DECLARE_REDUCTION_P (decl)))
  maybe_instantiate_decl (decl);
  
-  if (processing_template_decl || in_template_function ())

+  if (processing_template_decl || in_template_context)

Re: [PATCH] c++: bad 'this' conversion for nullary memfn [PR106760]

2023-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/23 11:55, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linu-xgnu, does this look OK for trunk?

-- >8 --

Here we notice the 'this' conversion for the call f() is bad, so
we correctly defer instantiating it, but we end up never adding it to
'bad_fns' since missing_conversion_p for it returns false (its only
argument is the already computed 'this' argument).  This is not a huge
deal, but it causes us to longer accept the call with -fpermissive.

So if we have a non-strictly viable template candidate that has not been
instantiated, then we need to add it to 'bad_fns' even if it doesn't
have any missing conversions.


I wonder about using ck_deferred_bad in add_template_candidate for this 
case rather than changing the test here?  OK either way.



PR c++/106760

gcc/cp/ChangeLog:

* call.cc (add_candidates): Relax test for adding a candidate
to 'bad_fns' to accept an uninstantiated template candidate that
has no missing conversions.

gcc/testsuite/ChangeLog:

* g++.dg/ext/conv3.C: New test.
---
  gcc/cp/call.cc   |  3 ++-
  gcc/testsuite/g++.dg/ext/conv3.C | 16 
  2 files changed, 18 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/ext/conv3.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 2736f55f229..dbf42567cc9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -6632,7 +6632,8 @@ add_candidates (tree fns, tree first_arg, const vec *args,
  
if (cand->viable == -1

  && shortcut_bad_convs
- && missing_conversion_p (cand))
+ && (TREE_CODE (cand->fn) == TEMPLATE_DECL
+ || missing_conversion_p (cand)))
{
  /* This candidate has been tentatively marked non-strictly viable,
 and we didn't compute all argument conversions for it (having
diff --git a/gcc/testsuite/g++.dg/ext/conv3.C b/gcc/testsuite/g++.dg/ext/conv3.C
new file mode 100644
index 000..5f4b4d4cc50
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/conv3.C
@@ -0,0 +1,16 @@
+// PR c++/106760
+// { dg-additional-options "-fpermissive" }
+
+struct S {
+  template
+  int f(...);
+
+  int g() const {
+return f(); // { dg-warning "discards qualifiers" }
+  }
+};
+
+int main() {
+  S s;
+  s.g();
+}

Re: [PATCH][committed] btf: Fix -Wformat errors

2023-06-02 Thread Iain Sandoe




> On 2 Jun 2023, at 17:18, Rainer Orth  wrote:
> 
> Hi Alex,
> 
>> g:7aae58b04b92303ccda3ead600be98f0d4b7f462 introduced -Wformat errors
>> breaking bootstrap on some targets. This patch fixes that.
>> 
>> Committed as obvious.
>> 
>> Thanks,
>> Alex
>> 
>> gcc/ChangeLog:
>> 
>>  * btfout.cc (btf_asm_type): Use PRIu64 instead of %lu for uint64_t.
>>  (btf_asm_datasec_type): Likewise.
> 
> This is PR libstdc++/110077.  Btw., your fix is incomplete: it needs
> another change (%lu -> %zu) in btf_asm_func_type.

I think there’s a typo in the PR ref?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110073

Iain

Re: [PATCH 1/2] c++: refine dependent_alias_template_spec_p [PR90679]

2023-06-02 Thread Patrick Palka via Gcc-patches

On Thu, 1 Jun 2023, Patrick Palka wrote:

> For a complex alias template-id, dependent_alias_template_spec_p returns
> true if any template argument of the template-id is dependent.  This
> predicate indicates that substitution into the template-id may behave
> differently with respect to SFINAE than substitution into the expanded
> alias, and so the alias is in a way non-transparent.  For example
> 'first_t' in
> 
>   template using first_t = T;
>   template first_t f();
> 
> is such an alias template-id since first_t doesn't use its second
> template parameter and so the substitution into the expanded alias would
> discard the SFINAE effects of the corresponding (dependent) argument 'T&'.
> 
> But this predicate is overly conservative since what really matters for
> sake of SFINAE equivalence is whether a template argument corresponding
> to an _unused_ template parameter is dependent.  So the predicate should
> return false for e.g. 'first_t' or 'first_t'.
> 
> This patch refines the predicate appropriately.  We need to be able to
> efficiently determine which template parameters of a complex alias
> template are unused, so to that end we add a new out parameter to
> complex_alias_template_p and cache its result in an on-the-side
> hash_map that replaces the existing TEMPLATE_DECL_COMPLEX_ALIAS_P
> flag.  And in doing so, we fix a latent bug that this flag wasn't
> being propagated during partial instantiation, and so we were treating
> all partially instantiated member alias templates as non-complex.

Whoops, this last sentence is wrong I think.  The flag propagation would
have happened via the call to copy_decl from tsubst_template_decl, so
there was no latent bug.

> 
>   PR c++/90679
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (TEMPLATE_DECL_COMPLEX_ALIAS_P): Remove.
>   (most_general_template): Constify parameter.
>   * pt.cc (push_template_decl): Adjust after removing
>   TEMPLATE_DECL_COMPLEX_ALIAS_P.
>   (complex_alias_tmpl_info): New hash_map.
>   (uses_all_template_parms_data::seen): Change type to
>   tree* from bool*.
>   (complex_alias_template_r): Adjust accordingly.
>   (complex_alias_template_p): Add 'seen_out' out parameter.
>   Call most_general_template and check PRIMARY_TEMPLATE_P.
>   Use complex_alias_tmpl_info to cache the result and set
>   '*seen_out' accordigly.
>   (dependent_alias_template_spec_p): Add !processing_template_decl
>   early exit test.  Consider dependence of only template arguments
>   corresponding to seen template parameters as per
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp0x/alias-decl-75.C: New test.
> ---
>  gcc/cp/cp-tree.h   |   7 +-
>  gcc/cp/pt.cc   | 101 +++--
>  gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C |  24 +
>  3 files changed, 100 insertions(+), 32 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index a1b882f11fe..5330d1e1f62 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -543,7 +543,6 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
> 2: DECL_THIS_EXTERN (in VAR_DECL, FUNCTION_DECL or PARM_DECL)
>DECL_IMPLICIT_TYPEDEF_P (in a TYPE_DECL)
>DECL_CONSTRAINT_VAR_P (in a PARM_DECL)
> -  TEMPLATE_DECL_COMPLEX_ALIAS_P (in TEMPLATE_DECL)
>DECL_INSTANTIATING_NSDMI_P (in a FIELD_DECL)
>USING_DECL_UNRELATED_P (in USING_DECL)
> 3: DECL_IN_AGGR_P.
> @@ -3655,10 +3654,6 @@ struct GTY(()) lang_decl {
>  #define TYPE_DECL_ALIAS_P(NODE) \
>DECL_LANG_FLAG_6 (TYPE_DECL_CHECK (NODE))
>  
> -/* Nonzero for TEMPLATE_DECL means that it is a 'complex' alias template.  */
> -#define TEMPLATE_DECL_COMPLEX_ALIAS_P(NODE) \
> -  DECL_LANG_FLAG_2 (TEMPLATE_DECL_CHECK (NODE))
> -
>  /* Nonzero for a type which is an alias for another type; i.e, a type
> which declaration was written 'using name-of-type =
> another-type'.  */
> @@ -7403,7 +7398,7 @@ extern tree tsubst_argument_pack(tree, 
> tree, tsubst_flags_t, tree);
>  extern tree tsubst_template_args (tree, tree, tsubst_flags_t, 
> tree);
>  extern tree tsubst_template_arg  (tree, tree, 
> tsubst_flags_t, tree);
>  extern tree tsubst_function_parms(tree, tree, tsubst_flags_t, 
> tree);
> -extern tree most_general_template(tree);
> +extern tree most_general_template(const_tree);
>  extern tree get_mostly_instantiated_function_type (tree);
>  extern bool problematic_instantiation_changed(void);
>  extern void record_last_problematic_instantiation (void);
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 7fb3e75bceb..1b28195e10d 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -211,7 +211,6 @@ static tree listify (tree);
>  static tree listify_autos (tree, tree);
>  static tree tsubst_template_parm (tree, tree, tsubst_flags_t);
>

Re: [PATCH][committed] btf: Fix -Wformat errors

2023-06-02 Thread Rainer Orth

Hi Alex,

> g:7aae58b04b92303ccda3ead600be98f0d4b7f462 introduced -Wformat errors
> breaking bootstrap on some targets. This patch fixes that.
>
> Committed as obvious.
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   * btfout.cc (btf_asm_type): Use PRIu64 instead of %lu for uint64_t.
>   (btf_asm_datasec_type): Likewise.

This is PR libstdc++/110077.  Btw., your fix is incomplete: it needs
another change (%lu -> %zu) in btf_asm_func_type.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH 2/2] btf: improve -dA comments for testsuite

2023-06-02 Thread Alex Coplan via Gcc-patches

Hi Iain,

On 02/06/2023 09:32, Iain Sandoe wrote:
> Hi David,
> 
> > On 31 May 2023, at 07:13, Indu Bhagat via Gcc-patches 
> >  wrote:
> > 
> > On 5/30/23 11:27, David Faust wrote:
> >> [Changes from v1:
> >>  - Fix typos.
> >>  - Split unrelated change into separate commit.
> >>  - Improve asm comment for enum constants, update btf-enum-1 test.
> >>  - Improve asm comment for DATASEC records, update btf-datasec-2 test.]
> >> Many BTF type kinds refer to other types via index to the final types
> >> list. However, the order of the final types list is not guaranteed to
> >> remain the same for the same source program between different runs of
> >> the compiler, making it difficult to test inter-type references.
> >> This patch updates the assembler comments output when writing a
> >> given BTF record to include minimal information about the referenced
> >> type, if any. This allows for the regular expressions used in the gcc
> >> testsuite to do some basic integrity checks on inter-type references.
> >> For example, for the type
> >>unsigned int *
> >> Assembly comments like the following are written with -dA:
> >>.4byte  0   ; TYPE 2 BTF_KIND_PTR ''
> >>.4byte  0x200   ; btt_info: kind=2, kflag=0, vlen=0
> >>.4byte  0x1 ; btt_type: (BTF_KIND_INT 'unsigned int')
> >> Several BTF tests which can immediately be made more robust with this
> >> change are updated. It will also be useful in new tests for the upcoming
> >> btf_type_tag support.
> >> Re-tested on BPF and x86_64, no known regressions.
> >> Thanks.
> > 
> > LGTM.
> 
> This seems to break bootstrap on x86_64 darwin with two instances of :
> 
> gcc/btfout.cc:802:32: error: format ‘%lu’ expects argument of type ‘long 
> unsigned int’, but argument 4 has type ‘ctf_id_t’ {aka ‘long long unsigned 
> int’} [-Werror=format=]
> 802 |"TYPE %lu BTF_KIND_%s '%s’"
> 
> And another on line 970.
> 
> could you suggest where the change should be?

I've pushed a fix for this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620515.html
as g:f2e60a00c7c017bd87ba9afb189cbb77d8c92925.

Thanks,
Alex

> thanks
> Iain
>

[PATCH][committed] btf: Fix -Wformat errors

2023-06-02 Thread Alex Coplan via Gcc-patches

Hi,

g:7aae58b04b92303ccda3ead600be98f0d4b7f462 introduced -Wformat errors
breaking bootstrap on some targets. This patch fixes that.

Committed as obvious.

Thanks,
Alex

gcc/ChangeLog:

* btfout.cc (btf_asm_type): Use PRIu64 instead of %lu for uint64_t.
(btf_asm_datasec_type): Likewise.
commit e2dc586ecadd2399d5ebb14094d78fff1e6caf55
Author: Alex Coplan 
Date:   Fri Jun 2 16:50:45 2023

btf: Fix -Wformat errors in btfout.cc

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index f51ccf73242..1ea68b9e8ba 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -799,7 +799,7 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
}
 
   dw2_asm_output_data (4, dtd->dtd_data.ctti_name,
-  "TYPE %lu BTF_KIND_%s '%s'",
+  "TYPE %" PRIu64 " BTF_KIND_%s '%s'",
   get_btf_id (dtd->dtd_type), btf_kind_name (btf_kind),
   dtd->dtd_name);
   dw2_asm_output_data (4, BTF_TYPE_INFO (btf_kind, btf_kflag, btf_vlen),
@@ -967,7 +967,7 @@ btf_asm_datasec_type (ctf_container_ref ctfc, btf_datasec_t 
ds, ctf_id_t id,
  size_t stroffset)
 {
   dw2_asm_output_data (4, ds.name_offset + stroffset,
-  "TYPE %lu BTF_KIND_DATASEC '%s'",
+  "TYPE %" PRIu64 " BTF_KIND_DATASEC '%s'",
   btf_absolute_datasec_id (id), ds.name);
   dw2_asm_output_data (4, BTF_TYPE_INFO (BTF_KIND_DATASEC, 0,
 ds.entries.length ()),

[PATCH] c++: bad 'this' conversion for nullary memfn [PR106760]

2023-06-02 Thread Patrick Palka via Gcc-patches

Bootstrapped and regtested on x86_64-pc-linu-xgnu, does this look OK for trunk?

-- >8 --

Here we notice the 'this' conversion for the call f() is bad, so
we correctly defer instantiating it, but we end up never adding it to
'bad_fns' since missing_conversion_p for it returns false (its only
argument is the already computed 'this' argument).  This is not a huge
deal, but it causes us to longer accept the call with -fpermissive.

So if we have a non-strictly viable template candidate that has not been
instantiated, then we need to add it to 'bad_fns' even if it doesn't
have any missing conversions.

PR c++/106760

gcc/cp/ChangeLog:

* call.cc (add_candidates): Relax test for adding a candidate
to 'bad_fns' to accept an uninstantiated template candidate that
has no missing conversions.

gcc/testsuite/ChangeLog:

* g++.dg/ext/conv3.C: New test.
---
 gcc/cp/call.cc   |  3 ++-
 gcc/testsuite/g++.dg/ext/conv3.C | 16 
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/conv3.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 2736f55f229..dbf42567cc9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -6632,7 +6632,8 @@ add_candidates (tree fns, tree first_arg, const vec *args,
 
   if (cand->viable == -1
  && shortcut_bad_convs
- && missing_conversion_p (cand))
+ && (TREE_CODE (cand->fn) == TEMPLATE_DECL
+ || missing_conversion_p (cand)))
{
  /* This candidate has been tentatively marked non-strictly viable,
 and we didn't compute all argument conversions for it (having
diff --git a/gcc/testsuite/g++.dg/ext/conv3.C b/gcc/testsuite/g++.dg/ext/conv3.C
new file mode 100644
index 000..5f4b4d4cc50
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/conv3.C
@@ -0,0 +1,16 @@
+// PR c++/106760
+// { dg-additional-options "-fpermissive" }
+
+struct S {
+  template
+  int f(...);
+
+  int g() const {
+return f(); // { dg-warning "discards qualifiers" }
+  }
+};
+
+int main() {
+  S s;
+  s.g();
+}
-- 
2.41.0.rc1.10.g9e49351c30

[pushed] c++: fix explicit/copy problem [PR109247]

2023-06-02 Thread Jason Merrill via Gcc-patches

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

In the testcase, the user wants the assignment to use the operator= declared
in the class, but because [over.match.list] says that explicit constructors
are also considered for list-initialization, as affirmed in CWG1228, we end
up choosing the implicitly-declared copy assignment operator, using the
explicit constructor template for the argument, which is ill-formed.  Other
implementations haven't implemented CWG1228, so we keep getting bug reports.

Discussion in CWG led to the idea for this targeted relaxation: if we use an
explicit constructor for the conversion to the argument of a copy or move
special member function, that makes the candidate worse than another.

DR 2735
PR c++/109247

gcc/cp/ChangeLog:

* call.cc (sfk_copy_or_move): New.
(joust): Add tiebreaker for explicit conv and copy ctor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-explicit3.C: New test.
---
 gcc/cp/call.cc| 31 +++
 .../g++.dg/cpp0x/initlist-explicit3.C | 15 +
 2 files changed, 46 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-explicit3.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 5d504e5c696..d6154f1a319 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -12611,6 +12611,17 @@ cand_parms_match (z_candidate *c1, z_candidate *c2)
   return compparms (parms1, parms2);
 }
 
+/* True iff FN is a copy or move constructor or assignment operator.  */
+
+static bool
+sfk_copy_or_move (tree fn)
+{
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+return false;
+  special_function_kind sfk = special_function_p (fn);
+  return sfk >= sfk_copy_constructor && sfk <= sfk_move_assignment;
+}
+
 /* Compare two candidates for overloading as described in
[over.match.best].  Return values:
 
@@ -12910,6 +12921,26 @@ joust (struct z_candidate *cand1, struct z_candidate 
*cand2, bool warn,
return winner;
 }
 
+  /* CWG2735 (PR109247): A copy/move ctor/op= for which its operand uses an
+ explicit conversion (due to list-initialization) is worse.  */
+  {
+z_candidate *sp = nullptr;
+if (sfk_copy_or_move (cand1->fn))
+  sp = cand1;
+if (sfk_copy_or_move (cand2->fn))
+  sp = sp ? nullptr : cand2;
+if (sp)
+  {
+   conversion *conv = sp->convs[!DECL_CONSTRUCTOR_P (sp->fn)];
+   if (conv->user_conv_p)
+ for (; conv; conv = next_conversion (conv))
+   if (conv->kind == ck_user
+   && DECL_P (conv->cand->fn)
+   && DECL_NONCONVERTING_P (conv->cand->fn))
+ return (sp == cand1) ? -1 : 1;
+  }
+  }
+
   /* or, if not that,
  F1 is a non-template function and F2 is a template function
  specialization.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-explicit3.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-explicit3.C
new file mode 100644
index 000..b0c92784566
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-explicit3.C
@@ -0,0 +1,15 @@
+// PR c++/109247
+// { dg-do compile { target c++11 } }
+
+template  struct optional {
+  template  explicit optional(_Up);
+  template  void operator=(_Up);
+};
+int setPattern_pattern;
+struct SourceBrush {
+  struct Brush {
+int brush;
+  };
+  void setPattern() { m_brush = {setPattern_pattern}; }
+  optional m_brush;
+};

base-commit: 957798e44e7194f7b6a67b19f85ff72eab9a0d0e
-- 
2.31.1

[PATCH] libstdc++: Do not assume existence of char8_t codecvt facet

2023-06-02 Thread Joseph Faulls

It is not required that codecvt facet be supported by
the locale, nor is it added as part of the default locale. This can lead to
dangerous behaviour when static_cast.

libstdc++-v3/ChangeLog:

* include/bits/locale_classes.tcc: Remove check.
---
libstdc++-v3/include/bits/locale_classes.tcc | 3 ---
1 file changed, 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/locale_classes.tcc 
b/libstdc++-v3/include/bits/locale_classes.tcc
index 94838cd7796..2351dd5bcfb 100644
--- a/libstdc++-v3/include/bits/locale_classes.tcc
+++ b/libstdc++-v3/include/bits/locale_classes.tcc
@@ -129,9 +129,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _GLIBCXX_STD_FACET(time_put);
   _GLIBCXX_STD_FACET(messages);
#endif
-#ifdef _GLIBCXX_USE_CHAR8_T
-  _GLIBCXX_STD_FACET(codecvt);
-#endif
#if __cplusplus >= 201103L
   _GLIBCXX_STD_FACET(codecvt);
   _GLIBCXX_STD_FACET(codecvt);
--
2.34.1

[PATCH 12/12] [contrib] validate_failures.py: Ignore stray filesystem paths in results

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

This patch simplifies comparison of results that have filesystem
paths.  E.g., (assuming different values of ):

Running 
/home/user/gcc-N/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp
 ...
ERROR: tcl error sourcing 
/home/user/gcc-N/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp.


We add "--srcpath ", option, and set it by default to
"[^ ]+/testsuite/", which works well for all components of the GNU
Toolchain.  We then remove substrings matching  from paths of
.exp files and from occasional "ERROR:" results.
---
 contrib/testsuite-management/validate_failures.py | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index a77aabe0bdd..4dfd9cda4e2 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -135,6 +135,9 @@ class TestResult(object):
 (self.state,
  self.name,
  self.description) = 
_VALID_TEST_RESULTS_REX.match(summary_line).groups()
+if _OPTIONS.srcpath_regex and _OPTIONS.srcpath_regex != '':
+  self.description = re.sub(_OPTIONS.srcpath_regex, '',
+self.description)
   except:
 print('Failed to parse summary line: "%s"' % summary_line,
   file=sys.stderr)
@@ -361,6 +364,9 @@ def ParseManifestWorker(result_set, manifest_path):
   result_set.add(result)
 elif IsExpLine(orig_line):
   result_set.current_exp = _EXP_LINE_REX.match(orig_line).groups()[0]
+  if _OPTIONS.srcpath_regex and _OPTIONS.srcpath_regex != '':
+result_set.current_exp = re.sub(_OPTIONS.srcpath_regex, '',
+result_set.current_exp)
 elif IsToolLine(orig_line):
   result_set.current_tool = _TOOL_LINE_REX.match(orig_line).groups()[0]
 elif IsSummaryLine(orig_line):
@@ -400,6 +406,9 @@ def ParseSummary(sum_fname):
   result_set.add(result)
 elif IsExpLine(line):
   result_set.current_exp = _EXP_LINE_REX.match(line).groups()[0]
+  if _OPTIONS.srcpath_regex and _OPTIONS.srcpath_regex != '':
+result_set.current_exp = re.sub(_OPTIONS.srcpath_regex, '',
+result_set.current_exp)
   result_set.testsuites.add((result_set.current_tool,
  result_set.current_exp))
 elif IsToolLine(line):
@@ -640,6 +649,12 @@ def Main(argv):
 help='Use provided date MMDD to decide whether '
 'manifest entries with expiry settings have expired '
 'or not. (default = Use today date)')
+  parser.add_option('--srcpath', action='store', type='string',
+dest='srcpath_regex', default='[^ ]+/testsuite/',
+help='Remove provided path (can be a regex) from '
+'the result entries.  This is useful to remove '
+'occasional filesystem path from the results. '
+'(default = "[^ ]+/testsuite/")')
   parser.add_option('--inverse_match', action='store_true',
 dest='inverse_match', default=False,
 help='Inverse result sets in comparison. '
-- 
2.34.1

[PATCH 08/12] [contrib] validate_failures.py: Support "$tool:" prefix in exp names

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

From: Christophe Lyon 

This makes it easier to extract the $tool:$exp pair when iterating
over failures/flaky tests, which, in turn, simplifies re-running
testsuite parts that have unexpected failures or passes.
---
 contrib/testsuite-management/validate_failures.py | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index c4b9fc377ce..6dcdcf5c69b 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -67,12 +67,14 @@ _VALID_TEST_RESULTS_REX = re.compile('(%s):\s*(\S+)\s*(.*)'
 
 # Formats of .sum file sections
 _TOOL_LINE_FORMAT = '\t\t=== %s tests ===\n'
-_EXP_LINE_FORMAT = '\nRunning %s ...\n'
+_EXP_LINE_FORMAT = '\nRunning %s:%s ...\n'
 _SUMMARY_LINE_FORMAT = '\n\t\t=== %s Summary ===\n'
 
 # ... and their compiled regexs.
 _TOOL_LINE_REX = re.compile('^\t\t=== (.*) tests ===\n')
-_EXP_LINE_REX = re.compile('^Running (.*\.exp) \.\.\.\n')
+# Match .exp file name, optionally prefixed by a "tool:" name and a
+# path ending with "testsuite/"
+_EXP_LINE_REX = re.compile('^Running (?:.*:)?(.*) \.\.\.\n')
 _SUMMARY_LINE_REX = re.compile('^\t\t=== (.*) Summary ===\n')
 
 # Subdirectory of srcdir in which to find the manifest file.
@@ -236,7 +238,7 @@ class ResultSet(set):
 outfile.write(_TOOL_LINE_FORMAT % current_tool)
   if current_exp != result.exp:
 current_exp = result.exp
-outfile.write(_EXP_LINE_FORMAT % current_exp)
+outfile.write(_EXP_LINE_FORMAT % (current_tool, current_exp))
   outfile.write('%s\n' % result)
 
 outfile.write(_SUMMARY_LINE_FORMAT % 'Results')
-- 
2.34.1

[PATCH 11/12] [contrib] validate_failures.py: Add "--expiry_date YYYYMMDD" option

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

This option sets "today" date to compare expiration entries against.
Setting expiration date into the future allows re-detection of flaky
tests and creating fresh entries for them before the current flaky
entries expire.
---
 .../testsuite-management/validate_failures.py | 24 +--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 6eb1acd473f..a77aabe0bdd 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -206,8 +206,7 @@ class TestResult(object):
 # Return True if the expiration date of this result has passed.
 expiration_date = self.ExpirationDate()
 if expiration_date:
-  now = datetime.date.today()
-  return now > expiration_date
+  return _OPTIONS.expiry_today_date > expiration_date
 
 
 class ResultSet(set):
@@ -636,6 +635,11 @@ def Main(argv):
 default=False, help='When used with --produce_manifest, '
 'it will overwrite an existing manifest file '
 '(default = False)')
+  parser.add_option('--expiry_date', action='store',
+dest='expiry_today_date', default=None,
+help='Use provided date MMDD to decide whether '
+'manifest entries with expiry settings have expired '
+'or not. (default = Use today date)')
   parser.add_option('--inverse_match', action='store_true',
 dest='inverse_match', default=False,
 help='Inverse result sets in comparison. '
@@ -670,6 +674,22 @@ def Main(argv):
   global _OPTIONS
   (_OPTIONS, _) = parser.parse_args(argv[1:])
 
+  # Set "today" date to compare expiration entries against.
+  # Setting expiration date into the future allows re-detection of flaky
+  # tests and creating fresh entries for them before the current flaky entries
+  # expire.
+  if _OPTIONS.expiry_today_date:
+today_date = re.search(r'(\d\d\d\d)(\d\d)(\d\d)',
+   _OPTIONS.expiry_today_date)
+if not today_date:
+Error('Invalid --expiry_today_date format "%s".  Must be of the form '
+  '"expire=MMDD"' % _OPTIONS.expiry_today_date)
+_OPTIONS.expiry_today_date=datetime.date(int(today_date.group(1)),
+ int(today_date.group(2)),
+ int(today_date.group(3)))
+  else:
+_OPTIONS.expiry_today_date = datetime.date.today()
+
   if _OPTIONS.produce_manifest:
 retval = ProduceManifest()
   elif _OPTIONS.clean_build:
-- 
2.34.1

[PATCH 09/12] [contrib] validate_failures.py: Improve error output

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

From: Thiago Bauermann 

- Print message in case of broken sum file error.
- Print error messages to stderr.  The script's stdout is, usually,
  redirected to a file, and error messages shouldn't go there.
---
 contrib/testsuite-management/validate_failures.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 6dcdcf5c69b..1919935cf53 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -136,12 +136,15 @@ class TestResult(object):
  self.name,
  self.description) = 
_VALID_TEST_RESULTS_REX.match(summary_line).groups()
   except:
-print('Failed to parse summary line: "%s"' % summary_line)
+print('Failed to parse summary line: "%s"' % summary_line,
+  file=sys.stderr)
 raise
   self.ordinal = ordinal
   if tool == None or exp == None:
 # .sum file seem to be broken.  There was no "tool" and/or "exp"
 # lines preceding this result.
+print(f'.sum file seems to be broken: tool="{tool}", exp="{exp}", 
summary_line="{summary_line}"',
+  file=sys.stderr)
 raise
   self.tool = tool
   self.exp = exp
-- 
2.34.1

[PATCH 07/12] [contrib] validate_failures.py: Use exit code "2" to indicate regression

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

... in the results.  Python exits with code "1" on exceptions and
internal errors, which we use to detect failure to parse results.
---
 contrib/testsuite-management/validate_failures.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index f2d7b099d78..c4b9fc377ce 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -645,7 +645,7 @@ def Main(argv):
   if retval:
 return 0
   else:
-return 1
+return 2
 
 
 if __name__ == '__main__':
-- 
2.34.1

[PATCH 05/12] [contrib] validate_failures.py: Add more verbosity levels

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

... to control validate_failures.py output
---
 .../testsuite-management/validate_failures.py | 82 +++
 1 file changed, 46 insertions(+), 36 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 1bd09e0c20c..26ea1d6f53b 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -324,7 +324,7 @@ def GetNegativeResult(line):
 
 def ParseManifestWorker(result_set, manifest_path):
   """Read manifest_path, adding the contents to result_set."""
-  if _OPTIONS.verbosity >= 1:
+  if _OPTIONS.verbosity >= 5:
 print('Parsing manifest file %s.' % manifest_path)
   manifest_file = open(manifest_path, encoding='latin-1', mode='r')
   for orig_line in manifest_file:
@@ -380,7 +380,8 @@ def ParseSummary(sum_fname):
 # Tests that have expired are not added to the set of expected
 # results. If they are still present in the set of actual results,
 # they will cause an error to be reported.
-print('WARNING: Expected failure "%s" has expired.' % line.strip())
+if _OPTIONS.verbosity >= 4:
+  print('WARNING: Expected failure "%s" has expired.' % line.strip())
 continue
   result_set.add(result)
 elif IsExpLine(line):
@@ -425,7 +426,8 @@ def GetResults(sum_files, build_results = None):
   if build_results == None:
 build_results = ResultSet()
   for sum_fname in sum_files:
-print('\t%s' % sum_fname)
+if _OPTIONS.verbosity >= 3:
+  print('\t%s' % sum_fname)
 build_results |= ParseSummary(sum_fname)
   return build_results
 
@@ -489,42 +491,46 @@ def GetBuildData():
   return None, None
   srcdir = GetMakefileValue('%s/Makefile' % _OPTIONS.build_dir, 'srcdir =')
   target = GetMakefileValue('%s/Makefile' % _OPTIONS.build_dir, 
'target_alias=')
-  print('Source directory: %s' % srcdir)
-  print('Build target: %s' % target)
+  if _OPTIONS.verbosity >= 3:
+print('Source directory: %s' % srcdir)
+print('Build target: %s' % target)
   return srcdir, target
 
 
-def PrintSummary(msg, summary):
-  print('\n\n%s' % msg)
+def PrintSummary(summary):
   summary.Print()
 
 def GetSumFiles(results, build_dir):
   if not results:
-print('Getting actual results from build directory %s' % build_dir)
+if _OPTIONS.verbosity >= 3:
+  print('Getting actual results from build directory %s' % build_dir)
 sum_files = CollectSumFiles(build_dir)
   else:
-print('Getting actual results from user-provided results')
+if _OPTIONS.verbosity >= 3:
+  print('Getting actual results from user-provided results')
 sum_files = results.split()
   return sum_files
 
 
-def PerformComparison(expected, actual, ignore_missing_failures):
+def PerformComparison(expected, actual):
   actual_vs_expected, expected_vs_actual = CompareResults(expected, actual)
 
   tests_ok = True
   if len(actual_vs_expected) > 0:
-PrintSummary('Unexpected results in this build (new failures)',
- actual_vs_expected)
+if _OPTIONS.verbosity >= 3:
+  print('\n\nUnexpected results in this build (new failures)')
+if _OPTIONS.verbosity >= 1:
+  PrintSummary(actual_vs_expected)
 tests_ok = False
 
-  if not ignore_missing_failures and len(expected_vs_actual) > 0:
-PrintSummary('Expected results not present in this build (fixed tests)'
- '\n\nNOTE: This is not a failure.  It just means that these '
- 'tests were expected\nto fail, but either they worked in '
- 'this configuration or they were not\npresent at all.\n',
- expected_vs_actual)
+  if _OPTIONS.verbosity >= 2 and len(expected_vs_actual) > 0:
+print('\n\nExpected results not present in this build (fixed tests)'
+  '\n\nNOTE: This is not a failure.  It just means that these '
+  'tests were expected\nto fail, but either they worked in '
+  'this configuration or they were not\npresent at all.\n')
+PrintSummary(expected_vs_actual)
 
-  if tests_ok:
+  if tests_ok and _OPTIONS.verbosity >= 3:
 print('\nSUCCESS: No unexpected failures.')
 
   return tests_ok
@@ -532,21 +538,25 @@ def PerformComparison(expected, actual, 
ignore_missing_failures):
 
 def CheckExpectedResults():
   manifest_path = GetManifestPath(True)
-  print('Manifest: %s' % manifest_path)
+  if _OPTIONS.verbosity >= 3:
+print('Manifest: %s' % manifest_path)
   manifest = GetManifest(manifest_path)
   sum_files = GetSumFiles(_OPTIONS.results, _OPTIONS.build_dir)
   actual = GetResults(sum_files)
 
-  if _OPTIONS.verbosity >= 1:
-PrintSummary('Tests expected to fail', manifest)
-PrintSummary('\nActual test results', actual)
+  if _OPTIONS.verbosity >= 5:
+print('\n\nTests expected to fail')
+PrintSummary(manifest)
+print('\n\nActual test results')
+PrintSummary(actual)
 
-  return

[PATCH 06/12] [contrib] validate_failures.py: Be more stringent in parsing result lines

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

Before this patch we would identify malformed line
"UNRESOLVEDTest run by tcwg-buildslave on Mon Aug 23 10:17:50 2021"
as an interesting result, only to fail in TestResult:__init__ due
to missing ":" after UNRESOLVED.

This patch makes all places that parse result lines use a single
compiled regex.
---
 contrib/testsuite-management/validate_failures.py | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 26ea1d6f53b..f2d7b099d78 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -60,9 +60,10 @@ import os
 import re
 import sys
 
-# Handled test results.
 _VALID_TEST_RESULTS = [ 'FAIL', 'UNRESOLVED', 'XPASS', 'ERROR' ]
-_VALID_TEST_RESULTS_REX = re.compile("%s" % "|".join(_VALID_TEST_RESULTS))
+# :

[PATCH 02/12] [contrib] validate_failures.py: Support expiry attributes in manifests

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

---
 contrib/testsuite-management/validate_failures.py | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 94ba2e58b51..7351ba120b7 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -338,7 +338,13 @@ def ParseManifestWorker(result_set, manifest_path):
 elif IsInclude(line):
   ParseManifestWorker(result_set, GetIncludeFile(line, manifest_path))
 elif IsInterestingResult(line):
-  result_set.add(result_set.MakeTestResult(line))
+  result = result_set.MakeTestResult(line)
+  if result.HasExpired():
+# Ignore expired manifest entries.
+if _OPTIONS.verbosity >= 4:
+  print('WARNING: Expected failure "%s" has expired.' % line.strip())
+continue
+  result_set.add(result)
 elif IsExpLine(orig_line):
   result_set.current_exp = _EXP_LINE_REX.match(orig_line).groups()[0]
 elif IsToolLine(orig_line):
@@ -369,6 +375,8 @@ def ParseSummary(sum_fname):
   result = result_set.MakeTestResult(line, ordinal)
   ordinal += 1
   if result.HasExpired():
+# ??? What is the use-case for this?  How "expiry" annotations are
+# ??? supposed to be added to .sum results?
 # Tests that have expired are not added to the set of expected
 # results. If they are still present in the set of actual results,
 # they will cause an error to be reported.
-- 
2.34.1

[PATCH 03/12] [contrib] validate_failures.py: Read in manifest when comparing build dirs

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

This allows comparison of two build directories with a manifest
listing known flaky tests on the side.
---
 contrib/testsuite-management/validate_failures.py | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 7351ba120b7..4733dd89dcb 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -420,9 +420,10 @@ def CollectSumFiles(builddir):
   return sum_files
 
 
-def GetResults(sum_files):
+def GetResults(sum_files, build_results = None):
   """Collect all the test results from the given .sum files."""
-  build_results = ResultSet()
+  if build_results == None:
+build_results = ResultSet()
   for sum_fname in sum_files:
 print('\t%s' % sum_fname)
 build_results |= ParseSummary(sum_fname)
@@ -567,8 +568,15 @@ def CompareBuilds():
   sum_files = GetSumFiles(_OPTIONS.results, _OPTIONS.build_dir)
   actual = GetResults(sum_files)
 
+  clean = ResultSet()
+
+  if _OPTIONS.manifest:
+manifest_path = GetManifestPath(srcdir, target, True)
+print('Manifest: %s' % manifest_path)
+clean = GetManifest(manifest_path)
+
   clean_sum_files = GetSumFiles(_OPTIONS.results, _OPTIONS.clean_build)
-  clean = GetResults(clean_sum_files)
+  clean = GetResults(clean_sum_files, clean)
 
   return PerformComparison(clean, actual, _OPTIONS.ignore_missing_failures)
 
-- 
2.34.1

[PATCH 10/12] [contrib] validate_failures.py: Add new option --invert_match

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

This option is used to detect flaky tests that FAILed in the clean
build (or manifest), but PASSed in the current build (or manifest).

The option inverts output logic similar to what "-v/--invert-match"
does for grep.
---
 .../testsuite-management/validate_failures.py | 34 +--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 1919935cf53..6eb1acd473f 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -217,11 +217,17 @@ class ResultSet(set):
   Attributes:
 current_tool: Name of the current top-level DejaGnu testsuite.
 current_exp: Name of the current .exp testsuite file.
+testsuites: A set of (tool, exp) tuples representing encountered 
testsuites.
   """
 
   def __init__(self):
 super().__init__()
 self.ResetToolExp()
+self.testsuites=set()
+
+  def update(self, other):
+super().update(other)
+self.testsuites.update(other.testsuites)
 
   def ResetToolExp(self):
 self.current_tool = None
@@ -246,6 +252,10 @@ class ResultSet(set):
 
 outfile.write(_SUMMARY_LINE_FORMAT % 'Results')
 
+  # Check if testsuite of expected_result is present in current results.
+  # This is used to compare partial test results against a full manifest.
+  def HasTestsuite(self, expected_result):
+return (expected_result.tool, expected_result.exp) in self.testsuites
 
 def GetMakefileValue(makefile_name, value_name):
   if os.path.exists(makefile_name):
@@ -391,6 +401,8 @@ def ParseSummary(sum_fname):
   result_set.add(result)
 elif IsExpLine(line):
   result_set.current_exp = _EXP_LINE_REX.match(line).groups()[0]
+  result_set.testsuites.add((result_set.current_tool,
+ result_set.current_exp))
 elif IsToolLine(line):
   result_set.current_tool = _TOOL_LINE_REX.match(line).groups()[0]
   result_set.current_exp = None
@@ -433,7 +445,7 @@ def GetResults(sum_files, build_results = None):
   for sum_fname in sum_files:
 if _OPTIONS.verbosity >= 3:
   print('\t%s' % sum_fname)
-build_results |= ParseSummary(sum_fname)
+build_results.update(ParseSummary(sum_fname))
   return build_results
 
 
@@ -458,7 +470,11 @@ def CompareResults(manifest, actual):
 # Ignore tests marked flaky.
 if 'flaky' in expected_result.attrs:
   continue
-if expected_result not in actual:
+# We try to support comparing partial results vs full manifest
+# (e.g., manifest has failures for gcc, g++, gfortran, but we ran only
+# g++ testsuite).  To achieve this we record encountered testsuites in
+# actual.testsuites set, and then we check it here using HasTestsuite().
+if expected_result not in actual and actual.HasTestsuite(expected_result):
   manifest_vs_actual.add(expected_result)
 
   return actual_vs_manifest, manifest_vs_actual
@@ -520,6 +536,13 @@ def GetSumFiles(results, build_dir):
 def PerformComparison(expected, actual):
   actual_vs_expected, expected_vs_actual = CompareResults(expected, actual)
 
+  if _OPTIONS.inverse_match:
+# Switch results if inverse comparison is requested.
+# This is useful in detecting flaky tests that FAILed in expected set,
+# but PASSed in actual set.
+actual_vs_expected, expected_vs_actual \
+  = expected_vs_actual, actual_vs_expected
+
   tests_ok = True
   if len(actual_vs_expected) > 0:
 if _OPTIONS.verbosity >= 3:
@@ -613,6 +636,13 @@ def Main(argv):
 default=False, help='When used with --produce_manifest, '
 'it will overwrite an existing manifest file '
 '(default = False)')
+  parser.add_option('--inverse_match', action='store_true',
+dest='inverse_match', default=False,
+help='Inverse result sets in comparison. '
+'Output unexpected passes as unexpected failures and '
+'unexpected failures as unexpected passes. '
+'This is used to catch FAIL->PASS flaky tests. '
+'(default = False)')
   parser.add_option('--manifest', action='store', type='string',
 dest='manifest', default=None,
 help='Name of the manifest file to use (default = '
-- 
2.34.1

[PATCH 04/12] [contrib] validate_failures.py: Simplify GetManifestPath()

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

... and don't require a valid build directory when no data from it
is necessary.
---
 contrib/testsuite-management/validate_failures.py | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 4733dd89dcb..1bd09e0c20c 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -457,7 +457,7 @@ def CompareResults(manifest, actual):
   return actual_vs_manifest, manifest_vs_actual
 
 
-def GetManifestPath(srcdir, target, user_provided_must_exist):
+def GetManifestPath(user_provided_must_exist):
   """Return the full path to the manifest file."""
   manifest_path = _OPTIONS.manifest
   if manifest_path:
@@ -465,6 +465,7 @@ def GetManifestPath(srcdir, target, 
user_provided_must_exist):
   Error('Manifest does not exist: %s' % manifest_path)
 return manifest_path
   else:
+(srcdir, target) = GetBuildData()
 if not srcdir:
   Error('Could not determine the location of GCC\'s source tree. '
 'The Makefile does not contain a definition for "srcdir".')
@@ -530,8 +531,7 @@ def PerformComparison(expected, actual, 
ignore_missing_failures):
 
 
 def CheckExpectedResults():
-  srcdir, target = GetBuildData()
-  manifest_path = GetManifestPath(srcdir, target, True)
+  manifest_path = GetManifestPath(True)
   print('Manifest: %s' % manifest_path)
   manifest = GetManifest(manifest_path)
   sum_files = GetSumFiles(_OPTIONS.results, _OPTIONS.build_dir)
@@ -545,8 +545,7 @@ def CheckExpectedResults():
 
 
 def ProduceManifest():
-  (srcdir, target) = GetBuildData()
-  manifest_path = GetManifestPath(srcdir, target, False)
+  manifest_path = GetManifestPath(False)
   print('Manifest: %s' % manifest_path)
   if os.path.exists(manifest_path) and not _OPTIONS.force:
 Error('Manifest file %s already exists.\nUse --force to overwrite.' %
@@ -563,15 +562,13 @@ def ProduceManifest():
 
 
 def CompareBuilds():
-  (srcdir, target) = GetBuildData()
-
   sum_files = GetSumFiles(_OPTIONS.results, _OPTIONS.build_dir)
   actual = GetResults(sum_files)
 
   clean = ResultSet()
 
   if _OPTIONS.manifest:
-manifest_path = GetManifestPath(srcdir, target, True)
+manifest_path = GetManifestPath(True)
 print('Manifest: %s' % manifest_path)
 clean = GetManifest(manifest_path)
 
-- 
2.34.1

[PATCH 01/12] [contrib] validate_failures.py: Avoid testsuite aliasing

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

This patch adds tracking of current testsuite "tool" and "exp"
to the processing of .sum files.  This avoids aliasing between
tests from different testsuites with same name+description.

E.g., this is necessary for testsuite/c-c++-common, which is ran
for both gcc and g++ "tools".

This patch changes manifest format from ...

FAIL: gcc_test
FAIL: g++_test

... to ...

=== gcc tests ===
Running gcc/foo.exp ...
FAIL: gcc_test
=== gcc Summary ==
=== g++ tests ===
Running g++/bar.exp ...
FAIL: g++_test
=== g++ Summary ==
.

The new format uses same formatting as DejaGnu's .sum files
to specify which "tool" and "exp" the test belongs to.
---
 .../testsuite-management/validate_failures.py | 137 +++---
 1 file changed, 115 insertions(+), 22 deletions(-)

diff --git a/contrib/testsuite-management/validate_failures.py 
b/contrib/testsuite-management/validate_failures.py
index 43d9d50af8d..94ba2e58b51 100755
--- a/contrib/testsuite-management/validate_failures.py
+++ b/contrib/testsuite-management/validate_failures.py
@@ -64,6 +64,16 @@ import sys
 _VALID_TEST_RESULTS = [ 'FAIL', 'UNRESOLVED', 'XPASS', 'ERROR' ]
 _VALID_TEST_RESULTS_REX = re.compile("%s" % "|".join(_VALID_TEST_RESULTS))
 
+# Formats of .sum file sections
+_TOOL_LINE_FORMAT = '\t\t=== %s tests ===\n'
+_EXP_LINE_FORMAT = '\nRunning %s ...\n'
+_SUMMARY_LINE_FORMAT = '\n\t\t=== %s Summary ===\n'
+
+# ... and their compiled regexs.
+_TOOL_LINE_REX = re.compile('^\t\t=== (.*) tests ===\n')
+_EXP_LINE_REX = re.compile('^Running (.*\.exp) \.\.\.\n')
+_SUMMARY_LINE_REX = re.compile('^\t\t=== (.*) Summary ===\n')
+
 # Subdirectory of srcdir in which to find the manifest file.
 _MANIFEST_SUBDIR = 'contrib/testsuite-management'
 
@@ -111,9 +121,11 @@ class TestResult(object):
 ordinal: Monotonically increasing integer.
  It is used to keep results for one .exp file sorted
  by the order the tests were run.
+tool: Top-level testsuite name (aka "tool" in DejaGnu parlance) of the 
test.
+exp: Name of .exp testsuite file.
   """
 
-  def __init__(self, summary_line, ordinal=-1):
+  def __init__(self, summary_line, ordinal, tool, exp):
 try:
   (self.attrs, summary_line) = SplitAttributesFromSummaryLine(summary_line)
   try:
@@ -125,6 +137,12 @@ class TestResult(object):
 print('Failed to parse summary line: "%s"' % summary_line)
 raise
   self.ordinal = ordinal
+  if tool == None or exp == None:
+# .sum file seem to be broken.  There was no "tool" and/or "exp"
+# lines preceding this result.
+raise
+  self.tool = tool
+  self.exp = exp
 except ValueError:
   Error('Cannot parse summary line "%s"' % summary_line)
 
@@ -133,14 +151,27 @@ class TestResult(object):
 self.state, summary_line, self))
 
   def __lt__(self, other):
-return (self.name < other.name or
-(self.name == other.name and self.ordinal < other.ordinal))
+if (self.tool != other.tool):
+  return self.tool < other.tool
+if (self.exp != other.exp):
+  return self.exp < other.exp
+if (self.name != other.name):
+  return self.name < other.name
+return self.ordinal < other.ordinal
 
   def __hash__(self):
-return hash(self.state) ^ hash(self.name) ^ hash(self.description)
-
+return (hash(self.state) ^ hash(self.tool) ^ hash(self.exp)
+^ hash(self.name) ^ hash(self.description))
+
+  # Note that we don't include "attrs" in this comparison.  This means that
+  # result entries "FAIL: test" and "flaky | FAIL: test" are considered
+  # the same.  Therefore the ResultSet will preserve only the first occurence.
+  # In practice this means that flaky entries should preceed expected fails
+  # entries.
   def __eq__(self, other):
 return (self.state == other.state and
+self.tool == other.tool and
+self.exp == other.exp and
 self.name == other.name and
 self.description == other.description)
 
@@ -174,6 +205,43 @@ class TestResult(object):
   return now > expiration_date
 
 
+class ResultSet(set):
+  """Describes a set of DejaGNU test results.
+  This set can be read in from .sum files or emitted as a manifest.
+
+  Attributes:
+current_tool: Name of the current top-level DejaGnu testsuite.
+current_exp: Name of the current .exp testsuite file.
+  """
+
+  def __init__(self):
+super().__init__()
+self.ResetToolExp()
+
+  def ResetToolExp(self):
+self.current_tool = None
+self.current_exp = None
+
+  def MakeTestResult(self, summary_line, ordinal=-1):
+return TestResult(summary_line, ordinal,
+  self.current_tool, self.current_exp)
+
+  def Print(self, outfile=sys.stdout):
+current_tool = None
+current_exp = None
+
+for result in sorted(self):
+  if current_tool != result.tool:
+current_tool = result.tool
+

[contrib] Extend and improve validate_failures.py

2023-06-02 Thread Maxim Kuvyrkov via Gcc-patches

This patch series extends and improves validate_failures.py script
to provide a powerful tool to handle DejaGnu test results in automated
CI environment.

Linaro TCWG uses validate_failures.py to ...
- compare test results without human oversight,
- detect unexpected FAILs vs baseline,
- detect unexpected PASSes vs baseline,
- automatically detect flaky tests,
- create lists of expected failures and flaky tests, see [1].

[1] 
https://ci.linaro.org/job/tcwg_gcc_check--master-arm-build/lastSuccessfulBuild/artifact/artifacts/sumfiles/xfails.xfail/*view*/

[PATCH] c++: replace in_template_function

2023-06-02 Thread Patrick Palka via Gcc-patches

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk?

-- >8 --

All uses of in_template_function besides the one in cp_make_fname_decl
seem like they could be generalized to apply to all template contexts,
not just function templates.  To that end this patch replaces the
predicate with a cheaper and more general in_template_context predicate
that returns true for all template contexts.  If we legitimately need
to consider only function template contexts, as in cp_make_fname_decl,
we can just additionallly check e.g. current_function_decl.

One concrete benefit of this is that we no longer instantiate/odr-use
entities based on uses within a non-function template such as in the
adjusted testcase below.

gcc/cp/ChangeLog:

* class.cc (build_base_path): Check in_template_context instead
of in_template_function.
(resolves_to_fixed_type_p): Likewise.
* cp-tree.h (in_template_context): Define.
(in_template_function): Remove.
* decl.cc (cp_make_fname_decl): Check current_function_decl
and in_template_context instead of in_template_function.
* decl2.cc (mark_used): Check in_template_context instead of
in_template_function.
* pt.cc (in_template_function): Remove.
* semantics.cc (enforce_access): Check in_template_context
instead of current_template_parms directly.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Waddress-of-packed-member2.C: No longer expect a()
to be marked as odr-used.
---
 gcc/cp/class.cc   |  4 ++--
 gcc/cp/cp-tree.h  |  2 +-
 gcc/cp/decl.cc|  2 +-
 gcc/cp/decl2.cc   |  2 +-
 gcc/cp/pt.cc  | 19 ---
 gcc/cp/semantics.cc   |  2 +-
 .../g++.dg/warn/Waddress-of-packed-member2.C  |  2 +-
 7 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index bc84f4f731a..778759237dc 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -344,7 +344,7 @@ build_base_path (enum tree_code code,
 
   bool uneval = (cp_unevaluated_operand != 0
 || processing_template_decl
-|| in_template_function ());
+|| in_template_context);
 
   /* For a non-pointer simple base reference, express it as a COMPONENT_REF
  without taking its address (and so causing lambda capture, 91933).  */
@@ -8055,7 +8055,7 @@ resolves_to_fixed_type_p (tree instance, int* nonnull)
   /* processing_template_decl can be false in a template if we're in
  instantiate_non_dependent_expr, but we still want to suppress
  this check.  */
-  if (in_template_function ())
+  if (in_template_context)
 {
   /* In a template we only care about the type of the result.  */
   if (nonnull)
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a1b882f11fe..ce2095c7aaa 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -1924,6 +1924,7 @@ extern GTY(()) struct saved_scope *scope_chain;
 #define current_template_parms scope_chain->template_parms
 #define current_template_depth \
   (current_template_parms ? TMPL_PARMS_DEPTH (current_template_parms) : 0)
+#define in_template_context (current_template_parms != NULL_TREE)
 
 #define processing_template_decl scope_chain->x_processing_template_decl
 #define processing_specialization scope_chain->x_processing_specialization
@@ -7353,7 +7354,6 @@ extern tree lookup_template_variable  (tree, 
tree);
 extern bool uses_template_parms(tree);
 extern bool uses_template_parms_level  (tree, int);
 extern bool uses_outer_template_parms_in_constraints (tree);
-extern bool in_template_function   (void);
 extern bool need_generic_capture   (void);
 extern tree instantiate_class_template (tree);
 extern tree instantiate_template   (tree, tree, tsubst_flags_t);
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index a672e4844f1..3985c6d2d1f 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -5021,7 +5021,7 @@ cp_make_fname_decl (location_t loc, tree id, int type_dep)
   tree domain = NULL_TREE;
   tree init = NULL_TREE;
 
-  if (!(type_dep && in_template_function ()))
+  if (!(type_dep && current_function_decl && in_template_context))
 {
   const char *name = NULL;
   bool release_name = false;
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b510cdac554..b402befba6d 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -5782,7 +5782,7 @@ mark_used (tree decl, tsubst_flags_t complain /* = 
tf_warning_or_error */)
  && DECL_OMP_DECLARE_REDUCTION_P (decl)))
 maybe_instantiate_decl (decl);
 
-  if (processing_template_decl || in_template_function ())
+  if (processing_template_decl || in_template_context)
 return true;
 
   /* Check this too in case we're within instantiate_non_dependent_expr.

Re: [PATCH V2, rs6000] Disable generation of scalar modulo instructions

2023-06-02 Thread Pat Haugen via Gcc-patches


Ping ^3

On 4/18/23 7:22 AM, Pat Haugen via Gcc-patches wrote:

Updated from prior patch to also disable for int128.


Disable generation of scalar modulo instructions.

It was recently discovered that the scalar modulo instructions can suffer
noticeable performance issues for certain input values. This patch disables
their generation since the equivalent div/mul/sub sequence does not suffer
the same problem.

Bootstrapped and regression tested on powerpc64/powerpc64le.
Ok for master and backports after burn in?

-Pat


2023-04-18  Pat Haugen  

gcc/
 * config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New.
 * config/rs6000/rs6000.md (mod3, *mod3): Disable.
 (define_expand umod3): New.
 (define_insn umod3): Rename to *umod3 and disable.
 (umodti3, modti3): Disable.

gcc/testsuite/
 * gcc.target/powerpc/clone1.c: Add xfails.
 * gcc.target/powerpc/clone3.c: Likewise.
 * gcc.target/powerpc/mod-1.c: Likewise.
 * gcc.target/powerpc/mod-2.c: Likewise.
 * gcc.target/powerpc/p10-vdivq-vmodq.c: Likewise.


diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 3503614efbd..1cf0a0013c0 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2492,3 +2492,9 @@ while (0)
     rs6000_asm_output_opcode (STREAM);    \
  }    \
    while (0)
+
+/* Disable generation of scalar modulo instructions due to performance 
issues

+   with certain input values. This can be removed in the future when the
+   issues have been resolved.  */
+#define RS6000_DISABLE_SCALAR_MODULO 1
+
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 44f7dd509cb..4f397bc9179 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3421,6 +3421,17 @@ (define_expand "mod3"
  FAIL;

    operands[2] = force_reg (mode, operands[2]);
+
+  if (RS6000_DISABLE_SCALAR_MODULO)
+    {
+  temp1 = gen_reg_rtx (mode);
+  temp2 = gen_reg_rtx (mode);
+
+  emit_insn (gen_div3 (temp1, operands[1], operands[2]));
+  emit_insn (gen_mul3 (temp2, temp1, operands[2]));
+  emit_insn (gen_sub3 (operands[0], operands[1], temp2));
+  DONE;
+    }
  }
    else
  {
@@ -3440,17 +3451,42 @@ (define_insn "*mod3"
    [(set (match_operand:GPR 0 "gpc_reg_operand" "=,r")
  (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
   (match_operand:GPR 2 "gpc_reg_operand" "r,r")))]
-  "TARGET_MODULO"
+  "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO"
    "mods %0,%1,%2"
    [(set_attr "type" "div")
     (set_attr "size" "")])

+;; This define_expand can be removed when RS6000_DISABLE_SCALAR_MODULO is
+;; removed.
+(define_expand "umod3"
+  [(set (match_operand:GPR 0 "gpc_reg_operand")
+    (umod:GPR (match_operand:GPR 1 "gpc_reg_operand")
+  (match_operand:GPR 2 "gpc_reg_operand")))]
+  ""
+{
+  rtx temp1;
+  rtx temp2;
+
+  if (!TARGET_MODULO)
+    FAIL;

-(define_insn "umod3"
+  if (RS6000_DISABLE_SCALAR_MODULO)
+    {
+  temp1 = gen_reg_rtx (mode);
+  temp2 = gen_reg_rtx (mode);
+
+  emit_insn (gen_udiv3 (temp1, operands[1], operands[2]));
+  emit_insn (gen_mul3 (temp2, temp1, operands[2]));
+  emit_insn (gen_sub3 (operands[0], operands[1], temp2));
+  DONE;
+    }
+})
+
+(define_insn "*umod3"
    [(set (match_operand:GPR 0 "gpc_reg_operand" "=,r")
  (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
    (match_operand:GPR 2 "gpc_reg_operand" "r,r")))]
-  "TARGET_MODULO"
+  "TARGET_MODULO && !RS6000_DISABLE_SCALAR_MODULO"
    "modu %0,%1,%2"
    [(set_attr "type" "div")
     (set_attr "size" "")])
@@ -3507,7 +3543,7 @@ (define_insn "umodti3"
    [(set (match_operand:TI 0 "altivec_register_operand" "=v")
  (umod:TI (match_operand:TI 1 "altivec_register_operand" "v")
   (match_operand:TI 2 "altivec_register_operand" "v")))]
-  "TARGET_POWER10 && TARGET_POWERPC64"
+  "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO"
    "vmoduq %0,%1,%2"
    [(set_attr "type" "vecdiv")
     (set_attr "size" "128")])
@@ -3516,7 +3552,7 @@ (define_insn "modti3"
    [(set (match_operand:TI 0 "altivec_register_operand" "=v")
  (mod:TI (match_operand:TI 1 "altivec_register_operand" "v")
  (match_operand:TI 2 "altivec_register_operand" "v")))]
-  "TARGET_POWER10 && TARGET_POWERPC64"
+  "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO"
    "vmodsq %0,%1,%2"
    [(set_attr "type" "vecdiv")
     (set_attr "size" "128")])
diff --git a/gcc/testsuite/gcc.target/powerpc/clone1.c 
b/gcc/testsuite/gcc.target/powerpc/clone1.c

index c69fd2aa1b8..74323ca0e8c 100644
--- a/gcc/testsuite/gcc.target/powerpc/clone1.c
+++ b/gcc/testsuite/gcc.target/powerpc/clone1.c
@@ -21,6 +21,7 @@ long mod_func_or (long a, long b, long c)
    return mod_func (a, b) | c;
  }

-/* { dg-final { scan-assembler-times {\mdivd\M}  1 } } */
-/* { dg-final { scan-assembler-times {\mmulld\M} 1 } }

PING: [PATCH] fix radix sort on 32bit platforms [PR109670]

2023-06-02 Thread Thomas Neumann via Gcc-patches

Summary: The radix sort did not handle the uppermost byte correctly, 
which sometimes broke win32 exceptions. Bugzilla #109670. The reporter 
confirmed that the patch fixes the bug.


See:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618000.html

Best

Thomas

Re: [RFC] Introduce -finline-memset-loops

2023-06-02 Thread Fangrui Song via Gcc-patches

On Fri, Jun 2, 2023 at 3:11 AM Alexandre Oliva via Gcc-patches
 wrote:
>
> On Jan 19, 2023, Alexandre Oliva  wrote:
>
> > Would it make more sense to extend it, even constrained by the
> > limitations mentioned above, or handle memset only?  In the latter case,
> > would it still make sense to adopt a command-line option that suggests a
> > broader effect than it already has, even if it's only a hopeful future
> > extension?  -finline-all-stringops[={memset,memcpy,...}], that you
> > suggested, seems to be a reasonable and extensible one to adopt.
>
> I ended up implementing all of memset, memcpy, memmove, and memcmp:
>
> Introduce -finline-stringops
>
> try_store_by_multiple_pieces was added not long ago, enabling
> variable-sized memset to be expanded inline when the worst-case
> in-range constant length would, using conditional blocks with powers
> of two to cover all possibilities of length and alignment.
>
> This patch introduces -finline-stringops[=fn] to request expansions to
> start with a loop, so as to still take advantage of known alignment
> even with long lengths, but without necessarily adding store blocks
> for every power of two.
>
> This makes it possible for the supported stringops (memset, memcpy,
> memmove, memset) to be expanded, even if storing a single byte per
> iteration.  Surely efficient implementations can run faster, with a
> pre-loop to increase alignment, but that would likely be excessive for
> inline expansions.
>
> Still, in some cases, such as in freestanding environments, users
> prefer to inline such stringops, especially those that the compiler
> may introduce itself, even if the expansion is not as performant as a
> highly optimized C library implementation could be, to avoid
> depending on a C runtime library.
>
> Regstrapped on x86_64-linux-gnu, also bootstrapped with
> -finline-stringops enabled by default, and tested with arm, aarch, 32-
> and 64-bit riscv with gcc-12.  Ok to install?
>[...]

This seems to be related to Clang's __builtin_mem{set,cpy}_inline . I
just created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110094 ("Support
__builtin_mem*_inline").

PING^2: [PATCH] release the sorted FDE array when deregistering a frame [PR109685]

2023-06-02 Thread Thomas Neumann via Gcc-patches

Summary: The old linear scan logic called free while searching the list 
of frames. The atomic fast path finds the frame quickly, but forgot the 
free call. This patches adds the missing free. Bugzilla #109685.


See:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619026.html

Best

Thomas

[COMMITTED] reg-stack: Change return type of predicate functions from int to bool

2023-06-02 Thread Uros Bizjak via Gcc-patches

Also change some internal variables to bool and recode handling of
boolean varialbes to not use bitwise or.

gcc/ChangeLog:

* rtl.h (stack_regs_mentioned): Change return type from int to bool.
* reg-stack.cc (struct_block_info_def): Change "done" to bool.
(stack_regs_mentioned_p): Change return type from int to bool
and adjust function body accordingly.
(stack_regs_mentioned): Ditto.
(check_asm_stack_operands): Ditto.  Change "malformed_asm"
variable to bool.
(move_for_stack_reg): Recode handling of control_flow_insn_deleted.
(swap_rtx_condition_1): Change return type from int to bool
and adjust function body accordingly.  Change "r" variable to bool.
(swap_rtx_condition): Change return type from int to bool
and adjust function body accordingly.
(subst_stack_regs_pat): Recode handling of control_flow_insn_deleted.
(subst_stack_regs): Ditto.
(convert_regs_entry): Change return type from int to bool and adjust
function body accordingly.  Change "inserted" variable to bool.
(convert_regs_1): Recode handling of control_flow_insn_deleted.
(convert_regs_2): Recode handling of cfg_altered.
(convert_regs): Ditto.  Change "inserted" variable to bool.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/reg-stack.cc b/gcc/reg-stack.cc
index 635b5825333..e988b797c41 100644
--- a/gcc/reg-stack.cc
+++ b/gcc/reg-stack.cc
@@ -214,7 +214,7 @@ typedef struct block_info_def
   struct stack_def stack_in;   /* Input stack configuration.  */
   struct stack_def stack_out;  /* Output stack configuration.  */
   HARD_REG_SET out_reg_set;/* Stack regs live on output.  */
-  int done;/* True if block already converted.  */
+  bool done;   /* True if block already converted.  */
   int predecessors;/* Number of predecessors that need
   to be visited.  */
 } *block_info;
@@ -248,11 +248,11 @@ static rtx not_a_num;
 
 /* Forward declarations */
 
-static int stack_regs_mentioned_p (const_rtx pat);
+static bool stack_regs_mentioned_p (const_rtx pat);
 static void pop_stack (stack_ptr, int);
 static rtx *get_true_reg (rtx *);
 
-static int check_asm_stack_operands (rtx_insn *);
+static bool check_asm_stack_operands (rtx_insn *);
 static void get_asm_operands_in_out (rtx, int *, int *);
 static rtx stack_result (tree);
 static void replace_reg (rtx *, int);
@@ -262,8 +262,8 @@ static rtx_insn *emit_pop_insn (rtx_insn *, stack_ptr, rtx, 
enum emit_where);
 static void swap_to_top (rtx_insn *, stack_ptr, rtx, rtx);
 static bool move_for_stack_reg (rtx_insn *, stack_ptr, rtx);
 static bool move_nan_for_stack_reg (rtx_insn *, stack_ptr, rtx);
-static int swap_rtx_condition_1 (rtx);
-static int swap_rtx_condition (rtx_insn *, int &);
+static bool swap_rtx_condition_1 (rtx);
+static bool swap_rtx_condition (rtx_insn *, int &);
 static void compare_for_stack_reg (rtx_insn *, stack_ptr, rtx, bool);
 static bool subst_stack_regs_pat (rtx_insn *, stack_ptr, rtx);
 static void subst_asm_stack_regs (rtx_insn *, stack_ptr);
@@ -272,16 +272,16 @@ static void change_stack (rtx_insn *, stack_ptr, 
stack_ptr, enum emit_where);
 static void print_stack (FILE *, stack_ptr);
 static rtx_insn *next_flags_user (rtx_insn *, int &);
 
-/* Return nonzero if any stack register is mentioned somewhere within PAT.  */
+/* Return true if any stack register is mentioned somewhere within PAT.  */
 
-static int
+static bool
 stack_regs_mentioned_p (const_rtx pat)
 {
   const char *fmt;
   int i;
 
   if (STACK_REG_P (pat))
-return 1;
+return true;
 
   fmt = GET_RTX_FORMAT (GET_CODE (pat));
   for (i = GET_RTX_LENGTH (GET_CODE (pat)) - 1; i >= 0; i--)
@@ -292,25 +292,25 @@ stack_regs_mentioned_p (const_rtx pat)
 
  for (j = XVECLEN (pat, i) - 1; j >= 0; j--)
if (stack_regs_mentioned_p (XVECEXP (pat, i, j)))
- return 1;
+ return true;
}
   else if (fmt[i] == 'e' && stack_regs_mentioned_p (XEXP (pat, i)))
-   return 1;
+   return true;
 }
 
-  return 0;
+  return false;
 }
 
-/* Return nonzero if INSN mentions stacked registers, else return zero.  */
+/* Return true if INSN mentions stacked registers, else return zero.  */
 
-int
+bool
 stack_regs_mentioned (const_rtx insn)
 {
   unsigned int uid, max;
   int test;
 
   if (! INSN_P (insn) || !stack_regs_mentioned_data.exists ())
-return 0;
+return false;
 
   uid = INSN_UID (insn);
   max = stack_regs_mentioned_data.length ();
@@ -467,12 +467,12 @@ static bool any_malformed_asm;
follow.  Those rules are explained at the top of this file: the rule
numbers below refer to that explanation.  */
 
-static int
+static bool
 check_asm_stack_operands (rtx_insn *insn)
 {
   int i;
   int n_clobbers;
-  int malformed_asm = 0;
+  bool malformed_asm = false;
   rtx body = PATTERN (insn);
 
   char reg_used_as_output[FIRST_PSEUDO_REGISTER];
@@

[PATCH] c++: simplify TEMPLATE_TEMPLATE_PARM hashing

2023-06-02 Thread Patrick Palka via Gcc-patches

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk?

-- >8 --

r10-7815-gaa576f2a860c82 added special hashing for TEMPLATE_TEMPLATE_PARM
since non-lowered ttps had TYPE_CANONICAL but level lowered ttps did not.
But this is no longer the case ever since r13-737-gd0ef9e06197d14 made
us set TYPE_CANONICAL for level lowered ttps as well.  So this special
hashing is now unnecessary, and we can fall back to using TYPE_CANONICAL.

gcc/cp/ChangeLog:

* pt.cc (iterative_hash_template_arg): Don't hash
TEMPLATE_TEMPLATE_PARM specially.
---
 gcc/cp/pt.cc | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 688a87a4bd3..7c2a5647665 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1879,19 +1879,8 @@ iterative_hash_template_arg (tree arg, hashval_t val)
  return hash_tmpl_and_args (TI_TEMPLATE (ti), TI_ARGS (ti));
}
 
-  switch (TREE_CODE (arg))
+  switch (code)
{
-   case TEMPLATE_TEMPLATE_PARM:
- {
-   tree tpi = TEMPLATE_TYPE_PARM_INDEX (arg);
-
-   /* Do not recurse with TPI directly, as that is unbounded
-  recursion.  */
-   val = iterative_hash_object (TEMPLATE_PARM_LEVEL (tpi), val);
-   val = iterative_hash_object (TEMPLATE_PARM_IDX (tpi), val);
- }
- break;
-
case  DECLTYPE_TYPE:
  val = iterative_hash_template_arg (DECLTYPE_TYPE_EXPR (arg), val);
  break;
-- 
2.41.0.rc1.10.g9e49351c30

[PATCH] c++: is_specialization_of_friend confusion [PR109923]

2023-06-02 Thread Patrick Palka via Gcc-patches

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

The check for a non-template member function of a class template in
is_specialization_of_friend is overbroad, and accidentally holds for a
non-template hidden friend too, which causes the predicate to return
true for

  decl = void non_templ_friend(A, A)
  friend_decl = void non_templ_friend(A, A)

This patch refines the check appropriately.

PR c++/109923

gcc/cp/ChangeLog:

* pt.cc (is_specialization_of_friend): Fix overbroad check for
a non-template member function of a class template.

gcc/testsuite/ChangeLog:

* g++.dg/template/friend79.C: New test.
---
 gcc/cp/pt.cc |  1 +
 gcc/testsuite/g++.dg/template/friend79.C | 20 
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/friend79.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 7c2a5647665..a15d1d062c6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1319,6 +1319,7 @@ is_specialization_of_friend (tree decl, tree friend_decl)
  of a template class, we want to check if DECL is a specialization
  if this.  */
   if (TREE_CODE (friend_decl) == FUNCTION_DECL
+  && DECL_CLASS_SCOPE_P (friend_decl)
   && DECL_TEMPLATE_INFO (friend_decl)
   && !DECL_USE_TEMPLATE (friend_decl))
 {
diff --git a/gcc/testsuite/g++.dg/template/friend79.C 
b/gcc/testsuite/g++.dg/template/friend79.C
new file mode 100644
index 000..cd2030df019
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/friend79.C
@@ -0,0 +1,20 @@
+// PR c++/109923
+
+template
+struct A {
+private:
+  int x;
+
+public:
+  A() : x(0) { }
+
+  friend void non_templ_friend(A val, A weird) {
+val.x++;   // always works
+weird.x++; // { dg-error "private" } should only work when T=void
+  }
+};
+
+int main() {
+  non_templ_friend(A(), A()); // { dg-bogus "" }
+  non_templ_friend(A(), A());  // { dg-message "required from here" 
}
+}
-- 
2.41.0.rc1.10.g9e49351c30

Re: [PATCH 1/4] rs6000: build constant via li;rotldi

2023-06-02 Thread David Edelsohn via Gcc-patches

Hi, Jiufu

* config/rs6000/rs6000.cc (can_be_rotated_to_possitive_li): New 
function.
(can_be_rotated_to_negative_li): New function.
(can_be_built_by_li_and_rotldi): New function.
(rs6000_emit_set_long_const): Call can_be_built_by_li_and_rotldi.

In English the word "positive" contains one "s", not two.  Please
correct throughout the patches.

Also a style issue, comments before a function should be followed by a
blank line.

> +/* Check if C can be rotated to a possitive value which 'li' instruction

positive
> +   is able to load.  If so, set *ROT to the number by which C is rotated,
> +   and return true.  Return false otherwise.  */

Add a blank line here
> +static bool
> +can_be_rotated_to_possitive_li (HOST_WIDE_INT c, int *rot)

positive
> +{
> +  /* 49 leading zeros and 15 lowbits on the possitive value

low bits, positive

> + generated by 'li' instruction.  */
> +  return can_be_rotated_to_lowbits (c, 15, rot);
> +}

> +/* Check if value C can be built by 2 instructions: one is 'li', another is
> +   rotldi.
> +
> +   If so, *SHIFT is set to the shift operand of rotldi(rldicl), and *MASK
> +   is set to -1, and return true.  Return false otherwise.  */
> +static bool
> +can_be_built_by_li_and_rotldi (HOST_WIDE_INT c, int *shift,
> +HOST_WIDE_INT *mask)
> +{
> +  int n;
> +  if (can_be_rotated_to_possitive_li (c, )
> +  || can_be_rotated_to_negative_li (c, ))
> +{
> +  *mask = HOST_WIDE_INT_M1;
> +  *shift = HOST_BITS_PER_WIDE_INT - n;
> +  return true;
> +}
> +
> +  return false;
> +}
> +
>  /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode.
> Output insns to set DEST equal to the constant C as a series of
> lis, ori and shl instructions.  */
> @@ -10246,15 +10285,14 @@ static void>  rs6000_emit_set_long_const (rtx dest, 
> HOST_WIDE_INT c)>  {>rtx temp;> +  int shift;> +  HOST_WIDE_INT mask;>
> HOST_WIDE_INT ud1, ud2, ud3, ud4;>  >ud1 = c & 0x;> -  c = c >> 16;> 
> -  ud2 = c & 0x;> -  c = c >> 16;> -  ud3 = c & 0x;> -  c = c >> 16;> 
> -  ud4 = c & 0x;> +  ud2 = (c >> 16) & 0x;> +  ud3 = (c >> 32) & 
> 0x;> +  ud4 = (c >> 48) & 0x;>  >if ((ud4 == 0x && ud3 == 
> 0x && ud2 == 0x && (ud1 & 0x8000))>|| (ud4 == 0 && ud3 == 0 
> && ud2 == 0 && ! (ud1 & 0x8000)))> @@ -10278,6 +10316,19 @@ 
> rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)>emit_move_insn 
> (dest, gen_rtx_XOR (DImode, temp,> 
> GEN_INT ((ud2 ^ 0x) << 16)));>  }> +  else if 
> (can_be_built_by_li_and_rotldi (c, , ))> +{> +  temp = 
> !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);> +  unsigned 
> HOST_WIDE_INT imm = (c | ~mask);> +  imm = (imm >> shift) | (imm << 
> (HOST_BITS_PER_WIDE_INT - shift));> +> +  emit_move_insn (temp, GEN_INT 
> (imm));> +  if (shift != 0)> + temp = gen_rtx_ROTATE (DImode, temp, 
> GEN_INT (shift));> +  if (mask != HOST_WIDE_INT_M1)

How is mask != HOST_WIDE_INT_M1? The call to
can_by_built_by_li_and_rotldi() set it

to that value and it is not modified in the interim statements.

> + temp = gen_rtx_AND (DImode, temp, GEN_INT (mask));> +  
> emit_move_insn (dest, temp);> +}>else if (ud3 == 0 && ud4 == 0)>  
> {>temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);

Thanks, David

RE: [PATCH] New wi::bitreverse function.

2023-06-02 Thread Roger Sayle

Doh!  Wrong patch...

Roger
--

-Original Message-
From: Roger Sayle  
Sent: Friday, June 2, 2023 3:17 PM
To: 'gcc-patches@gcc.gnu.org' 
Cc: 'Richard Sandiford' 
Subject: [PATCH] New wi::bitreverse function.

This patch provides a wide-int implementation of bitreverse, that implements
both of Richard Sandiford's suggestions from the review at
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an
improved API (as a stand-alone function matching the bswap refactoring), and
an implementation that works with any bit-width precision.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap (and a
make check-gcc).  Ok for mainline?  Are the remaining pieces of the above
patch pre-approved (pending re-testing)?  The aim is that this new code will
be thoroughly tested by the new *-2.c test cases in
https://gcc.gnu.org/git/?p=gcc.git;h=c09471fbc7588db2480f036aa56a2403d3c03ae
5
with a minor tweak to use the BITREVERSE rtx in the NVPTX back-end, followed
by similar tests on other targets that provide bit-reverse built-ins (such
as ARM and xstormy16), in advance of support for a backend-independent
solution to PR middle-end/50481.

2023-06-02  Roger Sayle  

gcc/ChangeLog
* wide-int.cc (wi::bitreverse_large): New function implementing
bit reversal of an integer.
* wide-int.h (wi::bitreverse): New (template) function prototype.
(bitreverse_large): Prototype helper function/implementation.
(wi::bitreverse): New template wrapper around bitreverse_large.

Thanks again,
Roger
--

diff --git a/gcc/wide-int.cc b/gcc/wide-int.cc
index 1e4c046..24bdce2 100644
--- a/gcc/wide-int.cc
+++ b/gcc/wide-int.cc
@@ -766,6 +766,33 @@ wi::bswap_large (HOST_WIDE_INT *val, const HOST_WIDE_INT 
*xval,
   return canonize (val, len, precision);
 }

+/* Bitreverse the integer represented by XVAL and LEN into VAL.  Return
+   the number of blocks in VAL.  Both XVAL and VAL have PRECISION bits.  */
+unsigned int
+wi::bitreverse_large (HOST_WIDE_INT *val, const HOST_WIDE_INT *xval,
+ unsigned int len, unsigned int precision)
+{
+  unsigned int i, s;
+
+  for (i = 0; i < len; i++)
+val[i] = 0;
+
+  for (s = 0; s < precision; s++)
+{
+  unsigned int block = s / HOST_BITS_PER_WIDE_INT;
+  unsigned int offset = s & (HOST_BITS_PER_WIDE_INT - 1);
+  if (((safe_uhwi (xval, len, block) >> offset) & 1) != 0)
+   {
+ unsigned int d = (precision - 1) - s;
+ block = d / HOST_BITS_PER_WIDE_INT;
+ offset = d & (HOST_BITS_PER_WIDE_INT - 1);
+  val[block] |= 1 << offset;
+   }
+}
+
+  return canonize (val, len, precision);
+}
+
 /* Fill VAL with a mask where the lower WIDTH bits are ones and the bits
above that up to PREC are zeros.  The result is inverted if NEGATE
is true.  Return the number of blocks in VAL.  */
diff --git a/gcc/wide-int.h b/gcc/wide-int.h
index e4723ad..498d14d 100644
--- a/gcc/wide-int.h
+++ b/gcc/wide-int.h
@@ -553,6 +553,7 @@ namespace wi
   UNARY_FUNCTION zext (const T &, unsigned int);
   UNARY_FUNCTION set_bit (const T &, unsigned int);
   UNARY_FUNCTION bswap (const T &);
+  UNARY_FUNCTION bitreverse (const T &);

   BINARY_FUNCTION min (const T1 &, const T2 &, signop);
   BINARY_FUNCTION smin (const T1 &, const T2 &);
@@ -1748,6 +1749,8 @@ namespace wi
  unsigned int, unsigned int, unsigned int);
   unsigned int bswap_large (HOST_WIDE_INT *, const HOST_WIDE_INT *,
unsigned int, unsigned int);
+  unsigned int bitreverse_large (HOST_WIDE_INT *, const HOST_WIDE_INT *,
+unsigned int, unsigned int);

   unsigned int lshift_large (HOST_WIDE_INT *, const HOST_WIDE_INT *,
 unsigned int, unsigned int, unsigned int);
@@ -2281,6 +2284,18 @@ wi::bswap (const T )
   return result;
 }

+/* Bitreverse the integer X.  */
+template 
+inline WI_UNARY_RESULT (T)
+wi::bitreverse (const T )
+{
+  WI_UNARY_RESULT_VAR (result, val, T, x);
+  unsigned int precision = get_precision (result);
+  WIDE_INT_REF_FOR (T) xi (x, precision);
+  result.set_len (bitreverse_large (val, xi.val, xi.len, precision));
+  return result;
+}
+
 /* Return the mininum of X and Y, treating them both as having
signedness SGN.  */
 template

[PATCH] New wi::bitreverse function.

2023-06-02 Thread Roger Sayle


This patch provides a wide-int implementation of bitreverse, that
implements both of Richard Sandiford's suggestions from the review at
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an
improved API (as a stand-alone function matching the bswap refactoring),
and an implementation that works with any bit-width precision.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
(and a make check-gcc).  Ok for mainline?  Are the remaining pieces
of the above patch pre-approved (pending re-testing)?  The aim is that
this new code will be thoroughly tested by the new *-2.c test cases in
https://gcc.gnu.org/git/?p=gcc.git;h=c09471fbc7588db2480f036aa56a2403d3c03ae
5
with a minor tweak to use the BITREVERSE rtx in the NVPTX back-end,
followed by similar tests on other targets that provide bit-reverse
built-ins (such as ARM and xstormy16), in advance of support for a
backend-independent solution to PR middle-end/50481.


2023-06-02  Roger Sayle  

gcc/ChangeLog
* wide-int.cc (wi::bitreverse_large): New function implementing
bit reversal of an integer.
* wide-int.h (wi::bitreverse): New (template) function prototype.
(bitreverse_large): Prototype helper function/implementation.
(wi::bitreverse): New template wrapper around bitreverse_large.


Thanks again,
Roger
--

diff --git a/gcc/fold-const-call.cc b/gcc/fold-const-call.cc
index 340cb66..663eae2 100644
--- a/gcc/fold-const-call.cc
+++ b/gcc/fold-const-call.cc
@@ -1060,7 +1060,8 @@ fold_const_call_ss (wide_int *result, combined_fn fn, 
const wide_int_ref ,
 case CFN_BUILT_IN_BSWAP32:
 case CFN_BUILT_IN_BSWAP64:
 case CFN_BUILT_IN_BSWAP128:
-  *result = wide_int::from (arg, precision, TYPE_SIGN (arg_type)).bswap ();
+  *result = wi::bswap (wide_int::from (arg, precision,
+  TYPE_SIGN (arg_type)));
   return true;
 
 default:
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index d4aeebc..d93d632 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -2111,7 +2111,7 @@ simplify_const_unary_operation (enum rtx_code code, 
machine_mode mode,
  break;
 
case BSWAP:
- result = wide_int (op0).bswap ();
+ result = wi::bswap (op0);
  break;
 
case TRUNCATE:
diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 6fb371c..26d5e44 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -2401,11 +2401,12 @@ evaluate_stmt (gimple *stmt)
  wide_int wval = wi::to_wide (val.value);
  val.value
= wide_int_to_tree (type,
-   wide_int::from (wval, prec,
-   UNSIGNED).bswap ());
+   wi::bswap (wide_int::from (wval, prec,
+  UNSIGNED)));
  val.mask
-   = widest_int::from (wide_int::from (val.mask, prec,
-   UNSIGNED).bswap (),
+   = widest_int::from (wi::bswap (wide_int::from (val.mask,
+  prec,
+  UNSIGNED)),
UNSIGNED);
  if (wi::sext (val.mask, prec) != -1)
break;
diff --git a/gcc/wide-int.cc b/gcc/wide-int.cc
index c0987aa..1e4c046 100644
--- a/gcc/wide-int.cc
+++ b/gcc/wide-int.cc
@@ -731,16 +731,13 @@ wi::set_bit_large (HOST_WIDE_INT *val, const 
HOST_WIDE_INT *xval,
 }
 }
 
-/* bswap THIS.  */
-wide_int
-wide_int_storage::bswap () const
+/* Byte swap the integer represented by XVAL and LEN into VAL.  Return
+   the number of blocks in VAL.  Both XVAL and VAL have PRECISION bits.  */
+unsigned int
+wi::bswap_large (HOST_WIDE_INT *val, const HOST_WIDE_INT *xval,
+unsigned int len, unsigned int precision)
 {
-  wide_int result = wide_int::create (precision);
   unsigned int i, s;
-  unsigned int len = BLOCKS_NEEDED (precision);
-  unsigned int xlen = get_len ();
-  const HOST_WIDE_INT *xval = get_val ();
-  HOST_WIDE_INT *val = result.write_val ();
 
   /* This is not a well defined operation if the precision is not a
  multiple of 8.  */
@@ -758,7 +755,7 @@ wide_int_storage::bswap () const
   unsigned int block = s / HOST_BITS_PER_WIDE_INT;
   unsigned int offset = s & (HOST_BITS_PER_WIDE_INT - 1);
 
-  byte = (safe_uhwi (xval, xlen, block) >> offset) & 0xff;
+  byte = (safe_uhwi (xval, len, block) >> offset) & 0xff;
 
   block = d / HOST_BITS_PER_WIDE_INT;
   offset = d & (HOST_BITS_PER_WIDE_INT - 1);
@@ -766,8 +763,7 @@ wide_int_storage::bswap () const
   val[block] |= byte << offset;
 }
 
-  result.set_len (canonize (val, len, precision));
-  return result;
+  return canonize

Ping: [PATCH v2] libcpp: Handle extended characters in user-defined literal suffix [PR103902]

2023-06-02 Thread Lewis Hyatt via Gcc-patches

Hello-

Ping please? Thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html

-Lewis

On Tue, May 2, 2023 at 9:27 AM Lewis Hyatt  wrote:
>
> May I please ping this one? Thanks...
> https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613247.html
>
> On Thu, Mar 2, 2023 at 6:21 PM Lewis Hyatt  wrote:
> >
> > The PR complains that we do not handle UTF-8 in the suffix for a 
> > user-defined
> > literal, such as:
> >
> > bool operator ""_π (unsigned long long);
> >
> > In fact we don't handle any extended identifier characters there, whether
> > UTF-8, UCNs, or the $ sign. We do handle it fine if the optional space after
> > the "" tokens is included, since then the identifier is lexed in the 
> > "normal"
> > way as its own token. But when it is lexed as part of the string token, this
> > is handled in lex_string() with a one-off loop that is not aware of extended
> > characters.
> >
> > This patch fixes it by adding a new function scan_cur_identifier() that can 
> > be
> > used to lex an identifier while in the middle of lexing another token.
> >
> > BTW, the other place that has been mis-lexing identifiers is
> > lex_identifier_intern(), which is used to implement #pragma push_macro
> > and #pragma pop_macro. This does not support extended characters either.
> > I will add that in a subsequent patch, because it can't directly reuse the
> > new function, but rather needs to lex from a string instead of a cpp_buffer.
> >
> > With scan_cur_identifier(), we do also correctly warn about bidi and
> > normalization issues in the extended identifiers comprising the suffix.
> >
> > libcpp/ChangeLog:
> >
> > PR preprocessor/103902
> > * lex.cc (identifier_diagnostics_on_lex): New function refactoring
> > some common code.
> > (lex_identifier_intern): Use the new function.
> > (lex_identifier): Don't run identifier diagnostics here, rather let
> > the call site do it when needed.
> > (_cpp_lex_direct): Adjust the call sites of lex_identifier ()
> > acccordingly.
> > (struct scan_id_result): New struct.
> > (scan_cur_identifier): New function.
> > (create_literal2): New function.
> > (lit_accum::create_literal2): New function.
> > (is_macro): Folded into new function...
> > (maybe_ignore_udl_macro_suffix): ...here.
> > (is_macro_not_literal_suffix): Folded likewise.
> > (lex_raw_string): Handle UTF-8 in UDL suffix via 
> > scan_cur_identifier ().
> > (lex_string): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR preprocessor/103902
> > * g++.dg/cpp0x/udlit-extended-id-1.C: New test.
> > * g++.dg/cpp0x/udlit-extended-id-2.C: New test.
> > * g++.dg/cpp0x/udlit-extended-id-3.C: New test.
> > * g++.dg/cpp0x/udlit-extended-id-4.C: New test.
> > ---
> >
> > Notes:
> > Hello-
> >
> > This is the updated version of the patch, incorporating feedback from 
> > Jakub
> > and Jason, most recently discussed here:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612073.html
> >
> > Please let me know how it looks? It is simpler than before with the new
> > approach. Thanks!
> >
> > One thing to note. As Jason clarified for me, a usage like this:
> >
> >  #pragma GCC poison _x
> > const char * operator "" _x (const char *, unsigned long);
> >
> > The space between the "" and the _x is currently allowed but will be
> > deprecated in C++23. GCC currently will complain about the poisoned use 
> > of
> > _x in this case, and this patch, which is just focused on handling UTF-8
> > properly, does not change this. But it seems that it would be correct
> > not to apply poison in this case. I can try to follow up with a patch 
> > to do
> > so, if it seems worthwhile? Given the syntax is deprecated, maybe it's 
> > not
> > worth it...
> >
> > For the time being, this patch does add a testcase for the above and 
> > xfails
> > it. For the case where no space is present, which is the part touched 
> > by the
> > present patch, existing behavior is preserved correctly and no 
> > diagnostics
> > such as poison are issued for the UDL suffix. (Contrary to v1 of this
> > patch.)
> >
> > Thanks! bootstrap + regtested all languages on x86-64 Linux with
> > no regressions.
> >
> > -Lewis
> >
> >  .../g++.dg/cpp0x/udlit-extended-id-1.C|  68 
> >  .../g++.dg/cpp0x/udlit-extended-id-2.C|   6 +
> >  .../g++.dg/cpp0x/udlit-extended-id-3.C|  15 +
> >  .../g++.dg/cpp0x/udlit-extended-id-4.C|  14 +
> >  libcpp/lex.cc | 382 ++
> >  5 files changed, 317 insertions(+), 168 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
> >  create mode 100644 gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-2.C
> >  create mode 100644

Re: [Patch, fortran] PR37336 finalization

2023-06-02 Thread Paul Richard Thomas via Gcc-patches

Hi All,

I propose to backport
r13-6747-gd7caf313525a46f200d7f5db1ba893f853774aee to 12-branch very
soon. Before that, I propose to remove the F2003/2008 finalization of
structure and array constructors in 13- and 14-branches. I can see why
it was removed from the standard in a correction to F2008 and think
that it is likely to cause endless confusion and maintenance
complications. However, finalization of function results within
constructors will be retained.

If there are any objections, please let me know.

Paul

[pushed] analyzer: implement various atomic builtins [PR109015]

2023-06-02 Thread David Malcolm via Gcc-patches

This patch implements many of the __atomic_* builtins from
sync-builtins.def as known_function subclasses within the analyzer.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-1497-gef768035ae8090.

gcc/analyzer/ChangeLog:
PR analyzer/109015
* kf.cc (class kf_atomic_exchange): New.
(class kf_atomic_exchange_n): New.
(class kf_atomic_fetch_op): New.
(class kf_atomic_op_fetch): New.
(class kf_atomic_load): New.
(class kf_atomic_load_n): New.
(class kf_atomic_store_n): New.
(register_atomic_builtins): New function.
(register_known_functions): Call register_atomic_builtins.

gcc/testsuite/ChangeLog:
PR analyzer/109015
* gcc.dg/analyzer/atomic-builtins-1.c: New test.
* gcc.dg/analyzer/atomic-builtins-haproxy-proxy.c: New test.
* gcc.dg/analyzer/atomic-builtins-qemu-sockets.c: New test.
* gcc.dg/analyzer/atomic-types-1.c: New test.
---
 gcc/analyzer/kf.cc| 355 
 .../gcc.dg/analyzer/atomic-builtins-1.c   | 544 ++
 .../analyzer/atomic-builtins-haproxy-proxy.c  |  55 ++
 .../analyzer/atomic-builtins-qemu-sockets.c   |  18 +
 .../gcc.dg/analyzer/atomic-types-1.c  |  11 +
 5 files changed, 983 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/atomic-builtins-1.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/atomic-builtins-haproxy-proxy.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/atomic-builtins-qemu-sockets.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/atomic-types-1.c

diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 93c46630f36..104499e 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -69,6 +69,235 @@ kf_alloca::impl_call_pre (const call_details ) const
   cd.maybe_set_lhs (ptr_sval);
 }
 
+/* Handler for:
+   void __atomic_exchange (type *ptr, type *val, type *ret, int memorder).  */
+
+class kf_atomic_exchange : public internal_known_function
+{
+public:
+  /* This is effectively:
+   *RET = *PTR;
+   *PTR = *VAL;
+  */
+  void impl_call_pre (const call_details ) const final override
+  {
+const svalue *ptr_ptr_sval = cd.get_arg_svalue (0);
+tree ptr_ptr_tree = cd.get_arg_tree (0);
+const svalue *val_ptr_sval = cd.get_arg_svalue (1);
+tree val_ptr_tree = cd.get_arg_tree (1);
+const svalue *ret_ptr_sval = cd.get_arg_svalue (2);
+tree ret_ptr_tree = cd.get_arg_tree (2);
+/* Ignore the memorder param.  */
+
+region_model *model = cd.get_model ();
+region_model_context *ctxt = cd.get_ctxt ();
+
+const region *val_region
+  = model->deref_rvalue (val_ptr_sval, val_ptr_tree, ctxt);
+const svalue *star_val_sval = model->get_store_value (val_region, ctxt);
+const region *ptr_region
+  = model->deref_rvalue (ptr_ptr_sval, ptr_ptr_tree, ctxt);
+const svalue *star_ptr_sval = model->get_store_value (ptr_region, ctxt);
+const region *ret_region
+  = model->deref_rvalue (ret_ptr_sval, ret_ptr_tree, ctxt);
+model->set_value (ptr_region, star_val_sval, ctxt);
+model->set_value (ret_region, star_ptr_sval, ctxt);
+  }
+};
+
+/* Handler for:
+   __atomic_exchange_n (type *ptr, type val, int memorder).  */
+
+class kf_atomic_exchange_n : public internal_known_function
+{
+public:
+  /* This is effectively:
+   RET = *PTR;
+   *PTR = VAL;
+   return RET;
+  */
+  void impl_call_pre (const call_details ) const final override
+  {
+const svalue *ptr_sval = cd.get_arg_svalue (0);
+tree ptr_tree = cd.get_arg_tree (0);
+const svalue *set_sval = cd.get_arg_svalue (1);
+/* Ignore the memorder param.  */
+
+region_model *model = cd.get_model ();
+region_model_context *ctxt = cd.get_ctxt ();
+
+const region *dst_region = model->deref_rvalue (ptr_sval, ptr_tree, ctxt);
+const svalue *ret_sval = model->get_store_value (dst_region, ctxt);
+model->set_value (dst_region, set_sval, ctxt);
+cd.maybe_set_lhs (ret_sval);
+  }
+};
+
+/* Handler for:
+   type __atomic_fetch_add (type *ptr, type val, int memorder);
+   type __atomic_fetch_sub (type *ptr, type val, int memorder);
+   type __atomic_fetch_and (type *ptr, type val, int memorder);
+   type __atomic_fetch_xor (type *ptr, type val, int memorder);
+   type __atomic_fetch_or (type *ptr, type val, int memorder);
+*/
+
+class kf_atomic_fetch_op : public internal_known_function
+{
+public:
+  kf_atomic_fetch_op (enum tree_code op): m_op (op) {}
+
+  /* This is effectively:
+   RET = *PTR;
+   *PTR = RET OP VAL;
+   return RET;
+  */
+  void impl_call_pre (const call_details ) const final override
+  {
+const svalue *ptr_sval = cd.get_arg_svalue (0);
+tree ptr_tree = cd.get_arg_tree (0);
+const svalue *val_sval = cd.get_arg_svalue (1);
+/* Ignore the memorder param.  */
+
+region_model *model = cd.get_model ();
+region_model_manager *mgr =

[pushed] analyzer: regions in different memory spaces can't alias

2023-06-02 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-1496-gb8a916726e7f4b.

gcc/analyzer/ChangeLog:
* store.cc (store::eval_alias_1): Regions in different memory
spaces can't alias.
---
 gcc/analyzer/store.cc | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index e8c927b9fe9..4d1de825e79 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -2710,6 +2710,18 @@ tristate
 store::eval_alias_1 (const region *base_reg_a,
 const region *base_reg_b) const
 {
+  /* If they're in different memory spaces, they can't alias.  */
+  {
+enum memory_space memspace_a = base_reg_a->get_memory_space ();
+if (memspace_a != MEMSPACE_UNKNOWN)
+  {
+   enum memory_space memspace_b = base_reg_b->get_memory_space ();
+   if (memspace_b != MEMSPACE_UNKNOWN
+   && memspace_a != memspace_b)
+ return tristate::TS_FALSE;
+  }
+  }
+
   if (const symbolic_region *sym_reg_a
   = base_reg_a->dyn_cast_symbolic_region ())
 {
-- 
2.26.3

[PATCH] Add more ForEachMacros to clang-format file

2023-06-02 Thread Lehua Ding

Hi,

This patch adds some missed ForEachMacros to the contrib/clang-format file,
which allows the clang-format tool to format gcc code correctly.

Best,
Lehua

---
 contrib/clang-format | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/contrib/clang-format b/contrib/clang-format
index 5d264aee3c6..8cfee99cd15 100644
--- a/contrib/clang-format
+++ b/contrib/clang-format
@@ -194,7 +194,15 @@ ForEachMacros: [
 'FOR_EACH_WIDER_MODE',
 'FOR_EXPR',
 'FOR_INIT_STMT',
-'FOR_SCOPE'
+'FOR_SCOPE',
+'EXECUTE_IF_SET_IN_BITMAP',
+'EXECUTE_IF_AND_IN_BITMAP',
+'EXECUTE_IF_AND_COMPL_IN_BITMAP',
+'EXECUTE_IF_SET_IN_REG_SET',
+'EXECUTE_IF_SET_IN_HARD_REG_SET',
+'EXECUTE_IF_AND_COMPL_IN_REG_SET',
+'EXECUTE_IF_AND_IN_REG_SET',
+'EXECUTE_IF_SET_IN_SPARSESET'
 ]
 IndentCaseLabels: false
 NamespaceIndentation: None
-- 
2.36.1

Re: [PATCH] Move std::search into algobase.h

2023-06-02 Thread Jonathan Wakely via Gcc-patches

On Fri, 2 Jun 2023 at 12:30, Jonathan Wakely  wrote:

>
>
> On Fri, 2 Jun 2023 at 10:47, François Dumont  wrote:
>
>> Ok, push done.
>>
>
> Thanks.
>
>
>> Even after full rebuild those tests are still UNRESOLVED on my system.
>>
> What is the error in the log?
>
> What is your system? How and where did you install "OMP"?
>
> Does the libgomp directory exist in the GCC build tree, at the same level
> as libstdc++-v3?
>
> e.g. in $objdir/x86_64-pc-linux-gnu/libgomp or equivalent?
>
> That directory should contain omp.h and .libs/libgomp.* which will be used
> by the libstdc++ testsuite for the check-parallel target (see the
> libgomp_flags variable which sets the paths to find libgomp in the build
> tree).
>
> But because that test only runs for normal mode (not parallel mode) it
> doesn't use libgomp_flags, and so it will only find omp.h if it already
> exists in the compiler's default include paths, which will happen if you've
> already run "make install" on the GCC built with libgomp enabled.
>
> If you haven't enabled libgomp, or you haven't installed the new GCC yet,
> then the __has_include() should fail, and so the test does nothing
> and so should just PASS. If it's UNRESOLVED for you then that implies it's
> finding an  header, but probably not the one from GCC, so it fails
> to compile. I think that's due to how you've installed "OMP" (whatever that
> means ... I don't think you've installed libgomp and so I don't think you
> should have done that ... maybe you installed Clang's libomp headers
> instead and GCC is finding those somehow?)
>

Since we already have dg-require-parallel-mode that is used for most
parallel mode tests, I don't think it's worth adding a new "openmp"
effective-target just for these three tests. But it would be helpful if I
added this comment to them instead:

--- a/libstdc++-v3/testsuite/17_intro/headers/c++2014/parallel_mode.cc
+++ b/libstdc++-v3/testsuite/17_intro/headers/c++2014/parallel_mode.cc
@@ -19,6 +19,12 @@
// { dg-do compile { target c++14 } }
// { dg-require-normal-mode "" }

+// In order to improve coverage this test is run by the normal 'make check'
+// target, not only the infrequently-tested check-parallel target. That
means
+// the makefile variable $(libgomp_flags) is not used, so the libgomp files
+// in the build tree will not be found. The parallel mode headers will only
+// be able to include  if libgomp has already been installed to the
+// $prefix of the GCC being tested, so use __has_include to fail
gracefully.
#if __has_include()
# define _GLIBCXX_PARALLEL 1
# include 

Or we could just remove those tests and ensure that somebody runs 'make
check-parallel' at least once every six months.

N.B. running 'make check-parallel' would have found the problem with the
missing #include in , even without an installed
libgomp.

Re: [PATCH] libstdc++: Correct NTTP and simd_mask ctor call

2023-06-02 Thread Jonathan Wakely via Gcc-patches

On Fri, 2 Jun 2023 at 10:30, Alexandre Oliva via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

>
> ISTM that rtems is missing some of the math.h functions expected by
> libstdc++, but also that even those that are present are not visible in
> namespace ::std::, where the macros reasonably expect to find them.  Is
> this known?  Should I file a PR about it?
>

That looks like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109818

We only import the C99  functions into namespace std when the
target libc supports all of them.

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-06-02 Thread Richard Sandiford via Gcc-patches

Just some very minor things.

"Andre Vieira (lists)"  writes:
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 
> 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb
>  100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -90,6 +90,71 @@ lookup_internal_fn (const char *name)
>return entry ? *entry : IFN_LAST;
>  }
>  
> +/*  Given an internal_fn IFN that is either a widening or narrowing 
> function, return its
> +corresponding LO and HI internal_fns.  */

Long line and too much space after "/*":

/* Given an internal_fn IFN that is either a widening or narrowing function,
   return its corresponding _LO and _HI internal_fns in *LO and *HI.  */

> +extern void
> +lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
> +{
> +  gcc_assert (widening_fn_p (ifn) || narrowing_fn_p (ifn));
> +
> +  switch (ifn)
> +{
> +default:
> +  gcc_unreachable ();
> +#undef DEF_INTERNAL_FN
> +#undef DEF_INTERNAL_WIDENING_OPTAB_FN
> +#undef DEF_INTERNAL_NARROWING_OPTAB_FN
> +#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
> +#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T)\
> +case IFN_##NAME: \
> +  *lo = internal_fn (IFN_##NAME##_LO);   \
> +  *hi = internal_fn (IFN_##NAME##_HI);   \
> +  break;
> +#define DEF_INTERNAL_NARROWING_OPTAB_FN(NAME, F, O, T)   \
> +case IFN_##NAME: \
> +  *lo = internal_fn (IFN_##NAME##_LO);   \
> +  *hi = internal_fn (IFN_##NAME##_HI);   \
> +  break;
> +#include "internal-fn.def"
> +#undef DEF_INTERNAL_FN
> +#undef DEF_INTERNAL_WIDENING_OPTAB_FN
> +#undef DEF_INTERNAL_NARROWING_OPTAB_FN
> +}
> +}
> +
> +extern void
> +lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
> + internal_fn *odd)

This needs a similar comment:

/* Given an internal_fn IFN that is either a widening or narrowing function,
   return its corresponding _EVEN and _ODD internal_fns in *EVEN and *ODD.  */

> @@ -3971,6 +4036,9 @@ commutative_binary_fn_p (internal_fn fn)
>  case IFN_UBSAN_CHECK_MUL:
>  case IFN_ADD_OVERFLOW:
>  case IFN_MUL_OVERFLOW:
> +case IFN_VEC_WIDEN_PLUS:
> +case IFN_VEC_WIDEN_PLUS_LO:
> +case IFN_VEC_WIDEN_PLUS_HI:

Should include even & odd as well.

I'd suggest leaving out the narrowing stuff for now.  There are some
questions that would be easier to answer once we add the first use,
such as whether one of the hi/lo pair and one or the even/odd pair
merge with a vector containing the other half, whether all four
define the other half to be zero, etc.

OK for the optab/internal-fn parts with those changes from my POV.

Thanks again for doing this!

Richard

RE: [PATCH V3] VECT: Change flow of decrement IV

2023-06-02 Thread Li, Pan2 via Gcc-patches

Committed, thanks all.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Richard Sandiford via Gcc-patches
Sent: Friday, June 2, 2023 7:44 PM
To: juzhe.zh...@rivai.ai
Cc: rguenther ; gcc-patches ; linkw 

Subject: Re: [PATCH V3] VECT: Change flow of decrement IV

"juzhe.zh...@rivai.ai"  writes:
> Thanks Richi. I am gonna merge it after Richard's final approve.

Thanks for checking, but no need to wait for a second ack from me!
Please go ahead and commit.

Richard

Re: [PATCH V3] VECT: Change flow of decrement IV

2023-06-02 Thread Richard Sandiford via Gcc-patches

"juzhe.zh...@rivai.ai"  writes:
> Thanks Richi. I am gonna merge it after Richard's final approve.

Thanks for checking, but no need to wait for a second ack from me!
Please go ahead and commit.

Richard

[avr, committed] Improve operations on non-LD_REGS when the operation follows a move from LD_REGS.

2023-06-02 Thread Georg-Johann Lay


Applied the following patch to improve operations on no-LD_REGS
when the operation follows a move from LD_REGS.

Johann

target/110088: Improve operation of l-reg with const after move from d-reg.

After reload, there may be sequences like
   lreg = dreg
   lreg = lreg  const
with an LD_REGS dreg, non-LD_REGS lreg, and  in PLUS, IOR, AND.
If dreg dies after the first insn, it is possible to use
   dreg = dreg  const
   lreg = dreg
instead which is more efficient.

gcc/
PR target/110088
* config/avr/avr.md: Add an RTL peephole to optimize operations on
non-LD_REGS after a move from LD_REGS.
(piaop): New code iterator.

diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index 371965938a6..9f5fabc861f 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -279,6 +279,7 @@ (define_code_iterator any_extend2 [sign_extend 
zero_extend])

 (define_code_iterator any_extract [sign_extract zero_extract])
 (define_code_iterator any_shiftrt [lshiftrt ashiftrt])

+(define_code_iterator piaop [plus ior and])
 (define_code_iterator bitop [xor ior and])
 (define_code_iterator xior [xor ior])
 (define_code_iterator eqne [eq ne])
@@ -4727,6 +4729,43 @@ (define_split
 DONE;
   })

+;; If  $0 = $0  const  requires a QI scratch, and d-reg $1 dies after
+;; the first insn, then we can replace
+;;$0 = $1
+;;$0 = $0  const
+;; by
+;;$1 = $1  const
+;;$0 = $1
+;; This transforms constraint alternative "r,0,n," of the first operation
+;; to alternative "d,0,n,X".
+;; "*addhi3_clobber"  "*addpsi3"  "*addsi3"
+;; "*addhq3"  "*adduhq3"  "*addha3"  "*adduha3"
+;; "*addsq3"  "*addusq3"  "*addsa3"  "*addusa3"
+;; "*iorhi3"  "*iorpsi3"  "*iorsi3"
+;; "*andhi3"  "*andpsi3"  "*andsi3"
+(define_peephole2
+  [(parallel [(set (match_operand:ORDERED234 0 "register_operand")
+   (match_operand:ORDERED234 1 "d_register_operand"))
+  (clobber (reg:CC REG_CC))])
+   (parallel [(set (match_dup 0)
+   (piaop:ORDERED234 (match_dup 0)
+ (match_operand:ORDERED234 2 
"const_operand")))

+  ; A d-reg as scratch tells that this insn is expensive, and
+  ; that $0 is not a d-register: l-reg or something like 
SI:14 etc.

+  (clobber (match_operand:QI 3 "d_register_operand"))
+  (clobber (reg:CC REG_CC))])]
+  "peep2_reg_dead_p (1, operands[1])"
+  [(parallel [(set (match_dup 1)
+   (piaop:ORDERED234 (match_dup 1)
+ (match_dup 2)))
+  (clobber (scratch:QI))
+  (clobber (reg:CC REG_CC))])
+   ; Unfortunately, the following insn misses a REG_DEAD note for $1,
+   ; so this peep2 works only once.
+   (parallel [(set (match_dup 0)
+   (match_dup 1))
+  (clobber (reg:CC REG_CC))])])
+

 ;; swap swap swap swap swap swap swap swap swap swap swap swap swap 
swap swap

 ;; swap

Re: [PATCH] Move std::search into algobase.h

2023-06-02 Thread Jonathan Wakely via Gcc-patches

On Fri, 2 Jun 2023 at 10:47, François Dumont  wrote:

> Ok, push done.
>

Thanks.

> Even after full rebuild those tests are still UNRESOLVED on my system.
>
What is the error in the log?

What is your system? How and where did you install "OMP"?

Does the libgomp directory exist in the GCC build tree, at the same level
as libstdc++-v3?

e.g. in $objdir/x86_64-pc-linux-gnu/libgomp or equivalent?

That directory should contain omp.h and .libs/libgomp.* which will be used
by the libstdc++ testsuite for the check-parallel target (see the
libgomp_flags variable which sets the paths to find libgomp in the build
tree).

But because that test only runs for normal mode (not parallel mode) it
doesn't use libgomp_flags, and so it will only find omp.h if it already
exists in the compiler's default include paths, which will happen if you've
already run "make install" on the GCC built with libgomp enabled.

If you haven't enabled libgomp, or you haven't installed the new GCC yet,
then the __has_include() should fail, and so the test does nothing
and so should just PASS. If it's UNRESOLVED for you then that implies it's
finding an  header, but probably not the one from GCC, so it fails
to compile. I think that's due to how you've installed "OMP" (whatever that
means ... I don't think you've installed libgomp and so I don't think you
should have done that ... maybe you installed Clang's libomp headers
instead and GCC is finding those somehow?)

[PATCH 2/2] [V3] [RISC-V] support cm.push cm.pop cm.popret in zcmp

2023-06-02 Thread Fei Gao

Zcmp can share the same logic as save-restore in stack allocation: 
pre-allocation
by cm.push, step 1 and step 2.

please be noted cm.push pushes ra, s0-s11 in reverse order than what 
save-restore does.
So adaption has been done in .cfi directives in my patch.

Signed-off-by: Fei Gao 

gcc/ChangeLog:

* config/riscv/iterators.md (-8): slot offset in bytes
(-16): likewise
(-24): likewise
(-32): likewise
(-40): likewise
(-48): likewise
(-56): likewise
(-64): likewise
(-72): likewise
(-80): likewise
(-88): likewise
(-96): likewise
(-104): likewise
* config/riscv/predicates.md
(stack_push_up_to_ra_operand): predicates for stack adjust of pushing ra
(stack_push_up_to_s0_operand): predicates for stack adjust of pushing 
ra, s0
(stack_push_up_to_s1_operand): likewise
(stack_push_up_to_s2_operand): likewise
(stack_push_up_to_s3_operand): likewise
(stack_push_up_to_s4_operand): likewise
(stack_push_up_to_s5_operand): likewise
(stack_push_up_to_s6_operand): likewise
(stack_push_up_to_s7_operand): likewise
(stack_push_up_to_s8_operand): likewise
(stack_push_up_to_s9_operand): likewise
(stack_push_up_to_s11_operand): likewise
(stack_pop_up_to_ra_operand): predicates for stack adjust of poping ra
(stack_pop_up_to_s0_operand): predicates for stack adjust of poping ra, 
s0
(stack_pop_up_to_s1_operand): likewise
(stack_pop_up_to_s2_operand): likewise
(stack_pop_up_to_s3_operand): likewise
(stack_pop_up_to_s4_operand): likewise
(stack_pop_up_to_s5_operand): likewise
(stack_pop_up_to_s6_operand): likewise
(stack_pop_up_to_s7_operand): likewise
(stack_pop_up_to_s8_operand): likewise
(stack_pop_up_to_s9_operand): likewise
(stack_pop_up_to_s11_operand): likewise
* config/riscv/riscv-protos.h 
(riscv_zcmp_valid_stack_adj_bytes_p):declaration
* config/riscv/riscv.cc (struct riscv_frame_info): comment change
(riscv_avoid_multi_push): helper function of riscv_use_multi_push
(riscv_use_multi_push): true if multi push is used
(riscv_multi_push_sregs_count): num of sregs in multi-push
(riscv_multi_push_regs_count): num of regs in multi-push
(riscv_16bytes_align): align to 16 bytes
(riscv_stack_align): moved to a better place
(riscv_save_libcall_count): no functional change
(riscv_compute_frame_info): add zcmp frame info
(riscv_adjust_multi_push_cfi_prologue): adjust cfi for cm.push
(riscv_gen_multi_push_pop_insn): gen function for multi push and pop
(riscv_expand_prologue): allocate stack by cm.push
(riscv_adjust_multi_pop_cfi_epilogue): adjust cfi for cm.pop[ret]
(riscv_expand_epilogue): allocate stack by cm.pop[ret]
(zcmp_base_adj): calculate stack adjustment base size
(zcmp_additional_adj): calculate stack adjustment additional size
(riscv_zcmp_valid_stack_adj_bytes_p): check if stack adjustment size is 
valid
* config/riscv/riscv.h (RETURN_ADDR_MASK): mask of ra
(S0_MASK): likewise
(S1_MASK): likewise
(S2_MASK): likewise
(S3_MASK): likewise
(S4_MASK): likewise
(S5_MASK): likewise
(S6_MASK): likewise
(S7_MASK): likewise
(S8_MASK): likewise
(S9_MASK): likewise
(S10_MASK): likewise
(S11_MASK): likewise
(MULTI_PUSH_GPR_MASK): GPR_MASK that cm.push can cover at most
(ZCMP_MAX_SPIMM): max spimm value
(ZCMP_SP_INC_STEP): zcmp sp increment step
(ZCMP_INVALID_S0S10_SREGS_COUNTS): num of s0-s10
(ZCMP_S0S11_SREGS_COUNTS): num of s0-s11
(ZCMP_MAX_GRP_SLOTS): max slots of pushing and poping in zcmp
* config/riscv/riscv.md: include zc.md
* config/riscv/zc.md: New file. machine description for zcmp

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rv32e_zcmp.c: New test.
* gcc.target/riscv/rv32i_zcmp.c: New test.
* gcc.target/riscv/zcmp_stack_alignment.c: New test.
---
 gcc/config/riscv/iterators.md |   15 +
 gcc/config/riscv/predicates.md|   96 ++
 gcc/config/riscv/riscv-protos.h   |1 +
 gcc/config/riscv/riscv.cc |  360 +-
 gcc/config/riscv/riscv.h  |   23 +
 gcc/config/riscv/riscv.md |2 +
 gcc/config/riscv/zc.md| 1042 +
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c   |  239 
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c   |  239 
 .../gcc.target/riscv/zcmp_stack_alignment.c   |   23 +
 10 files changed, 2000 insertions(+), 40 deletions(-)
 create mode 100644 gcc/config/riscv/zc.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
 create mode

[PATCH 1/2] [RISC-V] fix cfi issue in save-restore.

2023-06-02 Thread Fei Gao

This patch fixes a cfi issue introduced by
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=60524be1e3929d83e15fceac6e2aa053c8a6fb20

Test code:
char my_getchar();
float getf();
int test_f0()
{
  int s0 = my_getchar();
  float f0 = getf();
  int b = my_getchar();
  return f0+s0+b;
}

cflags: -g -Os -march=rv32imafc -mabi=ilp32f -msave-restore -mcmodel=medlow

before patch:
test_f0:
...
.cfi_startproc
callt0,__riscv_save_1
.cfi_offset 8, -8
.cfi_offset 1, -4
.cfi_def_cfa_offset 16
...
addisp,sp,-16
.cfi_def_cfa_offset 32

...

addisp,sp,16
.cfi_def_cfa_offset 0  // issue here
...
tail__riscv_restore_1
.cfi_restore 8
.cfi_restore 1
.cfi_def_cfa_offset -16 // issue here
.cfi_endproc

after patch:
test_f0:
...
.cfi_startproc
callt0,__riscv_save_1
.cfi_offset 8, -8
.cfi_offset 1, -4
.cfi_def_cfa_offset 16
...
addisp,sp,-16
.cfi_def_cfa_offset 32

...

addisp,sp,16
.cfi_def_cfa_offset 16  // corrected here
...
tail__riscv_restore_1
.cfi_restore 8
.cfi_restore 1
.cfi_def_cfa_offset 0 // corrected here
.cfi_endproc

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_epilogue): fix cfi issue with 
correct offset.
---
 gcc/config/riscv/riscv.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 85db1e3c86b..469af02bdf7 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5652,7 +5652,7 @@ riscv_expand_epilogue (int style)
   adjust));
  rtx dwarf = NULL_RTX;
  rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-GEN_INT (step2));
+GEN_INT (step2 + libcall_size));
 
  dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
  RTX_FRAME_RELATED_P (insn) = 1;
@@ -5689,7 +5689,7 @@ riscv_expand_epilogue (int style)
 
   rtx dwarf = NULL_RTX;
   rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-const0_rtx);
+GEN_INT (libcall_size ));
   dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
   RTX_FRAME_RELATED_P (insn) = 1;
 
-- 
2.17.1

Re: [RFC] Introduce -finline-memset-loops

2023-06-02 Thread Alexandre Oliva via Gcc-patches

On Jan 19, 2023, Alexandre Oliva  wrote:

> Would it make more sense to extend it, even constrained by the
> limitations mentioned above, or handle memset only?  In the latter case,
> would it still make sense to adopt a command-line option that suggests a
> broader effect than it already has, even if it's only a hopeful future
> extension?  -finline-all-stringops[={memset,memcpy,...}], that you
> suggested, seems to be a reasonable and extensible one to adopt.

I ended up implementing all of memset, memcpy, memmove, and memcmp:

Introduce -finline-stringops

try_store_by_multiple_pieces was added not long ago, enabling
variable-sized memset to be expanded inline when the worst-case
in-range constant length would, using conditional blocks with powers
of two to cover all possibilities of length and alignment.

This patch introduces -finline-stringops[=fn] to request expansions to
start with a loop, so as to still take advantage of known alignment
even with long lengths, but without necessarily adding store blocks
for every power of two.

This makes it possible for the supported stringops (memset, memcpy,
memmove, memset) to be expanded, even if storing a single byte per
iteration.  Surely efficient implementations can run faster, with a
pre-loop to increase alignment, but that would likely be excessive for
inline expansions.

Still, in some cases, such as in freestanding environments, users
prefer to inline such stringops, especially those that the compiler
may introduce itself, even if the expansion is not as performant as a
highly optimized C library implementation could be, to avoid
depending on a C runtime library.

Regstrapped on x86_64-linux-gnu, also bootstrapped with
-finline-stringops enabled by default, and tested with arm, aarch, 32-
and 64-bit riscv with gcc-12.  Ok to install?


for  gcc/ChangeLog

* expr.cc (emit_block_move_hints): Take ctz of len.  Obey
-finline-stringops.  Use oriented or sized loop.
(emit_block_move): Take ctz of len, and pass it on.
(emit_block_move_via_sized_loop): New.
(emit_block_move_via_oriented_loop): New.
(emit_block_move_via_loop): Take incr.  Move an incr-sized
block per iteration.
(emit_block_cmp_via_cmpmem): Take ctz of len.  Obey
-finline-stringops.
(emit_block_cmp_via_loop): New.
* expr.h (emit_block_move): Add ctz of len defaulting to zero.
(emit_block_move_hints): Likewise.
(emit_block_cmp_hints): Likewise.
* builtins.cc (expand_builtin_memory_copy_args): Pass ctz of
len to emit_block_move_hints.
(try_store_by_multiple_pieces): Support starting with a loop.
(expand_builtin_memcmp): Pass ctz of len to
emit_block_cmp_hints.
(expand_builtin): Allow inline expansion of memset, memcpy,
memmove and memcmp if requested.
* common.opt (finline-stringops): New.
(ilsop_fn): New enum.
* flag-types.h (enum ilsop_fn): New.
* doc/invoke.texi (-finline-stringops): Add.

for  gcc/testsuite/ChangeLog

* gcc.dg/torture/inline-mem-cmp-1.c: New.
* gcc.dg/torture/inline-mem-cpy-1.c: New.
* gcc.dg/torture/inline-mem-cpy-cmp-1.c: New.
* gcc.dg/torture/inline-mem-move-1.c: New.
* gcc.dg/torture/inline-mem-set-1.c: New.
---
 gcc/builtins.cc|  114 ++
 gcc/common.opt |   34 ++
 gcc/doc/invoke.texi|   15 +
 gcc/expr.cc|  374 +++-
 gcc/expr.h |9 
 gcc/flag-types.h   |   11 +
 gcc/testsuite/gcc.dg/torture/inline-mem-cmp-1.c|7 
 gcc/testsuite/gcc.dg/torture/inline-mem-cpy-1.c|8 
 .../gcc.dg/torture/inline-mem-cpy-cmp-1.c  |   11 +
 gcc/testsuite/gcc.dg/torture/inline-mem-move-1.c   |9 
 gcc/testsuite/gcc.dg/torture/inline-mem-set-1.c|   84 
 11 files changed, 646 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/inline-mem-cmp-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/inline-mem-cpy-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/inline-mem-cpy-cmp-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/inline-mem-move-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/inline-mem-set-1.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 8400adaf5b4db..1beaa4eae97a5 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -3769,7 +3769,7 @@ expand_builtin_memory_copy_args (tree dest, tree src, 
tree len,
 expected_align, expected_size,
 min_size, max_size, probable_max_size,
 use_mempcpy_call, _move_done,
-might_overlap);
+might_overlap, tree_ctz (len));
 
   /* Bail

[PATCH, PR110086] avr: Fix ICE on optimize attribute

2023-06-02 Thread SenthilKumar.Selvaraj--- via Gcc-patches

Hi,

This patch fixes an ICE when an optimize attribute changes the prevailing
optimization level.

I found https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105069 describing the
same ICE for the sh target, where the fix was to enable save/restore of
target specific options modified via TARGET_OPTIMIZATION_TABLE hook.

For the AVR target, mgas-isr-prologues and -mmain-is-OS_task are those
target specific options. As they enable generation of more optimal code,
this patch adds the Optimization option property to those option records,
and that fixes the ICE.

Regression run shows no regressions, and >100 new PASSes.
Ok to commit to master?

Regards
Senthil


PR 110086

gcc/ChangeLog:

* config/avr/avr.opt (mgas-isr-prologues, mmain-is-OS_task):
Add Optimization option property.

gcc/testsuite/ChangeLog:

* gcc.target/avr/pr110086.c: New test.

diff --git gcc/config/avr/avr.opt gcc/config/avr/avr.opt
index f62d746..5a0b465 100644
--- gcc/config/avr/avr.opt
+++ gcc/config/avr/avr.opt
@@ -27,7 +27,7 @@ Target RejectNegative Joined Var(avr_mmcu) 
MissingArgError(missing device or arc
 -mmcu=MCU  Select the target MCU.
 
 mgas-isr-prologues
-Target Var(avr_gasisr_prologues) UInteger Init(0)
+Target Var(avr_gasisr_prologues) UInteger Init(0) Optimization 
 Allow usage of __gcc_isr pseudo instructions in ISR prologues and epilogues.
 
 mn-flash=
@@ -65,7 +65,7 @@ Target Joined RejectNegative UInteger Var(avr_branch_cost) 
Init(0)
 Set the branch costs for conditional branch instructions.  Reasonable values 
are small, non-negative integers.  The default
branch cost is 0.
 
 mmain-is-OS_task
-Target Mask(MAIN_IS_OS_TASK)
+Target Mask(MAIN_IS_OS_TASK) Optimization
 Treat main as if it had attribute OS_task.
 
 morder1
diff --git gcc/testsuite/gcc.target/avr/pr110086.c 
gcc/testsuite/gcc.target/avr/pr110086.c
new file mode 100644
index 000..6b97620
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr110086.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+
+void __attribute__((optimize("O0"))) foo(void) {
+}

Consider '--with-build-sysroot=[...]' for target libraries' build-tree testing (instead of build-time 'CC' etc.) [PR109951]

2023-06-02 Thread Thomas Schwinge

Hi!

For context:

In early 2020, Maciej pushed a set of changes that made target libraries
capture their build-time 'CC' for test-time usage as 'GCC_UNDER_TEST'
(via the respective 'site.exp'), for build-tree testing (but not
installed testing).  This was motivated by the need for '--sysroot=[...]'
as set per '--with-build-sysroot=[...]'.  For example, for libgomp that's
Subversion r279708 (Git commit c8e759b4215ba4b376c9d468aeffe163b3d520f0)
"libgomp/test: Fix compilation for build sysroot", followed by
Git commit 749bd22ddc50b5112e5ed506ffef7249bf8e6fb3
"libgomp/test: Remove a build sysroot fix regression".  (Similarly other
commits for a number of (but not all?) other target libraries).

In recent Git commit 11f4d483600b5788a3d1cf1527e838e4a7ed1455
"libgomp testsuite: As appropriate, use the 'gcc', 'g++', 'gfortran' driver 
[PR91884]"
I've then extended this approach for 'CXX': 'GXX_UNDER_TEST',
'FC': 'GFORTRAN_UNDER_TEST'.

Per 
"libgomp, testsuite: non-native multilib c++ tests fail on Darwin"
we have however found that there is a conceptual problem in the original
approach: the build-time 'CC' etc. potentially are different per multilib
build of the target library (say, multilib-specific '-L[...]'  flags),
whereas the test-time usage will always use the 'GCC_UNDER_TEST', thus
build-time 'CC' of the default multilib.  (This is fundamental in how the
GCC/DejaGnu testsuite is set up; we're not going to change this without
considerable effort.)  Often this is not a problem (if, effectively,
non-applicable flags etc. are dropped silently, for example), but as Iain
has demonstrated, this is an actual problem in certain configurations
(if, effectively, non-applicable flags etc. cause diagnostics, for
example).

Now, back then, Chung-Lin actually had a different proposal:

On 2020-01-14T21:31:13+0800, Chung-Lin Tang  wrote:
> I understand your situation with --with-build-sysroot/--without-sysroot, [...]
>
> Can you test if the attached patch works for you? The patch exports the build 
> sysroot
> setting from the toplevel to target library subdirs, and adds the --sysroot= 
> option
> when doing build-tree testing [...]

Belatedly: thanks, I like that approach better indeed.

This is, by the way, in line with what GCC compiler testing is doing;
'gcc/Makefile.in':

# Set if the compiler was configured with --with-build-sysroot.
SYSROOT_CFLAGS_FOR_TARGET = @SYSROOT_CFLAGS_FOR_TARGET@

# TEST_ALWAYS_FLAGS are flags that should be passed to every compilation.
# They are passed first to allow individual tests to override them.
@echo "set TEST_ALWAYS_FLAGS \"$(SYSROOT_CFLAGS_FOR_TARGET)\"" >> 
./site.tmp

That is, via 'site.exp', put 'SYSROOT_CFLAGS_FOR_TARGET' into
'TEST_ALWAYS_FLAGS', which is "passed to every compilation".

> [...], if this does work, then other library testsuites (e.g. libatomic.exp) 
> might
> also need considering updating, I think.

Correct.  (I'm offering to take care of that.)

> 2020-01-14  Chung-Lin Tang  
>
>   * Makefile.tpl  (NORMAL_TARGET_EXPORTS): Add export of
>   SYSROOT_CFLAGS_FOR_TARGET variable.
>   * Makefile.in:  Regenerate.
>
>   libgomp/
>   * testsuite/lib/libgomp.exp (ALWAYS_CFLAGS): Add
>   --sysroot=$SYSROOT_CFLAGS_FOR_TARGET option when doing build-tree 
> testing.
>   Fix comment typo.
>   * testsuite/libgomp-test-support.exp.in (GCC_UNDER_TEST): Delete 
> definition.

To that, Maciej had to report:

On 2020-01-31T21:46:01+, "Maciej W. Rozycki"  wrote:
>  So it does seem to pick the right uninstalled compiler, however without
> the sysroot option and therefore all tests fail [...]

That, however, is not a conceptual but simply an implementation problem:

> --- libgomp/testsuite/lib/libgomp.exp (revision 279954)
> +++ libgomp/testsuite/lib/libgomp.exp (working copy)
> @@ -171,9 +171,16 @@ proc libgomp_init { args } {
>  lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/../../include"
>  lappend ALWAYS_CFLAGS "additional_flags=-I${srcdir}/.."
>
> -# For build-tree testing, also consider the library paths used for 
> builing.
> +# For build-tree testing, also consider the library paths used for 
> building.
>  # For installed testing, we assume all that to be provided in the 
> sysroot.
>  if { $blddir != "" } {
> +
> + # If --with-build-sysroot= was specified, we assume it will be needed
> + # for build-tree testing.
> + if [info exists SYSROOT_CFLAGS_FOR_TARGET] {
> + lappend ALWAYS_CFLAGS 
> "additional_flags=--sysroot=$SYSROOT_CFLAGS_FOR_TARGET"
> + }

Need 'global SYSROOT_CFLAGS_FOR_TARGET'.

Need to change:

-   lappend ALWAYS_CFLAGS 
"additional_flags=--sysroot=$SYSROOT_CFLAGS_FOR_TARGET"
+   lappend ALWAYS_CFLAGS "additional_flags=$SYSROOT_CFLAGS_FOR_TARGET"

..., as 'SYSROOT_CFLAGS_FOR_TARGET' already includes '--sysroot=' prefix.

> --- libgomp/testsuite/libgomp-test-support.exp.in (revision

Re: [PATCH] Move std::search into algobase.h

2023-06-02 Thread François Dumont via Gcc-patches


Ok, push done.

Even after full rebuild those tests are still UNRESOLVED on my system.

Yes, I also noticed that I could remove this check. I'll propose it later.

François

On 02/06/2023 09:43, Jonathan Wakely wrote:

On Fri, 2 Jun 2023 at 08:33, François Dumont wrote:

I haven't been able to reproduce so far.

Here is however a patch that I think will fix the problem. At
least failing tests are UNRESOLVED on my system.

    libstdc++: Fix broken _GLIBCXX_PARALLEL mode

    Add missing  include in .


This fixes the broken parallel mode.


    Detect availability of  in tests needing it to make
them UNSUPPORTED
    rather than PASS when header is missing.

    libstdc++-v3/ChangeLog:

    * include/parallel/algobase.h: Include
.
    * testsuite/lib/libstdc++.exp
(check_effective_target_omp): New.
    * testsuite/17_intro/headers/c++2011/parallel_mode.cc:
    Add { dg-require-effective-target omp }.
    * testsuite/17_intro/headers/c++2014/parallel_mode.cc:
Likewise.
    * testsuite/17_intro/headers/c++2017/parallel_mode.cc:
Likewise.

Ok to commit ?


Please just add the #include to parallel/algobase.h for now.

The effective-target keyword seems reasonable, but "omp" is not a good 
name. And if we add that dg-require-effective-target to those tests 
then they don't need to repeat the check in the test itself:

#if __has_include()

So please just add the #include and then we can revisit the 
effective-target separately.




On 01/06/2023 23:57, Jonathan Wakely wrote:

On Thu, 1 Jun 2023, 21:37 François Dumont via Libstdc++,
mailto:libstdc%2b...@gcc.gnu.org>> wrote:

Now I've install OMP and try to rebuild lib to reproduce the
failure.


You shouldn't need to install anything, just build gcc and don't
configure it with --disable-libgomp


I haven't used --disable-libgomp. But maybe when I run configure
the 1st time it tried to detect OMP install and failed to find it
as I just installed it.


I don't know what you mean, because GCC doesn't depend on "OMP". GCC 
includes its own OpenMP implementation, and installs its own libgomp 
runtime library to support the -fopenmp flag. It doesn't depend on 
anything else.


Which OS are you testing on?

Re: [PATCH] inline: improve internal function costs

2023-06-02 Thread Andre Vieira (lists) via Gcc-patches





On 02/06/2023 10:13, Richard Biener wrote:

On Thu, 1 Jun 2023, Andre Vieira (lists) wrote:


Hi,

This is a follow-up of the internal function patch to add widening and
narrowing patterns.  This patch improves the inliner cost estimation for
internal functions.


I have no idea why calls are special in IPA analyze_function_body
and so I cannot say whether treating all internal fn calls as
non-calls is correct there.  Honza?

The tree-inline.cc change is OK though (you can push that separately).
I can't though, it ICEs on libgcc compilation (and other tests in 
testsuite). The estimate function is used by IPA to compute size and 
without the changes there it hits an assert because the 
estimate_num_insns no longer matches what ipa records in its 
size_time_table.


I'll wait for Honza to comment.


Thanks,
Richard.


Bootstrapped and regression tested on aarch64-unknown-linux-gnu.

gcc/ChangeLog:

 * ipa-fnsummary.cc (analyze_function_body): Correctly handle
 non-zero costed internal functions.
 * tree-inline.cc (estimate_num_insns): Improve costing for internal
 functions.

Re: [PATCH] libstdc++: Correct NTTP and simd_mask ctor call

2023-06-02 Thread Matthias Kretz via Gcc-patches

On Friday, 2 June 2023 11:30:17 CEST Alexandre Oliva wrote:
> I also noticed the same test is failing on rtems6 (at least with gcc
> 11).  AFAICT the problem is that _GLIBCXX_SIMD_MATH_CALL* macros in
> simd_math.h expect the named functions to be in std::, but I get such
> errors as:
> 
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299:
> error: 'remainder' is not a member of 'std'
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299:
> note: suggested alternatives: [...]
> .../aarch64-rtems6/include/math.h:346: note:   'remainder'
> [...]
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299:
> note:   'std::experimental::parallelism_v2::remainder'
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299:
> error: template argument 1 is invalid [...]
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1328:
> error: 'fmin' is not a member of 'std'; did you mean 'min'?
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1328:
> error: 'fmin' is not a member of 'std'; did you mean 'min'?
> .../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1328:
> error: template argument 1 is invalid
> 
> ISTM that rtems is missing some of the math.h functions expected by
> libstdc++, but also that even those that are present are not visible in
> namespace ::std::, where the macros reasonably expect to find them.  Is
> this known?  Should I file a PR about it?

I had/have no idea. Is rtems6 using the "freestanding" subset of C++? In which 
case simd shouldn't be there at all. Otherwise  should work, no?

- Matthias

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 1, 2023 at 8:25 PM Alexander Monakov  wrote:
>
>
> On Wed, 31 May 2023, Richard Biener wrote:
>
> > On Tue, May 30, 2023 at 4:49 PM Alexander Monakov  
> > wrote:
> > >
> > >
> > > On Thu, 25 May 2023, Richard Biener wrote:
> > >
> > > > On Wed, May 24, 2023 at 8:36 PM Alexander Monakov  
> > > > wrote:
> > > > >
> > > > >
> > > > > On Wed, 24 May 2023, Richard Biener via Gcc-patches wrote:
> > > > >
> > > > > > I’d have to check the ISAs what they actually do here - it of 
> > > > > > course depends
> > > > > > on RTL semantics as well but as you say those are not strictly 
> > > > > > defined here
> > > > > > either.
> > > > >
> > > > > Plus, we can add the following executable test to the testsuite:
> > > >
> > > > Yeah, that's probably a good idea.  I think your documentation change
> > > > with the added sentence about the truncation is OK.
> > >
> > > I am no longer confident in my patch, sorry.
> > >
> > > My claim about vector shift semantics in OpenCL was wrong. In fact it 
> > > specifies
> > > that RHS of a vector shift is masked to the exact bitwidth of the element 
> > > type.
> > >
> > > So, to collect various angles:
> > >
> > > 1. OpenCL semantics would need an 'AND' before a shift (except 
> > > VSX/Altivec).
> > >
> > > 2. From user side we had a request to follow C integer promotion semantics
> > >in https://gcc.gnu.org/PR91838 but I now doubt we can do that.
> > >
> > > 3. LLVM makes oversized vector shifts UB both for 'vector_size' and
> > >'ext_vector_type'.
> >
> > I had the impression GCC desired to do 3. as well, matching what we do
> > for scalar shifts.
> >
> > > 4. Vector lowering does not emit promotions, and starting from gcc-12
> > >ranger treats oversized shifts according to the documentation you
> > >cite below, and optimizes (e.g. with '-O2 -mno-sse')
> > >
> > > typedef short v8hi __attribute__((vector_size(16)));
> > >
> > > void f(v8hi *p)
> > > {
> > > *p >>= 16;
> > > }
> > >
> > >to zeroing '*p'. If this looks unintended, I can file a bug.
> > >
> > > I still think we need to clarify semantics of vector shifts, but probably
> > > not in the way I proposed initially. What do you think?
> >
> > I think the intent at some point was to adhere to the OpenCL spec
> > for the GCC vector extension (because that's a written spec while
> > GCCs vector extension docs are lacking).  Originally the powerpc
> > altivec 'vector' keyword spurred most of the development IIRC
> > so it might be useful to see how they specify shifts.
>
> It doesn't look like they document the semantics of '<<' and '>>'
> operators for vector types.
>
> > So yes, we probably should clarify the semantics to match the
> > implementation (since we have two targets doing things differently
> > since forever we can only document it as UB) and also note the
> > difference from OpenCL (in case OpenCL is still relevant these
> > days we might want to offer a -fopencl-vectors to emit the required
> > AND).
>
> It doesn't have to be UB, in principle we could say that shift amount
> is taken modulo some power of two depending on the target without UB.
> But since LLVM already treats that as UB, we might as well follow.
>
> I think for addition/multiplication of signed vectors everybody
> expects them to have wrapping semantics without UB on overflow though?

Actually GCC already treats them as UB on overflow by means of
vector lowering eventually turning them into scalar operations and
quite some patterns in match.pd applying to ANY_INTEGRAL_TYPE_P.

> Revised patch below.

The revised patch is OK.

Thanks,
Richard.

> > It would be also good to amend the RTL documentation.
> >
> > It would be very nice to start an internals documentation section
> > around collecting what the middle-end considers undefined
> > or implementation defined (aka target defined) behavior in the
> > GENERIC, GIMPLE and RTL ILs and what predicates eventually
> > control that (like TYPE_OVERFLOW_UNDEFINED).  Maybe spread it over
> > {gimple,generic,rtl}.texi, though gimple.texi is only about the 
> > representation
> > and all semantics are shared and documented in generic.texi.
>
> Hm, noted. Thanks.
>
> ---8<---
>
> From e4e8d9e262f2f8dbc91a94291cf7accb74d27e7c Mon Sep 17 00:00:00 2001
> From: Alexander Monakov 
> Date: Wed, 24 May 2023 15:48:29 +0300
> Subject: [PATCH] doc: clarify semantics of vector bitwise shifts
>
> Explicitly say that attempted shift past element bit width is UB for
> vector types.  Mention that integer promotions do not happen.
>
> gcc/ChangeLog:
>
> * doc/extend.texi (Vector Extensions): Clarify bitwise shift
> semantics.
> ---
>  gcc/doc/extend.texi | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e426a2eb7d..3723cfe467 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -12026,7 +12026,14 @@ elements in the operand.
>  It

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-02 Thread Richard Biener via Gcc-patches

On Fri, Jun 2, 2023 at 11:24 AM Alexander Monakov  wrote:
>
>
> On Fri, 2 Jun 2023, Matthias Kretz wrote:
>
> > > Okay, I see opinions will vary here. I was thinking about our immintrin.h
> > > which is partially implemented in terms of generic vectors. Imagine we
> > > extend UBSan to trap on signed overflow for vector types. I expect that
> > > will blow up on existing code that uses Intel intrinsics.
> >
> > _mm_add_epi32 is already implemented via __v4su addition (i.e. unsigned). So
> > the intrinsic would continue to wrap on signed overflow.
>
> Ah, if our intrinsics take care of it, that alleviates my concern.

Just to add when generic vectors are lowered to scalar operations then
signed vector ops become signed scalar ops which means followup
optimizations will assume undefined behavior on overflow.

> > > I'm not sure what you consider a breaking change here. Is that the implied
> > > threat to use undefinedness for range deduction and other optimizations?
> >
> > Consider the stdx::simd implementation. It currently follows semantics of 
> > the
> > builtin types. So simd can be shifted by 30 without UB. The
> > implementation of the shift operator depends on the current behavior, even 
> > if
> > it is target-dependent. For PPC the simd implementation adds extra code to
> > avoid the "UB". With nailing down shifts > sizeof(T) as UB this extra code 
> > now
> > needs to be added for all targets.
>
> What does stdx::simd do on LLVM, where that has always been UB even on x86?
>
> Alexander

Re: Re: [PATCH] RISC-V: Fix warning in predicated.md

2023-06-02 Thread juzhe.zh...@rivai.ai

Hi, I fixed it :
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620462.html 
Just feel free to commit it.

Thanks.


juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2023-06-02 17:29
To: juzhe.zhong
CC: gcc-patches; kito.cheng; kito.cheng; palmer; palmer; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Fix warning in predicated.md
../../gcc/gcc/config/riscv/predicates.md: In function ‘bool arith_operand_or_mod
e_mask(rtx, machine_mode)’:
../../gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison between 
signed and unsigned integer expressions [-Wsign-compare]
 (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-02 Thread Matthias Kretz via Gcc-patches

On Friday, 2 June 2023 11:24:23 CEST Alexander Monakov wrote:
> > > I'm not sure what you consider a breaking change here. Is that the
> > > implied
> > > threat to use undefinedness for range deduction and other optimizations?
> > 
> > Consider the stdx::simd implementation. It currently follows semantics of
> > the builtin types. So simd can be shifted by 30 without UB. The
> > implementation of the shift operator depends on the current behavior, even
> > if it is target-dependent. For PPC the simd implementation adds extra
> > code to avoid the "UB". With nailing down shifts > sizeof(T) as UB this
> > extra code now needs to be added for all targets.
> 
> What does stdx::simd do on LLVM, where that has always been UB even on x86?

At this point Clang/LLVM support is best effort. I did not know before that 
LLVM nailed this down as UB. Also my test suite didn't show any failures on 
shifts IIRC (but that doesn't say anything about UB, I know).

FWIW, I'm okay with saying nothing in the release notes. It might just be that 
some codes have become dependent on the existing (under-specified) behavior. 路

- Matthias
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──

[PATCH V2] RISC-V: Fix warning in predicated.md

2023-06-02 Thread juzhe . zhong

From: Juzhe-Zhong 

Notice there is warning in predicates.md:
../../../riscv-gcc/gcc/config/riscv/predicates.md: In function ???bool 
arith_operand_or_mode_mask(rtx, machine_mode):
../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison 
between signed and unsigned integer expressions [-Wsign-compare]
 (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison 
between signed and unsigned integer expressions [-Wsign-compare]
 || INTVAL (op) == GET_MODE_MASK (SImode)"

gcc/ChangeLog:

* config/riscv/predicates.md: Change INTVAL into UINTVAL.

---
 gcc/config/riscv/predicates.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index d14b1ca30bb..04ca6ceabc7 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -30,7 +30,7 @@
 (define_predicate "arith_operand_or_mode_mask"
   (ior (match_operand 0 "arith_operand")
(and (match_code "const_int")
-(match_test "INTVAL (op) == GET_MODE_MASK (HImode)
+(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
 || UINTVAL (op) == GET_MODE_MASK (SImode)"
 
 (define_predicate "lui_operand"
-- 
2.36.1

Re: [PATCH] rtl-optimization: [PR102733] DSE removing address which only differ by address space.

2023-06-02 Thread Richard Biener via Gcc-patches

On Fri, Jun 2, 2023 at 9:36 AM Andrew Pinski via Gcc-patches
 wrote:
>
> The problem here is DSE was not taking into account the address space
> which meant if you had two addresses say `fs:0` and `gs:0` (on x86_64),
> DSE would think they were the same and remove the first store.
> This fixes that issue by adding a check for the address space too.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> PR rtl-optimization/102733
>
> gcc/ChangeLog:
>
> * dse.cc (store_info): Add addrspace field.
> (record_store): Record the address space
> and check to make sure they are the same.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/addr-space-6.c: New test.
> ---
>  gcc/dse.cc   |  9 -
>  gcc/testsuite/gcc.target/i386/addr-space-6.c | 21 
>  2 files changed, 29 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/addr-space-6.c
>
> diff --git a/gcc/dse.cc b/gcc/dse.cc
> index 802b949cfb2..8b07be17674 100644
> --- a/gcc/dse.cc
> +++ b/gcc/dse.cc
> @@ -251,6 +251,9 @@ public:
>   and known (rather than -1).  */
>poly_int64 width;
>
> +  /* The address space that the memory reference uses.  */
> +  unsigned char addrspace;
> +
>union
>  {
>/* A bitmask as wide as the number of bytes in the word that
> @@ -1524,6 +1527,7 @@ record_store (rtx body, bb_info_t bb_info)
>ptr = active_local_stores;
>last = NULL;
>redundant_reason = NULL;
> +  unsigned char addrspace = MEM_ADDR_SPACE (mem);
>mem = canon_rtx (mem);
>
>if (group_id < 0)
> @@ -1548,7 +1552,9 @@ record_store (rtx body, bb_info_t bb_info)
>while (!s_info->is_set)
> s_info = s_info->next;
>
> -  if (s_info->group_id == group_id && s_info->cse_base == base)
> +  if (s_info->group_id == group_id
> + && s_info->cse_base == base
> + && s_info->addrspace == addrspace)
> {
>   HOST_WIDE_INT i;
>   if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -1688,6 +1694,7 @@ record_store (rtx body, bb_info_t bb_info)
>store_info->rhs = rhs;
>store_info->const_rhs = const_rhs;
>store_info->redundant_reason = redundant_reason;
> +  store_info->addrspace = addrspace;
>
>/* If this is a clobber, we return 0.  We will only be able to
>   delete this insn if there is only one store USED store, but we
> diff --git a/gcc/testsuite/gcc.target/i386/addr-space-6.c 
> b/gcc/testsuite/gcc.target/i386/addr-space-6.c
> new file mode 100644
> index 000..82eca4d7e0c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/addr-space-6.c
> @@ -0,0 +1,21 @@
> +/* PR rtl-optimization/102733 */
> +/* { dg-do compile } */
> +/* { dg-options "-O1" } */
> +
> +/* DSE was removing a store to fs:0 (correctly)
> +   and gs:0 (incorrectly) as DSE didn't take into
> +   account the address space was different.  */
> +
> +void test_null_store (void)
> +{
> +  int __seg_fs *fs = (int __seg_fs *)0;
> +  *fs = 1;
> +
> +  int __seg_gs *gs = (int __seg_gs *)0;
> +  *gs = 2;
> +  *fs = 3;
> +}
> +
> +/* { dg-final { scan-assembler-times "movl\t" 2 } } */
> +/* { dg-final { scan-assembler "gs:" } } */
> +/* { dg-final { scan-assembler "fs:" } } */
> --
> 2.31.1
>

Re: Re: [PATCH] RISC-V: Fix warning in predicated.md

2023-06-02 Thread juzhe.zh...@rivai.ai

Oh there is 2 INTVAL (op) == GET_MODE_MASK...
I only change one  :)



juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2023-06-02 17:29
To: juzhe.zhong
CC: gcc-patches; kito.cheng; kito.cheng; palmer; palmer; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Fix warning in predicated.md
../../gcc/gcc/config/riscv/predicates.md: In function ‘bool arith_operand_or_mod
e_mask(rtx, machine_mode)’:
../../gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison between 
signed and unsigned integer expressions [-Wsign-compare]
 (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH] libstdc++: Correct NTTP and simd_mask ctor call

2023-06-02 Thread Alexandre Oliva via Gcc-patches

On Jun  2, 2023, Matthias Kretz  wrote:

> I'm looking at that function again, also in light of recent improvements wrt. 
> code-gen, and will remove that assumption, that long long is vectorizable.

Thanks, I'll leave that to you, then.

I also noticed the same test is failing on rtems6 (at least with gcc
11).  AFAICT the problem is that _GLIBCXX_SIMD_MATH_CALL* macros in
simd_math.h expect the named functions to be in std::, but I get such
errors as:

.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299: 
error: 'remainder' is not a member of 'std'
.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299: note: 
suggested alternatives:
[...]
.../aarch64-rtems6/include/math.h:346: note:   'remainder'
[...]
.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299: note: 
  'std::experimental::parallelism_v2::remainder'
.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1299: 
error: template argument 1 is invalid
[...]
.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1328: 
error: 'fmin' is not a member of 'std'; did you mean 'min'?
.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1328: 
error: 'fmin' is not a member of 'std'; did you mean 'min'?
.../aarch64-rtems6/include/c++/11.4.1/experimental/bits/simd_math.h:1328: 
error: template argument 1 is invalid

ISTM that rtems is missing some of the math.h functions expected by
libstdc++, but also that even those that are present are not visible in
namespace ::std::, where the macros reasonably expect to find them.  Is
this known?  Should I file a PR about it?

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH RFA] varasm: check float size

2023-06-02 Thread Richard Biener via Gcc-patches

On Fri, Jun 2, 2023 at 4:44 AM Jason Merrill via Gcc-patches
 wrote:
>
> Tested x86_64-pc-linux-gnu, OK for trunk?

OK.

> -- 8< --
>
> In PR95226, the testcase was failing because we tried to output_constant a
> NOP_EXPR to float from a double REAL_CST, and so we output a double where
> the caller wanted a float.  That doesn't happen anymore, but with the
> output_constant hunk we will ICE in that situation rather than emit the
> wrong number of bytes.
>
> Part of the problem was that initializer_constant_valid_p_1 returned true
> for that NOP_EXPR, because it compared the sizes of integer types but not
> floating-point types.  So the C++ front end assumed it didn't need to fold
> the initializer.
>
> PR c++/95226
>
> gcc/ChangeLog:
>
> * varasm.cc (output_constant) [REAL_TYPE]: Check that sizes match.
> (initializer_constant_valid_p_1): Compare float precision.
> ---
>  gcc/varasm.cc | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 34400ec39ef..dd84754a283 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -4876,16 +4876,16 @@ initializer_constant_valid_p_1 (tree value, tree 
> endtype, tree *cache)
> tree src_type = TREE_TYPE (src);
> tree dest_type = TREE_TYPE (value);
>
> -   /* Allow conversions between pointer types, floating-point
> -  types, and offset types.  */
> +   /* Allow conversions between pointer types and offset types.  */
> if ((POINTER_TYPE_P (dest_type) && POINTER_TYPE_P (src_type))
> -   || (FLOAT_TYPE_P (dest_type) && FLOAT_TYPE_P (src_type))
> || (TREE_CODE (dest_type) == OFFSET_TYPE
> && TREE_CODE (src_type) == OFFSET_TYPE))
>   return initializer_constant_valid_p_1 (src, endtype, cache);
>
> -   /* Allow length-preserving conversions between integer types.  */
> -   if (INTEGRAL_TYPE_P (dest_type) && INTEGRAL_TYPE_P (src_type)
> +   /* Allow length-preserving conversions between integer types and
> +  floating-point types.  */
> +   if (((INTEGRAL_TYPE_P (dest_type) && INTEGRAL_TYPE_P (src_type))
> +|| (FLOAT_TYPE_P (dest_type) && FLOAT_TYPE_P (src_type)))
> && (TYPE_PRECISION (dest_type) == TYPE_PRECISION (src_type)))
>   return initializer_constant_valid_p_1 (src, endtype, cache);
>
> @@ -5255,6 +5255,7 @@ output_constant (tree exp, unsigned HOST_WIDE_INT size, 
> unsigned int align,
>break;
>
>  case REAL_TYPE:
> +  gcc_assert (size == thissize);
>if (TREE_CODE (exp) != REAL_CST)
> error ("initializer for floating value is not a floating constant");
>else
>
> base-commit: 5fccebdbd9666e0adf6dd8357c21d4ef3ac3f83f
> --
> 2.31.1
>

Re: [PATCH] RISC-V: Fix warning in predicated.md

2023-06-02 Thread Andreas Schwab

../../gcc/gcc/config/riscv/predicates.md: In function ‘bool arith_operand_or_mod
e_mask(rtx, machine_mode)’:
../../gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison between 
signed and unsigned integer expressions [-Wsign-compare]
 (match_test "INTVAL (op) == GET_MODE_MASK (HImode)

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH RFA] c++: make initializer_list array static again [PR110070]

2023-06-02 Thread Richard Biener via Gcc-patches

On Fri, Jun 2, 2023 at 3:32 AM Jason Merrill via Gcc-patches
 wrote:
>
> I ended up deciding not to apply the DECL_NOT_OBSERVABLE patch that you
> approved in stage 3 because I didn't feel like it was fully baked; I'm happy
> with this version now, which seems like a more broadly useful flag.
>
> Tested x86_64-pc-linux-gnu.  OK for trunk?

OK.

Richard.

> -- 8< --
>
> After the maybe_init_list_as_* patches, I noticed that we were putting the
> array of strings into .rodata, but then memcpying it into an automatic
> array, which is pointless; we should be able to use it directly.
>
> This doesn't happen automatically because TREE_ADDRESSABLE is set (since
> r12-657 for PR100464), and so gimplify_init_constructor won't promote the
> variable to static.  Theoretically we could do escape analysis to recognize
> that the address, though taken, never leaves the function; that would allow
> promotion when we're only using the address for indexing within the
> function, as in initlist-opt2.C.  But this would be a new pass.
>
> And in initlist-opt1.C, we're passing the array address to another function,
> so it definitely escapes; it's only safe in this case because it's calling a
> standard library function that we know only uses it for indexing.  So, a
> flag seems needed.  I first thought to put the flag on the TARGET_EXPR, but
> the VAR_DECL seems more appropriate.
>
> In a previous revision of the patch I called this flag DECL_NOT_OBSERVABLE,
> but I think DECL_MERGEABLE is a better name, especially if we're going to
> apply it to the backing array of initializer_list, which is observable.  I
> then also check it in places that check for -fmerge-all-constants, so that
> multiple equivalent initializer-lists can also be combined.  And then it
> seemed to make sense for [[no_unique_address]] to have this meaning for
> user-written variables.
>
> I think the note in [dcl.init.list]/6 intended to allow this kind of merging
> for initializer_lists, but it didn't actually work; for an explicit array
> with the same initializer, if the address escapes the program could tell
> whether the same variable in two frames have the same address.  P2752 is
> trying to correct this defect, so I'm going to assume that this is the
> intent.
>
> PR c++/110070
> PR c++/105838
>
> gcc/ChangeLog:
>
> * tree.h (DECL_MERGEABLE): New.
> * tree-core.h (struct tree_decl_common): Mention it.
> * gimplify.cc (gimplify_init_constructor): Check it.
> * cgraph.cc (symtab_node::address_can_be_compared_p): Likewise.
> * varasm.cc (categorize_decl_for_section): Likewise.
>
> gcc/cp/ChangeLog:
>
> * call.cc (maybe_init_list_as_array): Set DECL_MERGEABLE.
> (convert_like_internal) [ck_list]: Set it.
> (set_up_extended_ref_temp): Copy it.
> * tree.cc (handle_no_unique_addr_attribute): Set it.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/tree-ssa/initlist-opt1.C: Check for static array.
> * g++.dg/tree-ssa/initlist-opt2.C: Likewise.
> * g++.dg/tree-ssa/initlist-opt4.C: New test.
> * g++.dg/opt/icf1.C: New test.
> * g++.dg/opt/icf2.C: New test.
> ---
>  gcc/tree-core.h   |  3 ++-
>  gcc/tree.h|  6 ++
>  gcc/cgraph.cc |  2 +-
>  gcc/cp/call.cc| 15 ---
>  gcc/cp/tree.cc|  9 -
>  gcc/gimplify.cc   |  3 ++-
>  gcc/testsuite/g++.dg/opt/icf1.C   | 16 
>  gcc/testsuite/g++.dg/opt/icf2.C   | 17 +
>  gcc/testsuite/g++.dg/tree-ssa/initlist-opt1.C |  1 +
>  gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C |  2 ++
>  gcc/testsuite/g++.dg/tree-ssa/initlist-opt4.C | 13 +
>  gcc/varasm.cc |  2 +-
>  12 files changed, 81 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/opt/icf1.C
>  create mode 100644 gcc/testsuite/g++.dg/opt/icf2.C
>  create mode 100644 gcc/testsuite/g++.dg/tree-ssa/initlist-opt4.C
>
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h
> index 9d44c04bf03..6dd7b680b57 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1803,7 +1803,8 @@ struct GTY(()) tree_decl_common {
>   In VAR_DECL, PARM_DECL and RESULT_DECL, this is
>   DECL_HAS_VALUE_EXPR_P.  */
>unsigned decl_flag_2 : 1;
> -  /* In FIELD_DECL, this is DECL_PADDING_P.  */
> +  /* In FIELD_DECL, this is DECL_PADDING_P.
> + In VAR_DECL, this is DECL_MERGEABLE.  */
>unsigned decl_flag_3 : 1;
>/* Logically, these two would go in a theoretical base shared by var and
>   parm decl. */
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 0b72663e6a1..8a4beba1230 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -3233,6 +3233,12 @@ extern void decl_fini_priority_insert (tree, 
> priority_type);
>  #define

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-02 Thread Alexander Monakov via Gcc-patches



On Fri, 2 Jun 2023, Matthias Kretz wrote:

> > Okay, I see opinions will vary here. I was thinking about our immintrin.h
> > which is partially implemented in terms of generic vectors. Imagine we
> > extend UBSan to trap on signed overflow for vector types. I expect that
> > will blow up on existing code that uses Intel intrinsics.
> 
> _mm_add_epi32 is already implemented via __v4su addition (i.e. unsigned). So 
> the intrinsic would continue to wrap on signed overflow.

Ah, if our intrinsics take care of it, that alleviates my concern.

> > I'm not sure what you consider a breaking change here. Is that the implied
> > threat to use undefinedness for range deduction and other optimizations?
> 
> Consider the stdx::simd implementation. It currently follows semantics of the 
> builtin types. So simd can be shifted by 30 without UB. The 
> implementation of the shift operator depends on the current behavior, even if 
> it is target-dependent. For PPC the simd implementation adds extra code to 
> avoid the "UB". With nailing down shifts > sizeof(T) as UB this extra code 
> now 
> needs to be added for all targets.

What does stdx::simd do on LLVM, where that has always been UB even on x86?

Alexander

Re: [PATCH] inline: improve internal function costs

2023-06-02 Thread Richard Biener via Gcc-patches

On Thu, 1 Jun 2023, Andre Vieira (lists) wrote:

> Hi,
> 
> This is a follow-up of the internal function patch to add widening and
> narrowing patterns.  This patch improves the inliner cost estimation for
> internal functions.

I have no idea why calls are special in IPA analyze_function_body
and so I cannot say whether treating all internal fn calls as
non-calls is correct there.  Honza?

The tree-inline.cc change is OK though (you can push that separately).

Thanks,
Richard.

> Bootstrapped and regression tested on aarch64-unknown-linux-gnu.
> 
> gcc/ChangeLog:
> 
> * ipa-fnsummary.cc (analyze_function_body): Correctly handle
> non-zero costed internal functions.
> * tree-inline.cc (estimate_num_insns): Improve costing for internal
> functions.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH 2/2] ipa-cp: Feed results of IPA-CP into value numbering

2023-06-02 Thread Richard Biener via Gcc-patches

On Mon, 29 May 2023, Martin Jambor wrote:

> Hi,
> 
> PRs 68930 and 92497 show that when IPA-CP figures out constants in
> aggregate parameters or when passed by reference but the loads happen
> in an inlined function the information is lost.  This happens even
> when the inlined function itself was known to have - or even cloned to
> have - such constants in incoming parameters because the transform
> phase of IPA passes is not run on them.  See discussion in the bugs
> for reasons why.
> 
> Honza suggested that we can plug the results of IPA-CP analysis into
> value numbering, so that FRE can figure out that some loads fetch
> known constants.  This is what this patch does.
> 
> This version of the patch uses the new way we represent aggregate
> constants discovered IPA-CP and so avoids linear scan to find them.
> Similarly, it depends on the previous patch which avoids potentially
> slow linear look ups of indices of PARM_DECLs when there are many of
> them.
> 
> Bootstrapped, LTO-bootstrapped and LTO-profiledbootstrapped and tested
> on x86_64-linux.  OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2023-05-26  Martin Jambor  
> 
>   PR ipa/68930
>   PR ipa/92497
>   * ipa-prop.h (ipcp_get_aggregate_const): Declare.
>   * ipa-prop.cc (ipcp_get_aggregate_const): New function.
>   (ipcp_transform_function): Do not deallocate transformation info.
>   * tree-ssa-sccvn.cc: Include alloc-pool.h, symbol-summary.h and
>   ipa-prop.h.
>   (vn_reference_lookup_2): When hitting default-def vuse, query
>   IPA-CP transformation info for any known constants.
> 
> gcc/testsuite/ChangeLog:
> 
> 2022-09-05  Martin Jambor  
> 
>   PR ipa/68930
>   PR ipa/92497
>   * gcc.dg/ipa/pr92497-1.c: New test.
>   * gcc.dg/ipa/pr92497-2.c: Likewise.
> ---
>  gcc/ipa-prop.cc  | 33 +
>  gcc/ipa-prop.h   |  3 +++
>  gcc/testsuite/gcc.dg/ipa/pr92497-1.c | 26 
>  gcc/testsuite/gcc.dg/ipa/pr92497-2.c | 26 
>  gcc/tree-ssa-sccvn.cc| 36 +++-
>  5 files changed, 118 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92497-2.c
> 
> diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
> index f0976e363f7..fb2c0c0466b 100644
> --- a/gcc/ipa-prop.cc
> +++ b/gcc/ipa-prop.cc
> @@ -5765,6 +5765,34 @@ ipcp_modif_dom_walker::before_dom_children 
> (basic_block bb)
>return NULL;
>  }
>  
> +/* If IPA-CP discovered a constant in parameter PARM at OFFSET of a given 
> SIZE
> +   - whether passed by reference or not is given by BY_REF - return that
> +   constant.  Otherwise return NULL_TREE.  */
> +
> +tree
> +ipcp_get_aggregate_const (struct function *func, tree parm, bool by_ref,
> +   HOST_WIDE_INT bit_offset, HOST_WIDE_INT bit_size)
> +{
> +  cgraph_node *node = cgraph_node::get (func->decl);
> +  ipcp_transformation *ts = ipcp_get_transformation_summary (node);
> +
> +  if (!ts || !ts->m_agg_values)
> +return NULL_TREE;
> +
> +  int index = ts->get_param_index (func->decl, parm);
> +  if (index < 0)
> +return NULL_TREE;
> +
> +  ipa_argagg_value_list avl (ts);
> +  unsigned unit_offset = bit_offset / BITS_PER_UNIT;
> +  tree v = avl.get_value (index, unit_offset, by_ref);
> +  if (!v
> +  || maybe_ne (tree_to_poly_int64 (TYPE_SIZE (TREE_TYPE (v))), bit_size))
> +return NULL_TREE;
> +
> +  return v;
> +}
> +
>  /* Return true if we have recorded VALUE and MASK about PARM.
> Set VALUE and MASk accordingly.  */
>  
> @@ -6037,11 +6065,6 @@ ipcp_transform_function (struct cgraph_node *node)
>  free_ipa_bb_info (bi);
>fbi.bb_infos.release ();
>  
> -  ipcp_transformation *s = ipcp_transformation_sum->get (node);
> -  s->m_agg_values = NULL;
> -  s->bits = NULL;
> -  s->m_vr = NULL;
> -
>vec_free (descriptors);
>if (cfg_changed)
>  delete_unreachable_blocks_update_callgraph (node, false);
> diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> index 211b12ff6b3..f68fa4a12dd 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -1221,6 +1221,9 @@ void ipa_dump_param (FILE *, class ipa_node_params 
> *info, int i);
>  void ipa_release_body_info (struct ipa_func_body_info *);
>  tree ipa_get_callee_param_type (struct cgraph_edge *e, int i);
>  bool ipcp_get_parm_bits (tree, tree *, widest_int *);
> +tree ipcp_get_aggregate_const (struct function *func, tree parm, bool by_ref,
> +HOST_WIDE_INT bit_offset,
> +HOST_WIDE_INT bit_size);
>  bool unadjusted_ptr_and_unit_offset (tree op, tree *ret,
>poly_int64 *offset_ret);
>  
> diff --git a/gcc/testsuite/gcc.dg/ipa/pr92497-1.c 
> b/gcc/testsuite/gcc.dg/ipa/pr92497-1.c
> new file mode 100644
> index 000..eb8f2e75fd0
> --- /dev/null
> +++

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-02 Thread Matthias Kretz via Gcc-patches

On Friday, 2 June 2023 09:49:26 CEST Alexander Monakov wrote:
> > simd x = ...;
> > bool t = all_of(x < x + 1); // unconditionally true or not?
> > 
> > I'd expect t to be unconditionally true. Because simd simply is a
> > data- parallel version of int.
> 
> Okay, I see opinions will vary here. I was thinking about our immintrin.h
> which is partially implemented in terms of generic vectors. Imagine we
> extend UBSan to trap on signed overflow for vector types. I expect that
> will blow up on existing code that uses Intel intrinsics.

_mm_add_epi32 is already implemented via __v4su addition (i.e. unsigned). So 
the intrinsic would continue to wrap on signed overflow.

> > > Revised patch below.
> > 
> > This can be considered a breaking change. Does it need a mention in the
> > release notes?
> 
> I'm not sure what you consider a breaking change here. Is that the implied
> threat to use undefinedness for range deduction and other optimizations?

Consider the stdx::simd implementation. It currently follows semantics of the 
builtin types. So simd can be shifted by 30 without UB. The 
implementation of the shift operator depends on the current behavior, even if 
it is target-dependent. For PPC the simd implementation adds extra code to 
avoid the "UB". With nailing down shifts > sizeof(T) as UB this extra code now 
needs to be added for all targets.

- Matthias

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──

Re: Re: [PATCH V3] VECT: Change flow of decrement IV

2023-06-02 Thread juzhe.zh...@rivai.ai

Thanks Richi. I am gonna merge it after Richard's final approve.



juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-06-02 16:56
To: juzhe.zh...@rivai.ai
CC: gcc-patches; richard.sandiford; linkw
Subject: Re: [PATCH V3] VECT: Change flow of decrement IV
On Thu, 1 Jun 2023, juzhe.zh...@rivai.ai wrote:
 
> This patch is no difference from V2.
> Just add PR tree-optimization/109971 as Kewen's suggested.
> 
> Already bootstrapped and Regression on X86 no difference.
> 
> Ok for trunk ?
 
OK.
 
Richard.
 
> 
> juzhe.zh...@rivai.ai
>  
> From: juzhe.zhong
> Date: 2023-06-01 12:36
> To: gcc-patches
> CC: richard.sandiford; rguenther; linkw; Ju-Zhe Zhong
> Subject: [PATCH V3] VECT: Change flow of decrement IV
> From: Ju-Zhe Zhong 
>  
> Follow Richi's suggestion, I change current decrement IV flow from:
>  
> do {
>remain -= MIN (vf, remain);
> } while (remain != 0);
>  
> into:
>  
> do {
>old_remain = remain;
>len = MIN (vf, remain);
>remain -= vf;
> } while (old_remain >= vf);
>  
> to enhance SCEV.
>  
> Include fixes from kewen.
>  
>  
> This patch will need to wait for Kewen's test feedback.
>  
> Testing on X86 is on-going
>  
> Co-Authored by: Kewen Lin  
>  
>   PR tree-optimization/109971
>  
> gcc/ChangeLog:
>  
> * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Change 
> decrement IV flow.
> (vect_set_loop_condition_partial_vectors): Ditto.
>  
> ---
> gcc/tree-vect-loop-manip.cc | 36 +---
> 1 file changed, 25 insertions(+), 11 deletions(-)
>  
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index acf3642ceb2..3f735945e67 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -483,7 +483,7 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
> gimple_stmt_iterator loop_cond_gsi,
> rgroup_controls *rgc, tree niters,
> tree niters_skip, bool might_wrap_p,
> - tree *iv_step)
> + tree *iv_step, tree *compare_step)
> {
>tree compare_type = LOOP_VINFO_RGROUP_COMPARE_TYPE (loop_vinfo);
>tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> @@ -538,9 +538,9 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>...
>vect__4.8_28 = .LEN_LOAD (_17, 32B, _36, 0);
>...
> -ivtmp_35 = ivtmp_9 - _36;
> +ivtmp_35 = ivtmp_9 - POLY_INT_CST [4, 4];
>...
> -if (ivtmp_35 != 0)
> +if (ivtmp_9 > POLY_INT_CST [4, 4])
>  goto ; [83.33%]
>else
>  goto ; [16.67%]
> @@ -549,13 +549,15 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>tree step = rgc->controls.length () == 1 ? rgc->controls[0]
>: make_ssa_name (iv_type);
>/* Create decrement IV.  */
> -  create_iv (nitems_total, MINUS_EXPR, step, NULL_TREE, loop, _gsi,
> - insert_after, _before_incr, _after_incr);
> +  create_iv (nitems_total, MINUS_EXPR, nitems_step, NULL_TREE, loop,
> + _gsi, insert_after, _before_incr,
> + _after_incr);
>gimple_seq_add_stmt (header_seq, gimple_build_assign (step, MIN_EXPR,
> index_before_incr,
> nitems_step));
>*iv_step = step;
> -  return index_after_incr;
> +  *compare_step = nitems_step;
> +  return index_before_incr;
>  }
>/* Create increment IV.  */
> @@ -825,6 +827,7 @@ vect_set_loop_condition_partial_vectors (class loop *loop,
>   arbitrarily pick the last.  */
>tree test_ctrl = NULL_TREE;
>tree iv_step = NULL_TREE;
> +  tree compare_step = NULL_TREE;
>rgroup_controls *rgc;
>rgroup_controls *iv_rgc = nullptr;
>unsigned int i;
> @@ -861,7 +864,7 @@ vect_set_loop_condition_partial_vectors (class loop *loop,
> _seq, _seq,
> loop_cond_gsi, rgc, niters,
> niters_skip, might_wrap_p,
> - _step);
> + _step, _step);
> iv_rgc = rgc;
>   }
> @@ -884,10 +887,21 @@ vect_set_loop_condition_partial_vectors (class loop 
> *loop,
>/* Get a boolean result that tells us whether to iterate.  */
>edge exit_edge = single_exit (loop);
> -  tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? EQ_EXPR : NE_EXPR;
> -  tree zero_ctrl = build_zero_cst (TREE_TYPE (test_ctrl));
> -  gcond *cond_stmt = gimple_build_cond (code, test_ctrl, zero_ctrl,
> - NULL_TREE, NULL_TREE);
> +  gcond *cond_stmt;
> +  if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> +{
> +  gcc_assert (compare_step);
> +  tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? LE_EXPR : 
> GT_EXPR;
> +  cond_stmt = gimple_build_cond (code, test_ctrl, compare_step, 
> NULL_TREE,
> +  NULL_TREE);
> +}
> +  else
> +{
> +  tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? EQ_EXPR : 
> NE_EXPR;
> +  tree zero_ctrl = build_zero_cst (TREE_TYPE (test_ctrl));
> +  cond_stmt
> + = gimple_build_cond (code, test_ctrl, zero_ctrl, NULL_TREE, NULL_TREE);
> +}
>gsi_insert_before (_cond_gsi, cond_stmt, GSI_SAME_STMT);
>/* The loop iterates (NITERS - 1) / VF +

Re: [PATCH] Optimized "(X - N * M) / N + M" to "X / N" if valid

2023-06-02 Thread Richard Biener via Gcc-patches

On Thu, 1 Jun 2023, Jiufu Guo wrote:

> Hi,
> 
> Jiufu Guo via Gcc-patches  writes:
> 
> > Hi,
> >
> > Richard Biener  writes:
> >
> >> On Wed, 17 May 2023, Jiufu Guo wrote:
> >>
> >>> Hi,
> >>> 
> >>> This patch tries to optimize "(X - N * M) / N + M" to "X / N".
> >>
> >> But if that's valid why not make the transform simpler and transform
> >> (X - N * M) / N  to X / N - M instead?
> >
> > Great catch!
> > If "N * M" is not constant, "X / N - M" would be better than
> > "(X - N * M) / N".  If "N, M" are constants, "(X - N * M) / N" and
> > "X / N - M" may be similar; while for this case, "X / N - M" should
> > also be fine!  I would try to update accordingly. 
> >
> >>
> >> You use the same optimize_x_minus_NM_div_N_plus_M validator for
> >> the division and shift variants but the overflow rules are different,
> >> so I'm not sure that's warranted.  I'd also prefer to not split out
> >> the validator to a different file - iff then the appropriate file
> >> is fold-const.cc, not gimple-match-head.cc (I see we're a bit
> >> inconsistent here, for pure gimple matches gimple-fold.cc would
> >> be another place).
> >
> > Thanks for pointing out this!
> > For shift,  I guess you may concern that: 1. if the right operand is
> > negative or is greater than or equal to the type width.  2. if it is
> > a signed negative value.  They may UB or 'sign bit shift'?  This patch
> > assumes it is ok to do the transform.  I may have more check to see
> > if this is really ok, and hope some one can point out if this is
> > invalid. "(X - N * M) >> log2(N)" ==> " X >> log2(N) - M".
> >
> > I split out the validator just because: it is shared for division and
> > shift :).  And it seems gimple-match-head.cc and generic-match-head.cc,
> > may be introduced for match.pd.  So, I put it into gimple-match-head.cc.
> >
> >>
> >> Since you use range information why is the transform restricted
> >> to constant M?
> >
> > If M is a variable, the range for "X" is varying_p. I did not find
> > the method to get the bounds for "X" (or for "X - N * M") to check no
> > wraps.  Any suggestions?
> 
> Oh, I may misunderstand here.
> You may say: M could be with a range too, then we can check if
> "X - N * M" has a valid range or possible wrap/overflow. 

Yes.

Richard.

> BR,
> Jeff (Jiufu Guo)
> 
> >
> >
> > Again, thanks for your great help!
> >
> > BR,
> > Jeff (Jiufu Guo)
> >
> >>
> >> Richard.
> >>
> >>> As per the discussions in PR108757, we know this transformation is valid
> >>> only under some conditions.
> >>> For C code, "/" towards zero (trunc_div), and "X - N * M"
> >>> maybe wrap/overflow/underflow. So, it is valid that "X - N * M" does
> >>> not cross zero and does not wrap/overflow/underflow.
> >>> 
> >>> This patch also handles the case when "N" is the power of 2, where
> >>> "(X - N * M) / N" is "(X - N * M) >> log2(N)".
> >>> 
> >>> Bootstrap & regtest pass on ppc64{,le} and x86_64.
> >>> Is this ok for trunk?
> >>> 
> >>> BR,
> >>> Jeff (Jiufu)
> >>> 
> >>>   PR tree-optimization/108757
> >>> 
> >>> gcc/ChangeLog:
> >>> 
> >>>   * gimple-match-head.cc (optimize_x_minus_NM_div_N_plus_M): New function.
> >>>   * match.pd ((X - N * M) / N + M): New pattern.
> >>> 
> >>> gcc/testsuite/ChangeLog:
> >>> 
> >>>   * gcc.dg/pr108757-1.c: New test.
> >>>   * gcc.dg/pr108757-2.c: New test.
> >>>   * gcc.dg/pr108757.h: New test.
> >>> 
> >>> ---
> >>>  gcc/gimple-match-head.cc  |  54 ++
> >>>  gcc/match.pd  |  22 
> >>>  gcc/testsuite/gcc.dg/pr108757-1.c |  17 
> >>>  gcc/testsuite/gcc.dg/pr108757-2.c |  18 
> >>>  gcc/testsuite/gcc.dg/pr108757.h   | 160 ++
> >>>  5 files changed, 271 insertions(+)
> >>>  create mode 100644 gcc/testsuite/gcc.dg/pr108757-1.c
> >>>  create mode 100644 gcc/testsuite/gcc.dg/pr108757-2.c
> >>>  create mode 100644 gcc/testsuite/gcc.dg/pr108757.h
> >>> 
> >>> diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
> >>> index b08cd891a13..680a4cb2fc6 100644
> >>> --- a/gcc/gimple-match-head.cc
> >>> +++ b/gcc/gimple-match-head.cc
> >>> @@ -224,3 +224,57 @@ optimize_successive_divisions_p (tree divisor, tree 
> >>> inner_div)
> >>>  }
> >>>return true;
> >>>  }
> >>> +
> >>> +/* Return true if "(X - N * M) / N + M" can be optimized into "X / N".
> >>> +   Otherwise return false.
> >>> +
> >>> +   For unsigned,
> >>> +   If sign bit of M is 0 (clz is 0), valid range is [N*M, MAX].
> >>> +   If sign bit of M is 1, valid range is [0, MAX - N*(-M)].
> >>> +
> >>> +   For signed,
> >>> +   If N*M > 0, valid range: [MIN+N*M, 0] + [N*M, MAX]
> >>> +   If N*M < 0, valid range: [MIN, -(-N*M)] + [0, MAX - (-N*M)].  */
> >>> +
> >>> +static bool
> >>> +optimize_x_minus_NM_div_N_plus_M (tree x, wide_int n, wide_int m, tree 
> >>> type)
> >>> +{
> >>> +  wide_int max = wi::max_value (type);
> >>> +  signop sgn = TYPE_SIGN (type);
> >>> +  wide_int nm;
> >>> +  wi::overflow_type ovf;
> >>> +  if (TYPE_UNSIGNED (type) && wi::clz

Re: [PATCH V3] VECT: Change flow of decrement IV

2023-06-02 Thread Richard Biener via Gcc-patches

On Thu, 1 Jun 2023, juzhe.zh...@rivai.ai wrote:

> This patch is no difference from V2.
> Just add PR tree-optimization/109971 as Kewen's suggested.
> 
> Already bootstrapped and Regression on X86 no difference.
> 
> Ok for trunk ?

OK.

Richard.

> 
> juzhe.zh...@rivai.ai
>  
> From: juzhe.zhong
> Date: 2023-06-01 12:36
> To: gcc-patches
> CC: richard.sandiford; rguenther; linkw; Ju-Zhe Zhong
> Subject: [PATCH V3] VECT: Change flow of decrement IV
> From: Ju-Zhe Zhong 
>  
> Follow Richi's suggestion, I change current decrement IV flow from:
>  
> do {
>remain -= MIN (vf, remain);
> } while (remain != 0);
>  
> into:
>  
> do {
>old_remain = remain;
>len = MIN (vf, remain);
>remain -= vf;
> } while (old_remain >= vf);
>  
> to enhance SCEV.
>  
> Include fixes from kewen.
>  
>  
> This patch will need to wait for Kewen's test feedback.
>  
> Testing on X86 is on-going
>  
> Co-Authored by: Kewen Lin  
>  
>   PR tree-optimization/109971
>  
> gcc/ChangeLog:
>  
> * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Change 
> decrement IV flow.
> (vect_set_loop_condition_partial_vectors): Ditto.
>  
> ---
> gcc/tree-vect-loop-manip.cc | 36 +---
> 1 file changed, 25 insertions(+), 11 deletions(-)
>  
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index acf3642ceb2..3f735945e67 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -483,7 +483,7 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
> gimple_stmt_iterator loop_cond_gsi,
> rgroup_controls *rgc, tree niters,
> tree niters_skip, bool might_wrap_p,
> - tree *iv_step)
> + tree *iv_step, tree *compare_step)
> {
>tree compare_type = LOOP_VINFO_RGROUP_COMPARE_TYPE (loop_vinfo);
>tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> @@ -538,9 +538,9 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>...
>vect__4.8_28 = .LEN_LOAD (_17, 32B, _36, 0);
>...
> -ivtmp_35 = ivtmp_9 - _36;
> +ivtmp_35 = ivtmp_9 - POLY_INT_CST [4, 4];
>...
> -if (ivtmp_35 != 0)
> +if (ivtmp_9 > POLY_INT_CST [4, 4])
>  goto ; [83.33%]
>else
>  goto ; [16.67%]
> @@ -549,13 +549,15 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>tree step = rgc->controls.length () == 1 ? rgc->controls[0]
>: make_ssa_name (iv_type);
>/* Create decrement IV.  */
> -  create_iv (nitems_total, MINUS_EXPR, step, NULL_TREE, loop, _gsi,
> - insert_after, _before_incr, _after_incr);
> +  create_iv (nitems_total, MINUS_EXPR, nitems_step, NULL_TREE, loop,
> + _gsi, insert_after, _before_incr,
> + _after_incr);
>gimple_seq_add_stmt (header_seq, gimple_build_assign (step, MIN_EXPR,
> index_before_incr,
> nitems_step));
>*iv_step = step;
> -  return index_after_incr;
> +  *compare_step = nitems_step;
> +  return index_before_incr;
>  }
>/* Create increment IV.  */
> @@ -825,6 +827,7 @@ vect_set_loop_condition_partial_vectors (class loop *loop,
>   arbitrarily pick the last.  */
>tree test_ctrl = NULL_TREE;
>tree iv_step = NULL_TREE;
> +  tree compare_step = NULL_TREE;
>rgroup_controls *rgc;
>rgroup_controls *iv_rgc = nullptr;
>unsigned int i;
> @@ -861,7 +864,7 @@ vect_set_loop_condition_partial_vectors (class loop *loop,
> _seq, _seq,
> loop_cond_gsi, rgc, niters,
> niters_skip, might_wrap_p,
> - _step);
> + _step, _step);
> iv_rgc = rgc;
>   }
> @@ -884,10 +887,21 @@ vect_set_loop_condition_partial_vectors (class loop 
> *loop,
>/* Get a boolean result that tells us whether to iterate.  */
>edge exit_edge = single_exit (loop);
> -  tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? EQ_EXPR : NE_EXPR;
> -  tree zero_ctrl = build_zero_cst (TREE_TYPE (test_ctrl));
> -  gcond *cond_stmt = gimple_build_cond (code, test_ctrl, zero_ctrl,
> - NULL_TREE, NULL_TREE);
> +  gcond *cond_stmt;
> +  if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> +{
> +  gcc_assert (compare_step);
> +  tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? LE_EXPR : 
> GT_EXPR;
> +  cond_stmt = gimple_build_cond (code, test_ctrl, compare_step, 
> NULL_TREE,
> +  NULL_TREE);
> +}
> +  else
> +{
> +  tree_code code = (exit_edge->flags & EDGE_TRUE_VALUE) ? EQ_EXPR : 
> NE_EXPR;
> +  tree zero_ctrl = build_zero_cst (TREE_TYPE (test_ctrl));
> +  cond_stmt
> + = gimple_build_cond (code, test_ctrl, zero_ctrl, NULL_TREE, NULL_TREE);
> +}
>gsi_insert_before (_cond_gsi, cond_stmt, GSI_SAME_STMT);
>/* The loop iterates (NITERS - 1) / VF + 1 times.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] Don't try bswap + rotate when TYPE_PRECISION(n->type) > n->range.

2023-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 1, 2023 at 9:51 AM liuhongt via Gcc-patches
 wrote:
>
> For the testcase in the PR, we have
>
>   br64 = br;
>   br64 = ((br64 << 16) & 0x00ffull) | (br64 & 0xff00ull);
>
>   n->n: 0x300200.
>   n->range: 32.
>   n->type: uint64.
>
> The original code assumes n->range is same as TYPE PRECISION(n->type),
> and tries to rotate the mask from 0x30200 -> 0x20300 which is
> incorrect. The patch fixed this bug by not trying bswap + rotate when
> TYPE_PRECISION(n->type) is not equal to n->range.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

OK.

> gcc/ChangeLog:
>
> PR tree-optimization/110067
> * gimple-ssa-store-merging.cc (find_bswap_or_nop): Don't try
> bswap + rotate when TYPE_PRECISION(n->type) > n->range.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr110067.c: New test.
> ---
>  gcc/gimple-ssa-store-merging.cc  |  3 +
>  gcc/testsuite/gcc.target/i386/pr110067.c | 77 
>  2 files changed, 80 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr110067.c
>
> diff --git a/gcc/gimple-ssa-store-merging.cc b/gcc/gimple-ssa-store-merging.cc
> index 9cb574fa315..401496a9231 100644
> --- a/gcc/gimple-ssa-store-merging.cc
> +++ b/gcc/gimple-ssa-store-merging.cc
> @@ -1029,6 +1029,9 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number 
> *n, bool *bswap,
>/* TODO, handle cast64_to_32 and big/litte_endian memory
>  source when rsize < range.  */
>if (n->range == orig_range
> + /* There're case like 0x30200 for uint32->uint64 cast,
> +Don't hanlde this.  */
> + && n->range == TYPE_PRECISION (n->type)
>   && ((orig_range == 32
>&& optab_handler (rotl_optab, SImode) != CODE_FOR_nothing)
>   || (orig_range == 64
> diff --git a/gcc/testsuite/gcc.target/i386/pr110067.c 
> b/gcc/testsuite/gcc.target/i386/pr110067.c
> new file mode 100644
> index 000..c4208811628
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr110067.c
> @@ -0,0 +1,77 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fno-strict-aliasing" } */
> +
> +#include 
> +#define force_inline __inline__ __attribute__ ((__always_inline__))
> +
> +__attribute__((noipa))
> +static void
> +fetch_pixel_no_alpha_32_bug (void *out)
> +{
> +  uint32_t *ret = out;
> +  *ret = 0xff499baf;
> +}
> +
> +static force_inline uint32_t
> +bilinear_interpolation_local (uint32_t tl, uint32_t tr,
> + uint32_t bl, uint32_t br,
> + int distx, int disty)
> +{
> +  uint64_t distxy, distxiy, distixy, distixiy;
> +  uint64_t tl64, tr64, bl64, br64;
> +  uint64_t f, r;
> +
> +  distx <<= 1;
> +  disty <<= 1;
> +
> +  distxy = distx * disty;
> +  distxiy = distx * (256 - disty);
> +  distixy = (256 - distx) * disty;
> +  distixiy = (256 - distx) * (256 - disty);
> +
> +  /* Alpha and Blue */
> +  tl64 = tl & 0xffff;
> +  tr64 = tr & 0xffff;
> +  bl64 = bl & 0xffff;
> +  br64 = br & 0xffff;
> +
> +  f = tl64 * distixiy + tr64 * distxiy + bl64 * distixy + br64 * distxy;
> +  r = f & 0xffffull;
> +
> +  /* Red and Green */
> +  tl64 = tl;
> +  tl64 = ((tl64 << 16) & 0x00ffull) | (tl64 & 0xff00ull);
> +
> +  tr64 = tr;
> +  tr64 = ((tr64 << 16) & 0x00ffull) | (tr64 & 0xff00ull);
> +
> +  bl64 = bl;
> +  bl64 = ((bl64 << 16) & 0x00ffull) | (bl64 & 0xff00ull);
> +
> +  br64 = br;
> +  br64 = ((br64 << 16) & 0x00ffull) | (br64 & 0xff00ull);
> +
> +  f = tl64 * distixiy + tr64 * distxiy + bl64 * distixy + br64 * distxy;
> +  r |= ((f >> 16) & 0x00ffull) | (f & 0xff00ull);
> +
> +  return (uint32_t)(r >> 16);
> +}
> +
> +__attribute__((noipa))
> +static void
> +bits_image_fetch_pixel_bilinear_32_bug (void *out)
> +{
> +  uint32_t br;
> +  uint32_t *ret = out;
> +
> +  fetch_pixel_no_alpha_32_bug ();
> +  *ret = bilinear_interpolation_local (0, 0, 0, br, 0x41, 0x42);
> +}
> +
> +int main() {
> +  uint32_t r;
> +  bits_image_fetch_pixel_bilinear_32_bug ();
> +  if (r != 0x4213282d)
> +__builtin_abort ();
> +  return 0;
> +}
> --
> 2.39.1.388.g2fc9e9ca3c
>

Re: [PATCH] libstdc++: Correct NTTP and simd_mask ctor call

2023-06-02 Thread Matthias Kretz via Gcc-patches

Hello Alexandre,

On Friday, 2 June 2023 10:32:40 CEST Alexandre Oliva wrote:
> On May 26, 2023, Matthias Kretz via Libstdc++  wrote:
> > OK for master and all backports (after 11.4 is done)?
> > tested on powerpc64le-linux-gnu and x86_64-pc-linux-gnu
> > 
> > * testsuite/experimental/simd/pr109822_cast_functions.cc: New
> > test.
> 
> This testcase fails to compile on PowerPC targets without VSX: 64-bit
> integer and floating-point types cannot be vectorized.

Yes, and the simd implementation already encodes that both in 
__vectorized_sizeof() and __intrinsic_type.

> I wonder if the test is malformed (and should be amended to test for
> available simd types), or whether a patch like this would be desirable
> to make simd constructs more portable.  I'm not sure about the
> requirements.

The test is correct. The stdx::simd implementation has a latent bug (my 
dejagnu boards included only POWER7-POWER9; I'm at POWER5-POWER10 by now). The 
_S_store function is trying to work around bad code-gen but fails to notice 
that long long vectors can't be used.

I'm looking at that function again, also in light of recent improvements wrt. 
code-gen, and will remove that assumption, that long long is vectorizable.

__intrinsic_type_t should never be T, but always the type that can be 
passed to corresponding platform intrinsics. There are traits for the 
implementation to detect whether the intrinsics types are available.

- Matthias

> 
> 
> [libstdc++] [simd] [ppc] use nonvector intrinsic fallback types
> 
> From: Alexandre Oliva 
> 
> Compiling such tests as pr109822_cast_functions.cc on powerpc targets
> that don't support VSX fails because some intrinsic types that are
> expected to be vectorizable are not defined without VSX.
> 
> Introduce fallback non-vector types to enable the code to compile.
> 
> 
> for  libstdc++-v3/ChangeLog
> 
>   * include/experimental/bits/simd.h: Introduce fallback
>   non-vector intrinsic_type_impl specializations for PowerPC
>   without VSX.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h |   12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/libstdc++-v3/include/experimental/bits/simd.h
> b/libstdc++-v3/include/experimental/bits/simd.h index
> 834fe923065bd..2691823e869e8 100644
> --- a/libstdc++-v3/include/experimental/bits/simd.h
> +++ b/libstdc++-v3/include/experimental/bits/simd.h
> @@ -2431,9 +2431,14 @@ template 
>  #define _GLIBCXX_SIMD_PPC_INTRIN(_Tp)  
>\ template <>   
>   \ struct __intrinsic_type_impl<_Tp> { using type = __vector _Tp; }
> +#define _GLIBCXX_SIMD_PPC_INTRIN_NOVEC(_Tp)  
   \
> +  template <>  
>\ +struct __intrinsic_type_impl<_Tp> { using type = _Tp; }
>  _GLIBCXX_SIMD_PPC_INTRIN(float);
>  #ifdef __VSX__
>  _GLIBCXX_SIMD_PPC_INTRIN(double);
> +#else
> +_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(double);
>  #endif
>  _GLIBCXX_SIMD_PPC_INTRIN(signed char);
>  _GLIBCXX_SIMD_PPC_INTRIN(unsigned char);
> @@ -2444,12 +2449,19 @@ _GLIBCXX_SIMD_PPC_INTRIN(unsigned int);
>  #if defined __VSX__ || __SIZEOF_LONG__ == 4
>  _GLIBCXX_SIMD_PPC_INTRIN(signed long);
>  _GLIBCXX_SIMD_PPC_INTRIN(unsigned long);
> +#else
> +_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(signed long);
> +_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(unsigned long);
>  #endif
>  #ifdef __VSX__
>  _GLIBCXX_SIMD_PPC_INTRIN(signed long long);
>  _GLIBCXX_SIMD_PPC_INTRIN(unsigned long long);
> +#else
> +_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(signed long long);
> +_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(unsigned long long);
>  #endif
>  #undef _GLIBCXX_SIMD_PPC_INTRIN
> +#undef _GLIBCXX_SIMD_PPC_INTRIN_NOVEC
> 
>  template 
>struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp>
> && _Bytes <= 16>>


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──

[patch] Fix PR101188 wrong code from postreload

2023-06-02 Thread Georg-Johann Lay


There is the following bug in postreload that can be traced back
to v5 at least:

In postreload.cc::reload_cse_move2add() there is a loop over all
insns.  If it encounters a SET, the next insn is analyzed if it
is a single_set.

After next has been analyzed, it continues with

  if (success)
delete_insn (insn);
  changed |= success;
  insn = next; // This effectively skips analysis of next.
  move2add_record_mode (reg);
  reg_offset[regno]
= trunc_int_for_mode (added_offset + base_offset,
  mode);
  continue; // for continues with insn = NEXT_INSN (insn).

So it records the effect of next, but not the clobbers that
next might have.  This is a problem if next clobbers a GPR
like it can happen for avr.  What then can happen is that in a
later round, it may use a value from a (partially) clobbered reg.

The patch records the effects of potential clobbers.

Bootstrapped and reg-tested on x86_64.  Also tested on avr where
the bug popped up.  The testcase discriminates on avr, and for now
I am not aware of any other target that's affected by the bug.

The change is not intrusive and fixes wrong code, so I'd like
to backport it.

Ok to apply?

Johann

rtl-optimization/101188: Don't bypass clobbers of some insns that are
optimized or are optimization candidates.

gcc/
PR rtl-optimization/101188
* postreload.cc (reload_cse_move2add): Record clobbers of next
insn using move2add_note_store.

gcc/testsuite/
PR rtl-optimization/101188
* gcc.c-torture/execute/pr101188.c: New test.


diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index fb392651e1b..2de3e2ea780 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -2033,6 +2033,14 @@ reload_cse_move2add (rtx_insn *first)
  if (success)
delete_insn (insn);
  changed |= success;
+ // By setting "insn = next" below, we are bypassing the
+ // side-effects of next, see PR101188.  Do them by hand
+ subrtx_iterator::array_type array;
+ FOR_EACH_SUBRTX (iter, array, PATTERN (next), NONCONST)
+   {
+ if (GET_CODE (*iter) == CLOBBER)
+   move2add_note_store (XEXP (*iter, 0), *iter, next);
+   }
  insn = next;
  move2add_record_mode (reg);
  reg_offset[regno]
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr101188.c 
b/gcc/testsuite/gcc.c-torture/execute/pr101188.c

new file mode 100644
index 000..4817c69347c
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr101188.c
@@ -0,0 +1,59 @@
+typedef __UINT8_TYPE__ uint8_t;
+typedef __UINT16_TYPE__ uint16_t;
+
+typedef uint8_t (*fn1)(void *a);
+typedef void (*fn2)(void *a, int *arg);
+
+struct S
+{
+uint8_t buffer[64];
+uint16_t n;
+fn2 f2;
+void *a;
+fn1 f1;
+};
+
+volatile uint16_t x;
+
+void __attribute__((__noinline__,__noclone__))
+foo (uint16_t n)
+{
+  x = n;
+}
+
+void __attribute__((__noinline__,__noclone__))
+testfn (struct S *self)
+{
+int arg;
+
+foo (self->n);
+self->n++;
+self->f2 (self->a, );
+self->buffer[0] = self->f1 (self->a);
+}
+
+static unsigned char myfn2_called = 0;
+
+static void
+myfn2 (void *a, int *arg)
+{
+  myfn2_called = 1;
+}
+
+static uint8_t
+myfn1 (void *a)
+{
+  return 0;
+}
+
+int main (void)
+{
+  struct S s;
+  s.n = 0;
+  s.f2 = myfn2;
+  s.f1 = myfn1;
+  testfn ();
+  if (myfn2_called != 1)
+__builtin_abort();
+  return 0;
+}

Re: [PATCH 1/2] ipa-cp: Avoid long linear searches through DECL_ARGUMENTS

2023-06-02 Thread Richard Biener via Gcc-patches

On Wed, May 31, 2023 at 6:08 PM Martin Jambor  wrote:
>
> Hello,
>
> On Wed, May 31 2023, Richard Biener wrote:
> > On Tue, May 30, 2023 at 4:21 PM Jan Hubicka  wrote:
> >>
> >> > On Mon, May 29, 2023 at 6:20 PM Martin Jambor  wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > there have been concerns that linear searches through DECL_ARGUMENTS
> >> > > that are often necessary to compute the index of a particular
> >> > > PARM_DECL which is the key to results of IPA-CP can happen often
> >> > > enough to be a compile time issue, especially if we plug the results
> >> > > into value numbering, as I intend to do with a follow-up patch.
> >> > >
> >> > > This patch creates a hash map to do the look-up for all functions
> >> > > which have some information discovered by IPA-CP and which have 32
> >> > > parameters or more.  32 is a hard-wired magical constant here to
> >> > > capture the trade-off between the memory allocation overhead and
> >> > > length of the linear search.  I do not think it is worth making it a
> >> > > --param but if people think it appropriate, I can turn it into one.
> >> >
> >> > Since ipcp_transformation is short-lived (is it?) is it worth the 
> >> > trouble?
> >> > Comments below ...
> >>
> >> It lives from ipa-cp time to WPA stream-out or IPA transform stage,
> >> so memory consumption is a concern with -flto.
>
> It lives longer, until the function is finished, it holds the
> information we want to use during PRE, after all (and Honza also already
> added queries to it to tree-ssa-ccp.cc though those probably could be
> avoided).
>
> The proposed mapping for long chains would only be created in the
> transformation IPA-CP hook, so would only live in LTRANS and only
> throughout the compilation of a single function.  (But I am adding a
> pointer to the transformation summary of all.)
>
> >> > > +  m_tree_to_idx = hash_map::create_ggc (c);
> >> > > +  unsigned index = 0;
> >> > > +  for (tree p = DECL_ARGUMENTS (fndecl); p; p = DECL_CHAIN (p), 
> >> > > index++)
> >> > > +m_tree_to_idx->put (p, index);
> >> >
> >> > I think allocating the hash-map with 'c' for some numbers (depending
> >> > on the "prime"
> >> > chosen) will necessarily cause re-allocation of the hash since we keep a 
> >> > load
> >> > factor of at most 3/4 upon insertion.
>
> Oh, right.
>
> >> >
> >> > But - I wonder if a UID sorted array isn't a very much better data
> >> > structure for this?
> >> > That is, a vec >?
> >>
> >> Yeah, I was thinking along this lines too.
> >> Having field directly in PARM_DECL node would be probably prettiest.
> >> In general this is probably not that important as wast amount of time we
> >> have few parameters and linear lookup is just fine.
> >
> > There is 6 bits of DECL_OFFSET_ALIGN that could be re-purposed, but
> > 64 parameters is a bit low.  _Maybe_ PARM_DECL doesn't need any of
> > the tree_base bits so could use the full word for sth else as well ...
> >
> > I also though it might be interesting to only record PARM_DECLs that
> > we have interesting info for and skip VARYING ones.  So with an
> > indirection DECL_OFFSET_ALIGN -> index to non-varying param or
> > -1 the encoding space could shrink.
> >
> > But still using a vec<> looks like a straight-forward improvement here.
>
> Yeah, 64 parameters seems too tight.  I guess a testcase in which we
> would record information for that many parameters would be quite
> artificial, but I can imagine something like that in machine generated
> code.
>
> Below is the patch based on DECL_UIDs in a vector.  The problem with
> std::pair is that it is not GC-friendly and the transformation summary
> unfortunately needs to live in GC.  So I added a simple GTY marked
> structure.
>
> Bootstrapped, tested and (together with the subsequent patch) LTO
> bootstrapped on an x86_64-linux, as is and with lower threshold to
> create the mapping.  OK for master now?

LGTM now.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> Subject: [PATCH 1/2] ipa-cp: Avoid long linear searches through DECL_ARGUMENTS
>
> There have been concerns that linear searches through DECL_ARGUMENTS
> that are often necessary to compute the index of a particular
> PARM_DECL which is the key to results of IPA-CP can happen often
> enough to be a compile time issue, especially if we plug the results
> into value numbering, as I intend to do with a follow-up patch.
>
> This patch creates a vector sorted according to PARM_DECLs to do the look-up
> for all functions which have some information discovered by IPA-CP and which
> have 32 parameters or more.  32 is a hard-wired magical constant here to
> capture the trade-off between the memory allocation overhead and length of the
> linear search.  I do not think it is worth making it a --param but if people
> think it appropriate, I can turn it into one.
>
> gcc/ChangeLog:
>
> 2023-05-31  Martin Jambor  
>
> * ipa-prop.h (ipa_uid_to_idx_map_elt): New type.
> (struct ipcp_transformation): Rearrange members

Re: [PATCH] libstdc++: Correct NTTP and simd_mask ctor call

2023-06-02 Thread Alexandre Oliva via Gcc-patches

Hello, Matthias,

On May 26, 2023, Matthias Kretz via Libstdc++  wrote:

> OK for master and all backports (after 11.4 is done)?
> tested on powerpc64le-linux-gnu and x86_64-pc-linux-gnu

>   * testsuite/experimental/simd/pr109822_cast_functions.cc: New
>   test.

This testcase fails to compile on PowerPC targets without VSX: 64-bit
integer and floating-point types cannot be vectorized.

I wonder if the test is malformed (and should be amended to test for
available simd types), or whether a patch like this would be desirable
to make simd constructs more portable.  I'm not sure about the
requirements.


[libstdc++] [simd] [ppc] use nonvector intrinsic fallback types

From: Alexandre Oliva 

Compiling such tests as pr109822_cast_functions.cc on powerpc targets
that don't support VSX fails because some intrinsic types that are
expected to be vectorizable are not defined without VSX.

Introduce fallback non-vector types to enable the code to compile.


for  libstdc++-v3/ChangeLog

* include/experimental/bits/simd.h: Introduce fallback
non-vector intrinsic_type_impl specializations for PowerPC
without VSX.
---
 libstdc++-v3/include/experimental/bits/simd.h |   12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h 
b/libstdc++-v3/include/experimental/bits/simd.h
index 834fe923065bd..2691823e869e8 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2431,9 +2431,14 @@ template 
 #define _GLIBCXX_SIMD_PPC_INTRIN(_Tp)  
\
   template <>  
\
 struct __intrinsic_type_impl<_Tp> { using type = __vector _Tp; }
+#define _GLIBCXX_SIMD_PPC_INTRIN_NOVEC(_Tp)   \
+  template <>  
\
+struct __intrinsic_type_impl<_Tp> { using type = _Tp; }
 _GLIBCXX_SIMD_PPC_INTRIN(float);
 #ifdef __VSX__
 _GLIBCXX_SIMD_PPC_INTRIN(double);
+#else
+_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(double);
 #endif
 _GLIBCXX_SIMD_PPC_INTRIN(signed char);
 _GLIBCXX_SIMD_PPC_INTRIN(unsigned char);
@@ -2444,12 +2449,19 @@ _GLIBCXX_SIMD_PPC_INTRIN(unsigned int);
 #if defined __VSX__ || __SIZEOF_LONG__ == 4
 _GLIBCXX_SIMD_PPC_INTRIN(signed long);
 _GLIBCXX_SIMD_PPC_INTRIN(unsigned long);
+#else
+_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(signed long);
+_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(unsigned long);
 #endif
 #ifdef __VSX__
 _GLIBCXX_SIMD_PPC_INTRIN(signed long long);
 _GLIBCXX_SIMD_PPC_INTRIN(unsigned long long);
+#else
+_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(signed long long);
+_GLIBCXX_SIMD_PPC_INTRIN_NOVEC(unsigned long long);
 #endif
 #undef _GLIBCXX_SIMD_PPC_INTRIN
+#undef _GLIBCXX_SIMD_PPC_INTRIN_NOVEC
 
 template 
   struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp> && 
_Bytes <= 16>>


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

Re: [PATCH 2/2] btf: improve -dA comments for testsuite

2023-06-02 Thread Iain Sandoe

Hi David,

> On 31 May 2023, at 07:13, Indu Bhagat via Gcc-patches 
>  wrote:
> 
> On 5/30/23 11:27, David Faust wrote:
>> [Changes from v1:
>>  - Fix typos.
>>  - Split unrelated change into separate commit.
>>  - Improve asm comment for enum constants, update btf-enum-1 test.
>>  - Improve asm comment for DATASEC records, update btf-datasec-2 test.]
>> Many BTF type kinds refer to other types via index to the final types
>> list. However, the order of the final types list is not guaranteed to
>> remain the same for the same source program between different runs of
>> the compiler, making it difficult to test inter-type references.
>> This patch updates the assembler comments output when writing a
>> given BTF record to include minimal information about the referenced
>> type, if any. This allows for the regular expressions used in the gcc
>> testsuite to do some basic integrity checks on inter-type references.
>> For example, for the type
>>  unsigned int *
>> Assembly comments like the following are written with -dA:
>>  .4byte  0   ; TYPE 2 BTF_KIND_PTR ''
>>  .4byte  0x200   ; btt_info: kind=2, kflag=0, vlen=0
>>  .4byte  0x1 ; btt_type: (BTF_KIND_INT 'unsigned int')
>> Several BTF tests which can immediately be made more robust with this
>> change are updated. It will also be useful in new tests for the upcoming
>> btf_type_tag support.
>> Re-tested on BPF and x86_64, no known regressions.
>> Thanks.
> 
> LGTM.

This seems to break bootstrap on x86_64 darwin with two instances of :

gcc/btfout.cc:802:32: error: format ‘%lu’ expects argument of type ‘long 
unsigned int’, but argument 4 has type ‘ctf_id_t’ {aka ‘long long unsigned 
int’} [-Werror=format=]
802 |"TYPE %lu BTF_KIND_%s '%s’"

And another on line 970.

could you suggest where the change should be?
thanks
Iain

Re: [PATCH] i386: Add missing vector truncate patterns [PR92658].

2023-06-02 Thread Uros Bizjak via Gcc-patches

On Fri, Jun 2, 2023 at 2:49 AM liuhongt  wrote:
>
> Add missing insn patterns for v2si -> v2hi/v2qi and v2hi-> v2qi vector
> truncate.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/92658
> * config/i386/mmx.md (truncv2hiv2qi2): New define_insn.
> (truncv2si2): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr92658-avx512bw-trunc-2.c: New test.

OK.

Thanks,
Uros.
> ---
>  gcc/config/i386/mmx.md| 21 +++
>  .../i386/pr92658-avx512bw-trunc-2.c   | 27 +++
>  2 files changed, 48 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c
>
> diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> index dbcb850ffde..bb45098f797 100644
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -3667,6 +3667,27 @@ (define_expand "v2qiv2hi2"
>DONE;
>  })
>
> +(define_insn "truncv2hiv2qi2"
> +  [(set (match_operand:V2QI 0 "register_operand" "=v")
> +   (truncate:V2QI
> + (match_operand:V2HI 1 "register_operand" "v")))]
> +  "TARGET_AVX512VL && TARGET_AVX512BW"
> +  "vpmovwb\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "TI")])
> +
> +(define_mode_iterator V2QI_V2HI [V2QI V2HI])
> +(define_insn "truncv2si2"
> +  [(set (match_operand:V2QI_V2HI 0 "register_operand" "=v")
> +   (truncate:V2QI_V2HI
> + (match_operand:V2SI 1 "register_operand" "v")))]
> +  "TARGET_AVX512VL && TARGET_MMX_WITH_SSE"
> +  "vpmovd\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "TI")])
> +
>  ;; Pack/unpack vector modes
>  (define_mode_attr mmxpackmode
>[(V4HI "V8QI") (V2SI "V4HI")])
> diff --git a/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c 
> b/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c
> new file mode 100644
> index 000..2f5b7dc5668
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c
> @@ -0,0 +1,27 @@
> +/* PR target/92658 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
> +/* { dg-final { scan-assembler-times "vpmovwb" 1 } } */
> +/* { dg-final { scan-assembler-times "vpmovdb" 1 { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpmovdw" 1 { target { ! ia32 } } } } */
> +
> +void
> +foo (int* __restrict a, char* b)
> +{
> +b[0] = a[0];
> +b[1] = a[1];
> +}
> +
> +void
> +foo2 (short* __restrict a, char* b)
> +{
> +b[0] = a[0];
> +b[1] = a[1];
> +}
> +
> +void
> +foo3 (int* __restrict a, short* b)
> +{
> +b[0] = a[0];
> +b[1] = a[1];
> +}
> --
> 2.39.1.388.g2fc9e9ca3c
>

Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-02 Thread Alexander Monakov via Gcc-patches

On Fri, 2 Jun 2023, Matthias Kretz wrote:

> On Thursday, 1 June 2023 20:25:14 CEST Alexander Monakov wrote:
> > On Wed, 31 May 2023, Richard Biener wrote:
> > > So yes, we probably should clarify the semantics to match the
> > > implementation (since we have two targets doing things differently
> > > since forever we can only document it as UB) and also note the
> > > difference from OpenCL (in case OpenCL is still relevant these
> > > days we might want to offer a -fopencl-vectors to emit the required
> > > AND).
> > 
> > It doesn't have to be UB, in principle we could say that shift amount
> > is taken modulo some power of two depending on the target without UB.
> > But since LLVM already treats that as UB, we might as well follow.
> 
> I prefer UB (as your patch states ). If a user requires the AND, let them 
> state it explicitly. Don't let everybody pay in performance.

What I suggested does not imply a performance cost. All targets take some
lower bits of the shift amount anyway. It's only OpenCL's exact masking
that would imply a performance cost (and I agree it's inappropriate for
GCC's generic vectors).

> > I think for addition/multiplication of signed vectors everybody
> > expects them to have wrapping semantics without UB on overflow though?
> 
>   simd x = ...;
>   bool t = all_of(x < x + 1); // unconditionally true or not?
> 
> I'd expect t to be unconditionally true. Because simd simply is a data-
> parallel version of int.

Okay, I see opinions will vary here. I was thinking about our immintrin.h
which is partially implemented in terms of generic vectors. Imagine we
extend UBSan to trap on signed overflow for vector types. I expect that
will blow up on existing code that uses Intel intrinsics. But use of
generic vectors in immintrin.h is our implementation detail, and people
might have expected intrinsics to be overflow-safe, like for aliasing
(where we use __attribute__((may_alias)) in immintrin.h). Although, we
can solve that by inventing overflow-wraps attribute for types, maybe?

> > Revised patch below.
> 
> This can be considered a breaking change. Does it need a mention in the 
> release notes?

I'm not sure what you consider a breaking change here. Is that the implied
threat to use undefinedness for range deduction and other optimizations?

Thanks.
Alexander

1 2 >

1 - 100 of 117 matches

Mail list logo