[PATCH] RISC-V: Don't try to vectorize tree-ssa/gen-vect-34.c

2022-09-02 Thread Palmer Dabbelt
We don't yet support vectorization on RISC-V.

gcc/testsuite/ChangeLog

* gcc.dg/tree-ssa/gen-vect-34.c: Skip RISC-V targets.
---
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c 
b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
index 8d2d36401fe..41877e05efd 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
@@ -13,4 +13,4 @@ float summul(int n, float *arg1, float *arg2)
 return res1;   
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! { avr-*-* pru-*-* } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { 
! { avr-*-* pru-*-* riscv*-*-* } } } } } */
-- 
2.34.1



[PATCH] RISC-V: make USE_LOAD_ADDRESS_MACRO easier to understand

2022-09-02 Thread Vineet Gupta
The current macro has several && and || making it really hard to understand
the first time.

Signed-off-by: Vineet Gupta 
---
Since we are on this topic, perhaps get this simplification too.

But I'm not sure if the current checking of local symbol can be simplified
a bit. Isn't the first line enough for GET_CODE == const case too ?

---
 gcc/config/riscv/riscv.h | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index eb1284e56d69..3e3f67ef8270 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -749,18 +749,19 @@ typedef struct {
 #define CASE_VECTOR_MODE SImode
 #define CASE_VECTOR_PC_RELATIVE (riscv_cmodel != CM_MEDLOW)
 
+#define LOCAL_SYM(sym) \
+ ((SYMBOL_REF_P (sym) && SYMBOL_REF_LOCAL_P (sym)) \
+|| ((GET_CODE (sym) == CONST)  \
+&& SYMBOL_REF_P (XEXP (XEXP (sym, 0),0))   \
+&& SYMBOL_REF_LOCAL_P (XEXP (XEXP (sym, 0),0
+
 /* The load-address macro is used for PC-relative addressing of symbols
that bind locally.  Don't use it for symbols that should be addressed
via the GOT.  Also, avoid it for CM_MEDLOW, where LUI addressing
currently results in more opportunities for linker relaxation.  */
 #define USE_LOAD_ADDRESS_MACRO(sym)\
   (!TARGET_EXPLICIT_RELOCS &&  \
-   ((flag_pic  \
- && ((SYMBOL_REF_P (sym) && SYMBOL_REF_LOCAL_P (sym))  \
-|| ((GET_CODE (sym) == CONST)  \
-&& SYMBOL_REF_P (XEXP (XEXP (sym, 0),0))   \
-&& SYMBOL_REF_LOCAL_P (XEXP (XEXP (sym, 0),0)  \
- || riscv_cmodel == CM_MEDANY))
+   ((flag_pic && LOCAL_SYM(sym)) || riscv_cmodel == CM_MEDANY))
 
 /* Define this as 1 if `char' should by default be signed; else as 0.  */
 #define DEFAULT_SIGNED_CHAR 0
-- 
2.32.0



Re: [PATCH] c/c++: new warning: -Wxor-used-as-pow [PR90885]

2022-09-02 Thread David Malcolm via Gcc-patches
On Tue, 2022-08-30 at 16:40 -0400, Marek Polacek wrote:
> This looks good to me, one thing though:
> 
> On Thu, Aug 11, 2022 at 09:38:12PM -0400, David Malcolm via Gcc-
> patches wrote:
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -1439,6 +1439,10 @@ Wwrite-strings
> >  C ObjC C++ ObjC++ Var(warn_write_strings) Warning
> >  In C++, nonzero means warn about deprecated conversion from string
> > literals to 'char *'.  In C, similar warning, except that the
> > conversion is of course not deprecated by the ISO C standard.
> >  
> > +Wxor-used-as-pow
> > +C C++ Common Var(warn_xor_used_as_pow) Warning Init(1)
> 
> This doesn't include ObjC/ObjC++, but...

I added ObjC/ObjC++ (and dropped Common), and retested it (and tested
it with -xobjective-c and -xobjective-c++);  I've pushed it to trunk as
r13-2386-gbedfca647a9e9c.

Thanks
Dave



[PATCH v3] RISC-V: remove deprecate pic code model macro

2022-09-02 Thread Vineet Gupta
Came across this deprecated symbol when looking around for
-mexplicit-relocs handling in code

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv-c.cc   | 5 -
 gcc/testsuite/gcc.target/riscv/predef-1.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-2.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-3.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-4.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-5.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-6.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-7.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-8.c | 3 ---
 9 files changed, 29 deletions(-)

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index eb7ef09297e9..8d55ad598a9c 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -93,11 +93,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
   break;
 
 case CM_PIC:
-  /* __riscv_cmodel_pic is deprecated, and will removed in next GCC 
release.
-see https://github.com/riscv/riscv-c-api-doc/pull/11  */
-  builtin_define ("__riscv_cmodel_pic");
-  /* FALLTHROUGH. */
-
 case CM_MEDANY:
   builtin_define ("__riscv_cmodel_medany");
   break;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-1.c 
b/gcc/testsuite/gcc.target/riscv/predef-1.c
index 2e57ce6b3954..9dddc1849635 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-1.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-1.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-2.c 
b/gcc/testsuite/gcc.target/riscv/predef-2.c
index c85b3c9fd32a..755fe4ef7d8a 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-2.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-2.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-3.c 
b/gcc/testsuite/gcc.target/riscv/predef-3.c
index 82a89d415809..513645351c09 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-3.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-3.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medany"
-#endif
-#if !defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_pic"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-4.c 
b/gcc/testsuite/gcc.target/riscv/predef-4.c
index 5868d39eb67a..76b6feec6b6f 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-4.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-4.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-5.c 
b/gcc/testsuite/gcc.target/riscv/predef-5.c
index 4b2bd3835061..54a51508afbd 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-5.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-5.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-6.c 
b/gcc/testsuite/gcc.target/riscv/predef-6.c
index 8e5ea366bd5e..f61709f7bf32 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-6.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-6.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medany"
-#endif
-#if !defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medpic"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-7.c 
b/gcc/testsuite/gcc.target/riscv/predef-7.c
index 0bde299aef1a..41217554c4db 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-7.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-7.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-8.c 
b/gcc/testsuite/gcc.target/riscv/predef-8.c
index 18aa591a6039..982056a53438 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-8.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-8.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
-- 
2.32.0



Re: [PATCH v2] RISC-V: remove deprecate pic code model macro

2022-09-02 Thread Andreas Schwab
On Sep 02 2022, Vineet Gupta wrote:

> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> index eb7ef09297e9..bba72cf77a82 100644
> --- a/gcc/config/riscv/riscv-c.cc
> +++ b/gcc/config/riscv/riscv-c.cc
> @@ -93,9 +93,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>break;
>  
>  case CM_PIC:
> -  /* __riscv_cmodel_pic is deprecated, and will removed in next GCC 
> release.
> -  see https://github.com/riscv/riscv-c-api-doc/pull/11  */
> -  builtin_define ("__riscv_cmodel_pic");
>/* FALLTHROUGH. */
>  
>  case CM_MEDANY:

If there is nothing left between the case labels the fallthrough comment
is no longer needed.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH v2, rs6000] Put dg-options before effective target checks

2022-09-02 Thread Segher Boessenkool
Hi!

On Fri, Sep 02, 2022 at 11:43:28AM +0800, HAO CHEN GUI wrote:
> On 2/9/2022 上午 12:07, Segher Boessenkool wrote:
> >> +/* { dg-do compile { target { ! has_arch_pwr9 } } } */
> > Please keep dg-do first thing in the file.
> Could you inform me if it's a must to put dg-do in the first line?

It is customary.  If you do differently it will be a lot harder for
people to truly understand your tests.

> Here I hit a problem. "! has_arch_pwr9" can not be put into
> dg-require-effective-target as it has a NOT.

dg-require-effective-target has a selector, maybe you can do something
with that?
  dg-require-effective-target { whatever { has_arch_pwr9 } }
or something like that?

> >> --- a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> >> @@ -1,5 +1,6 @@
> >> -/* { dg-do compile { target has_arch_ppc64 } } */
> >> +/* { dg-do compile } */
> >>  /* { dg-options "-mdejagnu-cpu=power6 -O2" } */
> >> +/* { dg-require-effective-target has_arch_ppc64 } */
> > This is fine, but it doesn't change anything, unless we have a bug.
> 
> This case suffer from "empty translation unit" problem and to be
> unsupported on all platform. Put dg-options before the check avoid
> the problem.

Then please fix that problem first!  It *will* come back to bite us,
multiple times per week, until it is fixed.


Segher


Re: [PATCH 2/2] RISC-V: remove CM_PIC as it doesn't do much

2022-09-02 Thread Vineet Gupta




On 8/31/22 13:39, Vineet Gupta wrote:



On 8/31/22 07:57, Palmer Dabbelt wrote:

   if (flag_pic)
-    riscv_cmodel = CM_PIC;
+    riscv_cmodel = CM_MEDANY;

   /* We get better code with explicit relocs for CM_MEDLOW, but
  worse code for the others (for now).  Pick the best default.  */


I'm fine either way on this one: having CM_PIC gone makes it a bit 
more likely to confuse CM_MEDANY with PIC, but flag_pic is overriding 
riscv_cmodel anyway so this isn't really used and deleting code is 
always a plus.


Indeed this was the most contentious part of removing CM_PIC, but it 
seems this is the way fwd. I'll add Kito's comment from [1] in code to 
make it more explicit.


[1]https://github.com/riscv-non-isa/riscv-c-api-doc/pull/11#issuecomment-686385585 


I think I'll punt on this one, in the short-term. The reason being it 
affects USE_LOAD_ADDRESS_MACRO.


#define USE_LOAD_ADDRESS_MACRO(sym) \
  (!TARGET_EXPLICIT_RELOCS &&   \
   ((flag_pic   \
 && ((SYMBOL_REF_P (sym) && SYMBOL_REF_LOCAL_P (sym))   \
 || ((GET_CODE (sym) == CONST)  \
 && SYMBOL_REF_P (XEXP (XEXP (sym, 0),0))   \
 && SYMBOL_REF_LOCAL_P (XEXP (XEXP (sym, 0),0)  \
 || riscv_cmodel == CM_MEDANY))

With the patch, PIC implies CM_MEDANY and thus will change codegen for 
pic non-local symbols to also use the load address macro. I think we 
want to go in the opposite direction, i.e. wean away from the asm macros 
and have gcc codegen natively. It seems there are bugs in that area so 
once we flush them out (after creating a few as I don't know of any 
existing documented ones) this will get cleaned out.


Thx
-Vineet


[PATCH v2] RISC-V: remove deprecate pic code model macro

2022-09-02 Thread Vineet Gupta
Came across this deprecated symbol when looking around for
-mexplicit-relocs handling in code

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv-c.cc   | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-1.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-2.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-3.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-4.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-5.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-6.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-7.c | 3 ---
 gcc/testsuite/gcc.target/riscv/predef-8.c | 3 ---
 9 files changed, 27 deletions(-)

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index eb7ef09297e9..bba72cf77a82 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -93,9 +93,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
   break;
 
 case CM_PIC:
-  /* __riscv_cmodel_pic is deprecated, and will removed in next GCC 
release.
-see https://github.com/riscv/riscv-c-api-doc/pull/11  */
-  builtin_define ("__riscv_cmodel_pic");
   /* FALLTHROUGH. */
 
 case CM_MEDANY:
diff --git a/gcc/testsuite/gcc.target/riscv/predef-1.c 
b/gcc/testsuite/gcc.target/riscv/predef-1.c
index 2e57ce6b3954..9dddc1849635 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-1.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-1.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-2.c 
b/gcc/testsuite/gcc.target/riscv/predef-2.c
index c85b3c9fd32a..755fe4ef7d8a 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-2.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-2.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-3.c 
b/gcc/testsuite/gcc.target/riscv/predef-3.c
index 82a89d415809..513645351c09 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-3.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-3.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medany"
-#endif
-#if !defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_pic"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-4.c 
b/gcc/testsuite/gcc.target/riscv/predef-4.c
index 5868d39eb67a..76b6feec6b6f 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-4.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-4.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-5.c 
b/gcc/testsuite/gcc.target/riscv/predef-5.c
index 4b2bd3835061..54a51508afbd 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-5.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-5.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-6.c 
b/gcc/testsuite/gcc.target/riscv/predef-6.c
index 8e5ea366bd5e..f61709f7bf32 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-6.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-6.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medany"
-#endif
-#if !defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medpic"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-7.c 
b/gcc/testsuite/gcc.target/riscv/predef-7.c
index 0bde299aef1a..41217554c4db 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-7.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-7.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
diff --git a/gcc/testsuite/gcc.target/riscv/predef-8.c 
b/gcc/testsuite/gcc.target/riscv/predef-8.c
index 18aa591a6039..982056a53438 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-8.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-8.c
@@ -57,9 +57,6 @@ int main () {
 #endif
 #if !defined(__riscv_cmodel_medany)
 #error "__riscv_cmodel_medlow"
-#endif
-#if defined(__riscv_cmodel_pic)
-#error "__riscv_cmodel_medlow"
 #endif
 
   return 0;
-- 
2.32.0



Proxy ping [PATCH] Fortran: Fix ICE with automatic reallocation [PR100245]

2022-09-02 Thread Harald Anlauf via Gcc-patches
Dear all,

Jose posted a small patch here that was never reviewed:

  https://gcc.gnu.org/pipermail/fortran/2021-April/055982.html

IMHO the patch is fine and nearly obvious.

I inquired in the PR, and Jose did not object to my handling of
his patch.  So - unless there are objections - I will commit
the patch in the next days in the slightly corrected version as
attached below (with fixed PR typo in commit message ;-).

Regtested on x86_64-pc-linux-gnu.

Thanks,
Harald

From d7e5cca20be4a4ed00705f0d577302819ad97123 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jos=C3=A9=20Rui=20Faustino=20de=20Sousa?=
 
Date: Fri, 2 Sep 2022 21:35:22 +0200
Subject: [PATCH] Fortran: Fix ICE with automatic reallocation [PR100245]

gcc/fortran/ChangeLog:

	PR fortran/100245
	* trans-expr.cc (trans_class_assignment): Add if clause to handle
	derived type in the LHS.

gcc/testsuite/ChangeLog:

	PR fortran/100245
	* gfortran.dg/PR100245.f90: New test.
---
 gcc/fortran/trans-expr.cc  |  3 +++
 gcc/testsuite/gfortran.dg/PR100245.f90 | 28 ++
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/PR100245.f90

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 850007fd2e1..13c3e7df45f 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -11436,6 +11436,9 @@ trans_class_assignment (stmtblock_t *block, gfc_expr *lhs, gfc_expr *rhs,
   class_han = GFC_CLASS_TYPE_P (TREE_TYPE (lse->expr))
 	  ? gfc_class_data_get (lse->expr) : lse->expr;

+  if (!POINTER_TYPE_P (TREE_TYPE (class_han)))
+	class_han = gfc_build_addr_expr (NULL_TREE, class_han);
+
   /* Allocate block.  */
   gfc_init_block ();
   gfc_allocate_using_malloc (, class_han, size, NULL_TREE);
diff --git a/gcc/testsuite/gfortran.dg/PR100245.f90 b/gcc/testsuite/gfortran.dg/PR100245.f90
new file mode 100644
index 000..07c1f7b3a1c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/PR100245.f90
@@ -0,0 +1,28 @@
+! { dg-do run }
+!
+! Test the fix for PR100245
+!
+
+program main_p
+
+  implicit none
+
+  type :: foo_t
+integer :: a
+  end type foo_t
+
+  integer, parameter :: a = 42
+
+  class(foo_t), allocatable :: val
+  class(foo_t), allocatable :: rs1
+  type(foo_t),  allocatable :: rs2
+
+  allocate(val, source=foo_t(42))
+  if (val%a/=a) stop 1
+  rs1 = val
+  if (rs1%a/=a) stop 2
+  rs2 = val
+  if (rs2%a/=a) stop 3
+  deallocate(val, rs1, rs2)
+
+end program main_p
--
2.35.3



Re: GNU Tools Cauldron 2022

2022-09-02 Thread Thomas Schwinge
Hi!

(Currently still on parental leave, but I just had to...)  ;-P

On 2022-05-15T01:02:07+0200, Jan Hubicka via Gcc  wrote:
> We are pleased to invite you all to the next GNU Tools Cauldron,
> taking place in [Prague] on September 16-18, 2022.  We are looking forward
> to meet you again after three years!
>
> As for the previous instances, we have setup a wiki page for
> details:
>
>  https://gcc.gnu.org/wiki/cauldron2022  
> 

Pushed to wwwdocs commit b465e8f1f72a0718aebcf483a78b68c3e68ead72
"GNU Tools Cauldron 2022", see attached.


See you in less than two weeks!

Grüße
 Thomas


>From b465e8f1f72a0718aebcf483a78b68c3e68ead72 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 2 Sep 2022 21:39:18 +0200
Subject: [PATCH] GNU Tools Cauldron 2022

---
 htdocs/index.html | 4 
 1 file changed, 4 insertions(+)

diff --git a/htdocs/index.html b/htdocs/index.html
index 3bec379e..c2982a74 100644
--- a/htdocs/index.html
+++ b/htdocs/index.html
@@ -55,6 +55,10 @@ mission statement.
 News
 
 
+https://gcc.gnu.org/wiki/cauldron2022;>GNU Tools Cauldron 2022
+[2022-09-02]
+Prague, Czech Republic and online, September 16-18 2022
+
 GCC 12.2 released
 [2022-08-19]
 
-- 
2.35.1



[PATCH, committed] Fortran: avoid NULL pointer dereference on invalid DATA constant [PR99349]

2022-09-02 Thread Harald Anlauf via Gcc-patches
Dear all,

I've committed the attached fix for a NULL pointer dereference
as obvious after a discussion with Steve in the PR, and
successful regtesting on x86_64-pc-linux-gnu, as r13-2382.

See also https://gcc.gnu.org/g:b6aa7d45b502c01f8703c8d2cee2690f9aa8e282

Thanks,
Harald

From b6aa7d45b502c01f8703c8d2cee2690f9aa8e282 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 2 Sep 2022 21:07:26 +0200
Subject: [PATCH] Fortran: avoid NULL pointer dereference on invalid DATA
 constant [PR99349]

gcc/fortran/ChangeLog:

	PR fortran/99349
	* decl.cc (match_data_constant): Avoid NULL pointer dereference.

gcc/testsuite/ChangeLog:

	PR fortran/99349
	* gfortran.dg/pr99349.f90: New test.

Co-authored-by: Steven G. Kargl 
---
 gcc/fortran/decl.cc   | 3 ++-
 gcc/testsuite/gfortran.dg/pr99349.f90 | 9 +
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr99349.f90

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index b6400514731..0f9b2ced4c2 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -423,7 +423,8 @@ match_data_constant (gfc_expr **result)
 	 data-pointer-initialization compatible (7.5.4.6) with the initial
 	 data target; the data statement object is initially associated
 	 with the target.  */
-  if ((*result)->symtree->n.sym->attr.save
+  if ((*result)->symtree
+	  && (*result)->symtree->n.sym->attr.save
 	  && (*result)->symtree->n.sym->attr.target)
 	return m;
   gfc_free_expr (*result);
diff --git a/gcc/testsuite/gfortran.dg/pr99349.f90 b/gcc/testsuite/gfortran.dg/pr99349.f90
new file mode 100644
index 000..e1f4628af0b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr99349.f90
@@ -0,0 +1,9 @@
+! { dg-do compile }
+! PR fortran/99349 - ICE in match_data_constant
+! Contributed by G.Steinmetz
+
+function f()
+  logical, parameter :: a((1.)/0) = .true. ! { dg-error "Parameter array" }
+  integer :: b
+  data b /a%kind/ ! { dg-error "Syntax error" }
+end
--
2.35.3



Re: [PATCH] rs6000: Use NO_EXPR to cast to MMA pointer types

2022-09-02 Thread Peter Bergner via Gcc-patches
On 9/2/22 12:23 PM, Segher Boessenkool wrote:
> On Fri, Sep 02, 2022 at 12:02:54PM -0500, Peter Bergner wrote:
>> On 9/2/22 11:31 AM, Segher Boessenkool wrote:
>>> (Did you also look at non-MMA VIEW_CONVERT_EXPR uses btw?)
>>
>> I did.  It seemed they were all related to pointers to vectors and I remember
>> you mentioning that as one of the reasons for using VIEW_CONVERT_EXPR over
>> NOP_EXPR, so I left them alone to be safe.
> 
> Huh?  I have no idea what you mean here.
> 
> Casting from one pointer type to another never needs it.  Casting from a
> scalar integer type to a pointer type not either AFAIKi.  But I am not a
> Gimple expert, all this might be wrong, it isn't documented anywbere :-(

Ah, then I misunderstood you and didn't pick up on the non-pointer thing.
...which just goes to show I'm not an expert and someone else should look
at those uses. :-) 

Peter




Re: [PATCH] libstdc++: Consistently use ::type when deriving from __and/or/not_

2022-09-02 Thread Jonathan Wakely via Gcc-patches
On Fri, 2 Sept 2022 at 17:39, Patrick Palka via Libstdc++
 wrote:
>
> Now that these internal type traits are again class templates, it's
> better to derive from the trait's ::type (which is either false_type or
> true_type) instead of from the trait itself, for sake of a shallower
> inheritance chain.  We usually do this but not always; this patch makes
> us consistently do so.
>
> Tested on x86_64-pc-lnux-gnu, does this look OK for trunk?  (Compile
> time for join.cc decreases by about 0.5% with this, avg of 10 runs.)

OK, thanks.


>
> libstdc++-v3/ChangeLog:
>
> * include/std/tuple (tuple::_UseOtherCtor): Do ::type when
> deriving from __and_, __or_ or __not_.
> * include/std/type_traits (negation): Likewise.
> (is_unsigned): Likewise.
> (__is_implicitly_default_constructible): Likewise.
> (is_trivially_destructible): Likewise.
> (__is_nt_invocable_impl): Likewise.
> ---
>  libstdc++-v3/include/std/tuple   |  2 +-
>  libstdc++-v3/include/std/type_traits | 10 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
> index ddd7c226d80..26e248431ec 100644
> --- a/libstdc++-v3/include/std/tuple
> +++ b/libstdc++-v3/include/std/tuple
> @@ -826,7 +826,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// then TUPLE should match tuple(UTypes&&...) instead.
>template
> struct _UseOtherCtor<_Tuple, tuple<_Tp>, tuple<_Up>>
> -   : __or_, is_constructible<_Tp, _Tuple>>
> +   : __or_, is_constructible<_Tp, 
> _Tuple>>::type
> { };
>// If TUPLE and *this each have a single element of the same type,
>// then TUPLE should match a copy/move constructor instead.
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index be9f2955539..c0bb1cf64e3 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -235,7 +235,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>template
>  struct negation
> -: __not_<_Pp>
> +: __not_<_Pp>::type
>  { };
>
>/** @ingroup variable_templates
> @@ -845,7 +845,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>/// is_unsigned
>template
>  struct is_unsigned
> -: public __and_, __not_>>
> +: public __and_, __not_>>::type
>  { };
>
>/// @cond undocumented
> @@ -1222,7 +1222,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>template 
>  struct __is_implicitly_default_constructible
>  : public __and_<__is_constructible_impl<_Tp>,
> -   __is_implicitly_default_constructible_safe<_Tp>>
> +   __is_implicitly_default_constructible_safe<_Tp>>::type
>  { };
>
>/// is_trivially_copy_constructible
> @@ -1282,7 +1282,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>template
>  struct is_trivially_destructible
>  : public __and_<__is_destructible_safe<_Tp>,
> -   __bool_constant<__has_trivial_destructor(_Tp)>>
> +   __bool_constant<__has_trivial_destructor(_Tp)>>::type
>  {
>static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
> "template argument must be a complete class or an unbounded array");
> @@ -2975,7 +2975,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  struct __is_nt_invocable_impl<_Result, _Ret,
>   __void_t>
>  : __or_,
> -   __is_nothrow_convertible>
> +   __is_nothrow_convertible>::type
>  { };
>/// @endcond
>
> --
> 2.37.2.490.g6c8e4ee870
>



Re: [PATCH] Convert rest of compiler to dconst[n]inf.

2022-09-02 Thread Jeff Law via Gcc-patches




On 9/1/2022 12:08 PM, Aldy Hernandez via Gcc-patches wrote:

This is kinda obvious.

OK?

gcc/ChangeLog:

* builtins.cc (fold_builtin_inf): Convert use of real_info to dconstinf.
(fold_builtin_fpclassify): Same.
* fold-const-call.cc (fold_const_call_cc): Same.
* match.pd: Same.
* omp-low.cc (omp_reduction_init_op): Same.
* realmpfr.cc (real_from_mpfr): Same.
* tree.cc (build_complex_inf): Same.

OK
jeff



Re: [PATCH] Improve converting between 128-bit modes that use the same format

2022-09-02 Thread Michael Meissner via Gcc-patches
On Tue, Aug 23, 2022 at 04:13:45PM -0500, Segher Boessenkool wrote:
> Please do not send new patches as replies to other patches.

This was sent as a new patch.

> On Thu, Aug 18, 2022 at 05:48:29PM -0400, Michael Meissner wrote:
> > mprove converting between 128-bit modes that use the same format.
>
> You are missing some characters?  But this is an edited version of the
> subject anyway.  Just don't do that (neither the copying or the
> editing), it just confuses things.

That is the first line from the git commit, which git format-patch puts as the
subject.  I accidently deleted a few extra characters when trimming it down (I
remove the From:, etc. lines from the format-patch output).  But I can just
delete this line if desired.

> Please factor this patch into more pieces, pieces that can be reviewed
> more easily, pieces that change one thing only.
> 
> As is you are just rewriting the lot, and it is not an improvement at
> all this way.  No doubt there are many good pieces in it, but mixed with
> a non-trivial amount of bad pieces I cannot approve it.  It also isn't
> clear at all what you want to do; piece by piece it is easier to
> explain.
> 
> > -; Iterators for converting to/from TFmode
> > -(define_mode_iterator IFKF [IF KF])
> 
> Yes, IFmode and KFmode have almost nothing in common.  Good to see this
> go.  It would be even better if we would not use
> rs6000_expand_float128_convert when not needed, either, and all this
> would be just gone after expand.

I took a look at it, and I have a new version that only does the moves that are
NOPs, and it makes sure all of the functions called have the proper names.  I
will post it on Tuesday, as some of the machines that I use for testing are now
down for the US Labor Day weekend (they need to work on power infrastructure to
the lab the machines are in).

> 
> > +(define_expand "extendkfif2"
> > +  [(set (match_operand:IF 0 "gpc_reg_operand")
> > +   (float_extend:IF (match_operand:KF 1 "gpc_reg_operand")))]
> > +  "TARGET_FLOAT128_TYPE"
> > +{
> > +  rs6000_expand_float128_convert (operands[0], operands[1], false);
> > +  DONE;
> > +})
> 
> This does not belong here.
> 
> It really shouldn't *exist* at all: this is *not* a float_extend!  It is
> not converting to a wider mode (as required!), but not even to a mode of
> higher precision: both IFmode and KFmode can represent (finite, normal)
> numbers the other can not.

We know that TFmode (if -mabi=ieeelongdouble) and KFmode are the same, just
like TFmode (if -mabi=ibmlongdouble) and IFmode are the same.  But RTL does not
know that these modes use the same representation.  So to convert between them,
it needs to use either FLOAT_EXTEND or FLOAT_TRUNCATE, depending on which
precision each of the three modes have (i.e. rs6000-modes.h).  So you need
these conversions in RTL.

Unfortunately, you can't just use SUBREG before register allocation is done.
So I do define_insn_and_split to cover this.

> 
> But it certainly does not belong here in the middle of no-op moves.
> 
> > +(define_expand "trunckfif2"
> > +  [(set (match_operand:IF 0 "gpc_reg_operand")
> > +   (float_truncate:IF (match_operand:KF 1 "gpc_reg_operand")))]
> > +  "TARGET_FLOAT128_TYPE"
> > +{
> > +  rs6000_expand_float128_convert (operands[0], operands[1], false);
> > +  DONE;
> > +})
> 
> I also would expect IBM128 instead of just IF.  This would simplify a
> lot.  Why do you not use that, is there a reason?

If you use IBM128, you then need to create a mode_attr that for a given mode
gives the other mode.  Sure it can be done, but for the insns involved it was
just simpler to duplicate the insns.

So for example, for IBM floating point my current patches are:

(define_insn_and_split "extendtfif2"
  [(set (match_operand:IF 0 "gpc_reg_operand" "=wa,wa,r,r")
(float_extend:IF
 (match_operand:TF 1 "gpc_reg_operand" "0,wa,0,r")))]
  "TARGET_HARD_FLOAT && TARGET_IBM128 && FLOAT128_IBM_P (TFmode)"
  "#"
  "&& reload_completed"
  [(set (match_dup 0)
(match_dup 2))]
{
  operands[2] = gen_lowpart (IFmode, operands[1]);
}
  [(set_attr "num_insns" "2")
   (set_attr "length" "8")])

(define_insn_and_split "extendiftf2"
  [(set (match_operand:TF 0 "gpc_reg_operand" "=wa,wa,r,r")
(float_extend:TF
 (match_operand:IF 1 "gpc_reg_operand" "0,wa,0,r")))]
  "TARGET_HARD_FLOAT && TARGET_IBM128 && FLOAT128_IBM_P (TFmode)"
  "#"
  "&& reload_completed"
  [(set (match_dup 0)
(match_dup 2))]
{
  operands[2] = gen_lowpart (TFmode, operands[1]);
}
  [(set_attr "num_insns" "2")
   (set_attr "length" "8")])

You could rewrite that as:

(define_mode_attr IBM128_other [(IF "TF") ("TF" "IF")])

(define_insn_and_split "extend2"
  [(set (match_operand:IBM128 0 "gpc_reg_operand" 

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-02 Thread Segher Boessenkool
On Fri, Sep 02, 2022 at 08:51:01AM +0800, Kewen.Lin wrote:
> on 2022/9/1 23:04, Segher Boessenkool wrote:
> > On Thu, Sep 01, 2022 at 05:05:44PM +0800, Kewen.Lin wrote:
> >> Without any explicit -mpowerpc64 (and -mno-), I think we all agree
> >> that -m64 should set OPTION_MASK_POWERPC64 in opts, conversely -m32
> >> should unset OPTION_MASK_POWERPC64 in opts.
> > 
> > The latter only for OSes that do not handle -mpowerpc64 correctly.
> 
> I think it's the same for the OSes that handle -mpowerpc64 correctly.

No.  -m32 should not set or unset POWERPC64.  The two options are
independent.

-m64 on the other hand forces POWERPC64 to on.  -m64 -mno-powerpc64 is
invalid (and we do indeed error on that).  But we do allow
  -m32 -mno-powerpc64 -m64
(silently enabling it again), urgh.

> 
> Note that it's for the context without any explicit -mpowerpc64 (and
> -mno-), assuming we don't "unset OPTION_MASK_POWERPC64 in opts" for
> -m32, then the command line "-m64 -m32" would not be the same as
> "-m32", since the previous "-m64" sets OPTION_MASK_POWERPC64 in opts
> and it's still kept, it's unexpected.

No.  -m64 -m32 does not set POWERPC64!  Or it shouldn't, in any case :-(


Segher


Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-02 Thread Segher Boessenkool
On Fri, Sep 02, 2022 at 08:50:52AM +0800, Kewen.Lin wrote:
> on 2022/9/1 22:57, Segher Boessenkool wrote:
> > These two are independent, but apparently we have a bug here, which will
> > make what you did malfunction in some cases -- the test will not run for
> > ilp32 if you have RUNTESTFLAGS {-m32,-m64}.
> 
> Yeah, because of the bug (or call it surprised behavior),

No, I call it a bug.  Because that is what it is!

> the test case can
> fail for some dejaGnu version like 1.5.1 (how it places the dg-options 
> matters).

Yes, but that is only one way to expose the problem.

The bug just should be fixed.

> But to be clarified, the order of 
> 
>   /* { dg-options "-O2 -mpowerpc64" } */
> 
> and 
>   
>   /* { dg-require-effective-target has_arch_ppc64 } */
> 
> matters in this proposed fix, not for the line with ilp32.

Of course :-)

> has_arch_ppc64 uses current_compiler_flags which only incorporates dg-options
> which is placed before the dg-require-effective-target.  I guess it's related
> to how dejaGnu parses lines and sets global variables, for this kind of case,
> we have to put the expected order for now.

Even just to avoid having to uselessly edit hundreds of testcases, it
would be better to just fix the bug!


Segher


Re: [PATCH] rs6000: Use NO_EXPR to cast to MMA pointer types

2022-09-02 Thread Segher Boessenkool
On Fri, Sep 02, 2022 at 12:02:54PM -0500, Peter Bergner wrote:
> On 9/2/22 11:31 AM, Segher Boessenkool wrote:
> > (Did you also look at non-MMA VIEW_CONVERT_EXPR uses btw?)
> 
> I did.  It seemed they were all related to pointers to vectors and I remember
> you mentioning that as one of the reasons for using VIEW_CONVERT_EXPR over
> NOP_EXPR, so I left them alone to be safe.

Huh?  I have no idea what you mean here.

Casting from one pointer type to another never needs it.  Casting from a
scalar integer type to a pointer type not either AFAIKi.  But I am not a
Gimple expert, all this might be wrong, it isn't documented anywbere :-(


Segher


Ping: [PATCH] Rework 128-bit complex multiply and divide.

2022-09-02 Thread Michael Meissner via Gcc-patches
Ping patch:

| Date: Thu, 18 Aug 2022 17:46:51 -0400
| Subject: [PATCH] Rework 128-bit complex multiply and divide.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] rs6000: Use NO_EXPR to cast to MMA pointer types

2022-09-02 Thread Peter Bergner via Gcc-patches
On 9/2/22 11:31 AM, Segher Boessenkool wrote:
> I wouldn't worry about backports.  If it does make other backports
> easier in the future, we can decide to backport this *then*.

Ok.



> (Did you also look at non-MMA VIEW_CONVERT_EXPR uses btw?)

I did.  It seemed they were all related to pointers to vectors and I remember
you mentioning that as one of the reasons for using VIEW_CONVERT_EXPR over
NOP_EXPR, so I left them alone to be safe.



> Okay for trunk.  Thanks!

Ok, pushed to trunk.  Thanks!


Peter



[PATCH] libstdc++: Consistently use ::type when deriving from __and/or/not_

2022-09-02 Thread Patrick Palka via Gcc-patches
Now that these internal type traits are again class templates, it's
better to derive from the trait's ::type (which is either false_type or
true_type) instead of from the trait itself, for sake of a shallower
inheritance chain.  We usually do this but not always; this patch makes
us consistently do so.

Tested on x86_64-pc-lnux-gnu, does this look OK for trunk?  (Compile
time for join.cc decreases by about 0.5% with this, avg of 10 runs.)

libstdc++-v3/ChangeLog:

* include/std/tuple (tuple::_UseOtherCtor): Do ::type when
deriving from __and_, __or_ or __not_.
* include/std/type_traits (negation): Likewise.
(is_unsigned): Likewise.
(__is_implicitly_default_constructible): Likewise.
(is_trivially_destructible): Likewise.
(__is_nt_invocable_impl): Likewise.
---
 libstdc++-v3/include/std/tuple   |  2 +-
 libstdc++-v3/include/std/type_traits | 10 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index ddd7c226d80..26e248431ec 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -826,7 +826,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // then TUPLE should match tuple(UTypes&&...) instead.
   template
struct _UseOtherCtor<_Tuple, tuple<_Tp>, tuple<_Up>>
-   : __or_, is_constructible<_Tp, _Tuple>>
+   : __or_, is_constructible<_Tp, 
_Tuple>>::type
{ };
   // If TUPLE and *this each have a single element of the same type,
   // then TUPLE should match a copy/move constructor instead.
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index be9f2955539..c0bb1cf64e3 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -235,7 +235,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 struct negation
-: __not_<_Pp>
+: __not_<_Pp>::type
 { };
 
   /** @ingroup variable_templates
@@ -845,7 +845,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_unsigned
   template
 struct is_unsigned
-: public __and_, __not_>>
+: public __and_, __not_>>::type
 { };
 
   /// @cond undocumented
@@ -1222,7 +1222,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template 
 struct __is_implicitly_default_constructible
 : public __and_<__is_constructible_impl<_Tp>,
-   __is_implicitly_default_constructible_safe<_Tp>>
+   __is_implicitly_default_constructible_safe<_Tp>>::type
 { };
 
   /// is_trivially_copy_constructible
@@ -1282,7 +1282,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_trivially_destructible
 : public __and_<__is_destructible_safe<_Tp>,
-   __bool_constant<__has_trivial_destructor(_Tp)>>
+   __bool_constant<__has_trivial_destructor(_Tp)>>::type
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -2975,7 +2975,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_nt_invocable_impl<_Result, _Ret,
  __void_t>
 : __or_,
-   __is_nothrow_convertible>
+   __is_nothrow_convertible>::type
 { };
   /// @endcond
 
-- 
2.37.2.490.g6c8e4ee870



Re: [PATCH] rs6000: Use NO_EXPR to cast to MMA pointer types

2022-09-02 Thread Segher Boessenkool
Hi!

On Fri, Sep 02, 2022 at 11:22:07AM -0500, Peter Bergner wrote:
> When we cast pointers to our opaque MMA pointers, use NOP_EXPR rather
> than VIEW_CONVERT_EXPR.

> I think this is just a cleanup and not a correctness thing, so I'm assuming a
> backport isn't needed?  Or maybe we do to make other potential backports 
> easier?
> I'm fine either way.

I wouldn't worry about backports.  If it does make other backports
easier in the future, we can decide to backport this *then*.

Okay for trunk.  Thanks!

(Did you also look at non-MMA VIEW_CONVERT_EXPR uses btw?)


Segher


[PATCH] rs6000: Use NO_EXPR to cast to MMA pointer types

2022-09-02 Thread Peter Bergner via Gcc-patches
When we cast pointers to our opaque MMA pointers, use NOP_EXPR rather
than VIEW_CONVERT_EXPR.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk?

I think this is just a cleanup and not a correctness thing, so I'm assuming a
backport isn't needed?  Or maybe we do to make other potential backports easier?
I'm fine either way.

Peter


gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin): Use
NOP_EXPR for MMA pointer casting.


diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index e6948b9abb7..0d8be996f4e 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1101,7 +1101,7 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi,
  || (fncode == RS6000_BIF_DISASSEMBLE_PAIR_V
  && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_pair_type_node))
{
- tree dst = build_simple_mem_ref (build1 (VIEW_CONVERT_EXPR,
+ tree dst = build_simple_mem_ref (build1 (NOP_EXPR,
   src_type, dst_ptr));
  gimplify_assign (dst, src, _seq);
  pop_gimplify_context (NULL);
@@ -1125,7 +1125,7 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi,
= rs6000_builtin_decls[rs6000_builtin_info[fncode].assoc_bif];
   tree dst_type = build_pointer_type_for_mode (unsigned_V16QI_type_node,
   ptr_mode, true);
-  tree dst_base = build1 (VIEW_CONVERT_EXPR, dst_type, dst_ptr);
+  tree dst_base = build1 (NOP_EXPR, dst_type, dst_ptr);
   for (unsigned i = 0; i < nvec; i++)
{
  unsigned index = WORDS_BIG_ENDIAN ? i : nvec - 1 - i;
@@ -1151,7 +1151,7 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi,
   tree ptr = gimple_call_arg (stmt, 1);
   tree lhs = gimple_call_lhs (stmt);
   if (TREE_TYPE (TREE_TYPE (ptr)) != vector_pair_type_node)
-   ptr = build1 (VIEW_CONVERT_EXPR,
+   ptr = build1 (NOP_EXPR,
  build_pointer_type (vector_pair_type_node), ptr);
   tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
   TREE_TYPE (ptr), ptr, offset));
@@ -1168,7 +1168,7 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi,
   tree offset = gimple_call_arg (stmt, 1);
   tree ptr = gimple_call_arg (stmt, 2);
   if (TREE_TYPE (TREE_TYPE (ptr)) != vector_pair_type_node)
-   ptr = build1 (VIEW_CONVERT_EXPR,
+   ptr = build1 (NOP_EXPR,
  build_pointer_type (vector_pair_type_node), ptr);
   tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
   TREE_TYPE (ptr), ptr, offset));


Re: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants

2022-09-02 Thread Segher Boessenkool
On Fri, Sep 02, 2022 at 10:29:35AM -0500, Peter Bergner wrote:
> On 9/1/22 4:52 PM, Segher Boessenkool wrote:
> > On Thu, Sep 01, 2022 at 11:24:00AM +0800, Jiufu Guo wrote:
> >> As mentioned in PR106550, since pli could support 34bits immediate, we 
> >> could
> >> use less instructions(3insn would be ok) to build 64bits constant with pli.
> > 
> >> For example, for constant 0x020805006106003, we could generate it with:
> >> asm code1:
> >> pli 9,101736451 (0x6106003)
> >> sldi 9,9,32
> >> paddi 9,9, 213 (0x0208050)
> > 
> > 3 insns, 2 insns dependent on the previous, each.
> > 
> >> or asm code2:
> >> pli 10, 213
> >> pli 9, 101736451
> >> rldimi 9, 10, 32, 0
> > 
> > 3 insns, 1 insn dependent on both others.
> 
> Yeah, the improvement here is the fewer dependent instructions, since
> 2 64-bit + 1 32-bit instructions is the same size as 5 32-bit insns.

It also helps CSE if you do say 0x1200aa00bb0034 and 0x1200aa00bb0056,
or even just 0x1200aa001200aa maybe (we probably have a separate pattern
for the latter though :-) )


Segher


Re: [PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2022-09-02 Thread FX via Gcc-patches
> IIRC there was discussion about abort on the ML some years ago where folks 
> decided to switch to stop N.
> I don't think I participated in that discussion, maybe somebody remembers the 
> reasoning or is able to find the thread.

Found it: https://gcc.gnu.org/legacy-ml/fortran/2018-02/msg00105.html
Will replace those abort calls, then.

FX

Re: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants

2022-09-02 Thread Segher Boessenkool
Hi!

On Fri, Sep 02, 2022 at 02:56:21PM +0800, Jiufu Guo wrote:
> >> +  /* pli 9,high32 + sldi 9,32 + paddi 9,9,low32.  */
> >> +  else
> >> +  {
> >
> > The comment goes here, in the block it refers to.  Comments for a block
> > are the first thing *in* the block.
> OK, great! I like the format you sugguested here :-)

It's the normal GCC style, not my invention :-)

> >> +emit_move_insn (copy_rtx (dest), GEN_INT ((ud4 << 16) | ud3));
> >> +
> >> +emit_move_insn (copy_rtx (dest),
> >> +gen_rtx_ASHIFT (DImode, copy_rtx (dest),
> >> +GEN_INT (32)));
> >> +
> >> +bool can_use_paddi = REGNO (dest) != FIRST_GPR_REGNO;
> >
> > There should be a test that we so the right thing (or *a* right thing,
> > anyway; a working thing; but hopefully a reasonably fast thing) for
> > !can_use_paddi.
> To catch this test point, we need let the splitter run after RA,
> and register 0 happen to be the dest of an assignment.

Or force the testcase to use r0 some other way.  Well, "forcing" cannot
be done, but we can probably encourage it (via a local register asm for
example, or by tying the var to the output of an asm that is hard reg 0,
or perhaps there are other ways as well :-) )

> I will add this test case in patch.
> Is this ok?  Any sugguestions?

Sounds useful yes.  Maybe describe the expected output in words as well
(in the testcase, not in email)?

> >> +/* 3 insns for each constant: pli+sldi+paddi or pli+pli+rldimi.
> >> +   And 3 additional insns: std+std+blr: 9 insns totally.  */
> >> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9 } } */
> >
> > Also test the expected insns separately please?  The std's (just with
> > \mstd so it will catch all variations as well), the blr, the pli's and
> > the rldimi etc.?
> The reason of using "(?n)^\s+[a-z]" is to keep this test case pass no
> matter the splitter running before or after RA.

Ah.  Some short comment in the testcase please?

Thanks again,


Segher


Re: [PATCH] d: Fix #error You must define PREFERRED_DEBUGGING_TYPE if DWARF is not supported (PR105659)

2022-09-02 Thread Iain Buclaw via Gcc-patches
Excerpts from Richard Biener's message of September 1, 2022 8:28 am:
> On Wed, Aug 31, 2022 at 9:21 PM Iain Buclaw  wrote:
>>
>> Excerpts from Joseph Myers's message of August 31, 2022 7:16 pm:
>> > On Wed, 31 Aug 2022, Iain Buclaw via Gcc-patches wrote:
>> >
>> >> Excerpts from Joseph Myers's message of August 30, 2022 11:53 pm:
>> >> > On Fri, 26 Aug 2022, Richard Biener via Gcc-patches wrote:
>> >> >
>> >> >> I was hoping Joseph would chime in here - I recollect debugging this 
>> >> >> kind
>> >> >> of thing and a thread about this a while back but unfortunately I do 
>> >> >> not
>> >> >> remember the details here (IIRC some things get included where they
>> >> >> better should not be).
>> >> >
>> >> > See 
>> >> > .
>> >> > Is there some reason it's problematic to avoid having defaults.h or
>> >> > ${cpu_type}/${cpu_type}.h included in tm_d.h, and instead have tm_d.h 
>> >> > only
>> >> > include D-specific headers?
>> >> >
>> >>
>> >> In targets such as arm-elf, we still need to pull in definitions from
>> >> ${cpu_type}/${cpu_type}-d.cc into default-d.cc.
>> >>
>> >> All I can think that might suffice is having D-specific prototype
>> >> headers in all targets as ${cpu_type}/${cpu_type}-d.h.
>> >
>> > As long as those prototypes don't involve any types that depend on an
>> > inclusion of tm.h, that should be fine.
>> >
>>
>> Updated patch that does what I described.
> 
> Ah yes - I think, even if a bit verbose, this is exactly how it was supposed
> to be?
> 
> OK from my side.
> 

To access the TARGET macros from arm-d.cc, arm-protos.h had to be
included (after tm_p.h was removed).

All ~200 configurations in contrib/config-list.mk now build again with
the D front-end enabled.

Regards,
Iain.

---

gcc/ChangeLog:

* config.gcc: Set tm_d_file to ${cpu_type}/${cpu_type}-d.h.
* config/aarch64/aarch64-d.cc: Include tm_d.h.
* config/aarch64/aarch64-protos.h (aarch64_d_target_versions): Move to
config/aarch64/aarch64-d.h.
(aarch64_d_register_target_info): Likewise.
* config/aarch64/aarch64.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/arm/arm-d.cc: Include tm_d.h and arm-protos.h instead of
tm_p.h.
* config/arm/arm-protos.h (arm_d_target_versions): Move to
config/arm/arm-d.h.
(arm_d_register_target_info): Likewise.
* config/arm/arm.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/default-d.cc: Remove memmodel.h include.
* config/freebsd-d.cc: Include tm_d.h instead of tm_p.h.
* config/glibc-d.cc: Likewise.
* config/i386/i386-d.cc: Include tm_d.h.
* config/i386/i386-protos.h (ix86_d_target_versions): Move to
config/i386/i386-d.h.
(ix86_d_register_target_info): Likewise.
(ix86_d_has_stdcall_convention): Likewise.
* config/i386/i386.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
(TARGET_D_HAS_STDCALL_CONVENTION): Likewise.
* config/i386/winnt-d.cc: Include tm_d.h instead of tm_p.h.
* config/mips/mips-d.cc: Include tm_d.h.
* config/mips/mips-protos.h (mips_d_target_versions): Move to
config/mips/mips-d.h.
(mips_d_register_target_info): Likewise.
* config/mips/mips.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/netbsd-d.cc: Include tm_d.h instead of tm.h and memmodel.h.
* config/openbsd-d.cc: Likewise.
* config/pa/pa-d.cc: Include tm_d.h.
* config/pa/pa-protos.h (pa_d_target_versions): Move to
config/pa/pa-d.h.
(pa_d_register_target_info): Likewise.
* config/pa/pa.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/riscv/riscv-d.cc: Include tm_d.h.
* config/riscv/riscv-protos.h (riscv_d_target_versions): Move to
config/riscv/riscv-d.h.
(riscv_d_register_target_info): Likewise.
* config/riscv/riscv.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/rs6000/rs6000-d.cc: Include tm_d.h.
* config/rs6000/rs6000-protos.h (rs6000_d_target_versions): Move to
config/rs6000/rs6000-d.h.
(rs6000_d_register_target_info): Likewise.
* config/rs6000/rs6000.h (TARGET_D_CPU_VERSIONS) Likewise.:
(TARGET_D_REGISTER_CPU_TARGET_INFO) Likewise.:
* config/s390/s390-d.cc: Include tm_d.h.
* config/s390/s390-protos.h (s390_d_target_versions): Move to
config/s390/s390-d.h.
(s390_d_register_target_info): Likewise.
* config/s390/s390.h (TARGET_D_CPU_VERSIONS): Likewise.
(TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
* config/sol2-d.cc: Include tm_d.h 

Re: [PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2022-09-02 Thread Bernhard Reutner-Fischer via Gcc-patches
On 2 September 2022 17:54:00 CEST, FX  wrote:
>Hi Bernhard,
>
>> Please do not call the non-standard abort, but use stop N.
>
>Is there a specific reason? It’s a well-documented GNU extension, and it’s 
>useful because it can easily display a backtrace and give line info for the 
>failure, unlike STOP.
>I’ll replace if there is consensus, but apart from aesthetics I don’t see why.


IIRC there was discussion about abort on the ML some years ago where folks 
decided to switch to stop N.
I don't think I participated in that discussion, maybe somebody remembers the 
reasoning or is able to find the thread.

thanks,


[committed] libstdc++: Optimize constructible/assignable variable templates

2022-09-02 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

This defines the is_xxx_constructible_v and is_xxx_assignable_v variable
templates by using the built-ins directly. The actual logic for each one
is the same as the corresponding class template, but way using the
variable template doesn't need to instantiate the class template.

This means that the variable templates won't use the static assertions
checking for complete types, cv void or unbounded arrays, but that's OK
because the built-ins check those anyway. We could probably remove the
static assertions from the class templates, and maybe from all type
traits that use a built-in.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_constructible_v)
(is_default_constructible_v, is_copy_constructible_v)
(is_move_constructible_v): Define using __is_constructible.
(is_assignable_v, is_copy_assignable_v, is_move_assignable_v):
Define using __is_assignable.
(is_trivially_constructible_v)
(is_trivially_default_constructible_v)
(is_trivially_copy_constructible_v)
(is_trivially_move_constructible_v): Define using
__is_trivially_constructible.
(is_trivially_assignable_v, is_trivially_copy_assignable_v)
(is_trivially_move_assignable_v): Define using
__is_trivially_assignable.
(is_nothrow_constructible_v)
(is_nothrow_default_constructible_v)
(is_nothrow_copy_constructible_v)
(is_nothrow_move_constructible_v): Define using
__is_nothrow_constructible.
(is_nothrow_assignable_v, is_nothrow_copy_assignable_v)
(is_nothrow_move_assignable_v): Define using
__is_nothrow_assignable.
---
 libstdc++-v3/include/std/type_traits | 88 
 1 file changed, 49 insertions(+), 39 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index be9f2955539..2f5fe80b98a 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3105,71 +3105,81 @@ template 
   inline constexpr bool is_signed_v = is_signed<_Tp>::value;
 template 
   inline constexpr bool is_unsigned_v = is_unsigned<_Tp>::value;
+
 template 
-  inline constexpr bool is_constructible_v =
-is_constructible<_Tp, _Args...>::value;
+  inline constexpr bool is_constructible_v = __is_constructible(_Tp, _Args...);
 template 
-  inline constexpr bool is_default_constructible_v =
-is_default_constructible<_Tp>::value;
+  inline constexpr bool is_default_constructible_v = __is_constructible(_Tp);
 template 
-  inline constexpr bool is_copy_constructible_v =
-is_copy_constructible<_Tp>::value;
+  inline constexpr bool is_copy_constructible_v
+= __is_constructible(_Tp, __add_lval_ref_t);
 template 
-  inline constexpr bool is_move_constructible_v =
-is_move_constructible<_Tp>::value;
+  inline constexpr bool is_move_constructible_v
+= __is_constructible(_Tp, __add_rval_ref_t<_Tp>);
+
 template 
-  inline constexpr bool is_assignable_v = is_assignable<_Tp, _Up>::value;
+  inline constexpr bool is_assignable_v = __is_assignable(_Tp, _Up);
 template 
-  inline constexpr bool is_copy_assignable_v = is_copy_assignable<_Tp>::value;
+  inline constexpr bool is_copy_assignable_v
+= __is_assignable(__add_lval_ref_t<_Tp>, __add_lval_ref_t);
 template 
-  inline constexpr bool is_move_assignable_v = is_move_assignable<_Tp>::value;
+  inline constexpr bool is_move_assignable_v
+= __is_assignable(__add_lval_ref_t<_Tp>, __add_rval_ref_t<_Tp>);
+
 template 
   inline constexpr bool is_destructible_v = is_destructible<_Tp>::value;
+
 template 
-  inline constexpr bool is_trivially_constructible_v =
-is_trivially_constructible<_Tp, _Args...>::value;
+  inline constexpr bool is_trivially_constructible_v
+= __is_trivially_constructible(_Tp, _Args...);
 template 
-  inline constexpr bool is_trivially_default_constructible_v =
-is_trivially_default_constructible<_Tp>::value;
+  inline constexpr bool is_trivially_default_constructible_v
+= __is_trivially_constructible(_Tp);
 template 
-  inline constexpr bool is_trivially_copy_constructible_v =
-is_trivially_copy_constructible<_Tp>::value;
+  inline constexpr bool is_trivially_copy_constructible_v
+= __is_trivially_constructible(_Tp, __add_lval_ref_t);
 template 
-  inline constexpr bool is_trivially_move_constructible_v =
-is_trivially_move_constructible<_Tp>::value;
+  inline constexpr bool is_trivially_move_constructible_v
+= __is_trivially_constructible(_Tp, __add_rval_ref_t<_Tp>);
+
 template 
-  inline constexpr bool is_trivially_assignable_v =
-is_trivially_assignable<_Tp, _Up>::value;
+  inline constexpr bool is_trivially_assignable_v
+= __is_trivially_assignable(_Tp, _Up);
 template 
-  inline constexpr bool is_trivially_copy_assignable_v =
-is_trivially_copy_assignable<_Tp>::value;
+  inline constexpr bool is_trivially_copy_assignable_v
+= 

Re: [PATCH v2, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-09-02 Thread Segher Boessenkool
Hi!

On Fri, Sep 02, 2022 at 04:31:38PM +0800, HAO CHEN GUI wrote:
>   This patch is for internal issue1136.

This isn't useful to most people.  Either just don't mention it here,
or make a public PR for it if that is useful?

> It changes insn condition from
> TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions.
> These instructions all use DI registers and can be invoked with -mpowerpc64
> in a 32-bit environment.

> gcc/
>   * config/rs6000/vsx.md (xsxexpdp): Change insn condition from
>   TARGET_64BIT to TARGET_POWERPC64.
>   (xsxsigdp): Likewise.
>   (xsiexpdp): Likewise.
>   (xsiexpdpf): Likewise.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Change effective
>   target from lp64 to has_arch_ppc64 and add -mpowerpc64 for 32-bit
>   environment.
>   * gcc.target/powerpc/bfp/scalar-extract-exp-6.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-exp-7.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-6.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-7.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-12.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-13.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Likewise.

> -  const signed long __builtin_vsx_scalar_extract_exp (double);
> +  const unsigned long long __builtin_vsx_scalar_extract_exp (double);
>  VSEEDP xsxexpdp {}
> 
> -  const signed long __builtin_vsx_scalar_extract_sig (double);
> +  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
>  VSESDP xsxsigdp {}

This also brings these legacy builtins in line with the vec_ versions,
which are the preferred builtins (they are defined in the PVIPR).

> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5098,7 +5098,7 @@ (define_insn "xsxexpdp"
>[(set (match_operand:DI 0 "register_operand" "=r")
>   (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UNSPEC_VSX_SXEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR && TARGET_POWERPC64"
>"xsxexpdp %0,%x1"
>[(set_attr "type" "integer")])

This doesn't need POWERPC64 even -- instead, it could use :GPR instead
of :DI, the output is always tiny.

> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> @@ -1,7 +1,8 @@
> -/* { dg-do compile { target { powerpc*-*-* } } } */
> -/* { dg-require-effective-target lp64 } */
> -/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */

Why?

>  /* { dg-options "-mdejagnu-cpu=power9" } */
> +/* { dg-additional-options "-mpowerpc64" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */

This is guaranteed already by that -mpowerpc64.

It probably is best if you do not add -mpowerpc64 at all.  That solves
both problems, is simpler, and gives better coverage as well :-)

So just use has_arch_ppc64 instead of lp64.  That makes it run on a
strict superset of cases :-)

> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-6.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-6.c
> @@ -1,7 +1,7 @@
> -/* { dg-do run { target { powerpc*-*-* } } } */
> -/* { dg-require-effective-target lp64 } */
> -/* { dg-require-effective-target p9vector_hw } */
> +/* { dg-do run { target { powerpc*-*-linux* } } } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +/* { dg-require-effective-target p9vector_hw } */

Nothing in gcc.target/powerpc has to check for powerpc*-*-* at all.  If
you want to test for linux (you shouldn't here afaics?), that is just
*-*-linux* .


Segher


Re: [PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2022-09-02 Thread FX via Gcc-patches
Hi Bernhard,

> Please do not call the non-standard abort, but use stop N.

Is there a specific reason? It’s a well-documented GNU extension, and it’s 
useful because it can easily display a backtrace and give line info for the 
failure, unlike STOP.
I’ll replace if there is consensus, but apart from aesthetics I don’t see why.

FX

Re: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]

2022-09-02 Thread Simon Rainer
Hi,

Thanks for committing the patch. I created PR106816 to track the noreturn/pure 
problem.

Regards
Simon Rainer

On Fri, Sep 2, 2022, at 08:03, Richard Biener wrote:
> On Thu, Sep 1, 2022 at 7:51 PM Simon Rainer  wrote:
> >
> > Hi,
> >
> > Thanks for taking a look at my patch. I tested some combinations with 
> > pure/noreturn attributes. gcc seems to ignore those attributes on 
> > multiversion functions and generates sub-optimal assembly.
> > But I wasn't able to fix this by simply copying members like DECL_PURE_P. 
> > It's pretty hard for me to tell which members of tree are relevant for a 
> > function declaration and should be copied and which should not be copied.
> >
> > Anyway, I think the TREE_NOTHROW change is the most important one, because 
> > it leads to correctness problems (and is what broke my original program :D 
> > ), so could you please commit my patch as I don't have write-access myself.
> 
> Sure, will do - thanks for the fix!
> 
> >
> > Should I open a new bug on bugzilla for the pure/noreturn issue?
> 
> Yes, I think it's worth investigating.
> 
> Richard.
> 
> > Thanks
> > Simon Rainer
> >
> >
> > On Thu, Sep 1, 2022, at 08:37, Richard Biener wrote:
> > > On Wed, Aug 31, 2022 at 11:00 PM Simon Rainer  wrote:
> > > >
> > > > Hi,
> > > >
> > > > This patch fixes PR106627. I ran the i386.exp tests on my 
> > > > x86_64-linux-gnu machine with a fully bootstrapped checkout. I also 
> > > > tested manually that no exception handling code is generated if none of 
> > > > the function versions throws an exception.
> > > > I don't have access to a machine to test the change to  rs6000.cc, but 
> > > > the code seems like an exact copy and I don't see a reason why it 
> > > > shouldn't work there the same way.
> > > >
> > > > Regards
> > > > Simon Rainer
> > > >
> > > > From 6fcb1c742fa1d61048f7d63243225a8d1931af4a Mon Sep 17 00:00:00 2001
> > > > From: Simon Rainer 
> > > > Date: Wed, 31 Aug 2022 20:56:04 +0200
> > > > Subject: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]
> > > >
> > > > Any multi-versioned function was implicitly declared as noexcept, which
> > > > leads to an abort if an exception is thrown inside the function.
> > > > The reason for this is that the function declaration is replaced by a
> > > > newly created dispatcher declaration, which has TREE_NOTHROW always set
> > > > to 1. Instead we need to set TREE_NOTHROW to the value of the original
> > > > declaration.
> > >
> > > Looks quite obvious.  The middle-end to target interface is a bit iffy
> > > since we have
> > > to duplicate this everywhere.  There's also other flags like
> > > pure/const and noreturn
> > > that do not impose correctness issues but may cause irritations if the IL 
> > > gets
> > > a call to the dispatcher not marked noreturn but there's no code 
> > > following.
> > >
> > > That said, the fix looks good to me.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > > PR ipa/106627
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/i386/i386-features.cc 
> > > > (ix86_get_function_versions_dispatcher): Set TREE_NOTHROW
> > > > correctly for dispatcher declaration
> > > > * config/rs6000/rs6000.cc 
> > > > (rs6000_get_function_versions_dispatcher): Likewise
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * g++.target/i386/pr106627.C: New test.
> > > > ---
> > > >  gcc/config/i386/i386-features.cc |  1 +
> > > >  gcc/config/rs6000/rs6000.cc  |  1 +
> > > >  gcc/testsuite/g++.target/i386/pr106627.C | 30 
> > > >  3 files changed, 32 insertions(+)
> > > >  create mode 100644 gcc/testsuite/g++.target/i386/pr106627.C
> > > >
> > > > diff --git a/gcc/config/i386/i386-features.cc 
> > > > b/gcc/config/i386/i386-features.cc
> > > > index d6bb66cbe01..5b3b1aeff28 100644
> > > > --- a/gcc/config/i386/i386-features.cc
> > > > +++ b/gcc/config/i386/i386-features.cc
> > > > @@ -3268,6 +3268,7 @@ ix86_get_function_versions_dispatcher (void *decl)
> > > >
> > > >/* Right now, the dispatching is done via ifunc.  */
> > > >dispatch_decl = make_dispatcher_decl (default_node->decl);
> > > > +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
> > > >
> > > >dispatcher_node = cgraph_node::get_create (dispatch_decl);
> > > >gcc_assert (dispatcher_node != NULL);
> > > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> > > > index 2f3146e56f8..9280da8a5c8 100644
> > > > --- a/gcc/config/rs6000/rs6000.cc
> > > > +++ b/gcc/config/rs6000/rs6000.cc
> > > > @@ -24861,6 +24861,7 @@ rs6000_get_function_versions_dispatcher (void 
> > > > *decl)
> > > >
> > > >/* Right now, the dispatching is done via ifunc.  */
> > > >dispatch_decl = make_dispatcher_decl (default_node->decl);
> > > > +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
> > > >
> > > >dispatcher_node = cgraph_node::get_create (dispatch_decl);
> > > >

Re: [PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2022-09-02 Thread Bernhard Reutner-Fischer via Gcc-patches
On 2 September 2022 13:37:41 CEST, FX via Fortran  wrote:
>Hi,

Please do not call the non-standard abort, but use stop N.

IIRC I once had a trivial script.. 
https://www.mail-archive.com/search?l=gcc-patches@gcc.gnu.org=subject:%22%5C%5BPATCH%2C+OpenACC%5C%5D+Fortran+deviceptr%22=newest=1

---8<---
Like (modulo typos, untested):
$ cat abort_to_stop.awk ; echo EOF
# awk -f ./abort_to_stop.awk < foo.f90 > x && mv x foo.f90
BEGIN { IGNORECASE = 1; i = 1 } { while (sub(/call\s\s*abort/, "stop " i)) {let 
i++;}; print $0; }
EOF

HTH and thanks,


Re: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants

2022-09-02 Thread Peter Bergner via Gcc-patches
On 9/1/22 4:52 PM, Segher Boessenkool wrote:
> On Thu, Sep 01, 2022 at 11:24:00AM +0800, Jiufu Guo wrote:
>> As mentioned in PR106550, since pli could support 34bits immediate, we could
>> use less instructions(3insn would be ok) to build 64bits constant with pli.
> 
>> For example, for constant 0x020805006106003, we could generate it with:
>> asm code1:
>> pli 9,101736451 (0x6106003)
>> sldi 9,9,32
>> paddi 9,9, 213 (0x0208050)
> 
> 3 insns, 2 insns dependent on the previous, each.
> 
>> or asm code2:
>> pli 10, 213
>> pli 9, 101736451
>> rldimi 9, 10, 32, 0
> 
> 3 insns, 1 insn dependent on both others.

Yeah, the improvement here is the fewer dependent instructions, since
2 64-bit + 1 32-bit instructions is the same size as 5 32-bit insns.
Those 5 32-bit insns are all dependent on the previous insn, so not ideal.

It's too bad we don't have a paddis or poris insns where we could specify
in the prefix a shift of 32-bits rather than the normal 16-bits.
If we had those, we could generate the constant with just 2 64-bit insns.

Peter



Re: [PATCH 2/2] analyzer: strcpy and strncpy semantics

2022-09-02 Thread David Malcolm via Gcc-patches
On Fri, 2022-09-02 at 16:08 +0200, Tim Lange wrote:
> Hi,
> 
> below is my patch for the strcpy and strncpy semantics inside the
> analyzer, enabling the out-of-bounds checker to also complain about
> overflows caused by those two functions.
> 
> As the plan is to reason about the inequality of symbolic values in
> the
> future, I decided to use eval_condition to compare the number of
> bytes and
> the string size for strncpy [0].
> 
> - Tim
> 
> [0] instead of only trying to handle cases where svalues are
> constant;
>     which was how I did it in an earlier draft discussed off-list.
> 
> 
> This patch adds modelling for the semantics of strcpy and strncpy in
> the
> simple case where the analyzer is able to reason about the inequality
> of
> the size argument and the string size.
> 
> Regrtested on Linux x86_64.

Thanks for the patch.

The strcpy part looks great, but strncpy has some tricky behavior that
isn't fully modeled by the patch; see:

https://en.cppreference.com/w/c/string/byte/strncpy

For example: if the src string with null terminator is shorter than
"count", then dest is padded with additional null characters up to
"count".

You could split out the strcpy part of the patch, as that seems to be
ready to go as-is, and do the strncpy part as a followup if you like;
some further notes below...

[...snip...]
> 
> +/* Handle the on_call_pre part of "strncpy" and
> "__builtin_strncpy_chk".  */
>  
> -  /* For now, just mark region's contents as unknown.  */
> -  mark_region_as_unknown (dest_reg, cd.get_uncertainty ());
> +void
> +region_model::impl_call_strncpy (const call_details )
> +{
> +  const svalue *dest_sval = cd.get_arg_svalue (0);
> +  const region *dest_reg = deref_rvalue (dest_sval, cd.get_arg_tree
> (0),
> +    cd.get_ctxt ());
> +  const svalue *src_sval = cd.get_arg_svalue (1);
> +  const region *src_reg = deref_rvalue (src_sval, cd.get_arg_tree
> (1),
> +   cd.get_ctxt ());
> +  const svalue *src_contents_sval = get_store_value (src_reg,
> +    cd.get_ctxt ());
> +  const svalue *num_bytes_sval = cd.get_arg_svalue (2);
> +
> +  cd.maybe_set_lhs (dest_sval);
> +
> +  const svalue *string_size_sval = get_string_size (src_reg);
> +  if (string_size_sval->get_kind () == SK_UNKNOWN)
> +    string_size_sval = get_string_size (src_contents_sval);
> +
> +  /* strncpy copies until a zero terminator is reached or n bytes
> were copied.
> + Determine the lesser of both here.  */
> +  tristate ts = eval_condition (string_size_sval, LT_EXPR,
> num_bytes_sval);
> +  const svalue *copied_bytes_sval;
> +  switch (ts.get_value ())
> +    {
> +  case tristate::TS_TRUE:
> +   copied_bytes_sval = string_size_sval;

This is the
  strlen(src) + 1 < count
case, and thus we want to do a zero-fill of size:
   bin_op(num_bytes_sval, MINUS_EXPR, copied_bytes_sval)
in the relevant subregion of dest_reg.
Or perhaps this could be expressed by first doing a full zero-fill of
the num_bytes_sval-sized region of dest_reg, and then copying the src
contents.

> +   break;
> +  case tristate::TS_FALSE:
> +   copied_bytes_sval = num_bytes_sval;
> +   break;
> +  case tristate::TS_UNKNOWN:
> +   copied_bytes_sval
> + = m_mgr->get_or_create_unknown_svalue (size_type_node);
> +   break;
> +  default:
> +   gcc_unreachable ();
> +    }
> +
> +  const region *sized_dest_reg = m_mgr->get_sized_region (dest_reg,
> NULL_TREE,
> +
> copied_bytes_sval);
> +  set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ());
>  }
> 

[...snip...]
> 

> b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c
> new file mode 100644
> index 000..382f0fb5ef4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c
> @@ -0,0 +1,122 @@
> +/* { dg-additional-options "-Wno-stringop-overflow -Wno-stringop-
> truncation" } */
> +#include 
> +#include 
> +#include 
> +
> +/* Wanalyzer-out-of-bounds tests for str(n)py-related overflows.
> +  
> +   The intra-procedural tests are all catched by Wstringop-overflow.

Nit: "catched" -> "caught".

[...snip...]

> diff --git a/gcc/testsuite/gcc.dg/analyzer/strncpy-1.c
> b/gcc/testsuite/gcc.dg/analyzer/strncpy-1.c
> new file mode 100644
> index 000..ea051eb761a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/strncpy-1.c
> @@ -0,0 +1,23 @@
> +#include 
> +#include "analyzer-decls.h"
> +
> +void test_1 (void)
> +{
> +  char str[] = "Hello";
> +  char buf[6];
> +  char *result = strncpy (buf, str, 6);
> +  __analyzer_describe (1, result); /* { dg-warning
> "region_svalue.*?'buf'" } */
> +  __analyzer_eval (result == buf); /* { dg-warning "TRUE" } */
> +  __analyzer_eval (buf[0] == 'H'); /* { dg-warning "TRUE" } */
> +  __analyzer_eval (buf[1] == 'e'); /* { dg-warning "TRUE" } */
> +  __analyzer_eval (buf[2] == 'l'); /* { dg-warning "TRUE" } */
> 

Re: [PATCH 1/2] analyzer: return a concrete offset for cast_regions

2022-09-02 Thread David Malcolm via Gcc-patches
On Fri, 2022-09-02 at 16:08 +0200, Tim Lange wrote:
> This patch fixes a bug where maybe_fold_sub_svalue did not fold the
> access of a single char from a string to a char when the offset was
> zero
> because get_relative_concrete_offset did return false for
> cast_regions.
> 
> Regrtested on Linux x86_64.

Thanks; this patch is OK for trunk.

Dave



[PATCH] Ignore debug insns with CONCAT and CONCATN for insn scheduling

2022-09-02 Thread H.J. Lu via Gcc-patches
CONCAT and CONCATN never appear in the insn chain.  They are only used
in debug insn.  Ignore debug insns with CONCAT and CONCATN for insn
scheduling to avoid different insn orders with and without debug insn.

gcc/

PR rtl-optimization/106746
* sched-deps.cc (sched_analyze_2): Ignore debug insns with CONCAT
and CONCATN.

gcc/testsuite/

PR rtl-optimization/106746
* gcc.dg/pr106746.c: New test.
---
 gcc/sched-deps.cc   | 14 ++
 gcc/testsuite/gcc.dg/pr106746.c | 30 ++
 2 files changed, 44 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr106746.c

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index 948aa0c3b60..b472e4fbb09 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -2794,6 +2794,20 @@ sched_analyze_2 (class deps_desc *deps, rtx x, rtx_insn 
*insn)
 
   return;
 
+case VAR_LOCATION:
+  if (GET_CODE (PAT_VAR_LOCATION_LOC (x)) == CONCAT
+ || GET_CODE (PAT_VAR_LOCATION_LOC (x)) == CONCATN)
+   {
+ /* CONCAT and CONCATN never appear in the insn chain.  They
+are only used in debug insn.  Ignore insns with CONCAT and
+CONCATN for insn scheduling to avoid different insn orders
+with and without debug insn.  */
+ if (cslr_p && sched_deps_info->finish_rhs)
+   sched_deps_info->finish_rhs ();
+ return;
+   }
+  break;
+
 default:
   break;
 }
diff --git a/gcc/testsuite/gcc.dg/pr106746.c b/gcc/testsuite/gcc.dg/pr106746.c
new file mode 100644
index 000..1fc29de28c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr106746.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-Wno-psabi -O2 -fsched2-use-superblocks -fcompare-debug" } */
+typedef char __attribute__((__vector_size__ (64))) U;
+typedef short __attribute__((__vector_size__ (64))) V;
+typedef int __attribute__((__vector_size__ (64))) W;
+
+char c;
+U a;
+U *r;
+W foo0_v512u32_0;
+
+void
+foo (W)
+{
+  U u;
+  V v;
+  W w = __builtin_shuffle (foo0_v512u32_0, foo0_v512u32_0);
+  u = __builtin_shufflevector (a, u, 3, 0, 4, 9, 9, 6,
+  7, 8, 5, 0, 6, 1, 8, 1,
+  2, 8, 6, 1, 8, 4, 9, 3,
+  8, 4, 6, 0, 9, 0, 1, 8,
+  2, 3, 3, 0, 4, 9, 9, 6,
+  7, 8, 5, 0, 6, 1, 8, 1,
+  2, 8, 6, 1, 8, 4, 9, 3,
+  8, 4, 6, 0, 9, 0, 1, 8,
+  2, 3);
+  v *= c;
+  w &= c;
+  *r = (U) v + (U) w;
+}
-- 
2.37.2



Re: [PATCH] libstdc++: Fix laziness of __and/or/not_

2022-09-02 Thread Jonathan Wakely via Gcc-patches
On Fri, 2 Sep 2022, 14:35 Patrick Palka via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> On Fri, 2 Sep 2022, Patrick Palka wrote:
>
> > r13-2230-g390f94eee1ae69 redefined the internal logical operator traits
> > __and_, __or_ and __not as alias templates that directly resolve to
> > true_type or false_type.  But it turns out using an alias template here
> > causes the traits to be less lazy than before because we now compute
> > the logical result immediately upon _specialization_ of the trait, and
> > not later upon _completion_ of the specialization.
> >
> > Thus, for example, in
> >
> >   using type = __and_>;
> >
> > we now compute the conjunction and thus instantiate A even though we're
> > in a context that doesn't require completion of the __and_.  What's
> > worse is that we now compute the negation and thus instantiate B as well
> > (for the same reason), independent of the __and_ and the value of A!
> > Thus the traits are now less lazy and composable than before.
>

Ah good catch.

>
> > Fortunately, the fix is cheap and simple: redefine these traits as class
> > templates instead of as alias templates so that completion not
> > specialization triggers computation of the logical result.  I added
> > comprehensive short circuiting tests for these internal logical operator
> > traits in short_circuit.cc guarded by __GLIBCXX__, not sure if
> > that's the best place for them.  (Note that before this fix, assert #5
> > and #10 guarded by __GLIBCXX__ would induce a hard error due to this
> > bug).
>

I don't bother guarding libstdc++-specific checks, because LLVM and MSVC
folk are allergic to anything that's been anywhere near GPL code.

But doing so is a kindness for any users who do decide to use our tests.
Maybe I should go through tests where I've added a comment saying "GCC
extension" and guard them this way



> FWIW this change doesn't seem to have a measurable compile time/memory
> impact on the stress test from r13-2230.  For std/ranges/adaptors/join.cc,
> memory usage increases by around 1% and compile time decreases by around
> 1%.
>

Great.


> >
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
>


OK, thanks


>
> > libstdc++-v3/ChangeLog:
> >
> >   * include/std/type_traits (__or_, __and_, __not_): Redefine as a
> >   class template instead of an alias template.
> >   * testsuite/20_util/logical_traits/requirements/short_circuit.cc:
> >   Add more tests for conjunction and disjunction.  Add corresponding
> >   tests for __and_ and __or_v.
> > ---
> >  libstdc++-v3/include/std/type_traits  | 12 ++--
> >  .../requirements/short_circuit.cc | 29 +++
> >  2 files changed, 38 insertions(+), 3 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/std/type_traits
> b/libstdc++-v3/include/std/type_traits
> > index 615791f29c8..2feb4b145c5 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -168,13 +168,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >// to either true_type or false_type which allows for a more efficient
> >// implementation that avoids recursive class template instantiation.
> >template
> > -using __or_ = decltype(__detail::__or_fn<_Bn...>(0));
> > +struct __or_
> > +: decltype(__detail::__or_fn<_Bn...>(0))
> > +{ };
> >
> >template
> > -using __and_ = decltype(__detail::__and_fn<_Bn...>(0));
> > +struct __and_
> > +: decltype(__detail::__and_fn<_Bn...>(0))
> > +{ };
> >
> >template
> > -using __not_ = __bool_constant;
> > +struct __not_
> > +: __bool_constant
> > +{ };
> >/// @endcond
> >
> >  #if __cplusplus >= 201703L
> > diff --git
> a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> > index 86996b27fa5..ff90f8a47c3 100644
> > ---
> a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> > +++
> b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> > @@ -14,6 +14,10 @@ static_assert(!std::conjunction_v invalid>);
> >  static_assert(!std::conjunction_v);
> >  static_assert(!std::conjunction_v invalid>);
> >  static_assert(!std::conjunction_v invalid, invalid>);
> > +static_assert(!std::conjunction_v > +   std::conjunction,
> > +   std::disjunction,
> > +   std::negation>);
> >
> >  // [meta.logical]/8: For a specialization disjunction, if
> >  // there is a template type argument B_i for which bool(B_i::value) is
> true,
> > @@ -24,3 +28,28 @@ static_assert(std::disjunction_v invalid>);
> >  static_assert(std::disjunction_v);
> >  static_assert(std::disjunction_v invalid>);
> >  static_assert(std::disjunction_v invalid, invalid>);
> > +static_assert(std::disjunction_v > +  std::conjunction,
> > + 

[PATCH 2/2] analyzer: strcpy and strncpy semantics

2022-09-02 Thread Tim Lange
Hi,

below is my patch for the strcpy and strncpy semantics inside the
analyzer, enabling the out-of-bounds checker to also complain about
overflows caused by those two functions.

As the plan is to reason about the inequality of symbolic values in the
future, I decided to use eval_condition to compare the number of bytes and
the string size for strncpy [0].

- Tim

[0] instead of only trying to handle cases where svalues are constant;
which was how I did it in an earlier draft discussed off-list.


This patch adds modelling for the semantics of strcpy and strncpy in the
simple case where the analyzer is able to reason about the inequality of
the size argument and the string size.

Regrtested on Linux x86_64.

2022-09-02  Tim Lange  

gcc/analyzer/ChangeLog:

* region-model-impl-calls.cc (region_model::impl_call_strncpy):
New function.
* region-model.cc (region_model::on_call_pre):
Add call to impl_call_strncpy.
(region_model::get_string_size): New function.
* region-model.h (class region_model):
Add impl_call_strncpy and get_string_size.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/out-of-bounds-4.c: New test.
* gcc.dg/analyzer/strcpy-3.c: New test.
* gcc.dg/analyzer/strncpy-1.c: New test.

---
 gcc/analyzer/region-model-impl-calls.cc   |  62 -
 gcc/analyzer/region-model.cc  |  33 +
 gcc/analyzer/region-model.h   |   4 +
 .../gcc.dg/analyzer/out-of-bounds-4.c | 122 ++
 gcc/testsuite/gcc.dg/analyzer/strcpy-3.c  |  23 
 gcc/testsuite/gcc.dg/analyzer/strncpy-1.c |  23 
 6 files changed, 264 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-4.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/strcpy-3.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/strncpy-1.c

diff --git a/gcc/analyzer/region-model-impl-calls.cc 
b/gcc/analyzer/region-model-impl-calls.cc
index 8eebd122d42..9f1ae020f4f 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -1019,13 +1019,69 @@ region_model::impl_call_strcpy (const call_details )
   const svalue *dest_sval = cd.get_arg_svalue (0);
   const region *dest_reg = deref_rvalue (dest_sval, cd.get_arg_tree (0),
 cd.get_ctxt ());
+  const svalue *src_sval = cd.get_arg_svalue (1);
+  const region *src_reg = deref_rvalue (src_sval, cd.get_arg_tree (1),
+   cd.get_ctxt ());
+  const svalue *src_contents_sval = get_store_value (src_reg,
+cd.get_ctxt ());
 
   cd.maybe_set_lhs (dest_sval);
 
-  check_region_for_write (dest_reg, cd.get_ctxt ());
+  /* Try to get the string size if SRC_REG is a string_region.  */
+  const svalue *copied_bytes_sval = get_string_size (src_reg);
+  /* Otherwise, check if the contents of SRC_REG is a string.  */
+  if (copied_bytes_sval->get_kind () == SK_UNKNOWN)
+copied_bytes_sval = get_string_size (src_contents_sval);
+
+  const region *sized_dest_reg
+= m_mgr->get_sized_region (dest_reg, NULL_TREE, copied_bytes_sval);
+  set_value (sized_dest_reg, src_contents_sval, cd.get_ctxt ());
+}
+
+/* Handle the on_call_pre part of "strncpy" and "__builtin_strncpy_chk".  */
 
-  /* For now, just mark region's contents as unknown.  */
-  mark_region_as_unknown (dest_reg, cd.get_uncertainty ());
+void
+region_model::impl_call_strncpy (const call_details )
+{
+  const svalue *dest_sval = cd.get_arg_svalue (0);
+  const region *dest_reg = deref_rvalue (dest_sval, cd.get_arg_tree (0),
+cd.get_ctxt ());
+  const svalue *src_sval = cd.get_arg_svalue (1);
+  const region *src_reg = deref_rvalue (src_sval, cd.get_arg_tree (1),
+   cd.get_ctxt ());
+  const svalue *src_contents_sval = get_store_value (src_reg,
+cd.get_ctxt ());
+  const svalue *num_bytes_sval = cd.get_arg_svalue (2);
+
+  cd.maybe_set_lhs (dest_sval);
+
+  const svalue *string_size_sval = get_string_size (src_reg);
+  if (string_size_sval->get_kind () == SK_UNKNOWN)
+string_size_sval = get_string_size (src_contents_sval);
+
+  /* strncpy copies until a zero terminator is reached or n bytes were copied.
+ Determine the lesser of both here.  */
+  tristate ts = eval_condition (string_size_sval, LT_EXPR, num_bytes_sval);
+  const svalue *copied_bytes_sval;
+  switch (ts.get_value ())
+{
+  case tristate::TS_TRUE:
+   copied_bytes_sval = string_size_sval;
+   break;
+  case tristate::TS_FALSE:
+   copied_bytes_sval = num_bytes_sval;
+   break;
+  case tristate::TS_UNKNOWN:
+   copied_bytes_sval
+ = m_mgr->get_or_create_unknown_svalue (size_type_node);
+   break;
+  default:
+   gcc_unreachable ();
+}
+
+  const region 

[PATCH 1/2] analyzer: return a concrete offset for cast_regions

2022-09-02 Thread Tim Lange
This patch fixes a bug where maybe_fold_sub_svalue did not fold the
access of a single char from a string to a char when the offset was zero
because get_relative_concrete_offset did return false for cast_regions.

Regrtested on Linux x86_64.

2022-09-02  Tim Lange  

gcc/analyzer/ChangeLog:

* region.cc (cast_region::get_relative_concrete_offset):
New overloaded method.
* region.h: Add cast_region::get_relative_concrete_offset.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/fold-string-to-char.c: New test.

---
 gcc/analyzer/region.cc  | 10 ++
 gcc/analyzer/region.h   |  2 ++
 gcc/testsuite/gcc.dg/analyzer/fold-string-to-char.c |  8 
 3 files changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/fold-string-to-char.c

diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index f4aba6b9c88..9c8279b130d 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -1556,6 +1556,16 @@ cast_region::dump_to_pp (pretty_printer *pp, bool 
simple) const
 }
 }
 
+/* Implementation of region::get_relative_concrete_offset vfunc
+   for cast_region.  */
+
+bool
+cast_region::get_relative_concrete_offset (bit_offset_t *out) const
+{
+  *out = (int) 0;
+  return true;
+}
+
 /* class heap_allocated_region : public region.  */
 
 /* Implementation of region::dump_to_pp vfunc for heap_allocated_region.  */
diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h
index d37584b7285..34ce1fa1714 100644
--- a/gcc/analyzer/region.h
+++ b/gcc/analyzer/region.h
@@ -1087,6 +1087,8 @@ public:
   void accept (visitor *v) const final override;
   void dump_to_pp (pretty_printer *pp, bool simple) const final override;
 
+  bool get_relative_concrete_offset (bit_offset_t *out) const final override;
+
   const region *get_original_region () const { return m_original_region; }
 
 private:
diff --git a/gcc/testsuite/gcc.dg/analyzer/fold-string-to-char.c 
b/gcc/testsuite/gcc.dg/analyzer/fold-string-to-char.c
new file mode 100644
index 000..46139216bba
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/fold-string-to-char.c
@@ -0,0 +1,8 @@
+#include "analyzer-decls.h"
+
+void test_1 (void)
+{
+  char str[] = "Hello";
+  char *ptr = str;
+  __analyzer_eval (ptr[0] == 'H'); /* { dg-warning "TRUE" } */
+}
-- 
2.37.2



Re: [PATCH] libstdc++: Fix laziness of __and/or/not_

2022-09-02 Thread Patrick Palka via Gcc-patches
On Fri, 2 Sep 2022, Patrick Palka wrote:

> r13-2230-g390f94eee1ae69 redefined the internal logical operator traits
> __and_, __or_ and __not as alias templates that directly resolve to
> true_type or false_type.  But it turns out using an alias template here
> causes the traits to be less lazy than before because we now compute
> the logical result immediately upon _specialization_ of the trait, and
> not later upon _completion_ of the specialization.
> 
> Thus, for example, in
> 
>   using type = __and_>;
> 
> we now compute the conjunction and thus instantiate A even though we're
> in a context that doesn't require completion of the __and_.  What's
> worse is that we now compute the negation and thus instantiate B as well
> (for the same reason), independent of the __and_ and the value of A!
> Thus the traits are now less lazy and composable than before.
> 
> Fortunately, the fix is cheap and simple: redefine these traits as class
> templates instead of as alias templates so that completion not
> specialization triggers computation of the logical result.  I added
> comprehensive short circuiting tests for these internal logical operator
> traits in short_circuit.cc guarded by __GLIBCXX__, not sure if
> that's the best place for them.  (Note that before this fix, assert #5
> and #10 guarded by __GLIBCXX__ would induce a hard error due to this
> bug).

FWIW this change doesn't seem to have a measurable compile time/memory
impact on the stress test from r13-2230.  For std/ranges/adaptors/join.cc,
memory usage increases by around 1% and compile time decreases by around
1%.

> 
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/type_traits (__or_, __and_, __not_): Redefine as a
>   class template instead of an alias template.
>   * testsuite/20_util/logical_traits/requirements/short_circuit.cc:
>   Add more tests for conjunction and disjunction.  Add corresponding
>   tests for __and_ and __or_v.
> ---
>  libstdc++-v3/include/std/type_traits  | 12 ++--
>  .../requirements/short_circuit.cc | 29 +++
>  2 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index 615791f29c8..2feb4b145c5 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -168,13 +168,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// to either true_type or false_type which allows for a more efficient
>// implementation that avoids recursive class template instantiation.
>template
> -using __or_ = decltype(__detail::__or_fn<_Bn...>(0));
> +struct __or_
> +: decltype(__detail::__or_fn<_Bn...>(0))
> +{ };
>  
>template
> -using __and_ = decltype(__detail::__and_fn<_Bn...>(0));
> +struct __and_
> +: decltype(__detail::__and_fn<_Bn...>(0))
> +{ };
>  
>template
> -using __not_ = __bool_constant;
> +struct __not_
> +: __bool_constant
> +{ };
>/// @endcond
>  
>  #if __cplusplus >= 201703L
> diff --git 
> a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc 
> b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> index 86996b27fa5..ff90f8a47c3 100644
> --- 
> a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> +++ 
> b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
> @@ -14,6 +14,10 @@ static_assert(!std::conjunction_v invalid>);
>  static_assert(!std::conjunction_v);
>  static_assert(!std::conjunction_v);
>  static_assert(!std::conjunction_v invalid>);
> +static_assert(!std::conjunction_v +   std::conjunction,
> +   std::disjunction,
> +   std::negation>);
>  
>  // [meta.logical]/8: For a specialization disjunction, if
>  // there is a template type argument B_i for which bool(B_i::value) is true,
> @@ -24,3 +28,28 @@ static_assert(std::disjunction_v);
>  static_assert(std::disjunction_v);
>  static_assert(std::disjunction_v);
>  static_assert(std::disjunction_v invalid>);
> +static_assert(std::disjunction_v +  std::conjunction,
> +  std::disjunction,
> +  std::negation>);
> +
> +#if __GLIBCXX__
> +// Also test the corresponding internal traits __and_, __or_ and __not_.
> +static_assert(!std::__and_v);
> +static_assert(!std::__and_v);
> +static_assert(!std::__and_v);
> +static_assert(!std::__and_v invalid>);
> +static_assert(!std::__and_v + std::__and_,
> + std::__or_,
> + std::__not_>);
> +
> +static_assert(std::__or_v);
> +static_assert(std::__or_v);
> +static_assert(std::__or_v);
> +static_assert(std::__or_v invalid>);
> +static_assert(std::__or_v +   

[PATCH] libstdc++: Fix laziness of __and/or/not_

2022-09-02 Thread Patrick Palka via Gcc-patches
r13-2230-g390f94eee1ae69 redefined the internal logical operator traits
__and_, __or_ and __not as alias templates that directly resolve to
true_type or false_type.  But it turns out using an alias template here
causes the traits to be less lazy than before because we now compute
the logical result immediately upon _specialization_ of the trait, and
not later upon _completion_ of the specialization.

Thus, for example, in

  using type = __and_>;

we now compute the conjunction and thus instantiate A even though we're
in a context that doesn't require completion of the __and_.  What's
worse is that we now compute the negation and thus instantiate B as well
(for the same reason), independent of the __and_ and the value of A!
Thus the traits are now less lazy and composable than before.

Fortunately, the fix is cheap and simple: redefine these traits as class
templates instead of as alias templates so that completion not
specialization triggers computation of the logical result.  I added
comprehensive short circuiting tests for these internal logical operator
traits in short_circuit.cc guarded by __GLIBCXX__, not sure if
that's the best place for them.  (Note that before this fix, assert #5
and #10 guarded by __GLIBCXX__ would induce a hard error due to this
bug).

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

* include/std/type_traits (__or_, __and_, __not_): Redefine as a
class template instead of an alias template.
* testsuite/20_util/logical_traits/requirements/short_circuit.cc:
Add more tests for conjunction and disjunction.  Add corresponding
tests for __and_ and __or_v.
---
 libstdc++-v3/include/std/type_traits  | 12 ++--
 .../requirements/short_circuit.cc | 29 +++
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 615791f29c8..2feb4b145c5 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -168,13 +168,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // to either true_type or false_type which allows for a more efficient
   // implementation that avoids recursive class template instantiation.
   template
-using __or_ = decltype(__detail::__or_fn<_Bn...>(0));
+struct __or_
+: decltype(__detail::__or_fn<_Bn...>(0))
+{ };
 
   template
-using __and_ = decltype(__detail::__and_fn<_Bn...>(0));
+struct __and_
+: decltype(__detail::__and_fn<_Bn...>(0))
+{ };
 
   template
-using __not_ = __bool_constant;
+struct __not_
+: __bool_constant
+{ };
   /// @endcond
 
 #if __cplusplus >= 201703L
diff --git 
a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc 
b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
index 86996b27fa5..ff90f8a47c3 100644
--- 
a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
+++ 
b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/short_circuit.cc
@@ -14,6 +14,10 @@ static_assert(!std::conjunction_v);
 static_assert(!std::conjunction_v);
 static_assert(!std::conjunction_v);
 static_assert(!std::conjunction_v);
+static_assert(!std::conjunction_v,
+ std::disjunction,
+ std::negation>);
 
 // [meta.logical]/8: For a specialization disjunction, if
 // there is a template type argument B_i for which bool(B_i::value) is true,
@@ -24,3 +28,28 @@ static_assert(std::disjunction_v);
 static_assert(std::disjunction_v);
 static_assert(std::disjunction_v);
 static_assert(std::disjunction_v);
+static_assert(std::disjunction_v,
+std::disjunction,
+std::negation>);
+
+#if __GLIBCXX__
+// Also test the corresponding internal traits __and_, __or_ and __not_.
+static_assert(!std::__and_v);
+static_assert(!std::__and_v);
+static_assert(!std::__and_v);
+static_assert(!std::__and_v);
+static_assert(!std::__and_v,
+   std::__or_,
+   std::__not_>);
+
+static_assert(std::__or_v);
+static_assert(std::__or_v);
+static_assert(std::__or_v);
+static_assert(std::__or_v);
+static_assert(std::__or_v,
+ std::__or_,
+ std::__not_>);
+#endif
-- 
2.37.2.490.g6c8e4ee870



[PATCH] Refactor RPO VN API to allow timevar tracking

2022-09-02 Thread Richard Biener via Gcc-patches
The following refactors things sligtly so "utility" use of the RPO VN
machinery gets its own timevar when invoked from other passes.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* timevar.def (TV_TREE_RPO_VN): New.
* tree-ssa-sccvn.c (do_rpo_vn): Remove one overload.
* tree-ssa-sccvn.c (do_rpo_vn_1): Rename the worker.
(do_rpo_vn): Unify the public API, track with TV_TREE_RPO_VN.
(pass_fre::execute): Adjust.
* tree-ssa-uninit.cc (execute_early_warn_uninitialized): Adjust.
---
 gcc/timevar.def|  1 +
 gcc/tree-ssa-sccvn.cc  | 28 +---
 gcc/tree-ssa-sccvn.h   |  8 ++--
 gcc/tree-ssa-uninit.cc |  5 +
 4 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/gcc/timevar.def b/gcc/timevar.def
index 651af19876f..eac4370431f 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -176,6 +176,7 @@ DEFTIMEVAR (TV_TREE_SPLIT_EDGES  , "tree split crit 
edges")
 DEFTIMEVAR (TV_TREE_REASSOC  , "tree reassociation")
 DEFTIMEVAR (TV_TREE_PRE , "tree PRE")
 DEFTIMEVAR (TV_TREE_FRE , "tree FRE")
+DEFTIMEVAR (TV_TREE_RPO_VN  , "tree RPO VN")
 DEFTIMEVAR (TV_TREE_SINK , "tree code sinking")
 DEFTIMEVAR (TV_TREE_PHIOPT  , "tree linearize phis")
 DEFTIMEVAR (TV_TREE_BACKPROP, "tree backward propagate")
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 5abc8667ce6..74b8d8d18ef 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -7290,14 +7290,14 @@ eliminate_with_rpo_vn (bitmap inserted_exprs)
   return walker.eliminate_cleanup ();
 }
 
-unsigned
-do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
-  bool iterate, bool eliminate, vn_lookup_kind kind);
+static unsigned
+do_rpo_vn_1 (function *fn, edge entry, bitmap exit_bbs,
+bool iterate, bool eliminate, vn_lookup_kind kind);
 
 void
 run_rpo_vn (vn_lookup_kind kind)
 {
-  do_rpo_vn (cfun, NULL, NULL, true, false, kind);
+  do_rpo_vn_1 (cfun, NULL, NULL, true, false, kind);
 
   /* ???  Prune requirement of these.  */
   constant_to_value_id = new hash_table (23);
@@ -7995,9 +7995,9 @@ do_unwind (unwind_state *to, rpo_elim )
executed and iterate.  If ELIMINATE is true then perform
elimination, otherwise leave that to the caller.  */
 
-unsigned
-do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
-  bool iterate, bool eliminate, vn_lookup_kind kind)
+static unsigned
+do_rpo_vn_1 (function *fn, edge entry, bitmap exit_bbs,
+bool iterate, bool eliminate, vn_lookup_kind kind)
 {
   unsigned todo = 0;
   default_vn_walk_kind = kind;
@@ -8415,12 +8415,18 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
 /* Region-based entry for RPO VN.  Performs value-numbering and elimination
on the SEME region specified by ENTRY and EXIT_BBS.  If ENTRY is not
the only edge into the region at ENTRY->dest PHI nodes in ENTRY->dest
-   are not considered.  */
+   are not considered.
+   If ITERATE is true then treat backedges optimistically as not
+   executed and iterate.  If ELIMINATE is true then perform
+   elimination, otherwise leave that to the caller.
+   KIND specifies the amount of work done for handling memory operations.  */
 
 unsigned
-do_rpo_vn (function *fn, edge entry, bitmap exit_bbs)
+do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
+  bool iterate, bool eliminate, vn_lookup_kind kind)
 {
-  unsigned todo = do_rpo_vn (fn, entry, exit_bbs, false, true, VN_WALKREWRITE);
+  auto_timevar tv (TV_TREE_RPO_VN);
+  unsigned todo = do_rpo_vn_1 (fn, entry, exit_bbs, iterate, eliminate, kind);
   free_rpo_vn ();
   return todo;
 }
@@ -8476,7 +8482,7 @@ pass_fre::execute (function *fun)
   if (iterate_p)
 loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
 
-  todo = do_rpo_vn (fun, NULL, NULL, iterate_p, true, VN_WALKREWRITE);
+  todo = do_rpo_vn_1 (fun, NULL, NULL, iterate_p, true, VN_WALKREWRITE);
   free_rpo_vn ();
 
   if (iterate_p)
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index a1b1e6bdd1e..abcf7e666c2 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -295,8 +295,12 @@ value_id_constant_p (unsigned int v)
 tree fully_constant_vn_reference_p (vn_reference_t);
 tree vn_nary_simplify (vn_nary_op_t);
 
-unsigned do_rpo_vn (function *, edge, bitmap, bool, bool, vn_lookup_kind);
-unsigned do_rpo_vn (function *, edge, bitmap);
+unsigned do_rpo_vn (function *, edge, bitmap,
+   /* iterate */ bool = false,
+   /* eliminate */ bool = true,
+   vn_lookup_kind = VN_WALKREWRITE);
+
+/* Private interface for PRE.  */
 void run_rpo_vn (vn_lookup_kind);
 unsigned eliminate_with_rpo_vn (bitmap);
 void free_rpo_vn (void);
diff --git a/gcc/tree-ssa-uninit.cc b/gcc/tree-ssa-uninit.cc
index c25fbe6381e..29dc48c4a29 100644
--- a/gcc/tree-ssa-uninit.cc
+++ b/gcc/tree-ssa-uninit.cc
@@ -1466,10 +1466,7 @@ 

Re: [PATCH 0/3] picolibc: Add picolibc linking help

2022-09-02 Thread Richard Sandiford via Gcc-patches
Keith Packard via Gcc-patches  writes:
> Picolibc is a C library for embedded systems based on code from newlib
> and avr libc. To connect some system-dependent picolibc functions
> (like stdio) to an underlying platform, the platform may provide an OS
> library.
>
> This OS library must follow the C library in the link command line. In
> current picolibc, that is done by providing an alternate .specs file
> which can rewrite the *lib spec to insert the OS library in the right
> spot.
>
> This patch series adds the ability to specify the OS library on the
> gcc command line when GCC is configured to us picolibc as the default
> C library, and then hooks that up for arm, nds32, riscv and sh targets.

Not really my area, but the approach LGTM FWIW.  Main question/points:

- In:

  +case "${with_default_libc}" in
  +glibc)
  +default_libc=LIBC_GLIBC
  +;;

  should there be a default case that raises an error for unrecognised
  libcs?  Command-line checking for configure isn't very tight, but we
  do raise similar errors for things like invalid --enable-threads values.

- I'm not sure either way about adding LIBC_NEWLIB.  On the one hand
  it makes sense for completeness, but on the other it's write-only.
  Adding it means that --with-default-libc=newlib toolchains have a
  different macro configuration from a default toolchain even in cases
  where newlib is the default.

  On balance I think it would be better to accept
  --with-default-libc=newlib but set default_libc to the empty string.

- Should we raise an error for toolchains that don't support the given
  C library?  It feels like we should, but I realise that could be
  difficult to do.

- Very minor, but in lines like:

  +#if defined(DEFAULT_LIBC) && defined(LIBC_PICOLIBC) && DEFAULT_LIBC == 
LIBC_PICOLIBC

  is LIBC_PICOLIB ever undefined?  It looks like config.gcc provides
  an unconditional definition.  If it is always defined then:

  #if DEFAULT_LIBC == LIBC_PICOLIB

  would be clearer.

Thanks,
Richard

>
> Keith Packard (3):
>   Allow default libc to be specified to configure
>   Add newlib and picolibc as default C library choices
>   Add '--oslib=' option when default C library is picolibc
>
>  gcc/config.gcc| 56 ---
>  gcc/config/arm/elf.h  |  5 
>  gcc/config/nds32/elf.h|  4 +++
>  gcc/config/picolibc.opt   | 26 ++
>  gcc/config/riscv/elf.h|  4 +++
>  gcc/config/sh/embed-elf.h |  5 
>  gcc/configure.ac  |  4 +++
>  7 files changed, 95 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/config/picolibc.opt


[PATCH] tree-optimization/106809 - compile time hog in VN

2022-09-02 Thread Richard Biener via Gcc-patches
The dominated_by_p_w_unex function is prone to high compile time.
With GCC 12 we introduced a VN run for uninit diagnostics which now
runs into a degenerate case with bison generated code.  Fortunately
this case is easy to fix with a simple extra check - a more
general fix needs more work.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/106809
* tree-ssa-sccvn.cc (dominaged_by_p_w_unex): Check we have
more than one successor before doing extra work.

* gcc.dg/torture/pr106809.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr106809.c | 28 
 gcc/tree-ssa-sccvn.cc   | 57 +
 2 files changed, 58 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr106809.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr106809.c 
b/gcc/testsuite/gcc.dg/torture/pr106809.c
new file mode 100644
index 000..11e158185cf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr106809.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wuninitialized" } */
+
+int foo (int x, int *val)
+{
+  switch (x)
+{
+#define C(n) \
+case n + 0: return *val; \
+case n + 1: return *val; \
+case n + 2: return *val; \
+case n + 3: return *val; \
+case n + 4: return *val; \
+case n + 5: return *val; \
+case n + 6: return *val; \
+case n + 7: return *val; \
+case n + 8: return *val; \
+case n + 9: return *val;
+#define C1(n) \
+C(n+00) C(n+10) C(n+20) C(n+30) C(n+40) \
+C(n+50) C(n+60) C(n+70) C(n+80) C(n+90)
+#define C10(n) \
+C1(n+000) C1(n+100) C1(n+200) C1(n+300) C1(n+400) \
+C1(n+500) C1(n+600) C1(n+700) C1(n+800) C1(n+900)
+C10(1000)
+}
+  return 0;
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index a1f6f309609..5abc8667ce6 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -4877,41 +4877,44 @@ dominated_by_p_w_unex (basic_block bb1, basic_block 
bb2, bool allow_back)
 }
 
   /* Iterate to the single executable bb2 successor.  */
-  edge succe = NULL;
-  FOR_EACH_EDGE (e, ei, bb2->succs)
-if ((e->flags & EDGE_EXECUTABLE)
-   || (!allow_back && (e->flags & EDGE_DFS_BACK)))
-  {
-   if (succe)
- {
-   succe = NULL;
-   break;
- }
-   succe = e;
-  }
-  if (succe)
+  if (EDGE_COUNT (bb2->succs) > 1)
 {
-  /* Verify the reached block is only reached through succe.
-If there is only one edge we can spare us the dominator
-check and iterate directly.  */
-  if (EDGE_COUNT (succe->dest->preds) > 1)
-   {
- FOR_EACH_EDGE (e, ei, succe->dest->preds)
-   if (e != succe
-   && ((e->flags & EDGE_EXECUTABLE)
-   || (!allow_back && (e->flags & EDGE_DFS_BACK
+  edge succe = NULL;
+  FOR_EACH_EDGE (e, ei, bb2->succs)
+   if ((e->flags & EDGE_EXECUTABLE)
+   || (!allow_back && (e->flags & EDGE_DFS_BACK)))
+ {
+   if (succe)
  {
succe = NULL;
break;
  }
-   }
+   succe = e;
+ }
   if (succe)
{
- bb2 = succe->dest;
+ /* Verify the reached block is only reached through succe.
+If there is only one edge we can spare us the dominator
+check and iterate directly.  */
+ if (EDGE_COUNT (succe->dest->preds) > 1)
+   {
+ FOR_EACH_EDGE (e, ei, succe->dest->preds)
+   if (e != succe
+   && ((e->flags & EDGE_EXECUTABLE)
+   || (!allow_back && (e->flags & EDGE_DFS_BACK
+ {
+   succe = NULL;
+   break;
+ }
+   }
+ if (succe)
+   {
+ bb2 = succe->dest;
 
- /* Re-do the dominance check with changed bb2.  */
- if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
-   return true;
+ /* Re-do the dominance check with changed bb2.  */
+ if (dominated_by_p (CDI_DOMINATORS, bb1, bb2))
+   return true;
+   }
}
 }
 
-- 
2.35.3


[PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2022-09-02 Thread FX via Gcc-patches
Hi,

These operations were added to Fortran 2018, and correspond to well-defined 
IEEE comparison operations, with defined signaling semantics for NaNs. All are 
implemented in terms of GCC expressions and built-ins, with no library support 
needed.

Bootstrapped and regtested on x86_64-linux, both 32- and 64-bit. Depends on a 
patch currently under review for the middle-end 
(https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600840.html).

OK to commit?
FX




0001-Fortran-add-IEEE_QUIET_-and-IEEE_SIGNALING_-comparis.patch
Description: Binary data


Re: [PATCH] LoongArch: add -mdirect-extern-access option

2022-09-02 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-09-01 at 18:54 +0800, Xi Ruoyao wrote:
> We'd like to introduce a new codegen option to align with the old
> "-Wa,-mla-global-with-pcrel" and avoid a performance & size regression
> building the Linux kernel with new-reloc toolchain.  And it should be
> also useful for building statically linked executables, firmwares (EDK2
> for example), and other OS kernels.

Some news: get rid of the GOT will also make the implementation of
relocatable kernel easier, so I hope this can be reviewed quickly.

> OK for trunk?
> 
> -- >8 --
> 
> As a new target, LoongArch does not use copy relocation as it's
> problematic in some circumstances.  One bad consequence is we are
> emitting GOT for all accesses to all extern objects with default
> visibility.  The use of GOT is not needed in statically linked
> executables, OS kernels etc.  The GOT entry just wastes space, and the
> GOT access just slow down the execution in those environments.
> 
> Before -mexplicit-relocs, we used "-Wa,-mla-global-with-pcrel" to tell
> the assembler not to use GOT for extern access.  But with
> -mexplicit-relocs, we have to opt the logic in GCC.
> 
> The name "-mdirect-extern-access" is learnt from x86 port.
> 
> gcc/ChangeLog:
> 
> * config/loongarch/genopts/loongarch.opt.in: Add
> -mdirect-extern-access option.
> * config/loongarch/loongarch.opt: Regenerate.
> * config/loongarch/loongarch.cc (loongarch_classify_symbol):
> Don't use SYMBOL_GOT_DISP if TARGET_DIRECT_EXTERN_ACCESS.
> (loongarch_option_override_internal): Complain if
> -mdirect-extern-access is used with -fPIC or -fpic.
> * doc/invoke.texi: Document -mdirect-extern-access for
> LoongArch.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/loongarch/direct-extern-1.c: New test.
> * gcc.target/loongarch/direct-extern-2.c: New test.
> ---
>  gcc/config/loongarch/genopts/loongarch.opt.in |  4 
>  gcc/config/loongarch/loongarch.cc |  5 -
>  gcc/config/loongarch/loongarch.opt    |  4 
>  gcc/doc/invoke.texi   | 15
> +++
>  .../gcc.target/loongarch/direct-extern-1.c    |  6 ++
>  .../gcc.target/loongarch/direct-extern-2.c    |  6 ++
>  6 files changed, 39 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/direct-extern-
> 1.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/direct-extern-
> 2.c
> 
> diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in
> b/gcc/config/loongarch/genopts/loongarch.opt.in
> index ebdd9538d48..e10618777b2 100644
> --- a/gcc/config/loongarch/genopts/loongarch.opt.in
> +++ b/gcc/config/loongarch/genopts/loongarch.opt.in
> @@ -184,3 +184,7 @@ Enum(cmodel) String(@@STR_CMODEL_EXTREME@@)
> Value(CMODEL_EXTREME)
>  mcmodel=
>  Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel)
> Init(CMODEL_NORMAL)
>  Specify the code model.
> +
> +mdirect-extern-access
> +Target Var(TARGET_DIRECT_EXTERN_ACCESS) Init(0)
> +Avoid using the GOT to access external symbols.
> diff --git a/gcc/config/loongarch/loongarch.cc
> b/gcc/config/loongarch/loongarch.cc
> index 77e3a105390..2875fa5b0f3 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -1642,7 +1642,7 @@ loongarch_classify_symbol (const_rtx x)
>    if (SYMBOL_REF_TLS_MODEL (x))
>  return SYMBOL_TLS;
>  
> -  if (!loongarch_symbol_binds_local_p (x))
> +  if (!TARGET_DIRECT_EXTERN_ACCESS && !loongarch_symbol_binds_local_p
> (x))
>  return SYMBOL_GOT_DISP;
>  
>    tree t = SYMBOL_REF_DECL (x);
> @@ -6093,6 +6093,9 @@ loongarch_option_override_internal (struct
> gcc_options *opts)
>    if (loongarch_branch_cost == 0)
>  loongarch_branch_cost = loongarch_cost->branch_cost;
>  
> +  if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib)
> +    error ("%qs cannot be used for compiling a shared library",
> +  "-mdirect-extern-access");
>  
>    switch (la_target.cmodel)
>  {
> diff --git a/gcc/config/loongarch/loongarch.opt
> b/gcc/config/loongarch/loongarch.opt
> index 6395234218b..96c811c850b 100644
> --- a/gcc/config/loongarch/loongarch.opt
> +++ b/gcc/config/loongarch/loongarch.opt
> @@ -191,3 +191,7 @@ Enum(cmodel) String(extreme) Value(CMODEL_EXTREME)
>  mcmodel=
>  Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel)
> Init(CMODEL_NORMAL)
>  Specify the code model.
> +
> +mdirect-extern-access
> +Target Var(TARGET_DIRECT_EXTERN_ACCESS) Init(0)
> +Avoid using the GOT to access external symbols.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e5eb525a2c1..d4e86682827 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1016,6 +1016,7 @@ Objective-C and Objective-C++ Dialects}.
>  -memcpy  -mno-memcpy -mstrict-align -mno-strict-align @gol
>  -mmax-inline-memcpy-size=@var{n} @gol
>  -mexplicit-relocs -mno-explicit-relocs @gol
> +-mdirect-extern-access -mno-direct-extern-access 

Re: [PATCH] vect: Use better fallback costs in layout subpass

2022-09-02 Thread Richard Biener via Gcc-patches
On Fri, 2 Sep 2022, Richard Sandiford wrote:

> vect_optimize_slp_pass always treats the starting layout as valid,
> to avoid having to "optimise" when every possible choice is invalid.
> But it gives the starting layout a high cost if it seems like the
> target might reject it, in the hope that this will encourage other
> (valid) layouts.
> 
> The testcase for PR106787 showed that this was flawed, since it was
> triggering even in cases where the number of input lanes is different
> from the number of output lanes.  Picking such a high cost could also
> make costs for loop-invariant nodes overwhelm the costs for inner-loop
> nodes.
> 
> This patch makes the costing less aggressive by (a) restricting
> it to N-to-N permutations and (b) assigning the maximum cost of
> a permute.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
> 
> 
> gcc/
>   * tree-vect-slp.cc (vect_optimize_slp_pass::internal_node_cost):
>   Reduce the fallback cost to 1.  Only use it if the number of
>   input lanes is equal to the number of output lanes.
> 
> gcc/testsuite/
>   * gcc.dg/vect/bb-slp-layout-20.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c | 33 
>  gcc/tree-vect-slp.cc | 40 +++-
>  2 files changed, 63 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c
> new file mode 100644
> index 000..ed7816b3f7b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fno-tree-loop-vectorize" } */
> +
> +extern int a[][4], b[][4], c[][4], d[4], e[4];
> +void f()
> +{
> +  int t0 = a[0][3];
> +  int t1 = a[1][3];
> +  int t2 = a[2][3];
> +  int t3 = a[3][3];
> +  int a0 = 0, a1 = 0, a2 = 0, a3 = 0, b0 = 0, b1 = 0, b2 = 0, b3 = 0;
> +  for (int i = 0; i < 400; i += 4)
> +{
> +  a0 += b[i][3] * t0;
> +  a1 += b[i][2] * t1;
> +  a2 += b[i][1] * t2;
> +  a3 += b[i][0] * t3;
> +  b0 += c[i][3] * t0;
> +  b1 += c[i][2] * t1;
> +  b2 += c[i][1] * t2;
> +  b3 += c[i][0] * t3;
> +}
> +  d[0] = a0;
> +  d[1] = a1;
> +  d[2] = a2;
> +  d[3] = a3;
> +  e[0] = b0;
> +  e[1] = b1;
> +  e[2] = b2;
> +  e[3] = b3;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\n\\r\]* = 
> VEC_PERM_EXPR" 3 "slp1" { target { vect_int_mult && vect_perm } } } } */
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 59ec66a6f96..b10f69da133 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -4436,18 +4436,19 @@ change_vec_perm_layout (slp_tree node, 
> lane_permutation_t ,
>  
> IN_LAYOUT_I has no meaning for other types of node.
>  
> -   Keeping the node as-is is always valid.  If the target doesn't appear to
> -   support the node as-is then layout 0 has a high and arbitrary cost instead
> -   of being invalid.  On the one hand, this ensures that every node has at
> -   least one valid layout, avoiding what would otherwise be an awkward
> -   special case.  On the other, it still encourages the pass to change
> -   an invalid pre-existing layout choice into a valid one.  */
> +   Keeping the node as-is is always valid.  If the target doesn't appear
> +   to support the node as-is, but might realistically support other layouts,
> +   then layout 0 instead has the cost of a worst-case permutation.  On the
> +   one hand, this ensures that every node has at least one valid layout,
> +   avoiding what would otherwise be an awkward special case.  On the other,
> +   it still encourages the pass to change an invalid pre-existing layout
> +   choice into a valid one.  */
>  
>  int
>  vect_optimize_slp_pass::internal_node_cost (slp_tree node, int in_layout_i,
>   unsigned int out_layout_i)
>  {
> -  const int fallback_cost = 100;
> +  const int fallback_cost = 1;
>  
>if (SLP_TREE_CODE (node) == VEC_PERM_EXPR)
>  {
> @@ -4457,8 +4458,9 @@ vect_optimize_slp_pass::internal_node_cost (slp_tree 
> node, int in_layout_i,
>/* Check that the child nodes support the chosen layout.  Checking
>the first child is enough, since any second child would have the
>same shape.  */
> +  auto first_child = SLP_TREE_CHILDREN (node)[0];
>if (in_layout_i > 0
> -   && !is_compatible_layout (SLP_TREE_CHILDREN (node)[0], in_layout_i))
> +   && !is_compatible_layout (first_child, in_layout_i))
>   return -1;
>  
>change_vec_perm_layout (node, tmp_perm, in_layout_i, out_layout_i);
> @@ -4469,7 +4471,15 @@ vect_optimize_slp_pass::internal_node_cost (slp_tree 
> node, int in_layout_i,
>if (count < 0)
>   {
> if (in_layout_i == 0 && out_layout_i == 0)
> - return fallback_cost;
> 

Re: [PATCH] vect: Ensure SLP nodes don't end up in multiple BB partitions [PR106787]

2022-09-02 Thread Richard Biener via Gcc-patches
On Fri, 2 Sep 2022, Richard Sandiford wrote:

> In the PR we have two REDUC_PLUS SLP instances that share a common
> load of stride 4.  Each instance also has a unique contiguous load.
> 
> Initially all three loads are out of order, so have a nontrivial
> load permutation.  The layout pass puts them in order instead,
> For the two contiguous loads it is possible to do this by adjusting the
> SLP_LOAD_PERMUTATION to be { 0, 1, 2, 3 }.  But a SLP_LOAD_PERMUTATION
> of { 0, 4, 8, 12 } is rejected as unsupported, so the pass creates a
> separate VEC_PERM_EXPR instead.
> 
> Later the 4-stride load's initial SLP_LOAD_PERMUTATION is rejected too,
> so that the load gets replaced by an external node built from scalars.
> We then have an external node feeding a VEC_PERM_EXPR.
> 
> VEC_PERM_EXPRs created in this way do not have any associated
> SLP_TREE_SCALAR_STMTS.  This means that they do not affect the
> decision about which nodes should be in which subgraph for costing
> purposes.  If the VEC_PERM_EXPR is fed by a vect_external_def,
> then the VEC_PERM_EXPR's input doesn't affect that decision either.
> 
> The net effect is that a shared VEC_PERM_EXPR fed by an external def
> can appear in more than one subgraph.  This triggered an ICE in
> vect_schedule_node, which (rightly) expects to be called no more
> than once for the same internal def.
> 
> There seemed to be many possible fixes, including:
> 
> (1) Replace unsupported loads with external defs *before* doing
> the layout optimisation.  This would avoid the need for the
> VEC_PERM_EXPR altogether.
>
> (2) If the target doesn't support a load in its original layout,
> stop the layout optimisation from checking whether the target
> supports loads in any new candidate layout.  In other words,
> treat all layouts as if they were supported whenever the
> original layout is not in fact supported.
> 
> I'd rather not do this.  In principle, the layout optimisation
> could convert an unsupported layout to a supported one.
> Selectively ignoring target support would work against that.
> 
> We could try to look specifically for loads that will need
> to be decomposed, but that just seems like admitting that
> things are happening in the wrong order.
> 
> (3) Add SLP_TREE_SCALAR_STMTS to VEC_PERM_EXPRs.
> 
> That would be OK for this case, but wouldn't be possible
> for external defs that represent existing vectors.

In general it's good to provide SLP_TREE_SCALAR_STMTS when we
can, but yes, that's not a fix for the actual problem.

> (4) Make vect_schedule_slp share SCC info between subgraphs.
> 
> It feels like that's working around the partitioning problem
> rather than a real fix though.
> 
> (5) Directly ensure that internal def nodes belong to a single
> subgraph.
> 
> (1) is probably the best long-term fix, but (5) is much simpler.
> The subgraph partitioning code already has a hash set to record
> which nodes have been visited; we just need to convert that to a
> map from nodes to instances instead.

Agreed.

> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
> 
> 
> gcc/
>   PR tree-optimization/106787
>   * tree-vect-slp.cc (vect_map_to_instance): New function, split out
>   from...
>   (vect_bb_partition_graph_r): ...here.  Replace the visited set
>   with a map from nodes to instances.  Ensure that a node only
>   appears in one partition.
>   (vect_bb_partition_graph): Update accordingly.
> 
> gcc/testsuite/
>   * gcc.dg/vect/bb-slp-layout-19.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c | 34 ++
>  gcc/tree-vect-slp.cc | 69 
>  2 files changed, 77 insertions(+), 26 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c
> new file mode 100644
> index 000..f075a83a25b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fno-tree-loop-vectorize" } */
> +
> +extern int a[][4], b[][4], c[][4], d[4], e[4];
> +void f()
> +{
> +  int t0 = a[0][3];
> +  int t1 = a[1][3];
> +  int t2 = a[2][3];
> +  int t3 = a[3][3];
> +  int a0 = 0, a1 = 0, a2 = 0, a3 = 0, b0 = 0, b1 = 0, b2 = 0, b3 = 0;
> +  for (int j = 0; j < 100; ++j)
> +for (int i = 0; i < 400; i += 4)
> +  {
> + a0 += b[i][3] * t0;
> + a1 += b[i][2] * t1;
> + a2 += b[i][1] * t2;
> + a3 += b[i][0] * t3;
> + b0 += c[i][3] * t0;
> + b1 += c[i][2] * t1;
> + b2 += c[i][1] * t2;
> + b3 += c[i][0] * t3;
> +  }
> +  d[0] = a0;
> +  d[1] = a1;
> +  d[2] = a2;
> +  d[3] = a3;
> +  e[0] = b0;
> +  e[1] = b1;
> +  e[2] = b2;
> +  e[3] = b3;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\n\\r\]* = 

[PATCH v2 1/2] xtensa: Eliminate unused stack frame allocation/freeing

2022-09-02 Thread Takayuki 'January June' Suwa via Gcc-patches
Changes from v1:
  (xtensa_expand_epilogue): Fixed forgetting to consider hard_frame_pointer_rtx 
when sharing codes.

---
In the example below, 'x' is once placed on the stack frame and then read
into registers as the argument value of bar():

/* example */
struct foo {
  int a, b;
};
extern struct foo bar(struct foo);
struct foo test(void) {
  struct foo x = { 0, 1 };
  return bar(x);
}

Thanks to the dead store elimination, the initialization of 'x' turns into
merely loading the immediates to registers, but corresponding stack frame
growth is not rolled back.  As a result:

;; prereq: the CALL0 ABI
;; before
test:
addisp, sp, -16 // unused stack frame allocation/freeing
movi.n  a2, 0
movi.n  a3, 1
addisp, sp, 16  // because no instructions that refer to
j.l bar, a9 // the stack pointer between the two

This patch eliminates such unused stack frame allocation/freeing:

;; after
test:
movi.n  a2, 0
movi.n  a3, 1
j.l bar, a9

gcc/ChangeLog:

* config/xtensa/xtensa.cc (machine_function): New member to track
the insns for stack pointer adjustment inside of the pro/epilogue.
(xtensa_emit_adjust_stack_ptr): New function to share the common
codes and to record the insns for stack pointer adjustment.
(xtensa_expand_prologue): Change to use the function mentioned
above when using the CALL0 ABI.
(xtensa_expand_epilogue): Ditto.
And also change to cancel emitting the insns for the stack pointer
adjustment if only used for its own.
---
 gcc/config/xtensa/xtensa.cc | 230 ++--
 1 file changed, 118 insertions(+), 112 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index b673b6764da..17416fc6c3f 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -102,6 +102,7 @@ struct GTY(()) machine_function
   int callee_save_size;
   bool frame_laid_out;
   bool epilogue_done;
+  hash_set *logues_a1_adjusts;
 };
 
 /* Vector, indexed by hard register number, which contains 1 for a
@@ -3048,7 +3049,7 @@ xtensa_output_literal (FILE *file, rtx x, machine_mode 
mode, int labelno)
 }
 
 static bool
-xtensa_call_save_reg(int regno)
+xtensa_call_save_reg (int regno)
 {
   if (TARGET_WINDOWED_ABI)
 return false;
@@ -3084,7 +3085,7 @@ compute_frame_size (poly_int64 size)
   cfun->machine->callee_save_size = 0;
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
 {
-  if (xtensa_call_save_reg(regno))
+  if (xtensa_call_save_reg (regno))
cfun->machine->callee_save_size += UNITS_PER_WORD;
 }
 
@@ -3143,6 +3144,51 @@ xtensa_initial_elimination_offset (int from, int to 
ATTRIBUTE_UNUSED)
and the total number of words must be a multiple of 128 bits.  */
 #define MIN_FRAME_SIZE (8 * UNITS_PER_WORD)
 
+#define ADJUST_SP_NONE  0x0
+#define ADJUST_SP_NEED_NOTE 0x1
+#define ADJUST_SP_FRAME_PTR 0x2
+static rtx_insn *
+xtensa_emit_adjust_stack_ptr (HOST_WIDE_INT offset, int flags)
+{
+  rtx_insn *insn;
+  rtx ptr = (flags & ADJUST_SP_FRAME_PTR) ? hard_frame_pointer_rtx
+ : stack_pointer_rtx;
+
+  if (xtensa_simm8 (offset)
+  || xtensa_simm8x256 (offset))
+insn = emit_insn (gen_addsi3 (stack_pointer_rtx, ptr, GEN_INT (offset)));
+  else
+{
+  rtx tmp_reg = gen_rtx_REG (Pmode, A9_REG);
+  rtx_insn* tmp_insn;
+
+  if (offset < 0)
+   {
+ tmp_insn = emit_move_insn (tmp_reg, GEN_INT (-offset));
+ insn = emit_insn (gen_subsi3 (stack_pointer_rtx, ptr, tmp_reg));
+   }
+  else
+   {
+ tmp_insn = emit_move_insn (tmp_reg, GEN_INT (offset));
+ insn = emit_insn (gen_addsi3 (stack_pointer_rtx, ptr, tmp_reg));
+   }
+  cfun->machine->logues_a1_adjusts->add (tmp_insn);
+}
+
+  if (flags & ADJUST_SP_NEED_NOTE)
+{
+  rtx note_rtx = gen_rtx_SET (stack_pointer_rtx,
+ plus_constant (Pmode, stack_pointer_rtx,
+offset));
+
+  RTX_FRAME_RELATED_P (insn) = 1;
+  add_reg_note (insn, REG_FRAME_RELATED_EXPR, note_rtx);
+}
+
+  cfun->machine->logues_a1_adjusts->add (insn);
+  return insn;
+}
+
 void
 xtensa_expand_prologue (void)
 {
@@ -3175,16 +3221,13 @@ xtensa_expand_prologue (void)
   HOST_WIDE_INT offset = 0;
   int callee_save_size = cfun->machine->callee_save_size;
 
+  cfun->machine->logues_a1_adjusts = new hash_set;
+
   /* -128 is a limit of single addi instruction. */
   if (IN_RANGE (total_size, 1, 128))
{
- insn = emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
-   GEN_INT (-total_size)));
- RTX_FRAME_RELATED_P (insn) = 1;
- note_rtx = gen_rtx_SET (stack_pointer_rtx,
-  

Re: [PATCH v4] RISC-V: Add support for inlining subword atomic operations

2022-09-02 Thread Kito Cheng via Gcc-patches
LGTM with minor comments, it's time to move forward, thanks Patrick and Palmer.

> +
> +void
> +riscv_subword_address (rtx mem, rtx *aligned_mem, rtx *shift, rtx *mask,
> +  rtx *not_mask)
> +{
> +  /* Align the memory addess to a word.  */
> +  rtx addr = force_reg (Pmode, XEXP (mem, 0));
> +
> +  rtx aligned_addr = gen_reg_rtx (Pmode);
> +  emit_move_insn (aligned_addr,  gen_rtx_AND (Pmode, addr,
> + gen_int_mode (-4, Pmode)));
> +
> +  *aligned_mem = change_address (mem, SImode, aligned_addr);
> +
> +  /* Calculate the shift amount.  */
> +  *shift = gen_reg_rtx (SImode);

Already allocated reg_rtx outside, this line could be removed.

> +  emit_move_insn (*shift, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
> + gen_int_mode (3, SImode)));
> +  emit_move_insn (*shift, gen_rtx_ASHIFT (SImode, *shift,
> +gen_int_mode(3, SImode)));
> +
> +  /* Calculate the mask.  */
> +  int unshifted_mask;
> +  if (GET_MODE (mem) == QImode)
> +unshifted_mask = 0xFF;
> +  else
> +unshifted_mask = 0x;
> +
> +  rtx mask_reg = gen_reg_rtx (SImode);

Ditto.

> @@ -152,6 +348,128 @@
>DONE;
>  })
>
> +(define_expand "atomic_compare_and_swap"
> +  [(match_operand:SI 0 "register_operand" "");; bool output
> +   (match_operand:SHORT 1 "register_operand" "") ;; val output
> +   (match_operand:SHORT 2 "memory_operand" "")   ;; memory
> +   (match_operand:SHORT 3 "reg_or_0_operand" "") ;; expected value
> +   (match_operand:SHORT 4 "reg_or_0_operand" "") ;; desired value
> +   (match_operand:SI 5 "const_int_operand" "")   ;; is_weak
> +   (match_operand:SI 6 "const_int_operand" "")   ;; mod_s
> +   (match_operand:SI 7 "const_int_operand" "")]  ;; mod_f
> +  "TARGET_ATOMIC && TARGET_INLINE_SUBWORD_ATOMIC"
> +{
> +  emit_insn (gen_atomic_cas_value_strong (operands[1], operands[2],
> +   operands[3], operands[4],
> +   operands[6], operands[7]));
> +
> +  rtx val = gen_reg_rtx (SImode);
> +  if (operands[1] != const0_rtx)
> +emit_insn (gen_rtx_SET (val, gen_rtx_SIGN_EXTEND (SImode, operands[1])));
> +  else
> +emit_insn (gen_rtx_SET (val, const0_rtx));

nit: emit_move_insn rather than emit_insn + gen_rtx_SET

> +
> +  rtx exp = gen_reg_rtx (SImode);
> +  if (operands[3] != const0_rtx)
> +emit_insn (gen_rtx_SET (exp, gen_rtx_SIGN_EXTEND (SImode, operands[3])));
> +  else
> +emit_insn (gen_rtx_SET (exp, const0_rtx));

nit: emit_move_insn rather than emit_insn + gen_rtx_SET

> +
> +  rtx compare = val;
> +  if (exp != const0_rtx)
> +{
> +  rtx difference = gen_rtx_MINUS (SImode, val, exp);
> +  compare = gen_reg_rtx (SImode);
> +  emit_insn (gen_rtx_SET (compare, difference));

nit: emit_move_insn rather than emit_insn + gen_rtx_SET

> +}
> +
> +  if (word_mode != SImode)
> +{
> +  rtx reg = gen_reg_rtx (word_mode);
> +  emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, 
> compare)));

nit: emit_move_insn rather than emit_insn + gen_rtx_SET


> +  compare = reg;
> +}
> +
> +  emit_insn (gen_rtx_SET (operands[0], gen_rtx_EQ (SImode, compare, 
> const0_rtx)));

nit: emit_move_insn rather than emit_insn + gen_rtx_SET


[PATCH] vect: Use better fallback costs in layout subpass

2022-09-02 Thread Richard Sandiford via Gcc-patches
vect_optimize_slp_pass always treats the starting layout as valid,
to avoid having to "optimise" when every possible choice is invalid.
But it gives the starting layout a high cost if it seems like the
target might reject it, in the hope that this will encourage other
(valid) layouts.

The testcase for PR106787 showed that this was flawed, since it was
triggering even in cases where the number of input lanes is different
from the number of output lanes.  Picking such a high cost could also
make costs for loop-invariant nodes overwhelm the costs for inner-loop
nodes.

This patch makes the costing less aggressive by (a) restricting
it to N-to-N permutations and (b) assigning the maximum cost of
a permute.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* tree-vect-slp.cc (vect_optimize_slp_pass::internal_node_cost):
Reduce the fallback cost to 1.  Only use it if the number of
input lanes is equal to the number of output lanes.

gcc/testsuite/
* gcc.dg/vect/bb-slp-layout-20.c: New test.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c | 33 
 gcc/tree-vect-slp.cc | 40 +++-
 2 files changed, 63 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c
new file mode 100644
index 000..ed7816b3f7b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-20.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+
+extern int a[][4], b[][4], c[][4], d[4], e[4];
+void f()
+{
+  int t0 = a[0][3];
+  int t1 = a[1][3];
+  int t2 = a[2][3];
+  int t3 = a[3][3];
+  int a0 = 0, a1 = 0, a2 = 0, a3 = 0, b0 = 0, b1 = 0, b2 = 0, b3 = 0;
+  for (int i = 0; i < 400; i += 4)
+{
+  a0 += b[i][3] * t0;
+  a1 += b[i][2] * t1;
+  a2 += b[i][1] * t2;
+  a3 += b[i][0] * t3;
+  b0 += c[i][3] * t0;
+  b1 += c[i][2] * t1;
+  b2 += c[i][1] * t2;
+  b3 += c[i][0] * t3;
+}
+  d[0] = a0;
+  d[1] = a1;
+  d[2] = a2;
+  d[3] = a3;
+  e[0] = b0;
+  e[1] = b1;
+  e[2] = b2;
+  e[3] = b3;
+}
+
+/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\n\\r\]* = 
VEC_PERM_EXPR" 3 "slp1" { target { vect_int_mult && vect_perm } } } } */
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 59ec66a6f96..b10f69da133 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -4436,18 +4436,19 @@ change_vec_perm_layout (slp_tree node, 
lane_permutation_t ,
 
IN_LAYOUT_I has no meaning for other types of node.
 
-   Keeping the node as-is is always valid.  If the target doesn't appear to
-   support the node as-is then layout 0 has a high and arbitrary cost instead
-   of being invalid.  On the one hand, this ensures that every node has at
-   least one valid layout, avoiding what would otherwise be an awkward
-   special case.  On the other, it still encourages the pass to change
-   an invalid pre-existing layout choice into a valid one.  */
+   Keeping the node as-is is always valid.  If the target doesn't appear
+   to support the node as-is, but might realistically support other layouts,
+   then layout 0 instead has the cost of a worst-case permutation.  On the
+   one hand, this ensures that every node has at least one valid layout,
+   avoiding what would otherwise be an awkward special case.  On the other,
+   it still encourages the pass to change an invalid pre-existing layout
+   choice into a valid one.  */
 
 int
 vect_optimize_slp_pass::internal_node_cost (slp_tree node, int in_layout_i,
unsigned int out_layout_i)
 {
-  const int fallback_cost = 100;
+  const int fallback_cost = 1;
 
   if (SLP_TREE_CODE (node) == VEC_PERM_EXPR)
 {
@@ -4457,8 +4458,9 @@ vect_optimize_slp_pass::internal_node_cost (slp_tree 
node, int in_layout_i,
   /* Check that the child nodes support the chosen layout.  Checking
 the first child is enough, since any second child would have the
 same shape.  */
+  auto first_child = SLP_TREE_CHILDREN (node)[0];
   if (in_layout_i > 0
- && !is_compatible_layout (SLP_TREE_CHILDREN (node)[0], in_layout_i))
+ && !is_compatible_layout (first_child, in_layout_i))
return -1;
 
   change_vec_perm_layout (node, tmp_perm, in_layout_i, out_layout_i);
@@ -4469,7 +4471,15 @@ vect_optimize_slp_pass::internal_node_cost (slp_tree 
node, int in_layout_i,
   if (count < 0)
{
  if (in_layout_i == 0 && out_layout_i == 0)
-   return fallback_cost;
+   {
+ /* Use the fallback cost if the node could in principle support
+some nonzero layout for both the inputs and the outputs.
+Otherwise assume that the node will be rejected later
+and rebuilt from scalars.  */
+

[PATCH] vect: Ensure SLP nodes don't end up in multiple BB partitions [PR106787]

2022-09-02 Thread Richard Sandiford via Gcc-patches
In the PR we have two REDUC_PLUS SLP instances that share a common
load of stride 4.  Each instance also has a unique contiguous load.

Initially all three loads are out of order, so have a nontrivial
load permutation.  The layout pass puts them in order instead,
For the two contiguous loads it is possible to do this by adjusting the
SLP_LOAD_PERMUTATION to be { 0, 1, 2, 3 }.  But a SLP_LOAD_PERMUTATION
of { 0, 4, 8, 12 } is rejected as unsupported, so the pass creates a
separate VEC_PERM_EXPR instead.

Later the 4-stride load's initial SLP_LOAD_PERMUTATION is rejected too,
so that the load gets replaced by an external node built from scalars.
We then have an external node feeding a VEC_PERM_EXPR.

VEC_PERM_EXPRs created in this way do not have any associated
SLP_TREE_SCALAR_STMTS.  This means that they do not affect the
decision about which nodes should be in which subgraph for costing
purposes.  If the VEC_PERM_EXPR is fed by a vect_external_def,
then the VEC_PERM_EXPR's input doesn't affect that decision either.

The net effect is that a shared VEC_PERM_EXPR fed by an external def
can appear in more than one subgraph.  This triggered an ICE in
vect_schedule_node, which (rightly) expects to be called no more
than once for the same internal def.

There seemed to be many possible fixes, including:

(1) Replace unsupported loads with external defs *before* doing
the layout optimisation.  This would avoid the need for the
VEC_PERM_EXPR altogether.

(2) If the target doesn't support a load in its original layout,
stop the layout optimisation from checking whether the target
supports loads in any new candidate layout.  In other words,
treat all layouts as if they were supported whenever the
original layout is not in fact supported.

I'd rather not do this.  In principle, the layout optimisation
could convert an unsupported layout to a supported one.
Selectively ignoring target support would work against that.

We could try to look specifically for loads that will need
to be decomposed, but that just seems like admitting that
things are happening in the wrong order.

(3) Add SLP_TREE_SCALAR_STMTS to VEC_PERM_EXPRs.

That would be OK for this case, but wouldn't be possible
for external defs that represent existing vectors.

(4) Make vect_schedule_slp share SCC info between subgraphs.

It feels like that's working around the partitioning problem
rather than a real fix though.

(5) Directly ensure that internal def nodes belong to a single
subgraph.

(1) is probably the best long-term fix, but (5) is much simpler.
The subgraph partitioning code already has a hash set to record
which nodes have been visited; we just need to convert that to a
map from nodes to instances instead.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
PR tree-optimization/106787
* tree-vect-slp.cc (vect_map_to_instance): New function, split out
from...
(vect_bb_partition_graph_r): ...here.  Replace the visited set
with a map from nodes to instances.  Ensure that a node only
appears in one partition.
(vect_bb_partition_graph): Update accordingly.

gcc/testsuite/
* gcc.dg/vect/bb-slp-layout-19.c: New test.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c | 34 ++
 gcc/tree-vect-slp.cc | 69 
 2 files changed, 77 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c
new file mode 100644
index 000..f075a83a25b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-19.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+
+extern int a[][4], b[][4], c[][4], d[4], e[4];
+void f()
+{
+  int t0 = a[0][3];
+  int t1 = a[1][3];
+  int t2 = a[2][3];
+  int t3 = a[3][3];
+  int a0 = 0, a1 = 0, a2 = 0, a3 = 0, b0 = 0, b1 = 0, b2 = 0, b3 = 0;
+  for (int j = 0; j < 100; ++j)
+for (int i = 0; i < 400; i += 4)
+  {
+   a0 += b[i][3] * t0;
+   a1 += b[i][2] * t1;
+   a2 += b[i][1] * t2;
+   a3 += b[i][0] * t3;
+   b0 += c[i][3] * t0;
+   b1 += c[i][2] * t1;
+   b2 += c[i][1] * t2;
+   b3 += c[i][0] * t3;
+  }
+  d[0] = a0;
+  d[1] = a1;
+  d[2] = a2;
+  d[3] = a3;
+  e[0] = b0;
+  e[1] = b1;
+  e[2] = b2;
+  e[3] = b3;
+}
+
+/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\n\\r\]* = 
VEC_PERM_EXPR" 3 "slp1" { target { vect_int_mult && vect_perm } } } } */
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 1cf79eee4a6..59ec66a6f96 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6435,47 +6435,64 @@ get_ultimate_leader (slp_instance instance,
   return instance;
 }
 
+namespace {
+/* Subroutine of vect_bb_partition_graph_r.  Map KEY to INSTANCE in
+   

Re: [PATCH 0/2] New target hook TARGET_COMPUTE_MULTILIB and implementation for RISC-V

2022-09-02 Thread Kito Cheng via Gcc-patches
Got Jim's review and approval before, but apparently we missed this
last year, rebase and committed to trunk.


On Fri, Jul 30, 2021 at 3:01 AM Palmer Dabbelt  wrote:
>
> On Thu, 29 Jul 2021 11:44:09 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
> > ping
> >
> > On Wed, Jul 21, 2021 at 5:28 PM Kito Cheng  wrote:
> >>
> >> This patch set allow target to use customized multi-lib mechanism rather 
> >> than the built-in
> >> multi-lib mechanism.
> >>
> >> The motivation of this patch is RISC-V might have very complicated 
> >> multi-lib re-use
> >> rule*, which is hard to maintain and use current multi-lib scripts,
> >> we even hit the "argument list too long" error when we tried to add more
> >> multi-lib reuse rule.
> >>
> >> * Here is an example for RISC-V multi-lib rules:
> >> https://gist.github.com/kito-cheng/0289cd42d9a756382e5afeb77b42b73b
> >>
> >> V2 Changes:
> >> - NO changes for first patch(TARGET_COMPUTE_MULTILIB part) since first 
> >> version.
> >> - Handle option other than -march and -mabi for riscv_compute_multilib.
>
> This generally LGTM, but I think it's the sort of thing that should be
> looked at by a global reviewer.  There's a bit of a policy decision
> being made here in that this allows external hooks during the build
> process.
>
> I'm fine with this, as it's just the multilib list, those are really
> specific to a specific toolchain distribution, and there's never going
> to be a way to catalog all the interested cases for the embedded
> toolchains.  I'm still not comfortable calling that a review, though, as
> these things are subtle and I don't always have the same bar for
> external bits that the rest of the GCC folks do.


Re: [PATCH 1/3] STABS: remove -gstabs and -gxcoff functionality

2022-09-02 Thread Richard Biener via Gcc-patches
On Fri, Sep 2, 2022 at 9:00 AM Martin Liška  wrote:
>
> On 9/1/22 13:18, Richard Biener wrote:
> > I presume WarnRemoved will diagnose use of -gstabs but not fail
> > compilation.  Will -gstabs then still enable -g (with the default debug
> > format)?
>
> No, it won't set -g option.

That was the usual side-effect - I wonder if we want to emit extra
diagnostic when one of the obsolete options is given but -g is not
enabled in the end or whether we want to preserve the debug info
enablement effect?

> >
> > Please followup with a gcc-13/changes.html entry.
>
> Sure.
>
> >
> > I notice we have VMS_DEBUGGING_INFO left.  From a quick look
> > it is used by alpha*-dec-* (exclusively) and ia64-hp-*vms*  (maybe
> > also supports DWARF, it is ELF at least).  One of the goals of
> > non-DWARF removal was to get rid of debug hooks and instead allow
> > "free-form" early debug generation from the frontends.
>
> Can you please explain what you mean by the free-form and what's expected
> to do with the VMS_DEBUGGING_INFO macro?

Well, VMS debugging would go, just like STABS.  With "free-form" I mean
that frontend code could call into the dwarf2out API directly, creating
DWARF DIEs for language specific info (we probably want to export more
and/or nicer APIs for such use).

Richard.

> Cheers,
> Martin


[PATCH v2, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-09-02 Thread HAO CHEN GUI via Gcc-patches
Hi,

  This patch is for internal issue1136. It changes insn condition from
TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions.
These instructions all use DI registers and can be invoked with -mpowerpc64
in a 32-bit environment.

  This patch also changes prototypes of related built-ins and effective
target of test cases.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-09-01  Haochen Gui  

gcc/
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
long long.
(__builtin_vsx_scalar_extract_sig): Likewise.
* config/rs6000/vsx.md (xsxexpdp): Change insn condition from
TARGET_64BIT to TARGET_POWERPC64.
(xsxsigdp): Likewise.
(xsiexpdp): Likewise.
(xsiexpdpf): Likewise.

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Change effective
target from lp64 to has_arch_ppc64 and add -mpowerpc64 for 32-bit
environment.
* gcc.target/powerpc/bfp/scalar-extract-exp-6.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-exp-7.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-6.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-7.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-12.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-13.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Likewise.

patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d7..4ebfd4704a1 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2847,10 +2847,10 @@
   pure vsc __builtin_vsx_lxvl (const void *, signed long);
 LXVL lxvl {}

-  const signed long __builtin_vsx_scalar_extract_exp (double);
+  const unsigned long long __builtin_vsx_scalar_extract_exp (double);
 VSEEDP xsxexpdp {}

-  const signed long __builtin_vsx_scalar_extract_sig (double);
+  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
 VSESDP xsxsigdp {}

   const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index e226a93bbe5..a01711aa2cb 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5098,7 +5098,7 @@ (define_insn "xsxexpdp"
   [(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
 UNSPEC_VSX_SXEXPDP))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsxexpdp %0,%x1"
   [(set_attr "type" "integer")])

@@ -5116,7 +5116,7 @@ (define_insn "xsxsigdp"
   [(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
 UNSPEC_VSX_SXSIG))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsxsigdp %0,%x1"
   [(set_attr "type" "integer")])

@@ -5147,7 +5147,7 @@ (define_insn "xsiexpdp"
(unspec:DF [(match_operand:DI 1 "register_operand" "r")
(match_operand:DI 2 "register_operand" "r")]
 UNSPEC_VSX_SIEXPDP))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsiexpdp %x0,%1,%2"
   [(set_attr "type" "fpsimple")])

@@ -5157,7 +5157,7 @@ (define_insn "xsiexpdpf"
(unspec:DF [(match_operand:DF 1 "register_operand" "r")
(match_operand:DI 2 "register_operand" "r")]
 UNSPEC_VSX_SIEXPDP))]
-  "TARGET_P9_VECTOR && TARGET_64BIT"
+  "TARGET_P9_VECTOR && TARGET_POWERPC64"
   "xsiexpdp %x0,%1,%2"
   [(set_attr "type" "fpsimple")])

diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
index 35bf1b240f3..81565c50ec7 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
@@ -1,7 +1,8 @@
-/* { dg-do compile { target { powerpc*-*-* } } } */
-/* { dg-require-effective-target lp64 } */
-/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-do compile { target { powerpc*-*-linux* } } } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mpowerpc64" } */
+/* { dg-require-effective-target has_arch_ppc64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */

 /* This test should succeed only on 64-bit configurations.  */
 #include 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-6.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-6.c
index b9dd7d61aae..33e55d5abc1 100644
--- 

Re: [PATCH][DOCS] gcc-13: document removal of STABS

2022-09-02 Thread Gerald Pfeifer
Hi Martin,

On Fri, 2 Sep 2022, Martin Liška wrote:
> +The support for emitting the STABS debugging format has been removed
> +  (includes -gstabs and -gxcoff options) which 
> means
> +  the support for dbx debugger is removed.

how about slightly rephrasing this and breaking up the sentence for easier 
consumption?

   Support for emitting the STABS debugging format (including the
 -gstabs and -gxcoff options) has been removed.
 (This means the dbx debugger is no longer 
 supported, either.)

Just a suggestion, feel free to take what you like (only the removals and 
addition of "the" should stay.)

And if you have an idea how to phrase my second sentence a little nicer, 
absolutely go for it.

Thanks,
Gerald


RE: [PATCH] x86: Handle V8BF in expand_vec_perm_broadcast_1

2022-09-02 Thread Kong, Lingling via Gcc-patches
Hi,

I fixed it in a new patch.  And added BF vector mode in SUBST_V and 
avx512fmaskhalfmode for @vec_interleave_high.
Ok for trunk ?

> > Hi,
> >
> > Handle E_V8BFmode in expand_vec_perm_broadcast_1 and
> ix86_expand_vector_init_duplicate.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/106742
> > * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate):
> > Handle V8BF mode.
> > (expand_vec_perm_broadcast_1): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr106742.c: New test.
> > ---
> >  gcc/config/i386/i386-expand.cc   | 17 -
> >  gcc/testsuite/gcc.target/i386/pr106742.c | 10 ++
> >  2 files changed, 22 insertions(+), 5 deletions(-)  create mode 100644
> > gcc/testsuite/gcc.target/i386/pr106742.c
> >
> > diff --git a/gcc/config/i386/i386-expand.cc
> > b/gcc/config/i386/i386-expand.cc index 4b216308a18..a08222fe1b6 100644
> > --- a/gcc/config/i386/i386-expand.cc
> > +++ b/gcc/config/i386/i386-expand.cc
> > @@ -15030,11 +15030,15 @@ ix86_expand_vector_init_duplicate (bool
> mmx_ok, machine_mode mode,
> >   dperm.op0 = dperm.op1 = gen_reg_rtx (mode);
> >   dperm.one_operand_p = true;
> >
> > - if (mode == V8HFmode)
> > + if (mode == V8HFmode || mode == V8BFmode)
> > {
> > - tmp1 = force_reg (HFmode, val);
> > + rtx (*gen_vec_set_0) (rtx, rtx, rtx) = NULL;
> > + tmp1 = mode == V8HFmode ? force_reg (HFmode, val)
> > + : force_reg (BFmode, val);
> tmp1 = force_reg (GET_MODE_INNER (mode), val);
> >   tmp2 = gen_reg_rtx (mode);
> > - emit_insn (gen_vec_setv8hf_0 (tmp2, CONST0_RTX (mode), tmp1));
> > + gen_vec_set_0 = mode == V8HFmode ? gen_vec_setv8hf_0
> > +  : gen_vec_setv8bf_0;
> add @ to vec_set_0 as (define_insn "@vec_set_0" and pass
> mode to vec_set_0 as emit_insn (gen_vec_set_0 (mode, tmp2, CONST0_RTX
> (mode), tmp1));
> > + emit_insn (gen_vec_set_0 (tmp2, CONST0_RTX (mode),
> > + tmp1));
> 
> >   tmp1 = gen_lowpart (mode, tmp2);
> > }
> >   else
> > @@ -21822,17 +21826,20 @@ expand_vec_perm_broadcast_1 (struct
> expand_vec_perm_d *d)
> >return true;
> >
> >  case E_V8HFmode:
> > +case E_V8BFmode:
> >/* This can be implemented via interleave and pshufd.  */
> >if (d->testing_p)
> > return true;
> >
> >if (elt >= nelt2)
> > {
> > - gen = gen_vec_interleave_highv8hf;
> > + gen = vmode == V8HFmode ? gen_vec_interleave_highv8hf
> > + : gen_vec_interleave_highv8bf;
> Similar, add @ to define_insn and pass gen_vec_interleave.
> >   elt -= nelt2;
> > }
> >else
> > -   gen = gen_vec_interleave_lowv8hf;
> > +   gen = vmode == V8HFmode ? gen_vec_interleave_lowv8hf
> > +   : gen_vec_interleave_lowv8bf;
> >nelt2 /= 2;
> >
> >dest = gen_reg_rtx (vmode);
> > diff --git a/gcc/testsuite/gcc.target/i386/pr106742.c
> > b/gcc/testsuite/gcc.target/i386/pr106742.c
> > new file mode 100644
> > index 000..4a53cd49902
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr106742.c
> > @@ -0,0 +1,10 @@
> > +/* { dg-do compile } */
> > +/* { dg-options " -msse2 -mno-avx2 -O1" } */ typedef __bf16 v8bf
> > +__attribute__ ((__vector_size__ (16)));
> > +
> > +v8bf
> > +vec_init_dup_v8bf (__bf16 a1)
> > +{
> > +  return __extension__ (v8bf) { a1, a1, a1, a1, a1, a1, a1, a1 }; }
> > +/* { dg-final { scan-assembler-times "punpcklwd" 1} } */
> > --
> > 2.18.2
> >
> 
> 
> --
> BR,
> Hongtao


0001-x86-Handle-V8BF-in-expand_vec_perm_broadcast_1.patch
Description: 0001-x86-Handle-V8BF-in-expand_vec_perm_broadcast_1.patch


[Ada] Adjust previous change to Expand_Subtype_From_Expr

2022-09-02 Thread Marc Poulhiès via Gcc-patches
An aggregate may have been rewritten before being seen by the procedure.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_util.adb (Expand_Subtype_From_Expr): Be prepared for
rewritten aggregates as expressions.diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -5741,7 +5741,7 @@ package body Exp_Util is
--  non-statically-matching subtypes on 'Access of this object.
 
and then (Nkind (N) /= N_Object_Declaration
-  or else Nkind (Exp) = N_Aggregate
+  or else Nkind (Original_Node (Exp)) = N_Aggregate
   or else Is_Constr_Subt_For_U_Nominal (Exp_Typ))
  then
 --  Within an initialization procedure, a selected component




[Ada] Fix crash on declaration of overaligned array with constraints

2022-09-02 Thread Marc Poulhiès via Gcc-patches
The semantic analyzer was setting the Is_Constr_Subt_For_UN_Aliased flag on
the actual subtype of the object, which is incorrect because the nominal
subtype is constrained.  This also adjusts a recent related change.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_util.adb (Expand_Subtype_From_Expr): Check for the presence
of the Is_Constr_Subt_For_U_Nominal flag instead of the absence
of the Is_Constr_Subt_For_UN_Aliased flag on the subtype of the
expression of an object declaration before reusing this subtype.
* sem_ch3.adb (Analyze_Object_Declaration): Do not incorrectly
set the Is_Constr_Subt_For_UN_Aliased flag on the actual subtype
of an array with definite nominal subtype.  Remove useless test.diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -5732,14 +5732,17 @@ package body Exp_Util is
   then
  if Is_Itype (Exp_Typ)
 
-   --  If Exp_Typ was created for a previous declaration whose nominal
-   --  subtype is unconstrained, and that declaration is aliased,
-   --  we need to generate a new subtype, because otherwise the
-   --  Is_Constr_Subt_For_U_Nominal flag will be set on the wrong
-   --  subtype, causing failure to detect non-statically-matching
-   --  subtypes on 'Access of the previously-declared object.
-
-   and then not Is_Constr_Subt_For_UN_Aliased (Exp_Typ)
+   --  When this is for an object declaration, the caller may want to
+   --  set Is_Constr_Subt_For_U_Nominal on the subtype, so we must make
+   --  sure that either the subtype has been built for the expression,
+   --  typically for an aggregate, or the flag is already set on it;
+   --  otherwise it could end up being set on the nominal constrained
+   --  subtype of an object and thus later cause the failure to detect
+   --  non-statically-matching subtypes on 'Access of this object.
+
+   and then (Nkind (N) /= N_Object_Declaration
+  or else Nkind (Exp) = N_Aggregate
+  or else Is_Constr_Subt_For_U_Nominal (Exp_Typ))
  then
 --  Within an initialization procedure, a selected component
 --  denotes a component of the enclosing record, and it appears as


diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -4770,20 +4770,13 @@ package body Sem_Ch3 is
  if not Is_Entity_Name (Object_Definition (N)) then
 Act_T := Etype (E);
 Check_Compile_Time_Size (Act_T);
-
-if Aliased_Present (N) then
-   Set_Is_Constr_Subt_For_UN_Aliased (Act_T);
-end if;
  end if;
 
  --  When the given object definition and the aggregate are specified
  --  independently, and their lengths might differ do a length check.
  --  This cannot happen if the aggregate is of the form (others =>...)
 
- if not Is_Constrained (T) then
-null;
-
- elsif Nkind (E) = N_Raise_Constraint_Error then
+ if Nkind (E) = N_Raise_Constraint_Error then
 
 --  Aggregate is statically illegal. Place back in declaration
 




[Ada] Add loop variants to Ada.Strings.Search and Ada.Strings.Maps

2022-09-02 Thread Marc Poulhiès via Gcc-patches
Add loop variants to verify that loops terminate in string handling.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-strmap.adb: Add variants to simple and while loops.
* libgnat/a-strsea.adb: Idem.diff --git a/gcc/ada/libgnat/a-strmap.adb b/gcc/ada/libgnat/a-strmap.adb
--- a/gcc/ada/libgnat/a-strmap.adb
+++ b/gcc/ada/libgnat/a-strmap.adb
@@ -290,6 +290,7 @@ is
  loop
 pragma Loop_Invariant
   (Seq1 (Seq1'First .. J) = Seq2 (Seq2'First .. J));
+pragma Loop_Variant (Increases => J);
 
 if J = Positive'Last then
return;
@@ -440,6 +441,7 @@ is
   (Character'Pos (C) >= Character'Pos (C'Loop_Entry));
 pragma Loop_Invariant
   (for all Char in C'Loop_Entry .. C => not Set (Char));
+pragma Loop_Variant (Increases => C);
 exit when C = Character'Last;
 C := Character'Succ (C);
  end loop;
@@ -457,6 +459,7 @@ is
 pragma Loop_Invariant
   (for all Char in C'Loop_Entry .. C =>
  (if Char /= C then Set (Char)));
+pragma Loop_Variant (Increases => C);
 exit when not Set (C) or else C = Character'Last;
 C := Character'Succ (C);
  end loop;
@@ -491,6 +494,7 @@ is
  pragma Loop_Invariant
(for all Span of Max_Ranges (1 .. Range_Num) =>
   (for all Char in Span.Low .. Span.High => Set (Char)));
+ pragma Loop_Variant (Increases => Range_Num);
   end loop;
 
   return Max_Ranges (1 .. Range_Num);


diff --git a/gcc/ada/libgnat/a-strsea.adb b/gcc/ada/libgnat/a-strsea.adb
--- a/gcc/ada/libgnat/a-strsea.adb
+++ b/gcc/ada/libgnat/a-strsea.adb
@@ -113,6 +113,7 @@ package body Ada.Strings.Search with SPARK_Mode is
 
 pragma Loop_Invariant (Num <= Ind - (Source'First - 1));
 pragma Loop_Invariant (Ind >= Source'First);
+pragma Loop_Variant (Increases => Ind);
  end loop;
 
   --  Mapped case
@@ -142,6 +143,7 @@ package body Ada.Strings.Search with SPARK_Mode is
 null;
 pragma Loop_Invariant (Num <= Ind - (Source'First - 1));
 pragma Loop_Invariant (Ind >= Source'First);
+pragma Loop_Variant (Increases => Ind);
  end loop;
   end if;
 
@@ -200,6 +202,7 @@ package body Ada.Strings.Search with SPARK_Mode is
  null;
  pragma Loop_Invariant (Num <= Ind - (Source'First - 1));
  pragma Loop_Invariant (Ind >= Source'First);
+ pragma Loop_Variant (Increases => Ind);
   end loop;
 
   return Num;




[Ada] Recover proof of Scaled_Divide in System.Arith_64

2022-09-02 Thread Marc Poulhiès via Gcc-patches
Proof of Scaled_Divide was impacted by changes in provers and Why3.
Recover it partially, leaving some unproved basic inferences to be
further investigated.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-aridou.adb: Add or rework ghost code.
* libgnat/s-aridou.ads: Add Big_Positive subtype.diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb
--- a/gcc/ada/libgnat/s-aridou.adb
+++ b/gcc/ada/libgnat/s-aridou.adb
@@ -126,7 +126,7 @@ is
  Pre => B /= 0;
--  Length doubling remainder
 
-   function Big_2xx (N : Natural) return Big_Integer is
+   function Big_2xx (N : Natural) return Big_Positive is
  (Big (Double_Uns'(2 ** N)))
with
  Ghost,
@@ -141,6 +141,13 @@ is
with Ghost;
--  X1 as a big integer
 
+   function Big3 (X1, X2, X3 : Big_Integer) return Big_Integer is
+ (Big_2xxSingle * Big_2xxSingle * X1
++ Big_2xxSingle * X2
++ X3)
+   with Ghost;
+   --  Version of Big3 on big integers
+
function Le3 (X1, X2, X3, Y1, Y2, Y3 : Single_Uns) return Boolean
with
  Post => Le3'Result = (Big3 (X1, X2, X3) <= Big3 (Y1, Y2, Y3));
@@ -234,6 +241,17 @@ is
  Pre  => X /= Double_Uns'Last,
  Post => Big (X + Double_Uns'(1)) = Big (X) + 1;
 
+   procedure Lemma_Big_Of_Double_Uns (X : Double_Uns)
+   with
+ Ghost,
+ Post => Big (X) < Big_2xxDouble;
+
+   procedure Lemma_Big_Of_Double_Uns_Of_Single_Uns (X : Single_Uns)
+   with
+ Ghost,
+ Post => Big (Double_Uns (X)) >= 0
+   and then Big (Double_Uns (X)) < Big_2xxSingle;
+
procedure Lemma_Bounded_Powers_Of_2_Increasing (M, N : Natural)
with
  Ghost,
@@ -447,9 +465,9 @@ is
procedure Lemma_Mult_Non_Negative (X, Y : Big_Integer)
with
  Ghost,
- Pre  => (X >= Big_0 and then Y >= Big_0)
-   or else (X <= Big_0 and then Y <= Big_0),
- Post => X * Y >= Big_0;
+ Pre  => (X >= 0 and then Y >= 0)
+   or else (X <= 0 and then Y <= 0),
+ Post => X * Y >= 0;
 
procedure Lemma_Mult_Non_Positive (X, Y : Big_Integer)
with
@@ -458,6 +476,13 @@ is
or else (X >= Big_0 and then Y <= Big_0),
  Post => X * Y <= Big_0;
 
+   procedure Lemma_Mult_Positive (X, Y : Big_Integer)
+   with
+ Ghost,
+ Pre  => (X > Big_0 and then Y > Big_0)
+   or else (X < Big_0 and then Y < Big_0),
+ Post => X * Y > Big_0;
+
procedure Lemma_Neg_Div (X, Y : Big_Integer)
with
  Ghost,
@@ -604,6 +629,8 @@ is
procedure Lemma_Abs_Range (X : Big_Integer) is null;
procedure Lemma_Add_Commutation (X : Double_Uns; Y : Single_Uns) is null;
procedure Lemma_Add_One (X : Double_Uns) is null;
+   procedure Lemma_Big_Of_Double_Uns (X : Double_Uns) is null;
+   procedure Lemma_Big_Of_Double_Uns_Of_Single_Uns (X : Single_Uns) is null;
procedure Lemma_Bounded_Powers_Of_2_Increasing (M, N : Natural) is null;
procedure Lemma_Deep_Mult_Commutation
  (Factor : Big_Integer;
@@ -638,6 +665,7 @@ is
procedure Lemma_Mult_Distribution (X, Y, Z : Big_Integer) is null;
procedure Lemma_Mult_Non_Negative (X, Y : Big_Integer) is null;
procedure Lemma_Mult_Non_Positive (X, Y : Big_Integer) is null;
+   procedure Lemma_Mult_Positive (X, Y : Big_Integer) is null;
procedure Lemma_Neg_Rem (X, Y : Big_Integer) is null;
procedure Lemma_Not_In_Range_Big2xx64 is null;
procedure Lemma_Powers (A : Big_Natural; B, C : Natural) is null;
@@ -1888,7 +1916,7 @@ is
 
   --  Local ghost variables
 
-  Mult  : constant Big_Integer := abs (Big (X) * Big (Y)) with Ghost;
+  Mult  : constant Big_Natural := abs (Big (X) * Big (Y)) with Ghost;
   Quot  : Big_Integer with Ghost;
   Big_R : Big_Integer with Ghost;
   Big_Q : Big_Integer with Ghost;
@@ -1955,6 +1983,15 @@ is
   --  Proves correctness of the multiplication of divisor by quotient to
   --  compute amount to subtract.
 
+  procedure Prove_Mult_Decomposition_Split3
+(D1, D2, D3, D3_Hi, D3_Lo, D4 : Big_Integer)
+  with
+Ghost,
+Pre  => Is_Mult_Decomposition (D1, D2, D3, D4)
+  and then D3 = Big_2xxSingle * D3_Hi + D3_Lo,
+Post => Is_Mult_Decomposition (D1, D2 + D3_Hi, D3_Lo, D4);
+  --  Proves decomposition of Mult after splitting third component
+
   procedure Prove_Negative_Dividend
   with
 Ghost,
@@ -2066,6 +2103,27 @@ is
else abs Quot);
   --  Proves correctness of the rounding of the unsigned quotient
 
+  procedure Prove_Scaled_Mult_Decomposition_Regroup24
+(D1, D2, D3, D4 : Big_Integer)
+  with
+Ghost,
+Pre  => Scale < Double_Size
+  and then Is_Scaled_Mult_Decomposition (D1, D2, D3, D4),
+Post => Is_Scaled_Mult_Decomposition
+  (0, Big_2xxSingle * D1 + D2, 0, Big_2xxSingle * D3 + D4);
+  --  Proves scaled decomposition of Mult after regrouping on second and
+  --  fourth component.
+
+  procedure 

Re: [PATCH] RISC-V: Fix the V calling convention

2022-09-02 Thread Kito Cheng via Gcc-patches
CALL_USED_REGISTERS already set those registers to 1, but I think it
is worth doing some clean up like this to prevent confusion.

On Thu, Sep 1, 2022 at 11:28 AM Palmer Dabbelt  wrote:
>
> The V registers are always clobbered on calls.
>
> gcc/ChangeLog
>
> * config/riscv/riscv.cc (riscv_conditional_register_usage):
> Always mark the V registers as clobbered on calls.
> ---
>  gcc/config/riscv/riscv.cc | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 675d92c0961..c18e61f4a03 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -5442,11 +5442,18 @@ riscv_conditional_register_usage (void)
>if (!TARGET_VECTOR)
>  {
>for (int regno = V_REG_FIRST; regno <= V_REG_LAST; regno++)
> -   fixed_regs[regno] = call_used_regs[regno] = 1;
> +   fixed_regs[regno] = 1;
>
> -  fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1;
> -  fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1;
> +  fixed_regs[VTYPE_REGNUM] = 1;
> +  fixed_regs[VL_REGNUM] = 1;
>  }

So we only need the above change I think.

> +
> +  /* The standard ABIs all clobber the entire vector state on calls.  */
> +  for (int regno = V_REG_FIRST; regno <= V_REG_LAST; regno++)
> +call_used_regs[regno] = 1;
> +
> +  call_used_regs[VTYPE_REGNUM] = 1;
> +  call_used_regs[VL_REGNUM] = 1;
>  }
>
>  /* Return a register priority for hard reg REGNO.  */
> --
> 2.34.1
>


[Ada] Fix proof of runtime unit System.Exp_Mod

2022-09-02 Thread Marc Poulhiès via Gcc-patches
Regain the proof of System.Exp_Mod after changes in provers and Why3.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-expmod.adb (Lemma_Add_Mod): Add new lemma to factor
out a complex sub-proof.
(Exp_Modular): Add assertion to help proof.diff --git a/gcc/ada/libgnat/s-expmod.adb b/gcc/ada/libgnat/s-expmod.adb
--- a/gcc/ada/libgnat/s-expmod.adb
+++ b/gcc/ada/libgnat/s-expmod.adb
@@ -106,6 +106,13 @@ is
---
 
procedure Lemma_Add_Mod (X, Y : Big_Natural; B : Big_Positive) is
+
+  procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) with
+Pre  => F /= 0,
+Post => (Q * F + R) mod F = R mod F;
+
+  procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) is null;
+
   Left  : constant Big_Natural := (X + Y) mod B;
   Right : constant Big_Natural := ((X mod B) + (Y mod B)) mod B;
   XQuot : constant Big_Natural := X / B;
@@ -119,6 +126,8 @@ is
(Left = ((XQuot + YQuot) * B + X mod B + Y mod B) mod B);
  pragma Assert (X mod B + Y mod B = AQuot * B + Right);
  pragma Assert (Left = ((XQuot + YQuot + AQuot) * B + Right) mod B);
+ Lemma_Euclidean_Mod (XQuot + YQuot + AQuot, B, Right);
+ pragma Assert (Left = (Right mod B));
  pragma Assert (Left = Right);
   end if;
end Lemma_Add_Mod;
@@ -259,6 +268,7 @@ is
pragma Assert (Equal_Modulo
  ((Big (Result) * Big (Factor)) * Big (Factor) ** (Exp - 1),
   Big (Left) ** Right));
+   pragma Assert (Big (Factor) >= 0);
Lemma_Mult_Mod (Big (Result) * Big (Factor),
   Big (Factor) ** (Exp - 1),
   Big (Modulus));




[Ada] Fix proof of runtime unit System.Wid_*

2022-09-02 Thread Marc Poulhiès via Gcc-patches
Regain the proof of System.Wid_* after changes in provers and Why3.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-widthu.adb (Lemma_Euclidean): Lemma to prove the
relation between the quotient/remainder of a division.diff --git a/gcc/ada/libgnat/s-widthu.adb b/gcc/ada/libgnat/s-widthu.adb
--- a/gcc/ada/libgnat/s-widthu.adb
+++ b/gcc/ada/libgnat/s-widthu.adb
@@ -73,6 +73,14 @@ package body System.Width_U is
 Ghost,
 Post => X / Y / Z = X / (Y * Z);
 
+  procedure Lemma_Euclidian (V, Q, F, R : Big_Integer)
+  with
+Ghost,
+Pre  => F > 0 and then Q = V / F and then R = V rem F,
+Post => V = Q * F + R;
+  --  Ghost lemma to prove the relation between the quotient/remainder of
+  --  division by F and the value V.
+
   --
   -- Lemma_Lower_Mult --
   --
@@ -104,6 +112,12 @@ package body System.Width_U is
  pragma Assert (X / YZ = XYZ + R / YZ);
   end Lemma_Div_Twice;
 
+  -
+  -- Lemma_Euclidian --
+  -
+
+  procedure Lemma_Euclidian (V, Q, F, R : Big_Integer) is null;
+
   --  Local variables
 
   W : Natural;
@@ -152,7 +166,7 @@ package body System.Width_U is
 R : constant Big_Integer := Big (T_Init) rem F with Ghost;
  begin
 pragma Assert (Q < Big_10);
-pragma Assert (Big (T_Init) = Q * F + R);
+Lemma_Euclidian (Big (T_Init), Q, F, R);
 Lemma_Lower_Mult (Q, Big (9), F);
 pragma Assert (Big (T_Init) <= Big (9) * F + F - 1);
 pragma Assert (Big (T_Init) < Big_10 * F);




[Ada] Error on return of object whose full view has undefaulted discriminants

2022-09-02 Thread Marc Poulhiès via Gcc-patches
The compiler wrongly reports an error about the expected type not
matching the same-named found type in a return statement for a function
whose result type has unknown discriminants when the full type is tagged
and has an undefaulted discriminant, and the return expression is an object
initialized by a function call. The processing for return statements that
creates an actual subtype based on the return expression type's underlying
type when that type has discriminants, and converts the expression to
the actual subtype, should only be done when the underlying discriminated
type is mutable (i.e., has defaulted discriminants). Otherwise the
unchecked conversion to the actual subtype (of the underlying full type)
can lead to a resolution problem later within Expand_Simple_Function_Return
in the expansion of tag assignments (because the target type of the
conversion is a full view and does not match the partial view of
the function's result type).

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Expand_Simple_Function_Return) Bypass creation of an 
actual
subtype and unchecked conversion to that subtype when the underlying 
type
of the expression has discriminants without defaults.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -6632,7 +6632,7 @@ package body Exp_Ch6 is
 
  begin
 if not Exp_Is_Function_Call
-  and then Has_Discriminants (Ubt)
+  and then Has_Defaulted_Discriminants (Ubt)
   and then not Is_Constrained (Ubt)
   and then not Has_Unchecked_Union (Ubt)
 then




[Ada] Fix proof of runtime unit System.Value* and System.Image*

2022-09-02 Thread Marc Poulhiès via Gcc-patches
Refactor specification of the Value* and Image* units and fix proofs.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-nbnbig.ads: Add Always_Return annotation.
* libgnat/s-vaispe.ads: New ghost unit for the specification of
System.Value_I. Restore proofs.
* libgnat/s-vauspe.ads: New ghost unit for the specification of
System.Value_U. Restore proofs.
* libgnat/s-valuei.adb: The specification only subprograms are
moved to System.Value_I_Spec. Restore proofs.
* libgnat/s-valueu.adb: The specification only subprograms are
moved to System.Value_U_Spec. Restore proofs.
* libgnat/s-valuti.ads
(Uns_Params): Generic unit used to bundle together the
specification functions of System.Value_U_Spec.
(Int_Params): Generic unit used to bundle together the
specification functions of System.Value_I_Spec.
* libgnat/s-imagef.adb: It is now possible to instantiate the
appropriate specification units instead of creating imported ghost
subprograms.
* libgnat/s-imagei.adb: Update to refactoring of specifications
and fix proofs.
* libgnat/s-imageu.adb: Likewise.
* libgnat/s-imgint.ads: Ghost parameters are grouped together in a
package now.
* libgnat/s-imglli.ads: Likewise.
* libgnat/s-imgllu.ads: Likewise.
* libgnat/s-imgllli.ads: Likewise.
* libgnat/s-imglllu.ads: Likewise.
* libgnat/s-imguns.ads: Likewise.
* libgnat/s-vallli.ads: Likewise.
* libgnat/s-vai.ads: Likewise.
* libgnat/s-imagei.ads: Likewise.
* libgnat/s-imageu.ads: Likewise.
* libgnat/s-vaispe.adb: Likewise.
* libgnat/s-valint.ads: Likewise.
* libgnat/s-valuei.ads: Likewise.
* libgnat/s-valueu.ads: Likewise.
* libgnat/s-vauspe.adb: Likewise.

patch.diff.gz
Description: application/gzip


[Ada] Update documentation about non-symbolic traceback

2022-09-02 Thread Marc Poulhiès via Gcc-patches
This documents the limitation of addr2line with Position-Independent Code,
introduces the replacement tool gnatsymbolize and adjusts obsolete stuff.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* doc/gnat_ugn/gnat_and_program_execution.rst
(Non-Symbolic Traceback): Update section.
* gnat_rm.texi, gnat_ugn.texi, gnat-style.texi: Regenerate.

patch.diff.gz
Description: application/gzip


[wwwdocs] gcc-13/changes.html + projects/gomp/: OpenMP update

2022-09-02 Thread Tobias Burnus

Update the OpenMP status for features that were added in the last months.

Comments/suggestions? Okay to commit?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gcc-13/changes.html + projects/gomp/: OpenMP update

* htdocs/gcc-13/changes.html: Update OpenMP entry; fix html syntax.
* htdocs/projects/gomp/index.html: Update OpenMP 5.x implementation status;
  add missing item from libgomp.texi + flip two items to have same order as
  the .texi.

 htdocs/gcc-13/changes.html  | 42 -
 htdocs/projects/gomp/index.html | 40 +--
 2 files changed, 63 insertions(+), 19 deletions(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index c4189c1b..24b97515 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -31,7 +31,6 @@ a work-in-progress.
 The support for the cr16-elf, tilegx*-linux, tilepro*-linux,
   hppa[12]*-*-hpux10*, hppa[12]*-*-hpux11*
   and m32c-rtems configurations has been removed.
-
 
 
 
@@ -41,14 +40,39 @@ a work-in-progress.
 
   https://gcc.gnu.org/projects/gomp/;>OpenMP
   
-The following OpenMP 5.1 features have been added: the
-omp_all_memory reserved locator, the inoutset
-modifier to the depend clause, the nowait
-clause for the taskwait directive and the
-omp_target_is_accessible, omp_target_memcpy_async,
-omp_target_memcpy_rect_async and
-omp_get_mapped_ptr API routines. Fortran now supports
-non-rectangular loop nests, which were added for C/C++ in GCC 11.
+
+  Reverse offload is now supported and the all clauses to the
+  requires directive are now accepted; however, the
+  requires_offload, unified_address
+  and unified_shared_memory clauses cause that the
+  only available device is the initial device (the host).
+
+
+  The following OpenMP 5.1 features have been added: the
+  omp_all_memory reserved locator, the inoutset
+  modifier to the depend clause, the nowait
+  clause for the taskwait directive and the
+  omp_target_is_accessible, omp_target_memcpy_async,
+  omp_target_memcpy_rect_async and
+  omp_get_mapped_ptr API routines. Fortran now supports
+  non-rectangular loop nests, which were added for C/C++ in GCC 11.
+
+
+  Initial support for OpenMP 5.2 features have been added: Support for
+  firstprivate and allocate clauses on the
+  scope construct and the OpenMP 5.2 syntax of the
+  linear clause; the new enum/constants
+  omp_initial_device and omp_invalid_device; and
+  optionally omitting the map-type in target enter/exit data.
+  The enter clause (as alias for to) has been added
+  to the declare target directive.
+
+
+  For user defined allocators requesting high bandwidth or large capacity
+  memspaces or interleaved partitioning, the http://memkind.github.io/memkind/;>memkind library is used,
+  if available at runtime.
+
   
   
   
diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html
index edafa0d3..92cbd9ab 100644
--- a/htdocs/projects/gomp/index.html
+++ b/htdocs/projects/gomp/index.html
@@ -307,8 +307,17 @@ than listed, depending on resolved corner cases and optimizations.
   
   
 requires directive
-GCC9GCC12
-(atomic_default_mem_order)(dynamic_allocators)rest parsing only
+
+  GCC9
+  GCC12
+  GCC13
+
+
+  (atomic_default_mem_order)
+  (dynamic_allocators)
+  complete but no non-host devices provides unified_address,
+  unified_shared_memory or reverse_offload
+
   
   
 conditional modifier to lastprivate clause
@@ -417,8 +426,14 @@ than listed, depending on resolved corner cases and optimizations.
   
   
 ancestor modifier on device clause
-GCC12
-Reverse offload unsupported
+
+  GCC12
+  GCC13
+
+
+  Reverse offload unsupported
+  See comment for requires
+
   
   
 Mapping C/C++ pointer variables and to assign the address of device memory mapped by an array section
@@ -705,6 +720,12 @@ than listed, depending on resolved corner cases and optimizations.
 No
 
   
+  
+Pointer predetermined firstprivate getting initialized
+  to address of matching mapped list item per 5.1, Sect. 2.21.7.2
+No
+
+  
   
 ompt_sync_region_t enum additions
 No
@@ -730,7 +751,6 @@ than listed, depending on resolved corner cases and optimizations.
 No
 
   
-  
 
 
 
@@ -862,8 +882,8 @@ than listed, depending on resolved corner cases and optimizations.
 
   
   
-Default map type for map clause in target enter/exit data
-No
+Default map-type for map clause in target enter/exit data
+  

Re: [PATCH 2/3] rename DBX_REGISTER_NUMBER to DEBUGGER_REGISTER_NUMBER

2022-09-02 Thread Martin Liška

On 9/1/22 14:32, Michael Matz wrote:

Hello,

okay, I'll bite :)  DBG_REGISTER_NUMBER?  DEBUGGER_REGNO?


Yep, I'm fine with the shorter macro name.

May I install such a change?

Cheers,
Martin




Ciao,
Michael.




[PATCH][DOCS] gcc-13: document removal of STABS

2022-09-02 Thread Martin Liška

Ready for master?

Thanks,
Martin

---
 htdocs/gcc-13/changes.html | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index c4189c1b..b5e49258 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -31,6 +31,9 @@ a work-in-progress.
 The support for the cr16-elf, tilegx*-linux, 
tilepro*-linux,
   hppa[12]*-*-hpux10*, hppa[12]*-*-hpux11*
   and m32c-rtems configurations has been removed.
+The support for emitting the STABS debugging format has been removed
+  (includes -gstabs and -gxcoff options) which 
means
+  the support for dbx debugger is removed.
 
 
 
--

2.37.2



Re: [PATCH 1/3] STABS: remove -gstabs and -gxcoff functionality

2022-09-02 Thread Martin Liška

On 9/1/22 13:18, Richard Biener wrote:

I presume WarnRemoved will diagnose use of -gstabs but not fail
compilation.  Will -gstabs then still enable -g (with the default debug
format)?


No, it won't set -g option.



Please followup with a gcc-13/changes.html entry.


Sure.



I notice we have VMS_DEBUGGING_INFO left.  From a quick look
it is used by alpha*-dec-* (exclusively) and ia64-hp-*vms*  (maybe
also supports DWARF, it is ELF at least).  One of the goals of
non-DWARF removal was to get rid of debug hooks and instead allow
"free-form" early debug generation from the frontends.


Can you please explain what you mean by the free-form and what's expected
to do with the VMS_DEBUGGING_INFO macro?

Cheers,
Martin


Re: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants

2022-09-02 Thread Jiufu Guo via Gcc-patches


Hi,

Segher Boessenkool  writes:

> Hi!
>
> This patch is a clear improvement :-)
>
> On Thu, Sep 01, 2022 at 11:24:00AM +0800, Jiufu Guo wrote:
>> As mentioned in PR106550, since pli could support 34bits immediate, we could
>> use less instructions(3insn would be ok) to build 64bits constant with pli.
>
>> For example, for constant 0x020805006106003, we could generate it with:
>> asm code1:
>> pli 9,101736451 (0x6106003)
>> sldi 9,9,32
>> paddi 9,9, 213 (0x0208050)
>
> 3 insns, 2 insns dependent on the previous, each.
Yeap.
>
>> or asm code2:
>> pli 10, 213
>> pli 9, 101736451
>> rldimi 9, 10, 32, 0
>
> 3 insns, 1 insn dependent on both others.
Yes.
>
>> Testing with simple cases as below, run them a lot of times:
>> f1.c
>> long __attribute__ ((noinline)) foo (long *arg,long *,long*)
>> {
>>   *arg = 0x2351847027482577;
>> }
>> 5insns: base
>> pli+sldi+paddi: similar -0.08%
>> pli+pli+rldimi: faster +0.66%
>
> This mostly tests how well this micro-benchmark is scheduled.  More time
> is spent in the looping and function calls (not shown)!
>
>> f2.c
>> long __attribute__ ((noinline)) foo (long *arg, long *arg2, long *arg3)
>> {
>>   *arg = 0x2351847027482577;
>>   *arg2 = 0x3257845024384680;
>>   *arg3 = 0x1245abcef9240dec;
>> }
>> 5nisns: base
>> pli+sldi+paddi: faster +1.35%
>> pli+pli+rldimi: faster +5.49%
>> 
>> f2.c would be more meaningful.  Because 'sched passes' are effective for
>> f2.c, but 'scheds' do less thing for f1.c.
>
> It still is a too small example to mean much without looking at a
> pipeview, or at the very least perf.  But the results show a solid
> improvement as expected ;-)
Right, checking how the 'cycles' are using on each instructions would be
more meaningful to demonstrate how the runtime is changing.
>
>> gcc/ChangeLog:
>>  PR target/106550
>>  * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add 'pli' for
>>  constant building.
>
> "Use pli." ?
Thanks, will update.
>
>> gcc/testsuite/ChangeLog:
>>  PR target/106550
>>  * gcc.target/powerpc/pr106550.c: New test.
>
>> +  else if (TARGET_PREFIXED)
>> +{
>> +  /* pli 9,high32 + pli 10,low32 + rldimi 9,10,32,0.  */
>
> But not just 9 and 10.  Use A and B or X and Y or H and L or something
> like that?
OK,  will updata accordingly.
>
> The comment goes...
>
>> +  if (can_create_pseudo_p ())
>> +{
>
> ... here.
>
>> +  temp = gen_reg_rtx (DImode);
>> +  rtx temp1 = gen_reg_rtx (DImode);
>> +  emit_move_insn (copy_rtx (temp), GEN_INT ((ud4 << 16) | ud3));
>> +  emit_move_insn (copy_rtx (temp1), GEN_INT ((ud2 << 16) | ud1));
>> +
>> +  emit_insn (gen_rotldi3_insert_3 (dest, temp, GEN_INT (32), temp1,
>> +   GEN_INT (0x)));
>> +}
>> +
>
> No blank line here please.
>
>> +  /* pli 9,high32 + sldi 9,32 + paddi 9,9,low32.  */
>> +  else
>> +{
>
> The comment goes here, in the block it refers to.  Comments for a block
> are the first thing *in* the block.
OK, great! I like the format you sugguested here :-)
>
>> +  emit_move_insn (copy_rtx (dest), GEN_INT ((ud4 << 16) | ud3));
>> +
>> +  emit_move_insn (copy_rtx (dest),
>> +  gen_rtx_ASHIFT (DImode, copy_rtx (dest),
>> +  GEN_INT (32)));
>> +
>> +  bool can_use_paddi = REGNO (dest) != FIRST_GPR_REGNO;
>
> There should be a test that we so the right thing (or *a* right thing,
> anyway; a working thing; but hopefully a reasonably fast thing) for
> !can_use_paddi.
To catch this test point, we need let the splitter run after RA,
and register 0 happen to be the dest of an assignment.
Oh, below case would be useful for this test point:

/* { dg-options "-O2 -std=c99 -mdejagnu-cpu=power10 -fdisable-rtl-split1" } */
/* force the constant splitter run after RA: -fdisable-rtl-split1
   a few assignments to make sure r0 is allocated as dest of an assign.  */

void
foo (unsigned long long *a)
{
  *a++ = 0x020805006106003;
  *a++ = 0x2351847027482587;
  *a++ = 0x22513478874a2578;
  *a++ = 0x02180570670b003;
  *a++ = 0x2311847029488587;
  *a++ = 0x335184b02748757f;
  *a++ = 0x720805006096003;
  *a++ = 0x23a18b70a74e257e;
  *a++ = 0x2a518a70a74a2567;
  *a++ = 0x5208a5da0606a03;
  *a++ = 0x1391a47a2749257a;
  *a++ = 0x235a847027488576;
  *a++ = 0x23a1847027482677;  
}

/* { dg-final { scan-assembler-times {\moris\M} 1 } } */
/* { dg-final { scan-assembler-times {\mori\M} 1 } } */

I will add this test case in patch.
Is this ok?  Any sugguestions?
   
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c
>> @@ -0,0 +1,14 @@
>> +/* PR target/106550 */
>> +/* { dg-options "-O2 -std=c99 -mdejagnu-cpu=power10" } */
>> +
>> +void
>> +foo (unsigned long long *a)
>> +{
>> +  *a++ = 0x020805006106003;
>> +  *a++ = 0x2351847027482577;  
>> +}
>> +
>> +/* 3 insns for each constant: pli+sldi+paddi or pli+pli+rldimi.
>> +   And 3 additional insns: std+std+blr: 9 

Re: [committed] c: C2x removal of unprototyped functions

2022-09-02 Thread Richard Biener via Gcc-patches
On Thu, Sep 1, 2022 at 11:18 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 9/1/2022 1:12 PM, Joseph Myers wrote:
> > C2x has completely removed unprototyped functions, so that () now
> > means the same as (void) in both function declarations and
> > definitions, where previously that change had been made for
> > definitions only.  Implement this accordingly.
> >
> > This is a change where GNU/Linux distribution builders might wish to
> > try builds with a -std=gnu2x default to start early on getting old
> > code fixed that still has () declarations for functions taking
> > arguments, in advance of GCC moving to -std=gnu2x as default maybe in
> > GCC 14 or 15; I don't know how much such code is likely to be in
> > current use.
> Happy to see this happen (dropping unprototyped funtions).  IIRC older
> versions of autoconf are going to generate code that runs afoul of this
> problem as well.

To catch these cases with a diagnostic earlier is
-Wstrict-prototypes -Wold-style-declaration enough to diagnose all cases
that the new standard will reject?

I suppose -W*-c2x-compat are not the correct vehicle to diagnose these?

Thanks,
Richard.

>
> jeff
>


Re: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]

2022-09-02 Thread Richard Biener via Gcc-patches
On Thu, Sep 1, 2022 at 7:51 PM Simon Rainer  wrote:
>
> Hi,
>
> Thanks for taking a look at my patch. I tested some combinations with 
> pure/noreturn attributes. gcc seems to ignore those attributes on 
> multiversion functions and generates sub-optimal assembly.
> But I wasn't able to fix this by simply copying members like DECL_PURE_P. 
> It's pretty hard for me to tell which members of tree are relevant for a 
> function declaration and should be copied and which should not be copied.
>
> Anyway, I think the TREE_NOTHROW change is the most important one, because it 
> leads to correctness problems (and is what broke my original program :D ), so 
> could you please commit my patch as I don't have write-access myself.

Sure, will do - thanks for the fix!

>
> Should I open a new bug on bugzilla for the pure/noreturn issue?

Yes, I think it's worth investigating.

Richard.

> Thanks
> Simon Rainer
>
>
> On Thu, Sep 1, 2022, at 08:37, Richard Biener wrote:
> > On Wed, Aug 31, 2022 at 11:00 PM Simon Rainer  wrote:
> > >
> > > Hi,
> > >
> > > This patch fixes PR106627. I ran the i386.exp tests on my 
> > > x86_64-linux-gnu machine with a fully bootstrapped checkout. I also 
> > > tested manually that no exception handling code is generated if none of 
> > > the function versions throws an exception.
> > > I don't have access to a machine to test the change to  rs6000.cc, but 
> > > the code seems like an exact copy and I don't see a reason why it 
> > > shouldn't work there the same way.
> > >
> > > Regards
> > > Simon Rainer
> > >
> > > From 6fcb1c742fa1d61048f7d63243225a8d1931af4a Mon Sep 17 00:00:00 2001
> > > From: Simon Rainer 
> > > Date: Wed, 31 Aug 2022 20:56:04 +0200
> > > Subject: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]
> > >
> > > Any multi-versioned function was implicitly declared as noexcept, which
> > > leads to an abort if an exception is thrown inside the function.
> > > The reason for this is that the function declaration is replaced by a
> > > newly created dispatcher declaration, which has TREE_NOTHROW always set
> > > to 1. Instead we need to set TREE_NOTHROW to the value of the original
> > > declaration.
> >
> > Looks quite obvious.  The middle-end to target interface is a bit iffy
> > since we have
> > to duplicate this everywhere.  There's also other flags like
> > pure/const and noreturn
> > that do not impose correctness issues but may cause irritations if the IL 
> > gets
> > a call to the dispatcher not marked noreturn but there's no code following.
> >
> > That said, the fix looks good to me.
> >
> > Thanks,
> > Richard.
> >
> > > PR ipa/106627
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/i386/i386-features.cc 
> > > (ix86_get_function_versions_dispatcher): Set TREE_NOTHROW
> > > correctly for dispatcher declaration
> > > * config/rs6000/rs6000.cc 
> > > (rs6000_get_function_versions_dispatcher): Likewise
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * g++.target/i386/pr106627.C: New test.
> > > ---
> > >  gcc/config/i386/i386-features.cc |  1 +
> > >  gcc/config/rs6000/rs6000.cc  |  1 +
> > >  gcc/testsuite/g++.target/i386/pr106627.C | 30 
> > >  3 files changed, 32 insertions(+)
> > >  create mode 100644 gcc/testsuite/g++.target/i386/pr106627.C
> > >
> > > diff --git a/gcc/config/i386/i386-features.cc 
> > > b/gcc/config/i386/i386-features.cc
> > > index d6bb66cbe01..5b3b1aeff28 100644
> > > --- a/gcc/config/i386/i386-features.cc
> > > +++ b/gcc/config/i386/i386-features.cc
> > > @@ -3268,6 +3268,7 @@ ix86_get_function_versions_dispatcher (void *decl)
> > >
> > >/* Right now, the dispatching is done via ifunc.  */
> > >dispatch_decl = make_dispatcher_decl (default_node->decl);
> > > +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
> > >
> > >dispatcher_node = cgraph_node::get_create (dispatch_decl);
> > >gcc_assert (dispatcher_node != NULL);
> > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> > > index 2f3146e56f8..9280da8a5c8 100644
> > > --- a/gcc/config/rs6000/rs6000.cc
> > > +++ b/gcc/config/rs6000/rs6000.cc
> > > @@ -24861,6 +24861,7 @@ rs6000_get_function_versions_dispatcher (void 
> > > *decl)
> > >
> > >/* Right now, the dispatching is done via ifunc.  */
> > >dispatch_decl = make_dispatcher_decl (default_node->decl);
> > > +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
> > >
> > >dispatcher_node = cgraph_node::get_create (dispatch_decl);
> > >gcc_assert (dispatcher_node != NULL);
> > > diff --git a/gcc/testsuite/g++.target/i386/pr106627.C 
> > > b/gcc/testsuite/g++.target/i386/pr106627.C
> > > new file mode 100644
> > > index 000..a67f5ae4813
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.target/i386/pr106627.C
> > > @@ -0,0 +1,30 @@
> > > +/* PR c++/103012 Exception handling with multiversioned functions */
> > > +/* { dg-do run } */