[PATCH] hardcfr: libgcc sym versioning

2023-11-29 Thread Alexandre Oliva


The libgcc-exported runtime component of control flow redundancy
hardening was missing symbol versioning information.  Add it.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  libgcc/ChangeLog

* libgcc-std.ver.in (__hardcfr_check): Add to GCC_14.0.0.
---
 libgcc/libgcc-std.ver.in |1 +
 1 file changed, 1 insertion(+)

diff --git a/libgcc/libgcc-std.ver.in b/libgcc/libgcc-std.ver.in
index f092752136aff..de00db647570c 100644
--- a/libgcc/libgcc-std.ver.in
+++ b/libgcc/libgcc-std.ver.in
@@ -1956,4 +1956,5 @@ GCC_14.0.0 {
   __PFX__fixdfbitint
   __PFX__floatbitintsf
   __PFX__floatbitintdf
+  __PFX__hardcfr_check
 }


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


RE: RE: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-11-29 Thread Li, Pan2
Take this file riscv-gnu-toolchain/newlib/newlib/libc/stdlib/mprec.c for 
example, the arguments and/or related local variables list as below

riscv_legitimize_move
  mode = E_DFmode
  dest = (reg:DF 144 [  ])
  src = (subreg:DF (reg:V2SI 170) 0)

nunits = 1
smode = {m_mode = E_DFmode}

Pan

From: juzhe.zh...@rivai.ai 
Sent: Thursday, November 30, 2023 3:33 PM
To: Li, Pan2 ; gcc-patches 
Cc: Wang, Yanzhang ; kito.cheng 
Subject: Re: RE: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec 
mode in zve32f

What it the RTX of the operand ?


juzhe.zh...@rivai.ai

From: Li, Pan2
Date: 2023-11-30 15:31
To: juzhe.zh...@rivai.ai; 
gcc-patches
CC: Wang, Yanzhang; 
kito.cheng
Subject: RE: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
> Why does get_vector_mode doesn't exist a vector mode ?

Because we set the zve32f here, but try to get_vect_mode with E_V1DFmode.
According to the ISA, FP64 is not support when zve32F.

Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Thursday, November 30, 2023 3:24 PM
To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Li, Pan2 mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>; kito.cheng 
mailto:kito.ch...@gmail.com>>
Subject: Re: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f

Why does get_vector_mode doesn't exist a vector mode ?

It must exist a vector mode, otherwise, it will cause ICE in other situations.


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-11-30 15:21
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
From: Pan Li mailto:pan2...@intel.com>>

When require mode after get_vec_mode in riscv_legitimize_move,
there will be precondition that the mode is exists. Or we will
have E_VOIDMode and of course have ICE when required.

Typically we should first check the mode exists or not before
require, or more friendly like leverage exist (U *mode) to get
the expected mode if exists and unchanged if not.

This patch would like to fix this by exist (U *mode) for requiring
a mode after get_vec_mode.

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) instead of directly require ().

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv.cc | 47 ++---
.../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
2 files changed, 79 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..19413b2c976 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2615,32 +2615,39 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-  for (unsigned int i = 0; i < num; i++)
+  opt_machine_mode opt_mode = riscv_vector::get_vector_mode (smode, 
nunits);
+
+  if (opt_mode.exists ())
{
-   rtx result;
-   if (num == 1)
- result = dest;
-   else if (i == 0)
- result = gen_lowpart (smode, dest);
-   else
- result = gen_reg_rtx (smode);
-   riscv_vector::emit_vec_extract (result, v, index + i);
+   rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-   if (i == 1)
+   for (unsigned int i = 0; i < num; i++)
{
-   rtx tmp
- = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
- gen_int_mode (32, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
-   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-OPTAB_DIRECT);
-   emit_move_insn (dest, tmp2);
+   rtx result;
+   if (num == 1)
+ result = dest;
+   else if (i == 0)
+ result = gen_lowpart (smode, dest);
+   else
+ result = gen_reg_rtx (smode);
+
+   riscv_vector::emit_vec_extract (result, v, index + i);
+
+   if (i == 1)
+ {
+   rtx tmp = expand_binop (Pmode, ashl_optab,
+   gen_lowpart (Pmode, result),
+   gen_int_mode (32, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest,
+NULL_RTX, 0,
+OPTAB_DIRECT);
+   emit_move_insn (dest, tmp2);
+ }
}
+   return true;
}
-  

RE: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-11-29 Thread Li, Pan2
> Why does get_vector_mode doesn't exist a vector mode ?

Because we set the zve32f here, but try to get_vect_mode with E_V1DFmode.
According to the ISA, FP64 is not support when zve32F.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Thursday, November 30, 2023 3:24 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f

Why does get_vector_mode doesn't exist a vector mode ?

It must exist a vector mode, otherwise, it will cause ICE in other situations.


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-11-30 15:21
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
From: Pan Li mailto:pan2...@intel.com>>

When require mode after get_vec_mode in riscv_legitimize_move,
there will be precondition that the mode is exists. Or we will
have E_VOIDMode and of course have ICE when required.

Typically we should first check the mode exists or not before
require, or more friendly like leverage exist (U *mode) to get
the expected mode if exists and unchanged if not.

This patch would like to fix this by exist (U *mode) for requiring
a mode after get_vec_mode.

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) instead of directly require ().

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv.cc | 47 ++---
.../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
2 files changed, 79 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..19413b2c976 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2615,32 +2615,39 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-  for (unsigned int i = 0; i < num; i++)
+  opt_machine_mode opt_mode = riscv_vector::get_vector_mode (smode, 
nunits);
+
+  if (opt_mode.exists ())
{
-   rtx result;
-   if (num == 1)
- result = dest;
-   else if (i == 0)
- result = gen_lowpart (smode, dest);
-   else
- result = gen_reg_rtx (smode);
-   riscv_vector::emit_vec_extract (result, v, index + i);
+   rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-   if (i == 1)
+   for (unsigned int i = 0; i < num; i++)
{
-   rtx tmp
- = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
- gen_int_mode (32, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
-   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-OPTAB_DIRECT);
-   emit_move_insn (dest, tmp2);
+   rtx result;
+   if (num == 1)
+ result = dest;
+   else if (i == 0)
+ result = gen_lowpart (smode, dest);
+   else
+ result = gen_reg_rtx (smode);
+
+   riscv_vector::emit_vec_extract (result, v, index + i);
+
+   if (i == 1)
+ {
+   rtx tmp = expand_binop (Pmode, ashl_optab,
+   gen_lowpart (Pmode, result),
+   gen_int_mode (32, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest,
+NULL_RTX, 0,
+OPTAB_DIRECT);
+   emit_move_insn (dest, tmp2);
+ }
}
+   return true;
}
-  return true;
 }
   /* Expand
(set (reg:QI target) (mem:QI (address)))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32f_zvfh_zfh -mabi=lp64 -O2" } */
+
+#include 
+
+union double_union
+{
+  double d;
+  __uint32_t i[2];
+};
+
+#define word0(x)  (x.i[1])
+#define word1(x)  (x.i[0])
+
+#define P 53
+#define Exp_shift 20
+#define Exp_msk1  ((__uint32_t)0x10L)
+#define Exp_mask  ((__uint32_t)0x7ff0L)
+
+double ulp (double _x)
+{
+  union double_union x, a;
+  register int L;
+
+  x.d = _x;
+  L = (word0 (x) & Exp_mask) - (P - 1) * Exp_msk1;
+
+  if (L > 0)
+{
+  L |= Exp_msk1 >> 4;
+  word0 (a) = L;
+  word1 (a) = 0;
+}
+  else
+{
+  L = -L >> Exp_shift;
+  if (L < Exp_shift)
+ {
+   word0 (a) = 0x8 >> L;
+   word1 (a) = 0;
+ }
+  else
+ {
+   word0 (a) = 0;
+   L -= Exp_shift;
+   

Re: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-11-29 Thread juzhe.zh...@rivai.ai
Why does get_vector_mode doesn't exist a vector mode ?

It must exist a vector mode, otherwise, it will cause ICE in other situations.



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-11-30 15:21
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
From: Pan Li 
 
When require mode after get_vec_mode in riscv_legitimize_move,
there will be precondition that the mode is exists. Or we will
have E_VOIDMode and of course have ICE when required.
 
Typically we should first check the mode exists or not before
require, or more friendly like leverage exist (U *mode) to get
the expected mode if exists and unchanged if not.
 
This patch would like to fix this by exist (U *mode) for requiring
a mode after get_vec_mode.
 
PR target/112743
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) instead of directly require ().
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr112743-2.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv.cc | 47 ++---
.../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
2 files changed, 79 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..19413b2c976 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2615,32 +2615,39 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-  for (unsigned int i = 0; i < num; i++)
+  opt_machine_mode opt_mode = riscv_vector::get_vector_mode (smode, 
nunits);
+
+  if (opt_mode.exists ())
{
-   rtx result;
-   if (num == 1)
- result = dest;
-   else if (i == 0)
- result = gen_lowpart (smode, dest);
-   else
- result = gen_reg_rtx (smode);
-   riscv_vector::emit_vec_extract (result, v, index + i);
+   rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-   if (i == 1)
+   for (unsigned int i = 0; i < num; i++)
{
-   rtx tmp
- = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
- gen_int_mode (32, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
-   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-OPTAB_DIRECT);
-   emit_move_insn (dest, tmp2);
+   rtx result;
+   if (num == 1)
+ result = dest;
+   else if (i == 0)
+ result = gen_lowpart (smode, dest);
+   else
+ result = gen_reg_rtx (smode);
+
+   riscv_vector::emit_vec_extract (result, v, index + i);
+
+   if (i == 1)
+ {
+   rtx tmp = expand_binop (Pmode, ashl_optab,
+   gen_lowpart (Pmode, result),
+   gen_int_mode (32, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest,
+NULL_RTX, 0,
+OPTAB_DIRECT);
+   emit_move_insn (dest, tmp2);
+ }
}
+   return true;
}
-  return true;
 }
   /* Expand
(set (reg:QI target) (mem:QI (address)))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32f_zvfh_zfh -mabi=lp64 -O2" } */
+
+#include 
+
+union double_union
+{
+  double d;
+  __uint32_t i[2];
+};
+
+#define word0(x)  (x.i[1])
+#define word1(x)  (x.i[0])
+
+#define P 53
+#define Exp_shift 20
+#define Exp_msk1  ((__uint32_t)0x10L)
+#define Exp_mask  ((__uint32_t)0x7ff0L)
+
+double ulp (double _x)
+{
+  union double_union x, a;
+  register int L;
+
+  x.d = _x;
+  L = (word0 (x) & Exp_mask) - (P - 1) * Exp_msk1;
+
+  if (L > 0)
+{
+  L |= Exp_msk1 >> 4;
+  word0 (a) = L;
+  word1 (a) = 0;
+}
+  else
+{
+  L = -L >> Exp_shift;
+  if (L < Exp_shift)
+ {
+   word0 (a) = 0x8 >> L;
+   word1 (a) = 0;
+ }
+  else
+ {
+   word0 (a) = 0;
+   L -= Exp_shift;
+   word1 (a) = L >= 31 ? 1 : 1 << (31 - L);
+ }
+}
+
+  return a.d;
+}
-- 
2.34.1
 
 


Re: [PATCH #2/4] c++: mark short-enums as packed

2023-11-29 Thread Alexandre Oliva
On Nov 29, 2023, Jason Merrill  wrote:

> On 11/29/23 04:39, Alexandre Oliva wrote:
>> Hello, Jason,
>> On Nov 22, 2023, Jason Merrill  wrote:
>> 
>>> On 11/22/23 13:12, Jason Merrill wrote:
 I'm coming to the conclusion that your C++ patch is fine but we
 should remove the TYPE_PACKED warning from
 check_address_or_pointer_of_packed_member.  And maybe add
 -Wcast-align=strict to -Wextra.
>> 
>>> Since I seem to have opinions, I'm preparing a patch for this.
>> Thanks for that patch.  It makes sense to me, but I suppose that, if
>> it goes in, I should revert the already-installed #1/4 in this series
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637244.html
>> rather than install #4/4 that Mike approved.
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637336.html
>> I wasn't sure whether your earlier conclusion (quoted above) was
>> meant
>> as an 'Ok' for the C++ patch.  Please confirm if so.  TIA,

> Yes.

Thanks, the C++ patch is now in, and so is the testsuite patch reversal.
The pr108251 analyzer tests are again failing on -fshort-enum platforms,
in hope that your patchset is going to make it.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PATCH v1] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-11-29 Thread pan2 . li
From: Pan Li 

When require mode after get_vec_mode in riscv_legitimize_move,
there will be precondition that the mode is exists. Or we will
have E_VOIDMode and of course have ICE when required.

Typically we should first check the mode exists or not before
require, or more friendly like leverage exist (U *mode) to get
the expected mode if exists and unchanged if not.

This patch would like to fix this by exist (U *mode) for requiring
a mode after get_vec_mode.

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) instead of directly require ().

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 47 ++---
 .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
 2 files changed, 79 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..19413b2c976 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2615,32 +2615,39 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
 
-  for (unsigned int i = 0; i < num; i++)
+  opt_machine_mode opt_mode = riscv_vector::get_vector_mode (smode, 
nunits);
+
+  if (opt_mode.exists ())
{
- rtx result;
- if (num == 1)
-   result = dest;
- else if (i == 0)
-   result = gen_lowpart (smode, dest);
- else
-   result = gen_reg_rtx (smode);
- riscv_vector::emit_vec_extract (result, v, index + i);
+ rtx v = gen_lowpart (vmode, SUBREG_REG (src));
 
- if (i == 1)
+ for (unsigned int i = 0; i < num; i++)
{
- rtx tmp
-   = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
-   gen_int_mode (32, Pmode), NULL_RTX, 0,
-   OPTAB_DIRECT);
- rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-  OPTAB_DIRECT);
- emit_move_insn (dest, tmp2);
+ rtx result;
+ if (num == 1)
+   result = dest;
+ else if (i == 0)
+   result = gen_lowpart (smode, dest);
+ else
+   result = gen_reg_rtx (smode);
+
+ riscv_vector::emit_vec_extract (result, v, index + i);
+
+ if (i == 1)
+   {
+ rtx tmp = expand_binop (Pmode, ashl_optab,
+ gen_lowpart (Pmode, result),
+ gen_int_mode (32, Pmode), NULL_RTX, 0,
+ OPTAB_DIRECT);
+ rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest,
+  NULL_RTX, 0,
+  OPTAB_DIRECT);
+ emit_move_insn (dest, tmp2);
+   }
}
+ return true;
}
-  return true;
 }
   /* Expand
(set (reg:QI target) (mem:QI (address)))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32f_zvfh_zfh -mabi=lp64 -O2" } */
+
+#include 
+
+union double_union
+{
+  double d;
+  __uint32_t i[2];
+};
+
+#define word0(x)  (x.i[1])
+#define word1(x)  (x.i[0])
+
+#define P 53
+#define Exp_shift 20
+#define Exp_msk1  ((__uint32_t)0x10L)
+#define Exp_mask  ((__uint32_t)0x7ff0L)
+
+double ulp (double _x)
+{
+  union double_union x, a;
+  register int L;
+
+  x.d = _x;
+  L = (word0 (x) & Exp_mask) - (P - 1) * Exp_msk1;
+
+  if (L > 0)
+{
+  L |= Exp_msk1 >> 4;
+  word0 (a) = L;
+  word1 (a) = 0;
+}
+  else
+{
+  L = -L >> Exp_shift;
+  if (L < Exp_shift)
+   {
+ word0 (a) = 0x8 >> L;
+ word1 (a) = 0;
+   }
+  else
+   {
+ word0 (a) = 0;
+ L -= Exp_shift;
+ word1 (a) = L >= 31 ? 1 : 1 << (31 - L);
+   }
+}
+
+  return a.d;
+}
-- 
2.34.1



Re: [PATCH] RISC-V: Update crypto vector ISA info with latest spec

2023-11-29 Thread Kito Cheng
LGTM

On Thu, Nov 30, 2023 at 2:16 PM Feng Wang  wrote:
>
> This patch add the Zvkb subset of crypto vector extension. The
> corresponding test cases have aslo been modified.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add zvkb ISA info.
> * config/riscv/riscv.opt: Add Mask(ZVKB)
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zvkn-1.c: Replace zvbb with zvkb.
> * gcc.target/riscv/zvkn.c:  Ditto.
> * gcc.target/riscv/zvknc-1.c: Ditto.
> * gcc.target/riscv/zvknc-2.c: Ditto.
> * gcc.target/riscv/zvknc.c: Ditto.
> * gcc.target/riscv/zvkng-1.c: Ditto.
> * gcc.target/riscv/zvkng-2.c: Ditto.
> * gcc.target/riscv/zvkng.c: Ditto.
> * gcc.target/riscv/zvks-1.c: Ditto.
> * gcc.target/riscv/zvks.c: Ditto.
> * gcc.target/riscv/zvksc-1.c: Ditto.
> * gcc.target/riscv/zvksc-2.c: Ditto.
> * gcc.target/riscv/zvksc.c: Ditto.
> * gcc.target/riscv/zvksg-1.c: Ditto.
> * gcc.target/riscv/zvksg-2.c: Ditto.
> * gcc.target/riscv/zvksg.c: Ditto.
> ---
>  gcc/common/config/riscv/riscv-common.cc  | 6 --
>  gcc/config/riscv/riscv.opt   | 2 ++
>  gcc/testsuite/gcc.target/riscv/zvkn-1.c  | 8 
>  gcc/testsuite/gcc.target/riscv/zvkn.c| 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvknc-1.c | 8 
>  gcc/testsuite/gcc.target/riscv/zvknc-2.c | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvknc.c   | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvkng-1.c | 8 
>  gcc/testsuite/gcc.target/riscv/zvkng-2.c | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvkng.c   | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvks-1.c  | 8 
>  gcc/testsuite/gcc.target/riscv/zvks.c| 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvksc-1.c | 8 
>  gcc/testsuite/gcc.target/riscv/zvksc-2.c | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvksc.c   | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvksg-1.c | 8 
>  gcc/testsuite/gcc.target/riscv/zvksg-2.c | 4 ++--
>  gcc/testsuite/gcc.target/riscv/zvksg.c   | 4 ++--
>  18 files changed, 50 insertions(+), 46 deletions(-)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index ded85b4c578..6c210412515 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -106,7 +106,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
>
>{"zvkn", "zvkned"},
>{"zvkn", "zvknhb"},
> -  {"zvkn", "zvbb"},
> +  {"zvkn", "zvkb"},
>{"zvkn", "zvkt"},
>{"zvknc", "zvkn"},
>{"zvknc", "zvbc"},
> @@ -114,7 +114,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
>{"zvkng", "zvkg"},
>{"zvks", "zvksed"},
>{"zvks", "zvksh"},
> -  {"zvks", "zvbb"},
> +  {"zvks", "zvkb"},
>{"zvks", "zvkt"},
>{"zvksc", "zvks"},
>{"zvksc", "zvbc"},
> @@ -253,6 +253,7 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>
>{"zvbb", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zvbc", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"zvkb", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zvkg", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zvkned", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zvknha", ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -1624,6 +1625,7 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>
>{"zvbb", _options::x_riscv_zvb_subext, MASK_ZVBB},
>{"zvbc", _options::x_riscv_zvb_subext, MASK_ZVBC},
> +  {"zvkb", _options::x_riscv_zvb_subext, MASK_ZVKB},
>{"zvkg", _options::x_riscv_zvk_subext, MASK_ZVKG},
>{"zvkned",   _options::x_riscv_zvk_subext, MASK_ZVKNED},
>{"zvknha",   _options::x_riscv_zvk_subext, MASK_ZVKNHA},
> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> index 0c6517bdc8b..78186fff6c5 100644
> --- a/gcc/config/riscv/riscv.opt
> +++ b/gcc/config/riscv/riscv.opt
> @@ -319,6 +319,8 @@ Mask(ZVBB) Var(riscv_zvb_subext)
>
>  Mask(ZVBC) Var(riscv_zvb_subext)
>
> +Mask(ZVKB) Var(riscv_zvb_subext)
> +
>  TargetVariable
>  int riscv_zvk_subext
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zvkn-1.c 
> b/gcc/testsuite/gcc.target/riscv/zvkn-1.c
> index 23b255b4779..069a8f66c92 100644
> --- a/gcc/testsuite/gcc.target/riscv/zvkn-1.c
> +++ b/gcc/testsuite/gcc.target/riscv/zvkn-1.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-options "-march=rv64gc_zvkned_zvknhb_zvbb_zvkt" { target { rv64 } } 
> } */
> -/* { dg-options "-march=rv32gc_zvkned_zvknhb_zvbb_zvkt" { target { rv32 } } 
> } */
> +/* { dg-options "-march=rv64gc_zvkned_zvknhb_zvkb_zvkt" { target { rv64 } } 
> } */
> +/* { dg-options "-march=rv32gc_zvkned_zvknhb_zvkb_zvkt" { target { rv32 } } 
> } */
>
>  #ifndef __riscv_zvkn
>  #error "Feature macro for `Zvkn' not defined"
> @@ -14,8 +14,8 @@
>  #error "Feature macro for `Zvknhb' not defined"
>  #endif
>
> -#ifndef __riscv_zvbb
> -#error "Feature macro for `Zvbb' not defined"
> +#ifndef __riscv_zvkb
> +#error "Feature macro for `Zvkb' not defined"
>  

Re: [PATCH] RISC-V: Support widening register overlap for vf4/vf8

2023-11-29 Thread Kito Cheng
LGTM, thanks :)

On Thu, Nov 30, 2023 at 2:49 PM Juzhe-Zhong  wrote:
>
>
> size_t
> foo (char const *buf, size_t len)
> {
>   size_t sum = 0;
>   size_t vl = __riscv_vsetvlmax_e8m8 ();
>   size_t step = vl * 4;
>   const char *it = buf, *end = buf + len;
>   for (; it + step <= end;)
> {
>   vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl);
>   it += vl;
>   vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl);
>   it += vl;
>   vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl);
>   it += vl;
>   vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl);
>   it += vl;
>
>   asm volatile("nop" ::: "memory");
>   vint64m8_t vw0 = __riscv_vsext_vf8_i64m8 (v0, vl);
>   vint64m8_t vw1 = __riscv_vsext_vf8_i64m8 (v1, vl);
>   vint64m8_t vw2 = __riscv_vsext_vf8_i64m8 (v2, vl);
>   vint64m8_t vw3 = __riscv_vsext_vf8_i64m8 (v3, vl);
>
>   asm volatile("nop" ::: "memory");
>   size_t sum0 = __riscv_vmv_x_s_i64m8_i64 (vw0);
>   size_t sum1 = __riscv_vmv_x_s_i64m8_i64 (vw1);
>   size_t sum2 = __riscv_vmv_x_s_i64m8_i64 (vw2);
>   size_t sum3 = __riscv_vmv_x_s_i64m8_i64 (vw3);
>
>   sum += sumation (sum0, sum1, sum2, sum3);
> }
>   return sum;
> }
>
> Before this patch:
>
> add a3,s0,s1
> add a4,s6,s1
> add a5,s7,s1
> vsetvli zero,s0,e64,m8,ta,ma
> vle8.v  v4,0(s1)
> vle8.v  v3,0(a3)
> mv  s1,s2
> vle8.v  v2,0(a4)
> vle8.v  v1,0(a5)
> nop
> vsext.vf8   v8,v4
> vsext.vf8   v16,v2
> vs8r.v  v8,0(sp)
> vsext.vf8   v24,v1
> vsext.vf8   v8,v3
> nop
> vmv.x.s a1,v8
> vl8re64.v   v8,0(sp)
> vmv.x.s a3,v24
> vmv.x.s a2,v16
> vmv.x.s a0,v8
> add s2,s2,s5
> callsumation
> add s3,s3,a0
> bgeus4,s2,.L5
>
> After this patch:
>
> add a3,s0,s1
> add a4,s6,s1
> add a5,s7,s1
> vsetvli zero,s0,e64,m8,ta,ma
> vle8.v  v15,0(s1)
> vle8.v  v23,0(a3)
> mv  s1,s2
> vle8.v  v31,0(a4)
> vle8.v  v7,0(a5)
> vsext.vf8   v8,v15
> vsext.vf8   v16,v23
> vsext.vf8   v24,v31
> vsext.vf8   v0,v7
> vmv.x.s a3,v0
> vmv.x.s a2,v24
> vmv.x.s a1,v16
> vmv.x.s a0,v8
> add s2,s2,s5
> callsumation
> add s3,s3,a0
> bgeus4,s2,.L5
>
> PR target/112431
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md: Add widening overlap of vf2/vf4.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr112431-16.c: New test.
> * gcc.target/riscv/rvv/base/pr112431-17.c: New test.
> * gcc.target/riscv/rvv/base/pr112431-18.c: New test.
>
> ---
>  gcc/config/riscv/vector.md| 38 ++-
>  .../gcc.target/riscv/rvv/base/pr112431-16.c   | 68 +++
>  .../gcc.target/riscv/rvv/base/pr112431-17.c   | 51 ++
>  .../gcc.target/riscv/rvv/base/pr112431-18.c   | 51 ++
>  4 files changed, 190 insertions(+), 18 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-16.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-17.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-18.c
>
> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index 6b891c11324..e5d62c6e58b 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -3704,43 +3704,45 @@
>
>  ;; Vector Quad-Widening Sign-extend and Zero-extend.
>  (define_insn "@pred__vf4"
> -  [(set (match_operand:VQEXTI 0 "register_operand"  "=,")
> +  [(set (match_operand:VQEXTI 0 "register_operand"   "=vr,   vr, 
>   vr,   vr, ?, ?")
> (if_then_else:VQEXTI
>   (unspec:
> -   [(match_operand: 1 "vector_mask_operand"   "vmWc1,vmWc1")
> -(match_operand 4 "vector_length_operand"  "   rK,   rK")
> -(match_operand 5 "const_int_operand"  "i,i")
> -(match_operand 6 "const_int_operand"  "i,i")
> -(match_operand 7 "const_int_operand"  "i,i")
> +   [(match_operand: 1 "vector_mask_operand"   
> "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
> +(match_operand 4 "vector_length_operand"  "   rK,   rK,  
>  rK,   rK,   rK,   rK")
> +(match_operand 5 "const_int_operand"  "i,i,  
>   i,i,i,i")
> +(match_operand 6 "const_int_operand"  "i,i,  
>   i,i,i,i")
> +(match_operand 7 "const_int_operand"  "i,i,  
>   i,i,i,i")
>  (reg:SI VL_REGNUM)
>  (reg:SI VTYPE_REGNUM)] 

[PATCH] RISC-V: Support widening register overlap for vf4/vf8

2023-11-29 Thread Juzhe-Zhong


size_t
foo (char const *buf, size_t len)
{
  size_t sum = 0;
  size_t vl = __riscv_vsetvlmax_e8m8 ();
  size_t step = vl * 4;
  const char *it = buf, *end = buf + len;
  for (; it + step <= end;)
{
  vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl);
  it += vl;
  vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl);
  it += vl;
  vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl);
  it += vl;
  vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl);
  it += vl;
  
  asm volatile("nop" ::: "memory");
  vint64m8_t vw0 = __riscv_vsext_vf8_i64m8 (v0, vl);
  vint64m8_t vw1 = __riscv_vsext_vf8_i64m8 (v1, vl);
  vint64m8_t vw2 = __riscv_vsext_vf8_i64m8 (v2, vl);
  vint64m8_t vw3 = __riscv_vsext_vf8_i64m8 (v3, vl);

  asm volatile("nop" ::: "memory");
  size_t sum0 = __riscv_vmv_x_s_i64m8_i64 (vw0);
  size_t sum1 = __riscv_vmv_x_s_i64m8_i64 (vw1);
  size_t sum2 = __riscv_vmv_x_s_i64m8_i64 (vw2);
  size_t sum3 = __riscv_vmv_x_s_i64m8_i64 (vw3);

  sum += sumation (sum0, sum1, sum2, sum3);
}
  return sum;
}

Before this patch:

add a3,s0,s1
add a4,s6,s1
add a5,s7,s1
vsetvli zero,s0,e64,m8,ta,ma
vle8.v  v4,0(s1)
vle8.v  v3,0(a3)
mv  s1,s2
vle8.v  v2,0(a4)
vle8.v  v1,0(a5)
nop
vsext.vf8   v8,v4
vsext.vf8   v16,v2
vs8r.v  v8,0(sp)
vsext.vf8   v24,v1
vsext.vf8   v8,v3
nop
vmv.x.s a1,v8
vl8re64.v   v8,0(sp)
vmv.x.s a3,v24
vmv.x.s a2,v16
vmv.x.s a0,v8
add s2,s2,s5
callsumation
add s3,s3,a0
bgeus4,s2,.L5

After this patch:

add a3,s0,s1
add a4,s6,s1
add a5,s7,s1
vsetvli zero,s0,e64,m8,ta,ma
vle8.v  v15,0(s1)
vle8.v  v23,0(a3)
mv  s1,s2
vle8.v  v31,0(a4)
vle8.v  v7,0(a5)
vsext.vf8   v8,v15
vsext.vf8   v16,v23
vsext.vf8   v24,v31
vsext.vf8   v0,v7
vmv.x.s a3,v0
vmv.x.s a2,v24
vmv.x.s a1,v16
vmv.x.s a0,v8
add s2,s2,s5
callsumation
add s3,s3,a0
bgeus4,s2,.L5

PR target/112431

gcc/ChangeLog:

* config/riscv/vector.md: Add widening overlap of vf2/vf4.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-16.c: New test.
* gcc.target/riscv/rvv/base/pr112431-17.c: New test.
* gcc.target/riscv/rvv/base/pr112431-18.c: New test.

---
 gcc/config/riscv/vector.md| 38 ++-
 .../gcc.target/riscv/rvv/base/pr112431-16.c   | 68 +++
 .../gcc.target/riscv/rvv/base/pr112431-17.c   | 51 ++
 .../gcc.target/riscv/rvv/base/pr112431-18.c   | 51 ++
 4 files changed, 190 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-17.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-18.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 6b891c11324..e5d62c6e58b 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -3704,43 +3704,45 @@
 
 ;; Vector Quad-Widening Sign-extend and Zero-extend.
 (define_insn "@pred__vf4"
-  [(set (match_operand:VQEXTI 0 "register_operand"  "=,")
+  [(set (match_operand:VQEXTI 0 "register_operand"   "=vr,   vr,   
vr,   vr, ?, ?")
(if_then_else:VQEXTI
  (unspec:
-   [(match_operand: 1 "vector_mask_operand"   "vmWc1,vmWc1")
-(match_operand 4 "vector_length_operand"  "   rK,   rK")
-(match_operand 5 "const_int_operand"  "i,i")
-(match_operand 6 "const_int_operand"  "i,i")
-(match_operand 7 "const_int_operand"  "i,i")
+   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
+(match_operand 4 "vector_length_operand"  "   rK,   rK,   
rK,   rK,   rK,   rK")
+(match_operand 5 "const_int_operand"  "i,i,
i,i,i,i")
+(match_operand 6 "const_int_operand"  "i,i,
i,i,i,i")
+(match_operand 7 "const_int_operand"  "i,i,
i,i,i,i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
  (any_extend:VQEXTI
-   (match_operand: 3 "register_operand" "   vr,   vr"))
- (match_operand:VQEXTI 2 "vector_merge_operand"   "   vu,0")))]
+   (match_operand: 3 "register_operand" "  W43,  W43,  
W86,  W86,   vr,   vr"))
+ (match_operand:VQEXTI 2 

Re: hurd: Add multilib paths for gnu-x86_64

2023-11-29 Thread rep . dot . nop
On 27 November 2023 15:48:33 CET, Thomas Schwinge  
wrote:
>Hi!
>
>On 2023-10-28T21:19:59+0200, Samuel Thibault  wrote:
>> We need the multilib paths in gcc to find e.g. glibc crt files on
>> Debian.
>
>ACK.
>
>> This is essentially based on t-linux64 version.
>
>Yes, but isn't the overall setup diverged from GNU/Linux?
>
>Currently, x86_64 GNU/Hurd first gets 'i386/t-linux64', whose definitons
>are only later:
>
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -5828,6 +5828,9 @@ case ${target} in
>>   visium-*-*)
>>   target_cpu_default2="TARGET_CPU_$with_cpu"
>>   ;;
>> + x86_64-*-gnu*)
>> + tmake_file="$tmake_file i386/t-gnu64"
>> + ;;
>>  esac
>
>... then here (effectively) overwritten by 'i386/t-gnu64'.  Instead, I
>suppose, we should handle 'i386/t-linux64' and 'i386/t-gnu64' alike, and
>resolve relevant configuration differences.
>
>As fas a I can tell, 'i386/t-linux64' is also used for multilib-enabled
>('test x$enable_targets = xall') x86 GNU/Linux, and that's not
>(correspondingly) done for x86 GNU/Hurd?
>
>However, such things can certainly be resolved incrementally, later on.
>I understand that your change does work for you as-is, so I've now pushed
>that to master branch in commit 5707e9db9c398d311defc80c5b7822c9a07ead60
>"hurd: Add multilib paths for gnu-x86_64", see attached.

+# To support i386, x86-64 and x32 libraries, the directory structrue

I guess one could legitimately understand this as a "structure setting 
standards " ;) but let's spell it structure nevertheless?

cheers


Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-29 Thread waffl3x






On Wednesday, November 29th, 2023 at 10:00 PM, Jason Merrill  
wrote:


> 
> 
> On 11/27/23 00:35, waffl3x wrote:
> 
> > I think this is cleaned up enough to be presentable. Bootstrapped but
> > not tested but I don't think I changed anything substantial. I am
> > running tests right now and will report if anything fails. I have not
> > fixed the problem in tsubst_lambda_expr that we talked about, that will
> > be first on my list tomorrow. While writing/cleaning up tests I had to
> > investigate some things, one of which is calling an overloaded
> > function, where one of the candidates are introduced by a using
> > declaration, is considered ambiguous. I haven't narrowed down the case
> > for this yet so I don't know if it's related to xobj member
> > functions/lambda with xobj parameters or not. I had to pull a few tests
> > because of it though.
> > 
> > I did not get as much done as I would have hoped today. This really
> > just serves as a small progress update. Once again, todo is the issue
> > you raised in tsubst_lambda_expr, and fixing handling of captures when
> > a const xobj parameter is deduced in a lamdba call operator.
> 
> > +#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
> > +#define DECL_XOBJ_MEMBER_FUNC_P(NODE) \
> > +#define DECL_OBJECT_MEMBER_FUNC_P(NODE) \
> 
> 
> Let's use the full word FUNCTION in these macros for consistency with
> DECL_STATIC_FUNCTION_P.

Okay.

> > @@ -6544,7 +6544,7 @@ add_candidates (tree fns, tree first_arg, const 
> > vec *args,
> > tree fn_first_arg = NULL_TREE;
> > const vec *fn_args = args;
> > 
> > - if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn))
> > + if (DECL_OBJECT_MEMBER_FUNC_P (fn))
> > {
> > /* Figure out where the object arg comes from. If this
> > function is a non-static member and we didn't get an
> 
> 
> Hmm, that this function explicitly pulls out the object argument into
> first_arg strengthens your earlier argument that we shouldn't bother
> trying to handle null first_arg. But let's not mess with that in this
> patch.

Maybe I'll take some time to look into it when this patch is done, I
came across another block that seems to guarantee that first_arg gets
passed in a bit ago.

> > + val = handle_arg(TREE_VALUE (parm),
> 
> 
> Missing space.

Is there a script I can use for this so I'm not wasting your time on
little typos like this one?

> > /* We know an iobj parameter must be a reference. If our xobj
> > parameter is a pointer, we know this is not a redeclaration.
> > This also catches array parameters, those are pointers too. */
> > if (TYPE_PTR_P (xobj_param))
> > continue;
> 
> 
> Handling pointers specifically here seems unnecessary, they should be
> rare and will be handled by the next check for unrelated type.

Ah, it took me a second but I see it now, yeah I think I'll make this
change, with a comment that notes that it also handles the pointer case.

> > dealing with a by-value xobj parameter we can procede following
> 
> 
> "proceed"
> 
> > /* An operator function must either be a non-static member function
> > or have at least one parameter of a class, a reference to a class,
> > an enumeration, or a reference to an enumeration. 13.4.0.6 /
> > - if (! methodp || DECL_STATIC_FUNCTION_P (decl))
> > + / This can just be DECL_STATIC_FUNCTION_P (decl) I think? */
> > + if ((!methodp && !DECL_XOBJ_MEMBER_FUNC_P (decl))
> > + || DECL_STATIC_FUNCTION_P (decl))
> 
> 
> No, it also needs to be true for non-members; rather, I think it can
> just be if (!DECL_OBJECT_FUNCTION_P (decl))

Yeah that seems to make sense, I'll try that.

> > + if (xobj_param)
> > + {
> > + quals = TYPE_UNQUALIFIED;
> > + if (TYPE_REF_P (xobj_param)
> > + && !(cp_type_quals (TREE_TYPE (xobj_param)) & TYPE_QUAL_CONST))
> > + LAMBDA_EXPR_MUTABLE_P (lambda_expr) = 1;
> > + }
> 
> 
> I don't think we want to mess with MUTABLE_P here. But then
> capture_decltype needs to be fixed to refer to the type of the object
> parameter rather than MUTABLE_P.
> 
> Actually, I think we can do away with the MUTABLE_P flag entirely. I'm
> going to push a patch to do that.

I've been working on code that concerns it today and yesterday and I
was beginning to think the same thing. My current version doesn't set
it at all. I'm happy to see it removed.

When I looked into whether I should or should not be setting it I found
a new bug with decltype((capture)) in lambdas with default capture. I
am not going to try to fix the value capture default case but I was
able to fix the ref capture default case. Basically decltype((x)) is
not dependent right now as far as I can tell. Both MSVC and clang have
the same bug so I'm not worried about it tbh.

> > - if (!LAMBDA_EXPR_STATIC_P (lambda_expr))
> > + if (!LAMBDA_EXPR_STATIC_P (lambda_expr)
> > + && !DECL_XOBJ_MEMBER_FUNC_P (fco))
> 
> 
> This could be if (DECL_IOBJ_MEMBER_FUNCTION_P (fco))
> 
> > - if (closure && !DECL_STATIC_FUNCTION_P (t))
> > + if (closure && DECL_IOBJ_MEMBER_FUNC_P (t) && !DECL_STATIC_FUNCTION_P (t))
> 
> 
> This shouldn't 

[PATCH] RISC-V: Update crypto vector ISA info with latest spec

2023-11-29 Thread Feng Wang
This patch add the Zvkb subset of crypto vector extension. The
corresponding test cases have aslo been modified.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zvkb ISA info.
* config/riscv/riscv.opt: Add Mask(ZVKB)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zvkn-1.c: Replace zvbb with zvkb.
* gcc.target/riscv/zvkn.c:  Ditto.
* gcc.target/riscv/zvknc-1.c: Ditto.
* gcc.target/riscv/zvknc-2.c: Ditto.
* gcc.target/riscv/zvknc.c: Ditto.
* gcc.target/riscv/zvkng-1.c: Ditto.
* gcc.target/riscv/zvkng-2.c: Ditto.
* gcc.target/riscv/zvkng.c: Ditto.
* gcc.target/riscv/zvks-1.c: Ditto.
* gcc.target/riscv/zvks.c: Ditto.
* gcc.target/riscv/zvksc-1.c: Ditto.
* gcc.target/riscv/zvksc-2.c: Ditto.
* gcc.target/riscv/zvksc.c: Ditto.
* gcc.target/riscv/zvksg-1.c: Ditto.
* gcc.target/riscv/zvksg-2.c: Ditto.
* gcc.target/riscv/zvksg.c: Ditto.
---
 gcc/common/config/riscv/riscv-common.cc  | 6 --
 gcc/config/riscv/riscv.opt   | 2 ++
 gcc/testsuite/gcc.target/riscv/zvkn-1.c  | 8 
 gcc/testsuite/gcc.target/riscv/zvkn.c| 4 ++--
 gcc/testsuite/gcc.target/riscv/zvknc-1.c | 8 
 gcc/testsuite/gcc.target/riscv/zvknc-2.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvknc.c   | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvkng-1.c | 8 
 gcc/testsuite/gcc.target/riscv/zvkng-2.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvkng.c   | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvks-1.c  | 8 
 gcc/testsuite/gcc.target/riscv/zvks.c| 4 ++--
 gcc/testsuite/gcc.target/riscv/zvksc-1.c | 8 
 gcc/testsuite/gcc.target/riscv/zvksc-2.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvksc.c   | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvksg-1.c | 8 
 gcc/testsuite/gcc.target/riscv/zvksg-2.c | 4 ++--
 gcc/testsuite/gcc.target/riscv/zvksg.c   | 4 ++--
 18 files changed, 50 insertions(+), 46 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index ded85b4c578..6c210412515 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -106,7 +106,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
 
   {"zvkn", "zvkned"},
   {"zvkn", "zvknhb"},
-  {"zvkn", "zvbb"},
+  {"zvkn", "zvkb"},
   {"zvkn", "zvkt"},
   {"zvknc", "zvkn"},
   {"zvknc", "zvbc"},
@@ -114,7 +114,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zvkng", "zvkg"},
   {"zvks", "zvksed"},
   {"zvks", "zvksh"},
-  {"zvks", "zvbb"},
+  {"zvks", "zvkb"},
   {"zvks", "zvkt"},
   {"zvksc", "zvks"},
   {"zvksc", "zvbc"},
@@ -253,6 +253,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
 
   {"zvbb", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvbc", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zvkb", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvkg", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvkned", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvknha", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1624,6 +1625,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
 
   {"zvbb", _options::x_riscv_zvb_subext, MASK_ZVBB},
   {"zvbc", _options::x_riscv_zvb_subext, MASK_ZVBC},
+  {"zvkb", _options::x_riscv_zvb_subext, MASK_ZVKB},
   {"zvkg", _options::x_riscv_zvk_subext, MASK_ZVKG},
   {"zvkned",   _options::x_riscv_zvk_subext, MASK_ZVKNED},
   {"zvknha",   _options::x_riscv_zvk_subext, MASK_ZVKNHA},
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 0c6517bdc8b..78186fff6c5 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -319,6 +319,8 @@ Mask(ZVBB) Var(riscv_zvb_subext)
 
 Mask(ZVBC) Var(riscv_zvb_subext)
 
+Mask(ZVKB) Var(riscv_zvb_subext)
+
 TargetVariable
 int riscv_zvk_subext
 
diff --git a/gcc/testsuite/gcc.target/riscv/zvkn-1.c 
b/gcc/testsuite/gcc.target/riscv/zvkn-1.c
index 23b255b4779..069a8f66c92 100644
--- a/gcc/testsuite/gcc.target/riscv/zvkn-1.c
+++ b/gcc/testsuite/gcc.target/riscv/zvkn-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gc_zvkned_zvknhb_zvbb_zvkt" { target { rv64 } } } 
*/
-/* { dg-options "-march=rv32gc_zvkned_zvknhb_zvbb_zvkt" { target { rv32 } } } 
*/
+/* { dg-options "-march=rv64gc_zvkned_zvknhb_zvkb_zvkt" { target { rv64 } } } 
*/
+/* { dg-options "-march=rv32gc_zvkned_zvknhb_zvkb_zvkt" { target { rv32 } } } 
*/
 
 #ifndef __riscv_zvkn
 #error "Feature macro for `Zvkn' not defined"
@@ -14,8 +14,8 @@
 #error "Feature macro for `Zvknhb' not defined"
 #endif
 
-#ifndef __riscv_zvbb
-#error "Feature macro for `Zvbb' not defined"
+#ifndef __riscv_zvkb
+#error "Feature macro for `Zvkb' not defined"
 #endif
 
 #ifndef __riscv_zvkt
diff --git a/gcc/testsuite/gcc.target/riscv/zvkn.c 
b/gcc/testsuite/gcc.target/riscv/zvkn.c
index 0047ebdede6..bcecbcc7e77 100644
--- a/gcc/testsuite/gcc.target/riscv/zvkn.c
+++ b/gcc/testsuite/gcc.target/riscv/zvkn.c
@@ -14,8 +14,8 @@
 #error "Feature macro for 

Re: [PATCH v4] Introduce strub: machine-independent stack scrubbing

2023-11-29 Thread Alexandre Oliva
On Nov 29, 2023, Richard Biener  wrote:

> On Wed, Nov 29, 2023 at 9:53 AM Alexandre Oliva  wrote:

>> Because _#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.

> 'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
> so I assume it was [arg_(D)][n_#] which is indeed OK.  But you
> shouldn't need to change a pointer argument to be passed by reference,
> do you?  As said, you want to restrict by-reference passing to arguments
> that are !is_gimple_reg_type ().  Everywhere where a plain PARM_DECL
> was valid a *ptr indirection is as well.

> Can you check on the pass-by-reference thing again please?

Applying the following patchlet on top of refs/users/aoliva/heads/strub
(that has a -fstrub=all patchlet in it) I get a build error in libgo
building golang.org/x/mod/sumdb.o:

In function ‘golang_0org_1x_1mod_1sumdb.Client.checkTrees.strub.0’:
go1: error: invalid argument to gimple call
_195(D)->Hash
# VUSE <.MEM_55>
_16 = __builtin_memcmp (, _195(D)->Hash, 32);
during IPA pass: strub

golang.org/x/mod/sumdb.go.057i.remove_symbols:  _5 = __builtin_memcmp (, 
, 32);

within golang_0org_1x_1mod_1sumdb.Client.checkTrees becomes, in wrapped
version thereof:

golang.org/x/mod/sumdb.go.058i.strub:  _16 = __builtin_memcmp (, 
_195(D)->Hash, 32);

It's not even the case that Hash is at offset 0 into older's type, but I
suspect the reason why the former is well-formed while the latter is not
the offset, but the SSA_NAME.


diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
index 293bec132b8..515ab9a2ee5 100644
--- a/gcc/ipa-strub.cc
+++ b/gcc/ipa-strub.cc
@@ -1950,7 +1950,7 @@ walk_regimplify_addr_expr (tree *op, int *rec, void *arg)
   if (!*op || TREE_CODE (*op) != ADDR_EXPR)
 return NULL_TREE;
 
-  if (!is_gimple_val (*op))
+  if (0 && !is_gimple_val (*op))
 {
   tree ret = force_gimple_operand_gsi (, *op, true,
   NULL_TREE, true, GSI_SAME_STMT);

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[pushed] c++: remove LAMBDA_EXPR_MUTABLE_P

2023-11-29 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

In review of the deducing 'this' patch it came up that LAMBDA_EXPR_MUTABLE_P
doesn't make sense for a lambda with an explicit object parameter.  And it
was never necessary, so let's remove it.

gcc/cp/ChangeLog:

* cp-tree.h (LAMBDA_EXPR_MUTABLE_P): Remove.
* cp-tree.def: Remove documentation.
* lambda.cc (build_lambda_expr): Remove reference.
* parser.cc (cp_parser_lambda_declarator_opt): Likewise.
* pt.cc (tsubst_lambda_expr): Likewise.
* ptree.cc (cxx_print_lambda_node): Likewise.
* semantics.cc (capture_decltype): Get the object quals
from the object instead.
---
 gcc/cp/cp-tree.h| 5 -
 gcc/cp/lambda.cc| 1 -
 gcc/cp/parser.cc| 1 -
 gcc/cp/pt.cc| 1 -
 gcc/cp/ptree.cc | 2 --
 gcc/cp/semantics.cc | 9 ++---
 gcc/cp/cp-tree.def  | 3 +--
 7 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 5614b71eed4..964af1ddd85 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -461,7 +461,6 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   TYPENAME_IS_CLASS_P (in TYPENAME_TYPE)
   STMT_IS_FULL_EXPR_P (in _STMT)
   TARGET_EXPR_LIST_INIT_P (in TARGET_EXPR)
-  LAMBDA_EXPR_MUTABLE_P (in LAMBDA_EXPR)
   DECL_FINAL_P (in FUNCTION_DECL)
   QUALIFIED_NAME_IS_TEMPLATE (in SCOPE_REF)
   CONSTRUCTOR_IS_DEPENDENT (in CONSTRUCTOR)
@@ -1478,10 +1477,6 @@ enum cp_lambda_default_capture_mode_type {
 #define LAMBDA_EXPR_CAPTURES_THIS_P(NODE) \
   LAMBDA_EXPR_THIS_CAPTURE(NODE)
 
-/* Predicate tracking whether the lambda was declared 'mutable'.  */
-#define LAMBDA_EXPR_MUTABLE_P(NODE) \
-  TREE_LANG_FLAG_1 (LAMBDA_EXPR_CHECK (NODE))
-
 /* True iff uses of a const variable capture were optimized away.  */
 #define LAMBDA_EXPR_CAPTURE_OPTIMIZED(NODE) \
   TREE_LANG_FLAG_2 (LAMBDA_EXPR_CHECK (NODE))
diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc
index 34d0190a89b..be8d240944d 100644
--- a/gcc/cp/lambda.cc
+++ b/gcc/cp/lambda.cc
@@ -44,7 +44,6 @@ build_lambda_expr (void)
   LAMBDA_EXPR_THIS_CAPTURE (lambda) = NULL_TREE;
   LAMBDA_EXPR_REGEN_INFO   (lambda) = NULL_TREE;
   LAMBDA_EXPR_PENDING_PROXIES  (lambda) = NULL;
-  LAMBDA_EXPR_MUTABLE_P(lambda) = false;
   return lambda;
 }
 
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 2464d1a0783..1826b6175f5 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11770,7 +11770,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
 
   if (lambda_specs.storage_class == sc_mutable)
 {
-  LAMBDA_EXPR_MUTABLE_P (lambda_expr) = 1;
   quals = TYPE_UNQUALIFIED;
 }
   else if (lambda_specs.storage_class == sc_static)
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c18718b319d..00a808bf323 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -19341,7 +19341,6 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
 = LAMBDA_EXPR_LOCATION (t);
   LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (r)
 = LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (t);
-  LAMBDA_EXPR_MUTABLE_P (r) = LAMBDA_EXPR_MUTABLE_P (t);
   if (tree ti = LAMBDA_EXPR_REGEN_INFO (t))
 LAMBDA_EXPR_REGEN_INFO (r)
   = build_template_info (t, add_to_template_args (TI_ARGS (ti),
diff --git a/gcc/cp/ptree.cc b/gcc/cp/ptree.cc
index 32c5b5280dc..d1f58921fab 100644
--- a/gcc/cp/ptree.cc
+++ b/gcc/cp/ptree.cc
@@ -265,8 +265,6 @@ cxx_print_identifier (FILE *file, tree node, int indent)
 void
 cxx_print_lambda_node (FILE *file, tree node, int indent)
 {
-  if (LAMBDA_EXPR_MUTABLE_P (node))
-fprintf (file, " /mutable");
   fprintf (file, " default_capture_mode=[");
   switch (LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (node))
 {
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 04b0540599a..36b57ac9524 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12792,9 +12792,12 @@ capture_decltype (tree decl)
 
   if (!TYPE_REF_P (type))
 {
-  if (!LAMBDA_EXPR_MUTABLE_P (lam))
-   type = cp_build_qualified_type (type, (cp_type_quals (type)
-  |TYPE_QUAL_CONST));
+  int quals = cp_type_quals (type);
+  tree obtype = TREE_TYPE (DECL_ARGUMENTS (current_function_decl));
+  gcc_checking_assert (!WILDCARD_TYPE_P (non_reference (obtype)));
+  if (INDIRECT_TYPE_P (obtype))
+   quals |= cp_type_quals (TREE_TYPE (obtype));
+  type = cp_build_qualified_type (type, quals);
   type = build_reference_type (type);
 }
   return type;
diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
index bf3bcd1bf13..fe47b0a10e6 100644
--- a/gcc/cp/cp-tree.def
+++ b/gcc/cp/cp-tree.def
@@ -446,8 +446,7 @@ DEFTREECODE (TRAIT_TYPE, "trait_type", tcc_type, 0)
LAMBDA_EXPR_CAPTURE_LIST holds the capture-list, including `this'.
LAMBDA_EXPR_THIS_CAPTURE goes straight to the capture of `this', if it 
exists.
LAMBDA_EXPR_PENDING_PROXIES is a vector of 

Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-29 Thread Jason Merrill

On 11/27/23 00:35, waffl3x wrote:

I think this is cleaned up enough to be presentable. Bootstrapped but
not tested but I don't think I changed anything substantial. I am
running tests right now and will report if anything fails. I have not
fixed the problem in tsubst_lambda_expr that we talked about, that will
be first on my list tomorrow. While writing/cleaning up tests I had to
investigate some things, one of which is calling an overloaded
function, where one of the candidates are introduced by a using
declaration, is considered ambiguous. I haven't narrowed down the case
for this yet so I don't know if it's related to xobj member
functions/lambda with xobj parameters or not. I had to pull a few tests
because of it though.

I did not get as much done as I would have hoped today. This really
just serves as a small progress update. Once again, todo is the issue
you raised in tsubst_lambda_expr, and fixing handling of captures when
a const xobj parameter is deduced in a lamdba call operator.



+#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
+#define DECL_XOBJ_MEMBER_FUNC_P(NODE)  \
+#define DECL_OBJECT_MEMBER_FUNC_P(NODE) \


Let's use the full word FUNCTION in these macros for consistency with 
DECL_STATIC_FUNCTION_P.



@@ -6544,7 +6544,7 @@ add_candidates (tree fns, tree first_arg, const vec *args,
   tree fn_first_arg = NULL_TREE;
   const vec *fn_args = args;
 
-  if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn))

+  if (DECL_OBJECT_MEMBER_FUNC_P (fn))
{
  /* Figure out where the object arg comes from.  If this
 function is a non-static member and we didn't get an


Hmm, that this function explicitly pulls out the object argument into 
first_arg strengthens your earlier argument that we shouldn't bother 
trying to handle null first_arg.  But let's not mess with that in this 
patch.



+  val = handle_arg(TREE_VALUE (parm),


Missing space.

  /* We know an iobj parameter must be a reference. If our xobj 
 parameter is a pointer, we know this is not a redeclaration.   
 This also catches array parameters, those are pointers too.  */

  if (TYPE_PTR_P (xobj_param))
continue;


Handling pointers specifically here seems unnecessary, they should be 
rare and will be handled by the next check for unrelated type.


 dealing with a by-value xobj parameter we can procede following


"proceed"


   /* An operator function must either be a non-static member function
  or have at least one parameter of a class, a reference to a class,
  an enumeration, or a reference to an enumeration.  13.4.0.6 */
-  if (! methodp || DECL_STATIC_FUNCTION_P (decl))
+  /* This can just be DECL_STATIC_FUNCTION_P (decl) I think?  */
+  if ((!methodp && !DECL_XOBJ_MEMBER_FUNC_P (decl))
+  || DECL_STATIC_FUNCTION_P (decl))


No, it also needs to be true for non-members; rather, I think it can 
just be if (!DECL_OBJECT_FUNCTION_P (decl))



+  if (xobj_param)
+{
+  quals = TYPE_UNQUALIFIED;
+  if (TYPE_REF_P (xobj_param)
+ && !(cp_type_quals (TREE_TYPE (xobj_param)) & TYPE_QUAL_CONST))
+LAMBDA_EXPR_MUTABLE_P (lambda_expr) = 1;
+}


I don't think we want to mess with MUTABLE_P here.  But then 
capture_decltype needs to be fixed to refer to the type of the object 
parameter rather than MUTABLE_P.


Actually, I think we can do away with the MUTABLE_P flag entirely.  I'm 
going to push a patch to do that.



-   if (!LAMBDA_EXPR_STATIC_P (lambda_expr))
+   if (!LAMBDA_EXPR_STATIC_P (lambda_expr)
+   && !DECL_XOBJ_MEMBER_FUNC_P (fco))


This could be if (DECL_IOBJ_MEMBER_FUNCTION_P (fco))


-  if (closure && !DECL_STATIC_FUNCTION_P (t))
+  if (closure && DECL_IOBJ_MEMBER_FUNC_P (t) && !DECL_STATIC_FUNCTION_P (t))


This shouldn't need to still check DECL_STATIC_FUNCTION_P.


+  /* We don't touch a lambda's func when it's just trying to create the
+ closure type.  */


We need to check it somewhere, currently this crashes:

template  void f()
{
  int i;
  [=](this T&& self){ return i; }(); // error, unrelated
}
int main() { f(); }


@@ -3691,18 +3691,7 @@ build_min_non_dep_op_overload (enum tree_code op,
   releasing_vec args;
   va_start (p, overload);
 
-  if (TREE_CODE (TREE_TYPE (overload)) == FUNCTION_TYPE)

-{
-  fn = overload;
-  if (op == ARRAY_REF)
-   obj = va_arg (p, tree);
-  for (int i = 0; i < nargs; i++)
-   {
- tree arg = va_arg (p, tree);
- vec_safe_push (args, arg);
-   }
-}


Maybe change the test to !DECL_OBJECT_MEMBER_FUNC_P to avoid reordering 
the cases?



@@ -15402,6 +15450,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain,
  gcc_checking_assert (TYPE_MAIN_VARIANT (TREE_TYPE (ve))
 

Re: [PATCH] testsuite: scev: expect fail on ilp32

2023-11-29 Thread Alexandre Oliva
On Nov 29, 2023, Hans-Peter Nilsson  wrote:

>> XPASS: gcc.dg/tree-ssa/scev-3.c scan-tree-dump-times ivopts "" 1
>> XPASS: gcc.dg/tree-ssa/scev-4.c scan-tree-dump-times ivopts "" 1
>> XPASS: gcc.dg/tree-ssa/scev-5.c scan-tree-dump-times ivopts "" 1

> It XPASSes on the ilp32 targets I've tried - except "ia32"
> (as in i686-elf) and h8300-elf.  Notably XPASSing targets
> includes a *default* configuration of arm-eabi, which in
> part contradicts your observation above.

My arm-eabi testing then targeted tms570 (big-endian cortex-r5).

I borrowed the ilp32 vs lp64 line from an internal patch by Eric that
we've had in gcc-11 and gcc-12, when I hit this fail while transitioning
the first and then the second of our 32-bit targets to gcc-13.

Eric, would you happen to recall where the notion that lp64 was a good
heuristic for these tests?

> Alex, can you share the presumably plural set of targets
> where you found gcc.dg/tree-ssa/scev-[3-5].c to fail before
> your patch, besides "ia32"?

I haven't even seen scev-4.c fail, I only got reports that it did.

I'm not even claiming it fails, I'm only claiming it has been observed
to fail on some ilp32 targets, and nobody seems to have a good sense of
when it's supposed to pass or fail, so my reasoning was that making it
an expected fail is less alarming than seeing actual failures on some
targets.  It was known to be imprecise, but to be an improvement over
getting a FAIL for some reasonably common targets when there was no
reason to expect it to actually pass, or even to have ever passed.

> So, ilp32 is IMO a really bad approximation for the elusive
> property.

Yeah.  Maybe we should just drop the ilp32, so that it's an unsurprising
fail on any targets?

> Would you please consider changing those "ilp32" to a
> specific set of targets where these tests failed?

I'd normally have aimed for that, but the challenge is that arm-eabi is
not uniform in the results for this test, and there doesn't seem to be
much support or knowledge to delineate on which target variants it's
meant to pass or not.  The test expects the transformation to take
place, as if it ought to, but there's no strong reason to expect that it
should.  There's nothing wrong if it doesn't.  Going about trying to
match the expectations to the current results may be useful, but
investigating the reasons why we get the current results for each target
is beyond my available resources for a set of tests that used to *seem*
to pass uniformly only because of a bug in the test pattern.

I don't see much value in these tests as they are, TBH.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v4] Introduce strub: machine-independent stack scrubbing

2023-11-29 Thread Alexandre Oliva
On Nov 29, 2023, Richard Biener  wrote:

>> Because _#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.

> 'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
> so I assume it was [arg_(D)][n_#] which is indeed OK.

Yeah.

> But you shouldn't need to change a pointer argument to be passed by
> reference, do you?

True, my attempt to simplify the example moved it past the breaking point.

IIRC the actual situations I hit involved computing address of members
of compound objects, such as struct members, even array elements
thereof.  They became problematic after replacing the object with a
dereference in gimple stmts.  The (effectively) offsetting operation is
well-formed gimple, but IIRC adding dereferencing to it made it
malformed gimple.  I don't immediately see why this should be the case,
since it's still offsetting, so perhaps I misremember.

> As said, you want to restrict by-reference passing to arguments
> that are !is_gimple_reg_type ()

*nod*, it was already there:

  if (!(0 /* DECL_BY_REFERENCE (narg) */
|| is_gimple_reg_type (TREE_TYPE (nparm))
...
{
  indirect_nparms.add (nparm);

>> Here are changes.html entries for this and for the other newly-added
>> features:

> LGTM.

Was that an ok to install, once the relevant pieces are in?

> Can you check on the pass-by-reference thing again please?

Sure.  I'll get back to you shortly.

If argument indirection becomes the only blocking issue, I'd be happy to
disable it, or even split out the patch that introduces it, so that the
bulk of the feature can go in while we sort out these details.
Disabling it is as simple as:

diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
index 293bec132b885..90770202fb851 100644
--- a/gcc/ipa-strub.cc
+++ b/gcc/ipa-strub.cc
@@ -2831,6 +2831,7 @@ pass_ipa_strub::execute (function *)
   parm = DECL_CHAIN (parm),
   nparm = DECL_CHAIN (nparm),
   nparmt = nparmt ? TREE_CHAIN (nparmt) : NULL_TREE)
+  if (true) ; else // ??? Disable parm indirection for now.
   if (!(0 /* DECL_BY_REFERENCE (narg) */
|| is_gimple_reg_type (TREE_TYPE (nparm))
|| VECTOR_TYPE_P (TREE_TYPE (nparm))


> Let's see if Honza or Martin have any comments on the IPA bits, I just
> mentioned what I think should be doable ...

I'm curious as to what you're hoping for.  I mean, I am using
create_version_clone_with_body, adding the new params and copying the
preexisting ones, and modifying some argument types for indirection
after cloning.  The problems I faced were as I tried to replace params
with their indirected versions.  According to my notes and my
recollection, that's where I hit most of the trouble.  But what would
this really buy us?  Do you envision a possibility of actually splitting
out the original function body, so that IPA takes care of the whole
wrapping?  AFAICT that would require a lot more infrastructure to deal
with new and modified parameters, though the details of what I learned
about this API back then, and that made it clear I wouldn't be able to
use it, seem to have faded away from my memory.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-29 Thread Joern Rennecke
 I originally computed mmask in carry_backpropagate from XEXP (x, 0),
but abandoned that when I realized we also get called for RTX_OBJ
things.  I forgot to adjust the SIGN_EXTEND code, though.  Fixed
in the attached revised patch.  Also made sure to not make inputs
of LSHIFTRT / ASHIFTRT live if the output is dead (and commened
the checks for (mask == 0) in the process).

Something that could be done to futher simplif the code is to make
carry_backpropagate do all the rtx_code-dependent propagation
decisions.  I.e. would have cases for RTX_OBJ, AND, OR, IOR etc
that propagate the mask, and the default action would be to make
the input live (after the check not make any bits in the input
live if the output is dead).

Then we wouldn't need safe_for_live_propagation any more.

Not sure if carry_backpropagate would then still be a suitable name
anymore, though.
* ext-dce.cc (carry_backpropagate): Always return 0 when output is dead.  
Fix SIGN_EXTEND input mask.

* ext-dce.cc: handle vector modes.

* ext-dce.cc: Amend comment to explain how liveness of vectors is tracked.
  (carry_backpropagate): Use GET_MODE_INNER.
  (ext_dce_process_sets): Likewise.  Only apply big endian correction for
  subregs if they don't have a vector mode.
  (ext_cde_process_uses): Likewise.

* ext-dce.cc: carry_backpropagate: [US]S_ASHIFT fix, handle [LA]SHIFTRT

* ext-dce.cc (safe_for_live_propagation): Add LSHIFTRT and ASHIFTRT.
  (carry_backpropagate): Reformat top comment.
  Add handling of LSHIFTRT and ASHIFTRT.
  Fix bit count for [SU]MUL_HIGHPART.
  Fix pasto for [SU]S_ASHIFT.

* ext-dce.c: Fixes for carry handling.

* ext-dce.c (safe_for_live_propagation): Handle MINUS.
  (ext_dce_process_uses): Break out carry handling into ..
  (carry_backpropagate): This new function.
  Better handling of ASHIFT.
  Add handling of SMUL_HIGHPART, UMUL_HIGHPART, SIGN_EXTEND, SS_ASHIFT and
  US_ASHIFT.

* ext-dce.c: fix SUBREG_BYTE test

As mentioned in
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637486.html
and
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638473.html

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 4e4c57de117..fd80052ad75 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -38,7 +38,10 @@ along with GCC; see the file COPYING3.  If not see
bit 0..7   (least significant byte)
bit 8..15  (second least significant byte)
bit 16..31
-   bit 32..BITS_PER_WORD-1  */
+   bit 32..BITS_PER_WORD-1
+
+   For vector modes, we apply these bit groups to every lane; if any of the
+   bits in the group are live in any lane, we consider this group live.  */
 
 /* Note this pass could be used to narrow memory loads too.  It's
not clear if that's profitable or not in general.  */
@@ -83,6 +86,7 @@ safe_for_live_propagation (rtx_code code)
 case SIGN_EXTEND:
 case TRUNCATE:
 case PLUS:
+case MINUS:
 case MULT:
 case SMUL_HIGHPART:
 case UMUL_HIGHPART:
@@ -96,6 +100,8 @@ safe_for_live_propagation (rtx_code code)
 case SS_ASHIFT:
 case US_ASHIFT:
 case ASHIFT:
+case LSHIFTRT:
+case ASHIFTRT:
   return true;
 
 /* There may be other safe codes.  If so they can be added
@@ -215,13 +221,22 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
livenow, bitmap live_tmp)
 
  /* Phase one of destination handling.  First remove any wrapper
 such as SUBREG or ZERO_EXTRACT.  */
- unsigned HOST_WIDE_INT mask = GET_MODE_MASK (GET_MODE (x));
+ unsigned HOST_WIDE_INT mask
+   = GET_MODE_MASK (GET_MODE_INNER (GET_MODE (x)));
  if (SUBREG_P (x)
  && !paradoxical_subreg_p (x)
  && SUBREG_BYTE (x).is_constant ())
{
- bit = subreg_lsb (x).to_constant ();
- mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit;
+ enum machine_mode omode = GET_MODE_INNER (GET_MODE (x));
+ enum machine_mode imode = GET_MODE (SUBREG_REG (x));
+ bit = 0;
+ if (!VECTOR_MODE_P (GET_MODE (x))
+ || (GET_MODE_SIZE (imode).is_constant ()
+ && (GET_MODE_SIZE (omode).to_constant ()
+ > GET_MODE_SIZE (imode).to_constant (
+   bit = subreg_lsb (x).to_constant ();
+ mask = (GET_MODE_MASK (GET_MODE_INNER (GET_MODE (SUBREG_REG (x
+ << bit);
  gcc_assert (mask);
  if (!mask)
mask = -0x1ULL;
@@ -365,6 +380,85 @@ binop_implies_op2_fully_live (rtx_code code)
 }
 }
 
+/* X, with code CODE, is an operation for which safe_for_live_propagation
+   holds true, and bits set in MASK are live in the result.  Compute a
+   mask of (potentially) live bits in the non-constant inputs.  In case of
+   binop_implies_op2_fully_live (e.g. shifts), the computed mask may
+   exclusively 

[PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-11-29 Thread juzhe.zh...@rivai.ai
Hi, Richard and Tamar.

I am sorry for bothering you.
Hope you don't mind I give some comments:

Can we support partial vector for length ?

IMHO, we can do that as follows:

bool length_loop_p = LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);

if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
  {
if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
OPTIMIZE_FOR_SPEED))
  vect_record_loop_len (loop_vinfo, lens, ncopies, vectype, 1);
else
  vect_record_loop_mask (loop_vinfo, masks, ncopies, truth_type, NULL);
  }

if (length_loop_p)
  {
tree len = vect_get_loop_len (loop_vinfo, gsi, loop_lens, 1, vectype, 0, 0);
/* Use VCOND_MASK_LEN (all true, cond, all false, len, bias) to generate
   final mask = i < len + bias ? cond[i] : false.  */
cond = gimple_build (_gsi, IFN_VCOND_MASK_LEN, truth_type,
 all true mask, cond, all false mask, len, bias);
  }
else if (masked_loop_p)
  {
tree mask
  = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, truth_type, 0);
cond
  = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond, _gsi);
  }

This is a prototype. Is this idea reasonable to Richi ?

Thanks.



juzhe.zh...@rivai.ai


[PATCH v2] LoongArch: Add intrinsic function descriptions for LSX and LASX instructions to doc.

2023-11-29 Thread Lulu Cheng
From: chenxiaolong 

gcc/ChangeLog:

* doc/extend.texi: Add information about the intrinsic function of the 
vector
instruction.

Change-Id: I0117d6f5d68731f1596b6c3016fd82f3d5e2a98d
---
 gcc/doc/extend.texi | 1662 +++
 1 file changed, 1662 insertions(+)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1ae589aeb29..04748ea6d81 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -15268,6 +15268,8 @@ instructions, but allow the compiler to schedule those 
calls.
 * BPF Built-in Functions::
 * FR-V Built-in Functions::
 * LoongArch Base Built-in Functions::
+* LoongArch SX Vector Intrinsics::
+* LoongArch ASX Vector Intrinsics::
 * MIPS DSP Built-in Functions::
 * MIPS Paired-Single Support::
 * MIPS Loongson Built-in Functions::
@@ -17052,6 +17054,1666 @@ Returns the value that is currently set in the 
@samp{tp} register.
 void * __builtin_thread_pointer (void)
 @end smallexample
 
+@node LoongArch SX Vector Intrinsics
+@subsection LoongArch SX Vector Intrinsics
+
+GCC provides intrinsics to access the LSX (Loongson SIMD Extension) 
instructions.
+The interface is made available by including @code{} and using
+@option{-mlsx}.
+
+The following vectors typedefs are included in @code{lsxintrin.h}:
+
+@itemize
+@item @code{__m128i}, a 128-bit vector of fixed point;
+@item @code{__m128}, a 128-bit vector of single precision floating point;
+@item @code{__m128d}, a 128-bit vector of double precision floating point.
+@end itemize
+
+Instructions and corresponding built-ins may have additional restrictions 
and/or
+input/output values manipulated:
+@itemize
+@item @code{imm0_1}, an integer literal in range 0 to 1;
+@item @code{imm0_3}, an integer literal in range 0 to 3;
+@item @code{imm0_7}, an integer literal in range 0 to 7;
+@item @code{imm0_15}, an integer literal in range 0 to 15;
+@item @code{imm0_31}, an integer literal in range 0 to 31;
+@item @code{imm0_63}, an integer literal in range 0 to 63;
+@item @code{imm0_127}, an integer literal in range 0 to 127;
+@item @code{imm0_255}, an integer literal in range 0 to 255;
+@item @code{imm_n16_15}, an integer literal in range -16 to 15;
+@item @code{imm_n128_127}, an integer literal in range -128 to 127;
+@item @code{imm_n256_255}, an integer literal in range -256 to 255;
+@item @code{imm_n512_511}, an integer literal in range -512 to 511;
+@item @code{imm_n1024_1023}, an integer literal in range -1024 to 1023;
+@item @code{imm_n2048_2047}, an integer literal in range -2048 to 2047.
+@end itemize
+
+For convenience, GCC defines functions @code{__lsx_vrepli_@{b/h/w/d@}} and
+@code{__lsx_b[n]z_@{v/b/h/w/d@}}, which are implemented as follows:
+
+@smallexample
+a. @code{__lsx_vrepli_@{b/h/w/d@}}: Implemented the case where the highest
+   bit of @code{vldi} instruction @code{i13} is 1.
+
+   i13[12] == 1'b0
+   case i13[11:10] of :
+ 2'b00: __lsx_vrepli_b (imm_n512_511)
+ 2'b01: __lsx_vrepli_h (imm_n512_511)
+ 2'b10: __lsx_vrepli_w (imm_n512_511)
+ 2'b11: __lsx_vrepli_d (imm_n512_511)
+
+b. @code{__lsx_b[n]z_@{v/b/h/w/d@}}: Since the @code{vseteqz} class directive
+   cannot be used on its own, this function is defined.
+
+   _lsx_bz_v  => vseteqz.v + bcnez
+   _lsx_bnz_v => vsetnez.v + bcnez
+   _lsx_bz_b  => vsetanyeqz.b + bcnez
+   _lsx_bz_h  => vsetanyeqz.h + bcnez
+   _lsx_bz_w  => vsetanyeqz.w + bcnez
+   _lsx_bz_d  => vsetanyeqz.d + bcnez
+   _lsx_bnz_b => vsetallnez.b + bcnez
+   _lsx_bnz_h => vsetallnez.h + bcnez
+   _lsx_bnz_w => vsetallnez.w + bcnez
+   _lsx_bnz_d => vsetallnez.d + bcnez
+@end smallexample
+
+@smallexample
+eg:
+  #include 
+
+  extern __m128i @var{a};
+
+  void
+  test (void)
+  @{
+if (__lsx_bz_v (@var{a}))
+  printf ("1\n");
+else
+  printf ("2\n");
+  @}
+@end smallexample
+
+@emph{Note:} For directives where the intent operand is also the source operand
+(modifying only part of the bitfield of the intent register), the first 
parameter
+in the builtin call function is used as the intent operand.
+
+@smallexample
+eg:
+  #include 
+
+  extern __m128i @var{dst};
+  extern int @var{src};
+
+  void
+  test (void)
+  @{
+@var{dst} = __lsx_vinsgr2vr_b (@var{dst}, @var{src}, 3);
+  @}
+@end smallexample
+
+The intrinsics provided are listed below:
+@smallexample
+int __lsx_bnz_b (__m128i);
+int __lsx_bnz_d (__m128i);
+int __lsx_bnz_h (__m128i);
+int __lsx_bnz_v (__m128i);
+int __lsx_bnz_w (__m128i);
+int __lsx_bz_b (__m128i);
+int __lsx_bz_d (__m128i);
+int __lsx_bz_h (__m128i);
+int __lsx_bz_v (__m128i);
+int __lsx_bz_w (__m128i);
+__m128i __lsx_vabsd_b (__m128i, __m128i);
+__m128i __lsx_vabsd_bu (__m128i, __m128i);
+__m128i __lsx_vabsd_di (__m128i, __m128i);
+__m128i __lsx_vabsd_du (__m128i, __m128i);
+__m128i __lsx_vabsd_h (__m128i, __m128i);
+__m128i __lsx_vabsd_hu (__m128i, __m128i);
+__m128i __lsx_vabsd_w (__m128i, __m128i);
+__m128i __lsx_vabsd_wu (__m128i, __m128i);
+__m128i __lsx_vadda_b (__m128i, __m128i);

[committed (pre-approved)] RISC-V: Fix 'E' extension version to test

2023-11-29 Thread Tsukasa OI
From: Tsukasa OI 

Commit 006e90e1 ("RISC-V: Initial RV64E and LP64E support") caused a
regression (test failure) but this is caused by failing to track GCC
changes in that test case (not a true GCC bug).

This commit fixes the test case to track the latest GCC with 'E'
extension version 2.0 (ratified).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-13.c: Fix 'E' extension version to test.
---
 gcc/testsuite/gcc.target/riscv/predef-13.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/predef-13.c 
b/gcc/testsuite/gcc.target/riscv/predef-13.c
index 3836255c8553..93ebb337dbd5 100644
--- a/gcc/testsuite/gcc.target/riscv/predef-13.c
+++ b/gcc/testsuite/gcc.target/riscv/predef-13.c
@@ -19,7 +19,7 @@ int main () {
 #error "__riscv_c"
 #endif
 
-#if !defined(__riscv_e) || (__riscv_e != (1 * 1000 * 1000 + 9 * 1000))
+#if !defined(__riscv_e) || (__riscv_e != (2 * 1000 * 1000 + 0 * 1000))
 #error "__riscv_e"
 #endif
 

base-commit: 8614cbb253484e28c3eb20cde4d1067aad56de58
-- 
2.42.0



Re: [V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-29 Thread Joern Rennecke
On Wed, 29 Nov 2023 at 20:05, Joern Rennecke
 wrote:

> > I suspect it'd be more useful to add handling of LSHIFTRT and ASHIFTRT
> > .  Some ports do
> > a lot of static shifting.
>
> > +case SS_ASHIFT:
> > +case US_ASHIFT:
> > +  if (!mask || XEXP (x, 1) == const0_rtx)
> > +   return 0;
>
> P.S.: I just realize that this is a pasto: in the case of a const0_rtx
> shift count,
> we returning 0 will usually be wrong.

I've attached my current patch version.
ext-dce.cc: handle vector modes.

* ext-dce.cc: Amend comment to explain how liveness of vectors is tracked.
  (carry_backpropagate): Use GET_MODE_INNER.
  (ext_dce_process_sets): Likewise.  Only apply big endian correction for
  subregs if they don't have a vector mode.
  (ext_cde_process_uses): Likewise.

* ext-dce.cc: carry_backpropagate: [US]S_ASHIFT fix, handle [LA]SHIFTRT

* ext-dce.cc (safe_for_live_propagation): Add LSHIFTRT and ASHIFTRT.
  (carry_backpropagate): Reformat top comment.
  Add handling of LSHIFTRT and ASHIFTRT.
  Fix bit count for [SU]MUL_HIGHPART.
  Fix pasto for [SU]S_ASHIFT.

* ext-dce.c: Fixes for carry handling.

* ext-dce.c (safe_for_live_propagation): Handle MINUS.
  (ext_dce_process_uses): Break out carry handling into ..
  (carry_backpropagate): This new function.
  Better handling of ASHIFT.
  Add handling of SMUL_HIGHPART, UMUL_HIGHPART, SIGN_EXTEND, SS_ASHIFT and
  US_ASHIFT.

* ext-dce.c: fix SUBREG_BYTE test

As mentioned in
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637486.html
and
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638473.html


diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 4e4c57de117..228c50e8b73 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -38,7 +38,10 @@ along with GCC; see the file COPYING3.  If not see
bit 0..7   (least significant byte)
bit 8..15  (second least significant byte)
bit 16..31
-   bit 32..BITS_PER_WORD-1  */
+   bit 32..BITS_PER_WORD-1
+
+   For vector modes, we apply these bit groups to every lane; if any of the
+   bits in the group are live in any lane, we consider this group live.  */
 
 /* Note this pass could be used to narrow memory loads too.  It's
not clear if that's profitable or not in general.  */
@@ -83,6 +86,7 @@ safe_for_live_propagation (rtx_code code)
 case SIGN_EXTEND:
 case TRUNCATE:
 case PLUS:
+case MINUS:
 case MULT:
 case SMUL_HIGHPART:
 case UMUL_HIGHPART:
@@ -96,6 +100,8 @@ safe_for_live_propagation (rtx_code code)
 case SS_ASHIFT:
 case US_ASHIFT:
 case ASHIFT:
+case LSHIFTRT:
+case ASHIFTRT:
   return true;
 
 /* There may be other safe codes.  If so they can be added
@@ -215,13 +221,22 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
livenow, bitmap live_tmp)
 
  /* Phase one of destination handling.  First remove any wrapper
 such as SUBREG or ZERO_EXTRACT.  */
- unsigned HOST_WIDE_INT mask = GET_MODE_MASK (GET_MODE (x));
+ unsigned HOST_WIDE_INT mask
+   = GET_MODE_MASK (GET_MODE_INNER (GET_MODE (x)));
  if (SUBREG_P (x)
  && !paradoxical_subreg_p (x)
  && SUBREG_BYTE (x).is_constant ())
{
- bit = subreg_lsb (x).to_constant ();
- mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit;
+ enum machine_mode omode = GET_MODE_INNER (GET_MODE (x));
+ enum machine_mode imode = GET_MODE (SUBREG_REG (x));
+ bit = 0;
+ if (!VECTOR_MODE_P (GET_MODE (x))
+ || (GET_MODE_SIZE (imode).is_constant ()
+ && (GET_MODE_SIZE (omode).to_constant ()
+ > GET_MODE_SIZE (imode).to_constant (
+   bit = subreg_lsb (x).to_constant ();
+ mask = (GET_MODE_MASK (GET_MODE_INNER (GET_MODE (SUBREG_REG (x
+ << bit);
  gcc_assert (mask);
  if (!mask)
mask = -0x1ULL;
@@ -365,6 +380,84 @@ binop_implies_op2_fully_live (rtx_code code)
 }
 }
 
+/* X, with code CODE, is an operation for which safe_for_live_propagation
+   holds true, and bits set in MASK are live in the result.  Compute a
+   mask of (potentially) live bits in the non-constant inputs.  In case of
+   binop_implies_op2_fully_live (e.g. shifts), the computed mask may
+   exclusively pertain to the first operand.  */
+
+HOST_WIDE_INT
+carry_backpropagate (HOST_WIDE_INT mask, enum rtx_code code, rtx x)
+{
+  enum machine_mode mode = GET_MODE_INNER (GET_MODE (x));
+  HOST_WIDE_INT mmask = GET_MODE_MASK (mode);
+  switch (code)
+{
+case ASHIFT:
+  if (CONSTANT_P (XEXP (x, 1))
+ && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode)))
+   return mask >> INTVAL (XEXP (x, 1));
+  /* Fall through.  */
+case PLUS: case MINUS:
+case MULT:
+  

[Committed] RISC-V: Support highpart overlap for floating-point widen instructions

2023-11-29 Thread Juzhe-Zhong
This patch leverages the approach of vwcvt/vext.vf2 which has been approved.
Their approaches are totally the same.

Tested no regression and committed.

PR target/112431

gcc/ChangeLog:

* config/riscv/vector.md: Add widenning overlap.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-10.c: New test.
* gcc.target/riscv/rvv/base/pr112431-11.c: New test.
* gcc.target/riscv/rvv/base/pr112431-12.c: New test.
* gcc.target/riscv/rvv/base/pr112431-13.c: New test.
* gcc.target/riscv/rvv/base/pr112431-14.c: New test.
* gcc.target/riscv/rvv/base/pr112431-15.c: New test.
* gcc.target/riscv/rvv/base/pr112431-7.c: New test.
* gcc.target/riscv/rvv/base/pr112431-8.c: New test.
* gcc.target/riscv/rvv/base/pr112431-9.c: New test.

---
 gcc/config/riscv/vector.md|  78 
 .../gcc.target/riscv/rvv/base/pr112431-10.c   | 104 ++
 .../gcc.target/riscv/rvv/base/pr112431-11.c   |  68 +++
 .../gcc.target/riscv/rvv/base/pr112431-12.c   |  51 +
 .../gcc.target/riscv/rvv/base/pr112431-13.c   | 188 ++
 .../gcc.target/riscv/rvv/base/pr112431-14.c   | 119 +++
 .../gcc.target/riscv/rvv/base/pr112431-15.c   |  86 
 .../gcc.target/riscv/rvv/base/pr112431-7.c| 106 ++
 .../gcc.target/riscv/rvv/base/pr112431-8.c|  68 +++
 .../gcc.target/riscv/rvv/base/pr112431-9.c|  51 +
 10 files changed, 882 insertions(+), 37 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-13.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-14.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-9.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 74716c73e98..6b891c11324 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7622,84 +7622,88 @@
 ;; 
---
 
 (define_insn "@pred_widen_fcvt_x_f"
-  [(set (match_operand:VWCONVERTI 0 "register_operand" "=,  ")
+  [(set (match_operand:VWCONVERTI 0 "register_operand"  "=vr,   vr,   
vr,   vr,  vr,vr, ?, ?")
(if_then_else:VWCONVERTI
  (unspec:
-   [(match_operand: 1 "vector_mask_operand"  "vmWc1,vmWc1")
-(match_operand 4 "vector_length_operand" "   rK,   rK")
-(match_operand 5 "const_int_operand" "i,i")
-(match_operand 6 "const_int_operand" "i,i")
-(match_operand 7 "const_int_operand" "i,i")
-(match_operand 8 "const_int_operand" "i,i")
+   [(match_operand: 1 "vector_mask_operand"  
"vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
+(match_operand 4 "vector_length_operand" "   rK,   rK,   
rK,   rK,   rK,   rK,   rK,   rK")
+(match_operand 5 "const_int_operand" "i,i,
i,i,i,i,i,i")
+(match_operand 6 "const_int_operand" "i,i,
i,i,i,i,i,i")
+(match_operand 7 "const_int_operand" "i,i,
i,i,i,i,i,i")
+(match_operand 8 "const_int_operand" "i,i,
i,i,i,i,i,i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)
 (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
  (unspec:VWCONVERTI
-[(match_operand: 3 "register_operand" "   vr,   vr")] 
VFCVTS)
- (match_operand:VWCONVERTI 2 "vector_merge_operand"  "   vu,0")))]
+[(match_operand: 3 "register_operand" "  W21,  W21,  
W42,  W42,  W84,  W84,   vr,   vr")] VFCVTS)
+ (match_operand:VWCONVERTI 2 "vector_merge_operand"  "   vu,0,   
vu,0,   vu,0,   vu,0")))]
   "TARGET_VECTOR"
   "vfwcvt.x.f.v\t%0,%3%p1"
   [(set_attr "type" "vfwcvtftoi")
(set_attr "mode" "")
(set (attr "frm_mode")
-   (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))])
+   (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))
+   (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")])
 
 (define_insn "@pred_widen_"
-  [(set (match_operand:VWCONVERTI 0 "register_operand""=,  ")
+  [(set (match_operand:VWCONVERTI 0 "register_operand" "=vr,   vr,   
vr,   vr,  vr,vr, ?, ?")
(if_then_else:VWCONVERTI
  

Re: [PATCH v1 1/1] RISC-V: Initial RV64E and LP64E support

2023-11-29 Thread Kito Cheng
Pre-approve the fix :)

On Thu, Nov 30, 2023 at 6:07 AM Tsukasa OI  wrote:
>
> Hi Patrick,
>
> Found a cause (although GCC is functionally correct, I forgot to fix
> corresponding test case [which assumes that 'E' is not ratified]).
>
> > #if !defined(__riscv_e) || (__riscv_e != (1 * 1000 * 1000 + 9 * 1000))
> > #error "__riscv_e"
> > #endif
>
> 1*1000*1000 + 9*1000 ('E' version 1.9) should have been fixed to
> 2*1000*1000 + 0*1000 because 'E' extension is now ratified version 2.0.
>
> I'll submit a fix later.
>
> Thanks,
> Tsukasa
>
>
> On 2023/11/30 6:15, Patrick O'Neill wrote:
> > Hi Tsukasa,
> >
> > I'm seeing a new regression across all tested riscv targets:
> > https://github.com/patrick-rivos/gcc-postcommit-ci/issues/224
> >
> > Regression:
> >
> > |FAIL: gcc.target/riscv/predef-13.c -O0 (test for excess errors) FAIL:
> > gcc.target/riscv/predef-13.c -O1 (test for excess errors) FAIL:
> > gcc.target/riscv/predef-13.c -O2 (test for excess errors) FAIL:
> > gcc.target/riscv/predef-13.c -O2 -flto -fno-use-linker-plugin
> > -flto-partition=none (test for excess errors) FAIL:
> > gcc.target/riscv/predef-13.c -O2 -flto -fuse-linker-plugin
> > -fno-fat-lto-objects (test for excess errors) FAIL:
> > gcc.target/riscv/predef-13.c -O3 -g (test for excess errors) FAIL:
> > gcc.target/riscv/predef-13.c -Os (test for excess errors)|
> >
> > Debug log:
> >
> > Executing on host: 
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
> >  
> > -B/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
> >   
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
> >   -march=rv32gc -mabi=ilp32d -mcmodel=medlow   -fdiagnostics-plain-output   
> >  -O0  -march=rv32e -mabi=ilp32e -mcmodel=medlow -misa-spec=2.2 -S   -o 
> > predef-13.s(timeout = 600)
> > spawn -ignore SIGHUP 
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
> >  
> > -B/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
> >  
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
> >  -march=rv32gc -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output -O0 
> > -march=rv32e -mabi=ilp32e -mcmodel=medlow -misa-spec=2.2 -S -o predef-13.s
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:
> >  In function 'main':
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2:
> >  error: #error "__riscv_e"
> > compiler exited with status 1
> > FAIL: gcc.target/riscv/predef-13.c   -O0  (test for excess errors)
> > Excess errors:
> > /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2:
> >  error: #error "__riscv_e"
> >
> > I bisected it locally to commit 006e90e13441c3716b40616282b200a0ef689376
> > (this patch):
> >
> >> ./bin/riscv64-unknown-linux-gnu-gcc -march=rv32e -mabi=ilp32e -S 
> >> ../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
> > ../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c: In function 'main':
> > ../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2: error: #error 
> > "__riscv_e"
> >23 | #error "__riscv_e"
> >   |  ^
> >
> > Let me know if you need any additional info/investigation from me.
> >
> > Thanks,
> > Patrick
> >
> > On 11/24/23 02:18, Tsukasa OI wrote:
> >> From: Tsukasa OI 
> >>
> >> Along with RV32E, RV64E is ratified.  Though ILP32E and LP64E ABIs are
> >> still draft, it's worth supporting it.
> >>
> >> gcc/ChangeLog:
> >>
> >>  * common/config/riscv/riscv-common.cc
> >>  (riscv_ext_version_table): Set version to ratified 2.0.
> >>  (riscv_subset_list::parse_std_ext): Allow RV64E.
> >>  * config.gcc: Parse base ISA 'rv64e' and ABI 'lp64e'.
> >>  * config/riscv/arch-canonicalize: Parse base ISA 'rv64e'.
> >>  * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
> >>  Define different macro per XLEN.  Add handling for ABI_LP64E.
> >>  * config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
> >>  Add handling for ABI_LP64E.
> >>  * config/riscv/riscv-opts.h (enum riscv_abi_type): Add ABI_LP64E.
> >>  * config/riscv/riscv.cc (riscv_option_override): Enhance error
> >>  handling to support RV64E and LP64E.
> >>  (riscv_conditional_register_usage): Change "RV32E" in a comment
> >>  to "RV32E/RV64E".
> >>  * config/riscv/riscv.h
> >>  (UNITS_PER_FP_ARG): Add handling for ABI_LP64E.
> >>  (STACK_BOUNDARY): Ditto.
> >>  (ABI_STACK_BOUNDARY): Ditto.
> >>  (MAX_ARGS_IN_REGISTERS): Ditto.
> >>  (ABI_SPEC): Add support for "lp64e".
> >>  * 

ping: [PATCH] diagnostics: Fix behavior of permerror options after diagnostic pop [PR111918]

2023-11-29 Thread Lewis Hyatt
On Thu, Nov 09, 2023 at 04:16:10PM -0500, Lewis Hyatt wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111918
> 
> This patch fixes the behavior of `#pragma GCC diagnostic pop' for permissive
> error diagnostics such as -Wnarrowing (in C++11). Those currently do not
> return to the correct state after the last pop; they become effectively
> simple warnings instead. Bootstrap + regtest all languages on x86-64, does
> it look OK please? Thanks!

Hello-

May I please ping this bug fix?
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635871.html

Please note, it requires a trivial rebase on top of recent changes to
the class diagnostic_context public interface. I attached the rebased patch
here as well. Thanks!

-Lewis
When a diagnostic pragma changes the classification of a given diagnostic,
the global options flags (such as warn_narrowing, etc.) may get changed too.
Specifically, if a warning was not enabled initially and was later enabled
by a pragma, then the corresponding global flag will change from false to
true when the pragma is processed. That change is permanent and is not
undone by a subsequent `#pragma GCC diagnostic pop'; the warning flag needs
to remain enabled since a diagnostic could be generated later on for a
source location prior to the pop.

So in order to support popping to the initial classification, given that the
global options flags no longer reflect that state, the diagnostic_context
object itself remembers the way things were before it changed anything. The
current implementation works fine for diagnostics that are always errors or
always warnings, but it doesn't do the right thing for diagnostics that
could be either, such as -Wnarrowing. The classification of that diagnostic
(or any permerror diagnostic) depends on the state of -fpermissive; for the
particular case of -Wnarrowing it also matters whether a compile-time or
run-time narrowing is being diagnosed.

The problem is that the current implementation insists on recording whether
an enabled diagnostic should be a DK_WARNING or a DK_ERROR, and then, after
popping to the initial state, it overrides it always to that type only. Fix
that up by adding a new internal diagnostic type DK_ANY. This just indicates
that the diagnostic is enabled without mandating exactly what type of
diagnostic it should be. Then the diagnostic can be emitted with whatever
type the frontend asks for.

Incidentally, while making this change, I noticed that classify_diagnostic()
spends some time computing a return value (the old classification kind) that
is not used anywhere. The computed value seems to have some problems, mainly
that it does not take into account `#pragma GCC diagnostic pop' at all, and
so the returned value doesn't seem like it could make sense in many
contexts. Given it would also not be desirable to leak the new internal-only
DK_ANY type to outside callers, I think it would make sense in a subsequent
cleanup patch to remove the return value altogether.

gcc/ChangeLog:

PR c++/111918
* diagnostic-core.h (enum diagnostic_t): Add DK_ANY special flag.
* diagnostic.cc (diagnostic_option_classifier::classify_diagnostic):
Make use of DK_ANY to indicate a diagnostic was initially enabled.
(diagnostic_context::diagnostic_enabled): Do not change the type of
a diagnostic if the saved classification is type DK_ANY.

gcc/testsuite/ChangeLog:

PR c++/111918
* g++.dg/cpp0x/Wnarrowing21a.C: New test.
* g++.dg/cpp0x/Wnarrowing21b.C: New test.
* g++.dg/cpp0x/Wnarrowing21c.C: New test.
* g++.dg/cpp0x/Wnarrowing21d.C: New test.

diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 04eba3d140e..4926c48da96 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -33,7 +33,10 @@ typedef enum
   DK_LAST_DIAGNOSTIC_KIND,
   /* This is used for tagging pragma pops in the diagnostic
  classification history chain.  */
-  DK_POP
+  DK_POP,
+  /* This is used internally to note that a diagnostic is enabled
+ without mandating any specific type.  */
+  DK_ANY,
 } diagnostic_t;
 
 /* RAII-style class for grouping related diagnostics.  */
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 4f66fa6acaa..fd40018a734 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -1136,8 +1136,7 @@ classify_diagnostic (const diagnostic_context *context,
   if (old_kind == DK_UNSPECIFIED)
{
  old_kind = !context->option_enabled_p (option_index)
-   ? DK_IGNORED : (context->warning_as_error_requested_p ()
-   ? DK_ERROR : DK_WARNING);
+   ? DK_IGNORED : DK_ANY;
  m_classify_diagnostic[option_index] = old_kind;
}
 
@@ -1472,7 +1471,15 @@ diagnostic_context::diagnostic_enabled (diagnostic_info 
*diagnostic)
  option.  */
   if (diag_class == DK_UNSPECIFIED
   && !option_unspecified_p (diagnostic->option_index))
-diagnostic->kind = 

Re: [PATCH v2] aarch64: Add support for Ampere-1B (-mcpu=ampere1b) CPU

2023-11-29 Thread Philipp Tomsich
Applied to master, thanks!
Philipp.

On Tue, 28 Nov 2023 at 12:57, Richard Sandiford
 wrote:
>
> Philipp Tomsich  writes:
> > On Tue, 28 Nov 2023 at 12:21, Richard Sandiford
> >  wrote:
> >>
> >> Philipp Tomsich  writes:
> >> > This patch adds initial support for Ampere-1B core.
> >> >
> >> > The Ampere-1B core implements ARMv8.7 with the following (compiler
> >> > visible) extensions:
> >> >  - CSSC (Common Short Sequence Compression instructions),
> >> >  - MTE (Memory Tagging Extension)
> >> >  - SM3/SM4
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >   * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add ampere-1b
> >> >   * config/aarch64/aarch64-cost-tables.h: Add ampere1b_extra_costs
> >> >   * config/aarch64/aarch64-tune.md: Regenerate
> >> >   * config/aarch64/aarch64.cc: Include ampere1b tuning model
> >> >   * doc/invoke.texi: Document -mcpu=ampere1b
> >> >   * config/aarch64/tuning_models/ampere1b.h: New file.
> >>
> >> OK, thanks, but:
> >>
> >> >
> >> > Signed-off-by: Philipp Tomsich 
> >> > ---
> >> >
> >> > Changes in v2:
> >> > - moved ampere1b model to a separated file
> >> > - regenerated aarch64-tune.md after rebase
> >> >
> >> >  gcc/config/aarch64/aarch64-cores.def|   1 +
> >> >  gcc/config/aarch64/aarch64-cost-tables.h| 107 ++
> >> >  gcc/config/aarch64/aarch64-tune.md  |   2 +-
> >> >  gcc/config/aarch64/aarch64.cc   |   1 +
> >> >  gcc/config/aarch64/tuning_models/ampere1b.h | 114 
> >> >  gcc/doc/invoke.texi |   2 +-
> >> >  6 files changed, 225 insertions(+), 2 deletions(-)
> >> >  create mode 100644 gcc/config/aarch64/tuning_models/ampere1b.h
> >> >
> >> > diff --git a/gcc/config/aarch64/aarch64-cores.def 
> >> > b/gcc/config/aarch64/aarch64-cores.def
> >> > index 16752b77f4b..ad896a80f1f 100644
> >> > --- a/gcc/config/aarch64/aarch64-cores.def
> >> > +++ b/gcc/config/aarch64/aarch64-cores.def
> >> > @@ -74,6 +74,7 @@ AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx, 
> >> >  V8A,  (CRC, CRYPTO), thu
> >> >  /* Ampere Computing ('\xC0') cores. */
> >> >  AARCH64_CORE("ampere1", ampere1, cortexa57, V8_6A, (F16, RNG, AES, 
> >> > SHA3), ampere1, 0xC0, 0xac3, -1)
> >> >  AARCH64_CORE("ampere1a", ampere1a, cortexa57, V8_6A, (F16, RNG, AES, 
> >> > SHA3, SM4, MEMTAG), ampere1a, 0xC0, 0xac4, -1)
> >> > +AARCH64_CORE("ampere1b", ampere1b, cortexa57, V8_7A, (F16, RNG, AES, 
> >> > SHA3, SM4, MEMTAG, CSSC), ampere1b, 0xC0, 0xac5, -1)
> >> >  /* Do not swap around "emag" and "xgene1",
> >> > this order is required to handle variant correctly. */
> >> >  AARCH64_CORE("emag",emag,  xgene1,V8A,  (CRC, CRYPTO), 
> >> > emag, 0x50, 0x000, 3)
> >> > diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
> >> > b/gcc/config/aarch64/aarch64-cost-tables.h
> >> > index 0cb638f3a13..4c8da7f119b 100644
> >> > --- a/gcc/config/aarch64/aarch64-cost-tables.h
> >> > +++ b/gcc/config/aarch64/aarch64-cost-tables.h
> >> > @@ -882,4 +882,111 @@ const struct cpu_cost_table ampere1a_extra_costs =
> >> >}
> >> >  };
> >> >
> >> > +const struct cpu_cost_table ampere1b_extra_costs =
> >> > +{
> >> > +  /* ALU */
> >> > +  {
> >> > +0, /* arith.  */
> >> > +0, /* logical.  */
> >> > +0, /* shift.  */
> >> > +COSTS_N_INSNS (1), /* shift_reg.  */
> >> > +0, /* arith_shift.  */
> >> > +COSTS_N_INSNS (1), /* arith_shift_reg.  */
> >> > +0, /* log_shift.  */
> >> > +COSTS_N_INSNS (1), /* log_shift_reg.  */
> >> > +0, /* extend.  */
> >> > +COSTS_N_INSNS (1), /* extend_arith.  */
> >> > +0, /* bfi.  */
> >> > +0, /* bfx.  */
> >> > +0, /* clz.  */
> >> > +0, /* rev.  */
> >> > +0, /* non_exec.  */
> >> > +true   /* non_exec_costs_exec.  */
> >> > +  },
> >> > +  {
> >> > +/* MULT SImode */
> >> > +{
> >> > +  COSTS_N_INSNS (2),   /* simple.  */
> >> > +  COSTS_N_INSNS (2),   /* flag_setting.  */
> >> > +  COSTS_N_INSNS (2),   /* extend.  */
> >> > +  COSTS_N_INSNS (3),   /* add.  */
> >> > +  COSTS_N_INSNS (3),   /* extend_add.  */
> >> > +  COSTS_N_INSNS (12)   /* idiv.  */
> >> > +},
> >> > +/* MULT DImode */
> >> > +{
> >> > +  COSTS_N_INSNS (2),   /* simple.  */
> >> > +  0,   /* flag_setting (N/A).  */
> >> > +  COSTS_N_INSNS (2),   /* extend.  */
> >> > +  COSTS_N_INSNS (3),   /* add.  */
> >> > +  COSTS_N_INSNS (3),   /* extend_add.  */
> >> > +  COSTS_N_INSNS (18)   /* idiv.  */
> >> > +}
> >> > +  },
> >> > +  /* LD/ST */
> >> > +  {
> >> > +COSTS_N_INSNS (2), /* load.  */
> >> > +COSTS_N_INSNS (2), /* load_sign_extend.  */
> >> > +0, /* 

Re: [PATCH] libatomic: Enable lock-free 128-bit atomics on AArch64 [PR110061]

2023-11-29 Thread Richard Sandiford
Not my specialist subject, but here goes anyway:

Wilco Dijkstra  writes:
> ping
>
> From: Wilco Dijkstra
> Sent: 02 June 2023 18:28
> To: GCC Patches 
> Cc: Richard Sandiford ; Kyrylo Tkachov 
> 
> Subject: [PATCH] libatomic: Enable lock-free 128-bit atomics on AArch64 
> [PR110061]
>
>
> Enable lock-free 128-bit atomics on AArch64.  This is backwards compatible 
> with
> existing binaries, gives better performance than locking atomics and is what
> most users expect.

Please add a justification for why it's backwards compatible, rather
than just stating that it's so.

> Note 128-bit atomic loads use a load/store exclusive loop if LSE2 is not 
> supported.
> This results in an implicit store which is invisible to software as long as 
> the given
> address is writeable (which will be true when using atomics in actual code).

Thanks for adding this.  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95722
suggests that it's still an open question whether this is a correct thing
to do, but it sounds from Joseph's comment that he isn't sure whether
atomic loads from read-only data are valid.

Linus's comment in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70490
suggests that a reasonable compromise might be to use a storing
implementation but not advertise that it is lock-free.  Also,
the comment above libat_is_lock_free says:

/* Note that this can return that a size/alignment is not lock-free even if
   all the operations that we use to implement the respective accesses provide
   lock-free forward progress as specified in C++14:  Users likely expect
   "lock-free" to also mean "fast", which is why we do not return true if, for
   example, we implement loads with this size/alignment using a CAS.  */

We don't use a CAS for the fallbacks, but like you say, we do use a
load/store exclusive loop.  So did you consider not doing this:

> +/* State we have lock-free 128-bit atomics.  */
> +#undef FAST_ATOMIC_LDST_16
> +#define FAST_ATOMIC_LDST_161

?

Otherwise it looks reasonable to me, for whatever that's worth, but:

> A simple test on an old Cortex-A72 showed 2.7x speedup of 128-bit atomics.
>
> Passes regress, OK for commit?
>
> libatomic/
> PR target/110061
> config/linux/aarch64/atomic_16.S: Implement lock-free ARMv8.0 atomics.
> config/linux/aarch64/host-config.h: Use atomic_16.S for baseline v8.0.
> State we have lock-free atomics.
>
> ---
>
> diff --git a/libatomic/config/linux/aarch64/atomic_16.S 
> b/libatomic/config/linux/aarch64/atomic_16.S
> index 
> 05439ce394b9653c9bcb582761ff7aaa7c8f9643..0485c284117edf54f41959d2fab9341a9567b1cf
>  100644
> --- a/libatomic/config/linux/aarch64/atomic_16.S
> +++ b/libatomic/config/linux/aarch64/atomic_16.S
> @@ -22,6 +22,21 @@
> .  */
>
>
> +/* AArch64 128-bit lock-free atomic implementation.
> +
> +   128-bit atomics are now lock-free for all AArch64 architecture versions.
> +   This is backwards compatible with existing binaries and gives better
> +   performance than locking atomics.
> +
> +   128-bit atomic loads use a exclusive loop if LSE2 is not supported.
> +   This results in an implicit store which is invisible to software as long
> +   as the given address is writeable.  Since all other atomics have explicit
> +   writes, this will be true when using atomics in actual code.
> +
> +   The libat__16 entry points are ARMv8.0.
> +   The libat__16_i1 entry points are used when LSE2 is available.  */
> +
> +
>  .arch   armv8-a+lse
>
>  #define ENTRY(name) \
> @@ -37,6 +52,10 @@ name:\
>  .cfi_endproc;   \
>  .size name, .-name;
>
> +#define ALIAS(alias,name)  \
> +   .global alias;  \
> +   .set alias, name;
> +
>  #define res0 x0
>  #define res1 x1
>  #define in0  x2
> @@ -70,6 +89,24 @@ name:\
>  #define SEQ_CST 5
>
>
> +ENTRY (libat_load_16)
> +   mov x5, x0
> +   cbnzw1, 2f
> +
> +   /* RELAXED.  */
> +1: ldxpres0, res1, [x5]
> +   stxpw4, res0, res1, [x5]
> +   cbnzw4, 1b
> +   ret
> +
> +   /* ACQUIRE/CONSUME/SEQ_CST.  */
> +2: ldaxp   res0, res1, [x5]
> +   stxpw4, res0, res1, [x5]
> +   cbnzw4, 2b
> +   ret
> +END (libat_load_16)
> +
> +
>  ENTRY (libat_load_16_i1)
>  cbnzw1, 1f
>
> @@ -93,6 +130,23 @@ ENTRY (libat_load_16_i1)
>  END (libat_load_16_i1)
>
>
> +ENTRY (libat_store_16)
> +   cbnzw4, 2f
> +
> +   /* RELAXED.  */
> +1: ldxpxzr, tmp0, [x0]
> +   stxpw4, in0, in1, [x0]
> +   cbnzw4, 1b
> +   ret
> +
> +   /* RELEASE/SEQ_CST.  */
> +2: ldxpxzr, tmp0, [x0]
> +   stlxp   w4, in0, in1, [x0]
> +   cbnzw4, 2b
> +   ret
> +END (libat_store_16)
> +
> +
>  ENTRY (libat_store_16_i1)
>  cbnzw4, 1f
>
> @@ -101,14 +155,14 @@ ENTRY (libat_store_16_i1)
>  ret
>
>  /* 

[PATCH]: gcc/doc/extend.texi: Update builtin example for __builtin_FILE, __builtin_LINE __builtin_FUNCTION

2023-11-29 Thread Jonny Grant


gcc/ChangeLog:
doc/extend.texi: Update builtin example for __builtin_FILE
 __builtin_LINE __builtin_FUNCTION.



>From 66290eb477dd1a99310ad9972c45391c2a87c1c7 Mon Sep 17 00:00:00 2001
From: Jonathan Grant 
Date: Wed, 29 Nov 2023 11:02:06 +
Subject: [PATCH] gcc/doc: Update builtin example for __builtin_FILE
 __builtin_LINE __builtin_FUNCTION

---
 gcc/doc/extend.texi | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1ae589aeb29..f17a4b215de 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -14660,20 +14660,22 @@ to @var{F} or the empty string if the call was not 
made at function
 scope.
 
 For example, in the following, each call to function @code{foo} will
-print a line similar to @code{"file.c:123: foo: message"} with the name
+print a line similar to @code{"file.c:5: foo: message"} with the name
 of the file and the line number of the @code{printf} call, the name of
 the function @code{foo}, followed by the word @code{message}.
 
 @smallexample
-const char*
-function (const char *func = __builtin_FUNCTION ())
+#include 
+
+void foo (void)
 @{
-  return func;
+  printf ("%s:%i: %s: message\n", __builtin_FILE (), __builtin_LINE (), 
__builtin_FUNCTION ());
+  printf ("%s:%i: %s: message\n", __builtin_FILE (), __builtin_LINE (), 
__builtin_FUNCTION ());
 @}
 
-void foo (void)
+int main (void)
 @{
-  printf ("%s:%i: %s: message\n", file (), line (), function ());
+  foo();
 @}
 @end smallexample
 
-- 
2.40.1


[Committed] RISC-V: Rename vconstraint into group_overlap

2023-11-29 Thread Juzhe-Zhong
Fix for Robin's suggestion.

gcc/ChangeLog:

* config/riscv/constraints.md (TARGET_VECTOR ? V_REGS : NO_REGS): Fix 
constraint.
* config/riscv/riscv.md (no,W21,W42,W84,W41,W81,W82): Rename 
vconstraint into group_overlap.
(no,yes): Ditto.
(none,W21,W42,W84,W43,W86,W87): Ditto.
* config/riscv/vector.md: Ditto.

---
 gcc/config/riscv/constraints.md | 12 ++--
 gcc/config/riscv/riscv.md   | 21 -
 gcc/config/riscv/vector.md  |  4 ++--
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 19bb36616bf..9836fd34460 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -183,14 +183,14 @@
 (define_register_constraint "W84" "TARGET_VECTOR ? V_REGS : NO_REGS"
   "A vector register has register number % 8 == 4." "regno % 8 == 4")
 
-(define_register_constraint "W41" "TARGET_VECTOR ? V_REGS : NO_REGS"
-  "A vector register has register number % 4 == 1." "regno % 4 == 1")
+(define_register_constraint "W43" "TARGET_VECTOR ? V_REGS : NO_REGS"
+  "A vector register has register number % 4 == 3." "regno % 4 == 3")
 
-(define_register_constraint "W81" "TARGET_VECTOR ? V_REGS : NO_REGS"
-  "A vector register has register number % 8 == 1." "regno % 8 == 1")
+(define_register_constraint "W86" "TARGET_VECTOR ? V_REGS : NO_REGS"
+  "A vector register has register number % 8 == 6." "regno % 8 == 6")
 
-(define_register_constraint "W82" "TARGET_VECTOR ? V_REGS : NO_REGS"
-  "A vector register has register number % 8 == 2." "regno % 8 == 2")
+(define_register_constraint "W87" "TARGET_VECTOR ? V_REGS : NO_REGS"
+  "A vector register has register number % 8 == 7." "regno % 8 == 7")
 
 ;; This constraint is used to match instruction "csrr %0, vlenb" which is 
generated in "mov".
 ;; VLENB is a run-time constant which represent the vector register length in 
bytes.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6bf2dfdf9b4..4c6f63677df 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -501,22 +501,25 @@
   ]
   (const_string "no")))
 
-(define_attr "vconstraint" "no,W21,W42,W84,W41,W81,W82"
-  (const_string "no"))
-
-(define_attr "vconstraint_enabled" "no,yes"
-  (cond [(eq_attr "vconstraint" "no")
+;; Widening instructions have group-overlap constraints.  Those are only
+;; valid for certain register-group sizes.  This attribute marks the
+;; alternatives not matching the required register-group size as disabled.
+(define_attr "group_overlap" "none,W21,W42,W84,W43,W86,W87"
+  (const_string "none"))
+
+(define_attr "group_overlap_valid" "no,yes"
+  (cond [(eq_attr "group_overlap" "none")
  (const_string "yes")
 
- (and (eq_attr "vconstraint" "W21")
+ (and (eq_attr "group_overlap" "W21")
  (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) 
!= 2"))
 (const_string "no")
 
- (and (eq_attr "vconstraint" "W42,W41")
+ (and (eq_attr "group_overlap" "W42,W43")
  (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) 
!= 4"))
 (const_string "no")
 
- (and (eq_attr "vconstraint" "W84,W81,W82")
+ (and (eq_attr "group_overlap" "W84,W86,W87")
  (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) 
!= 8"))
 (const_string "no")
 ]
@@ -531,7 +534,7 @@
 (eq_attr "fp_vector_disabled" "yes")
 (const_string "no")
 
-(eq_attr "vconstraint_enabled" "no")
+(eq_attr "group_overlap_valid" "no")
 (const_string "no")
   ]
   (const_string "yes")))
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 5667f8bd2b6..74716c73e98 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -3700,7 +3700,7 @@
   "vext.vf2\t%0,%3%p1"
   [(set_attr "type" "vext")
(set_attr "mode" "")
-   (set_attr "vconstraint" "W21,W21,W42,W42,W84,W84,no,no")])
+   (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")])
 
 ;; Vector Quad-Widening Sign-extend and Zero-extend.
 (define_insn "@pred__vf4"
@@ -3923,7 +3923,7 @@
(set (attr "ta") (symbol_ref "riscv_vector::get_ta(operands[5])"))
(set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
(set (attr "avl_type_idx") (const_int 7))
-   (set_attr "vconstraint" "W21,W21,W42,W42,W84,W84,no,no")])
+   (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")])
 
 ;; 
---
 ;;  Predicated integer Narrowing operations
-- 
2.36.3



Re: [PATCH v1 1/1] RISC-V: Initial RV64E and LP64E support

2023-11-29 Thread Tsukasa OI
Hi Patrick,

Found a cause (although GCC is functionally correct, I forgot to fix
corresponding test case [which assumes that 'E' is not ratified]).

> #if !defined(__riscv_e) || (__riscv_e != (1 * 1000 * 1000 + 9 * 1000))
> #error "__riscv_e"
> #endif

1*1000*1000 + 9*1000 ('E' version 1.9) should have been fixed to
2*1000*1000 + 0*1000 because 'E' extension is now ratified version 2.0.

I'll submit a fix later.

Thanks,
Tsukasa


On 2023/11/30 6:15, Patrick O'Neill wrote:
> Hi Tsukasa,
> 
> I'm seeing a new regression across all tested riscv targets:
> https://github.com/patrick-rivos/gcc-postcommit-ci/issues/224
> 
> Regression:
> 
> |FAIL: gcc.target/riscv/predef-13.c -O0 (test for excess errors) FAIL:
> gcc.target/riscv/predef-13.c -O1 (test for excess errors) FAIL:
> gcc.target/riscv/predef-13.c -O2 (test for excess errors) FAIL:
> gcc.target/riscv/predef-13.c -O2 -flto -fno-use-linker-plugin
> -flto-partition=none (test for excess errors) FAIL:
> gcc.target/riscv/predef-13.c -O2 -flto -fuse-linker-plugin
> -fno-fat-lto-objects (test for excess errors) FAIL:
> gcc.target/riscv/predef-13.c -O3 -g (test for excess errors) FAIL:
> gcc.target/riscv/predef-13.c -Os (test for excess errors)|
> 
> Debug log:
> 
> Executing on host: 
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
>  
> -B/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
>   
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
>   -march=rv32gc -mabi=ilp32d -mcmodel=medlow   -fdiagnostics-plain-output
> -O0  -march=rv32e -mabi=ilp32e -mcmodel=medlow -misa-spec=2.2 -S   -o 
> predef-13.s(timeout = 600)
> spawn -ignore SIGHUP 
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
>  
> -B/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
>  
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
>  -march=rv32gc -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output -O0 
> -march=rv32e -mabi=ilp32e -mcmodel=medlow -misa-spec=2.2 -S -o predef-13.s
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:
>  In function 'main':
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2:
>  error: #error "__riscv_e"
> compiler exited with status 1
> FAIL: gcc.target/riscv/predef-13.c   -O0  (test for excess errors)
> Excess errors:
> /home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2:
>  error: #error "__riscv_e"
> 
> I bisected it locally to commit 006e90e13441c3716b40616282b200a0ef689376
> (this patch):
> 
>> ./bin/riscv64-unknown-linux-gnu-gcc -march=rv32e -mabi=ilp32e -S 
>> ../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
> ../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c: In function 'main':
> ../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2: error: #error 
> "__riscv_e"
>23 | #error "__riscv_e"
>   |  ^
> 
> Let me know if you need any additional info/investigation from me.
> 
> Thanks,
> Patrick
> 
> On 11/24/23 02:18, Tsukasa OI wrote:
>> From: Tsukasa OI 
>>
>> Along with RV32E, RV64E is ratified.  Though ILP32E and LP64E ABIs are
>> still draft, it's worth supporting it.
>>
>> gcc/ChangeLog:
>>
>>  * common/config/riscv/riscv-common.cc
>>  (riscv_ext_version_table): Set version to ratified 2.0.
>>  (riscv_subset_list::parse_std_ext): Allow RV64E.
>>  * config.gcc: Parse base ISA 'rv64e' and ABI 'lp64e'.
>>  * config/riscv/arch-canonicalize: Parse base ISA 'rv64e'.
>>  * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
>>  Define different macro per XLEN.  Add handling for ABI_LP64E.
>>  * config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
>>  Add handling for ABI_LP64E.
>>  * config/riscv/riscv-opts.h (enum riscv_abi_type): Add ABI_LP64E.
>>  * config/riscv/riscv.cc (riscv_option_override): Enhance error
>>  handling to support RV64E and LP64E.
>>  (riscv_conditional_register_usage): Change "RV32E" in a comment
>>  to "RV32E/RV64E".
>>  * config/riscv/riscv.h
>>  (UNITS_PER_FP_ARG): Add handling for ABI_LP64E.
>>  (STACK_BOUNDARY): Ditto.
>>  (ABI_STACK_BOUNDARY): Ditto.
>>  (MAX_ARGS_IN_REGISTERS): Ditto.
>>  (ABI_SPEC): Add support for "lp64e".
>>  * config/riscv/riscv.opt: Parse -mabi=lp64e as ABI_LP64E.
>>  * doc/invoke.texi: Add documentation of the LP64E ABI.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/riscv/predef-1.c: Test for __riscv_64e.
>>  * gcc.target/riscv/predef-2.c: Ditto.
>>  * 

Re: [PATCH 2/4] libbacktrace: detect executable path on windows

2023-11-29 Thread Ian Lance Taylor
On Mon, Nov 20, 2023 at 11:57 AM Björn Schäpers  wrote:
>
> this is what I'm using with GCC 12 and 13 on my windows machines, rebased onto
> the current HEAD.

Thanks.  Committed as follows.

Ian

* fileline.c: Include  if available.
(windows_get_executable_path): New static function.
(fileline_initialize): Call windows_get_executable_path.
* configure.ac: Checked for windows.h
* configure: Regenerate.
* config.h.in: Regenerate.
0ee01dfacbcc9bc05d11433a69c0a0ac13afa42f
diff --git a/libbacktrace/config.h.in b/libbacktrace/config.h.in
index a4f5bf6..ee2616335c7 100644
--- a/libbacktrace/config.h.in
+++ b/libbacktrace/config.h.in
@@ -104,6 +104,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_UNISTD_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_WINDOWS_H
+
 /* Define if -lz is available. */
 #undef HAVE_ZLIB
 
diff --git a/libbacktrace/configure b/libbacktrace/configure
index 0ccc060901d..7ade966b54d 100755
--- a/libbacktrace/configure
+++ b/libbacktrace/configure
@@ -13509,6 +13509,19 @@ $as_echo "#define HAVE_LOADQUERY 1" >>confdefs.h
 
 fi
 
+for ac_header in windows.h
+do :
+  ac_fn_c_check_header_mongrel "$LINENO" "windows.h" "ac_cv_header_windows_h" 
"$ac_includes_default"
+if test "x$ac_cv_header_windows_h" = xyes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_WINDOWS_H 1
+_ACEOF
+
+fi
+
+done
+
+
 # Check for the fcntl function.
 if test -n "${with_target_subdir}"; then
case "${host}" in
diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac
index 71cd50f8cdf..00acb42eb6d 100644
--- a/libbacktrace/configure.ac
+++ b/libbacktrace/configure.ac
@@ -379,6 +379,8 @@ if test "$have_loadquery" = "yes"; then
   AC_DEFINE(HAVE_LOADQUERY, 1, [Define if AIX loadquery is available.])
 fi
 
+AC_CHECK_HEADERS(windows.h)
+
 # Check for the fcntl function.
 if test -n "${with_target_subdir}"; then
case "${host}" in
diff --git a/libbacktrace/fileline.c b/libbacktrace/fileline.c
index 0e560b44e7a..773f3a92969 100644
--- a/libbacktrace/fileline.c
+++ b/libbacktrace/fileline.c
@@ -47,6 +47,18 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #include 
 #endif
 
+#ifdef HAVE_WINDOWS_H
+#ifndef WIN32_MEAN_AND_LEAN
+#define WIN32_MEAN_AND_LEAN
+#endif
+
+#ifndef NOMINMAX
+#define NOMINMAX
+#endif
+
+#include 
+#endif
+
 #include "backtrace.h"
 #include "internal.h"
 
@@ -165,6 +177,37 @@ macho_get_executable_path (struct backtrace_state *state,
 
 #endif /* !HAVE_DECL__PGMPTR */
 
+#ifdef HAVE_WINDOWS_H
+
+#define FILENAME_BUF_SIZE (MAX_PATH)
+
+static char *
+windows_get_executable_path (char *buf, backtrace_error_callback 
error_callback,
+void *data)
+{
+  size_t got;
+  int error;
+
+  got = GetModuleFileNameA (NULL, buf, FILENAME_BUF_SIZE - 1);
+  error = GetLastError ();
+  if (got == 0
+  || (got == FILENAME_BUF_SIZE - 1 && error == ERROR_INSUFFICIENT_BUFFER))
+{
+  error_callback (data,
+ "could not get the filename of the current executable",
+ error);
+  return NULL;
+}
+  return buf;
+}
+
+#else /* !defined (HAVE_WINDOWS_H) */
+
+#define windows_get_executable_path(buf, error_callback, data) NULL
+#define FILENAME_BUF_SIZE 64
+
+#endif /* !defined (HAVE_WINDOWS_H) */
+
 /* Initialize the fileline information from the executable.  Returns 1
on success, 0 on failure.  */
 
@@ -178,7 +221,7 @@ fileline_initialize (struct backtrace_state *state,
   int called_error_callback;
   int descriptor;
   const char *filename;
-  char buf[64];
+  char buf[FILENAME_BUF_SIZE];
 
   if (!state->threaded)
 failed = state->fileline_initialization_failed;
@@ -202,7 +245,7 @@ fileline_initialize (struct backtrace_state *state,
 
   descriptor = -1;
   called_error_callback = 0;
-  for (pass = 0; pass < 9; ++pass)
+  for (pass = 0; pass < 10; ++pass)
 {
   int does_not_exist;
 
@@ -239,6 +282,9 @@ fileline_initialize (struct backtrace_state *state,
case 8:
  filename = macho_get_executable_path (state, error_callback, data);
  break;
+   case 9:
+ filename = windows_get_executable_path (buf, error_callback, data);
+ break;
default:
  abort ();
}


[PATCH v2] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Marek Polacek
On Wed, Nov 29, 2023 at 03:28:44PM -0500, Jason Merrill wrote:
> On 11/29/23 12:43, Marek Polacek wrote:
> > On Wed, Nov 29, 2023 at 12:23:46PM -0500, Patrick Palka wrote:
> > > On Wed, 29 Nov 2023, Marek Polacek wrote:
> > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > Now that I'm posting this patch, I think you'll probably want me to use
> > > > ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
> > > > a trivial testsuite tweak:
> > > >'C' is not an accessible base of 'X'
> > > > v.
> > > >'C' is an inaccessible base of 'X'
> > > > We should probably unify those messages...
> > > > 
> > > > -- >8 --
> > > > Given
> > > > 
> > > >struct A { constexpr static int a = 0; };
> > > >struct B : A {};
> > > >struct C : A {};
> > > >struct D : B, C {};
> > > > 
> > > > we give the "'A' is an ambiguous base of 'D'" error for
> > > > 
> > > >D{}.A::a;
> > > > 
> > > > which seems wrong: 'a' is a static data member so there is only one copy
> > > > so it can be unambiguously referred to even if there are multiple A
> > > > objects.  clang++/MSVC/icx agree.
> > > > 
> > > > PR c++/112744
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * typeck.cc (finish_class_member_access_expr): When accessing
> > > > a static data member, use ba_any for lookup_base.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/lookup/scoped11.C: New test.
> > > > * g++.dg/lookup/scoped12.C: New test.
> > > > * g++.dg/lookup/scoped13.C: New test.
> > > > ---
> > > >   gcc/cp/typeck.cc   | 21 ++---
> > > >   gcc/testsuite/g++.dg/lookup/scoped11.C | 14 ++
> > > >   gcc/testsuite/g++.dg/lookup/scoped12.C | 14 ++
> > > >   gcc/testsuite/g++.dg/lookup/scoped13.C | 14 ++
> > > >   4 files changed, 60 insertions(+), 3 deletions(-)
> > > >   create mode 100644 gcc/testsuite/g++.dg/lookup/scoped11.C
> > > >   create mode 100644 gcc/testsuite/g++.dg/lookup/scoped12.C
> > > >   create mode 100644 gcc/testsuite/g++.dg/lookup/scoped13.C
> > > > 
> > > > diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
> > > > index e995fb6ddd7..c4de8bb2616 100644
> > > > --- a/gcc/cp/typeck.cc
> > > > +++ b/gcc/cp/typeck.cc
> > > > @@ -3476,7 +3476,7 @@ finish_class_member_access_expr (cp_expr object, 
> > > > tree name, bool template_p,
> > > >name, scope);
> > > >   return error_mark_node;
> > > > }
> > > > -   
> > > > +
> > > >   if (TREE_SIDE_EFFECTS (object))
> > > > val = build2 (COMPOUND_EXPR, TREE_TYPE (val), object, 
> > > > val);
> > > >   return val;
> > > > @@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, 
> > > > tree name, bool template_p,
> > > >   return error_mark_node;
> > > > }
> > > > + /* NAME may refer to a static data member, in which case 
> > > > there is
> > > > +one copy of the data member that is shared by all the 
> > > > objects of
> > > > +the class.  So NAME can be unambiguously referred to even 
> > > > if
> > > > +there are multiple indirect base classes containing NAME.  
> > > > */
> > > > + const base_access ba = [scope, name] ()
> > > > +   {
> > > > + if (identifier_p (name))
> > > > +   {
> > > > + tree m = lookup_member (scope, name, /*protect=*/0,
> > > > + /*want_type=*/false, tf_none);
> > > > + if (!m || VAR_P (m))
> > > > +   return ba_any;
> > > 
> > > I wonder if we want to return ba_check_bit instead of ba_any so that we
> > > still check access of the selected base?
> > 
> > That would certainly make sense to me.  I didn't do that because
> > I'd not seen ba_check_bit being used except as part of ba_check,
> > but that may not mean much.
> > 
> > So either I can tweak the lambda to return ba_check_bit rather
> > than ba_any or use ba_check_bit unconditionally.  Any opinions on that?
> 
> The relevant passage seems to be
> https://eel.is/c++draft/class.access.base#6
> after DR 52, which seems to have clarified that the pointer conversion only
> applies to non-static members.
> 
> > >struct A { constexpr static int a = 0; };
> > >struct D : private A {};
> > > 
> > >void f() {
> > >  D{}.A::a; // #1 GCC (and Clang) currently rejects
> > >}
> 
> I see that MSVC also rejects it, while EDG accepts.
> 
> https://eel.is/c++draft/class.access.base#5.1 seems to say that a is
> accessible when named in A.
> 
> https://eel.is/c++draft/expr.ref#7 also only constrains references to
> non-static members.
> 
> But first we need to look up A in D, and A's injected-class-name looked up
> as a member of D is not accessible; it's private, and f() is not a friend,
> and we 

Re: [PATCH] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Marek Polacek
On Wed, Nov 29, 2023 at 01:58:31PM -0500, Jason Merrill wrote:
> On 11/29/23 10:45, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > Now that I'm posting this patch, I think you'll probably want me to use
> > ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
> > a trivial testsuite tweak:
> >'C' is not an accessible base of 'X'
> > v.
> >'C' is an inaccessible base of 'X'
> > We should probably unify those messages...
> 
> Hmm, won't using ba_any unconditionally break ambiguous base checking for
> non-static data members?

Yes.  I thought not but that's only because we weren't properly testing
that case (added scoped14.C, patch to follow).  So that settles that.
 
> > @@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, 
> > tree name, bool template_p,
> >   return error_mark_node;
> > }
> > + /* NAME may refer to a static data member, in which case there is
> > +one copy of the data member that is shared by all the objects of
> > +the class.  So NAME can be unambiguously referred to even if
> > +there are multiple indirect base classes containing NAME.  */
> > + const base_access ba = [scope, name] ()
> 
> Why a lambda?

Only so that I can set 'ba' once and make it const.  I don't believe it
deserves a named function.  It seems more readable than using a ?:.

> > +   {
> > + if (identifier_p (name))
> > +   {
> > + tree m = lookup_member (scope, name, /*protect=*/0,
> > + /*want_type=*/false, tf_none);
> > + if (!m || VAR_P (m))
> 
> Do you want shared_member_p here?

That looks like the right thing to use, thanks.

Marek



Re: [RFC PATCH] RISC-V: Remove f{r,s}flags builtins

2023-11-29 Thread Philipp Tomsich
These build-ins are used internally for the
TARGET_ATOMIC_ASSIGN_EXPAND_FENV expansion (and therefore can not be
removed):

/* Implement TARGET_ATOMIC_ASSIGN_EXPAND_FENV.  */

void
riscv_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
{
  if (!(TARGET_HARD_FLOAT || TARGET_ZFINX))
return;

  tree frflags = GET_BUILTIN_DECL (CODE_FOR_riscv_frflags);
  tree fsflags = GET_BUILTIN_DECL (CODE_FOR_riscv_fsflags);
  tree old_flags = create_tmp_var_raw (RISCV_ATYPE_USI);

  *hold = build4 (TARGET_EXPR, RISCV_ATYPE_USI, old_flags,
  build_call_expr (frflags, 0), NULL_TREE, NULL_TREE);
  *clear = build_call_expr (fsflags, 1, old_flags);
  *update = NULL_TREE;
}


On Wed, 29 Nov 2023 at 20:58, Christoph Müllner
 wrote:
>
> On Wed, Nov 29, 2023 at 8:24 PM Patrick O'Neill  wrote:
> >
> > Hi Christoph,
> >
> > The precommit-ci is seeing a large number of ICE segmentation faults as a 
> > result of this patch:
> > https://github.com/ewlu/gcc-precommit-ci/issues/796#issuecomment-1831853523
> >
> > The failures aren't in riscv.exp testsuite files so that's likely why you 
> > didn't run into them in your testing.
>
> Oh, I see.
> Then keeping things like they are is probably the best idea.
> Sorry for the noise!
>
> BR
> Christoph
>
> >
> > Debug log:
> >
> > /home/runner/work/gcc-precommit-ci/gcc-precommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.dg/c11-atomic-2.c:110:3:
> >  internal compiler error: Segmentation fault
> > 0x133afb3 crash_signal
> > ../../../gcc/gcc/toplev.cc:316
> > 0x1678d1f contains_struct_check(tree_node*, tree_node_structure_enum, char 
> > const*, int, char const*)
> > ../../../gcc/gcc/tree.h:3747
> > 0x1678d1f build_call_expr_loc_array(unsigned int, tree_node*, int, 
> > tree_node**)
> > ../../../gcc/gcc/tree.cc:10815
> > 0x1679043 build_call_expr(tree_node*, int, ...)
> > ../../../gcc/gcc/tree.cc:10865
> > 0x17f816e riscv_atomic_assign_expand_fenv(tree_node**, tree_node**, 
> > tree_node**)
> > ../../../gcc/gcc/config/riscv/riscv-builtins.cc:420
> > 0xc5209b build_atomic_assign
> > ../../../gcc/gcc/c/c-typeck.cc:4289
> > 0xc60a47 build_modify_expr(unsigned int, tree_node*, tree_node*, tree_code, 
> > unsigned int, tree_node*, tree_node*)
> > ../../../gcc/gcc/c/c-typeck.cc:6406
> > 0xc85a61 c_parser_expr_no_commas
> > ../../../gcc/gcc/c/c-parser.cc:9112
> > 0xc85db1 c_parser_expression
> > ../../../gcc/gcc/c/c-parser.cc:12725
> > 0xc862bb c_parser_expression_conv
> > ../../../gcc/gcc/c/c-parser.cc:12765
> > 0xca3607 c_parser_statement_after_labels
> > ../../../gcc/gcc/c/c-parser.cc:7755
> > 0xc9f27e c_parser_compound_statement_nostart
> > ../../../gcc/gcc/c/c-parser.cc:7242
> > 0xc9f804 c_parser_compound_statement
> > ../../../gcc/gcc/c/c-parser.cc:6527
> > 0xca359c c_parser_statement_after_labels
> > ../../../gcc/gcc/c/c-parser.cc:7590
> > 0xca5713 c_parser_statement
> > ../../../gcc/gcc/c/c-parser.cc:7561
> > 0xca5713 c_parser_c99_block_statement
> > ../../../gcc/gcc/c/c-parser.cc:7820
> > 0xca6a2c c_parser_do_statement
> > ../../../gcc/gcc/c/c-parser.cc:8194
> > 0xca3d51 c_parser_statement_after_labels
> > ../../../gcc/gcc/c/c-parser.cc:7605
> > 0xc9f27e c_parser_compound_statement_nostart
> > ../../../gcc/gcc/c/c-parser.cc:7242
> > 0xc9f804 c_parser_compound_statement
> > ../../../gcc/gcc/c/c-parser.cc:6527
> > Please submit a full bug report, with preprocessed source (by using 
> > -freport-bug).
> > Please include the complete backtrace with any bug report.
> > See  for instructions.
> > compiler exited with status 1
> > FAIL: gcc.dg/c11-atomic-2.c (internal compiler error: Segmentation fault)
> >
> > Let me know if you need any additional info/investigation from me.
> >
> > Thanks,
> > Patrick
> >
> > On 11/29/23 03:49, Christoph Muellner wrote:
> >
> > From: Christoph Müllner 
> >
> > We have two builtins which are undocumented and have no known users.
> > Further, they don't exist in LLVM (so are no portable).
> > This means they are in an unclear state of being supported or not.
> > Let's remove them get them out of this undecided state.
> >
> > A discussion about making these builtins available in all
> > compilers was held many years ago with the decision to
> > not document them in the RISC-V C API documentation:
> >   https://github.com/riscv-non-isa/riscv-c-api-doc/pull/3
> >
> > This is an RFC patch as this breaks existing code that uses
> > these builtins, even if we don't know if such code exists.
> >
> > An alternative to this patch would be to document them
> > in gcc/doc/extend.texi (like has been done with __builtin_riscv_pause)
> > and put them into a supported state.
> >
> > This patch removes two tests for these builtins.
> > A test of this patch did not trigger any regressions in riscv.exp.
> >
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv-builtins.cc: Remove the builtins
> > __builtin_riscv_frflags and __builtin_riscv_fsflags.
> >
> > gcc/testsuite/ChangeLog:
> 

RE: [PATCH 8/21]middle-end: update vectorizable_live_reduction with support for multiple exits and different exits

2023-11-29 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, November 29, 2023 2:29 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: RE: [PATCH 8/21]middle-end: update vectorizable_live_reduction
> with support for multiple exits and different exits
> 
> On Mon, 27 Nov 2023, Tamar Christina wrote:
> 
> >  >
> > > > This is a respun patch with a fix for VLA.
> > > >
> > > > This adds support to vectorizable_live_reduction to handle
> > > > multiple exits by doing a search for which exit the live value should be
> materialized in.
> > > >
> > > > Additionally which value in the index we're after depends on
> > > > whether the exit it's materialized in is an early exit or whether
> > > > the loop's main exit is different from the loop's natural one
> > > > (i.e. the one with the same src block as the latch).
> > > >
> > > > In those two cases we want the first rather than the last value as
> > > > we're going to restart the iteration in the scalar loop.  For VLA
> > > > this means we need to reverse both the mask and vector since
> > > > there's only a way to get the last active element and not the first.
> > > >
> > > > For inductions and multiple exits:
> > > >   - we test if the target will support vectorizing the induction
> > > >   - mark all inductions in the loop as relevant
> > > >   - for codegen of non-live inductions during codegen
> > > >   - induction during an early exit gets the first element rather than 
> > > > last.
> > > >
> > > > For reductions and multiple exits:
> > > >   - Reductions for early exits reduces the reduction definition 
> > > > statement
> > > > rather than the reduction step.  This allows us to get the value at 
> > > > the
> > > > start of the iteration.
> > > >   - The peeling layout means that we just have to update one
> > > > block, the
> > > merge
> > > > block.  We expect all the reductions to be the same but we leave it 
> > > > up to
> > > > the value numbering to clean up any duplicate code as we iterate 
> > > > over
> all
> > > > edges.
> > > >
> > > > These two changes fix the reduction codegen given before which has
> > > > been added to the testsuite for early vect.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * tree-vect-loop.cc (vectorizable_live_operation): Support 
> > > > early exits.
> > > > (vect_analyze_loop_operations): Check if target supports
> > > > vectorizing
> > > IV.
> > > > (vect_transform_loop): Call vectorizable_live_operation for 
> > > > non-live
> > > > inductions or reductions.
> > > > (find_connected_edge, vectorizable_live_operation_1): New.
> > > > (vect_create_epilog_for_reduction): Support reductions in early 
> > > > break.
> > > > * tree-vect-stmts.cc (perm_mask_for_reverse): Expose.
> > > > (vect_stmt_relevant_p): Mark all inductions when early break as 
> > > > being
> > > > relevant.
> > > > * tree-vectorizer.h (perm_mask_for_reverse): Expose.
> > > > (vect_iv_increment_position): New.
> > > > * tree-vect-loop-manip.cc (vect_iv_increment_position): Expose.
> > > >
> > > > --- inline copy of patch ---
> > > >
> > > > diff --git a/gcc/tree-vect-loop-manip.cc
> > > > b/gcc/tree-vect-loop-manip.cc index
> > > >
> > >
> 476be8a0bb6da2d06c4ca7052cb07bacecca60b1..1a4ba349fb6ae39c79401
> > > aecd4e7
> > > > e9e2b8a0 100644
> > > > --- a/gcc/tree-vect-loop-manip.cc
> > > > +++ b/gcc/tree-vect-loop-manip.cc
> > > > @@ -453,7 +453,7 @@ vect_adjust_loop_lens_control (tree iv_type,
> > > gimple_seq *seq,
> > > > INSERT_AFTER is set to true if the increment should be inserted 
> > > > after
> > > > *BSI.  */
> > > >
> > > > -static void
> > > > +void
> > > >  vect_iv_increment_position (edge loop_exit, gimple_stmt_iterator *bsi,
> > > > bool *insert_after)
> > > >  {
> > > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index
> > > >
> > >
> 8a50380de49bc12105be47ea1d8ee3cf1f2bdab4..b42318b2999e6a27e698
> > > 33821907
> > > > 92602cb25af1 100644
> > > > --- a/gcc/tree-vect-loop.cc
> > > > +++ b/gcc/tree-vect-loop.cc
> > > > @@ -2163,6 +2163,15 @@ vect_analyze_loop_operations
> (loop_vec_info
> > > loop_vinfo)
> > > > ok = vectorizable_live_operation (loop_vinfo, stmt_info,
> > > > NULL,
> > > NULL,
> > > >   -1, false, _vec);
> > > >
> > > > + /* Check if we can perform the operation for early break if 
> > > > we force
> > > > +the live operation.  */
> > > > + if (ok
> > > > + && LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> > > > + && !STMT_VINFO_LIVE_P (stmt_info)
> > > > + && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def)
> > > > +   ok 

[PATCH] Fortran: fix TARGET attribute of associating entity in ASSOCIATE [PR112764]

2023-11-29 Thread Harald Anlauf
Dear all,

the attached simple patch fixes the handling of the TARGET
attribute of an associate variable in an ASSOCIATE construct.

See e.g. F2018:11.1.3.3 for a standard reference.

(Note that the patch does not touch the pointer or allocatable
attributes, as that would lead to several testsuite regressions
and thus needs more work.)

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 023dc4691c73ed594d5c1085f1aab897ca4a7153 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 29 Nov 2023 21:47:24 +0100
Subject: [PATCH] Fortran: fix TARGET attribute of associating entity in
 ASSOCIATE [PR112764]

The associating entity in an ASSOCIATE construct has the TARGET attribute
if and only if the selector is a variable and has either the TARGET or
POINTER attribute (e.g. F2018:11.1.3.3).

gcc/fortran/ChangeLog:

	PR fortran/112764
	* primary.cc (gfc_variable_attr): Set TARGET attribute of associating
	entity dependent on TARGET or POINTER attribute of selector.

gcc/testsuite/ChangeLog:

	PR fortran/112764
	* gfortran.dg/associate_62.f90: New test.
---
 gcc/fortran/primary.cc | 16 ++
 gcc/testsuite/gfortran.dg/associate_62.f90 | 25 ++
 2 files changed, 41 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/associate_62.f90

diff --git a/gcc/fortran/primary.cc b/gcc/fortran/primary.cc
index d3aeeb89362..7278932b634 100644
--- a/gcc/fortran/primary.cc
+++ b/gcc/fortran/primary.cc
@@ -2653,6 +2653,22 @@ gfc_variable_attr (gfc_expr *expr, gfc_typespec *ts)
   if (pointer || attr.proc_pointer)
 target = 1;

+  /* F2018:11.1.3.3: Other attributes of associate names
+ "The associating entity does not have the ALLOCATABLE or POINTER
+ attributes; it has the TARGET attribute if and only if the selector is
+ a variable and has either the TARGET or POINTER attribute."  */
+  if (sym->attr.associate_var && sym->assoc && sym->assoc->target)
+{
+  if (sym->assoc->target->expr_type == EXPR_VARIABLE)
+	{
+	  symbol_attribute tgt_attr;
+	  tgt_attr = gfc_expr_attr (sym->assoc->target);
+	  target = (tgt_attr.pointer || tgt_attr.target);
+	}
+  else
+	target = 0;
+}
+
   if (ts != NULL && expr->ts.type == BT_UNKNOWN)
 *ts = sym->ts;

diff --git a/gcc/testsuite/gfortran.dg/associate_62.f90 b/gcc/testsuite/gfortran.dg/associate_62.f90
new file mode 100644
index 000..ce5bf286ee8
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/associate_62.f90
@@ -0,0 +1,25 @@
+! { dg-do compile }
+! PR fortran/112764
+! Contributed by martin 
+
+program assoc_target
+  implicit none
+  integer, dimension(:,:), pointer :: x
+  integer, pointer :: j
+  integer, allocatable, target :: z(:)
+  allocate (x(1:100,1:2), source=1)
+  associate (i1 => x(:,1))
+j => i1(1)
+print *, j
+if (j /= 1) stop 1
+  end associate
+  deallocate (x)
+  allocate (z(3))
+  z(:) = [1,2,3]
+  associate (i2 => z(2:3))
+j => i2(1)
+print *, j
+if (j /= 2) stop 2
+  end associate
+  deallocate (z)
+end program assoc_target
--
2.35.3



Re: [PATCH v1 1/1] RISC-V: Initial RV64E and LP64E support

2023-11-29 Thread Patrick O'Neill

Hi Tsukasa,

I'm seeing a new regression across all tested riscv targets:
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/224

Regression:

|FAIL: gcc.target/riscv/predef-13.c -O0 (test for excess errors) FAIL: 
gcc.target/riscv/predef-13.c -O1 (test for excess errors) FAIL: 
gcc.target/riscv/predef-13.c -O2 (test for excess errors) FAIL: 
gcc.target/riscv/predef-13.c -O2 -flto -fno-use-linker-plugin 
-flto-partition=none (test for excess errors) FAIL: 
gcc.target/riscv/predef-13.c -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects (test for excess errors) FAIL: 
gcc.target/riscv/predef-13.c -O3 -g (test for excess errors) FAIL: 
gcc.target/riscv/predef-13.c -Os (test for excess errors)|


Debug log:

Executing on host: 
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
 
-B/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
  
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
  -march=rv32gc -mabi=ilp32d -mcmodel=medlow   -fdiagnostics-plain-output
-O0  -march=rv32e -mabi=ilp32e -mcmodel=medlow -misa-spec=2.2 -S   -o 
predef-13.s(timeout = 600)
spawn -ignore SIGHUP 
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
 
-B/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
 
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c
 -march=rv32gc -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output -O0 
-march=rv32e -mabi=ilp32e -mcmodel=medlow -misa-spec=2.2 -S -o predef-13.s
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:
 In function 'main':
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2:
 error: #error "__riscv_e"
compiler exited with status 1
FAIL: gcc.target/riscv/predef-13.c   -O0  (test for excess errors)
Excess errors:
/home/runner/work/gcc-postcommit-ci/gcc-postcommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2:
 error: #error "__riscv_e"

I bisected it locally to commit 006e90e13441c3716b40616282b200a0ef689376 
(this patch):



./bin/riscv64-unknown-linux-gnu-gcc -march=rv32e -mabi=ilp32e -S 
../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c

../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c: In function 'main':
../gcc/gcc/testsuite/gcc.target/riscv/predef-13.c:23:2: error: #error 
"__riscv_e"
   23 | #error "__riscv_e"
  |  ^

Let me know if you need any additional info/investigation from me.

Thanks,
Patrick

On 11/24/23 02:18, Tsukasa OI wrote:

From: Tsukasa OI

Along with RV32E, RV64E is ratified.  Though ILP32E and LP64E ABIs are
still draft, it's worth supporting it.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_ext_version_table): Set version to ratified 2.0.
(riscv_subset_list::parse_std_ext): Allow RV64E.
* config.gcc: Parse base ISA 'rv64e' and ABI 'lp64e'.
* config/riscv/arch-canonicalize: Parse base ISA 'rv64e'.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
Define different macro per XLEN.  Add handling for ABI_LP64E.
* config/riscv/riscv-d.cc (riscv_d_handle_target_float_abi):
Add handling for ABI_LP64E.
* config/riscv/riscv-opts.h (enum riscv_abi_type): Add ABI_LP64E.
* config/riscv/riscv.cc (riscv_option_override): Enhance error
handling to support RV64E and LP64E.
(riscv_conditional_register_usage): Change "RV32E" in a comment
to "RV32E/RV64E".
* config/riscv/riscv.h
(UNITS_PER_FP_ARG): Add handling for ABI_LP64E.
(STACK_BOUNDARY): Ditto.
(ABI_STACK_BOUNDARY): Ditto.
(MAX_ARGS_IN_REGISTERS): Ditto.
(ABI_SPEC): Add support for "lp64e".
* config/riscv/riscv.opt: Parse -mabi=lp64e as ABI_LP64E.
* doc/invoke.texi: Add documentation of the LP64E ABI.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-1.c: Test for __riscv_64e.
* gcc.target/riscv/predef-2.c: Ditto.
* gcc.target/riscv/predef-3.c: Ditto.
* gcc.target/riscv/predef-4.c: Ditto.
* gcc.target/riscv/predef-5.c: Ditto.
* gcc.target/riscv/predef-6.c: Ditto.
* gcc.target/riscv/predef-7.c: Ditto.
* gcc.target/riscv/predef-8.c: Ditto.
* gcc.target/riscv/predef-9.c: New test for RV64E and LP64E,
based on predef-7.c.
---

Re: [PATCH] c++: bogus -Wparentheses warning [PR112765]

2023-11-29 Thread Jason Merrill

On 11/29/23 14:42, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linu-xgnu, does this look OK for
trunk?


OK.


-- >8 --

We need to consistently look through implicit INDIRECT_REF when
setting/checking for -Wparentheses warning suppression.  In passing
use STRIP_REFERENCE_REF consistently as well.

PR c++/112765

gcc/cp/ChangeLog:

* pt.cc (tsubst_expr) : Look through
implicit INDIRECT_REF when propagating -Wparentheses
warning suppression.
* semantics.cc (maybe_warn_unparenthesized_assignment):
Replace REFERENCE_REF_P stripping with STRIP_REFERENCE_REF.
(finish_parenthesized_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wparentheses-33.C: New test.
---
  gcc/cp/pt.cc|  2 +-
  gcc/cp/semantics.cc |  6 ++
  gcc/testsuite/g++.dg/warn/Wparentheses-33.C | 24 +
  3 files changed, 27 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-33.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 00b095265b6..fc4464dec02 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20282,7 +20282,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
   build_x_modify_expr sets it and it must not be reset
   here.  */
if (warning_suppressed_p (t, OPT_Wparentheses))
- suppress_warning (r, OPT_Wparentheses);
+ suppress_warning (STRIP_REFERENCE_REF (r), OPT_Wparentheses);
  
  	RETURN (r);

}
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 3bf586453dc..fc00c20cba4 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -871,8 +871,7 @@ is_assignment_op_expr_p (tree t)
  void
  maybe_warn_unparenthesized_assignment (tree t, tsubst_flags_t complain)
  {
-  if (REFERENCE_REF_P (t))
-t = TREE_OPERAND (t, 0);
+  t = STRIP_REFERENCE_REF (t);
  
if ((complain & tf_warning)

&& warn_parentheses
@@ -2176,8 +2175,7 @@ finish_parenthesized_expr (cp_expr expr)
  {
/* This inhibits warnings in maybe_warn_unparenthesized_assignment
 and c_common_truthvalue_conversion.  */
-  tree inner = REFERENCE_REF_P (expr) ? TREE_OPERAND (expr, 0) : *expr;
-  suppress_warning (inner, OPT_Wparentheses);
+  suppress_warning (STRIP_REFERENCE_REF (*expr), OPT_Wparentheses);
  }
  
if (TREE_CODE (expr) == OFFSET_REF

diff --git a/gcc/testsuite/g++.dg/warn/Wparentheses-33.C 
b/gcc/testsuite/g++.dg/warn/Wparentheses-33.C
new file mode 100644
index 000..6c65037d1b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wparentheses-33.C
@@ -0,0 +1,24 @@
+// PR c++/112765
+
+struct A {
+  A& operator=(const A&);
+  operator bool() const;
+};
+
+template
+void f(A a1, A a2) {
+  if ((a2 = a1)) // { dg-bogus "suggest parentheses" }
+return;
+  bool b = (a2 = a1); // { dg-bogus "suggest parentheses" }
+}
+
+template void f(A, A);
+
+template
+void g(T a1, T a2) {
+  if ((a2 = a1)) // { dg-bogus "suggest parentheses" }
+return;
+  bool b = (a2 = a1); // { dg-bogus "suggest parentheses" }
+}
+
+template void g(A, A);




Re: [PATCH v2] c++: P2280R4, Using unknown refs in constant expr [PR106650]

2023-11-29 Thread Jason Merrill

On 11/29/23 13:56, Marek Polacek wrote:

On Mon, Nov 20, 2023 at 04:29:33PM -0500, Jason Merrill wrote:

On 11/17/23 16:46, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch is an attempt to implement (part of?) P2280, Using unknown
pointers and references in constant expressions.  (Note that R4 seems to
only allow References to unknown/Accesses via this, but not Pointers to
unknown.)


Indeed.  That seems a bit arbitrary to me, but there it is.

We were rejecting the testcase before because cxx_bind_parameters_in_call
was trying to perform an lvalue->rvalue conversion on the reference itself;
this isn't really a thing in the language, but worked to implement the
reference bullet that the paper removes.  Your approach to fixing that makes
sense to me.

We should do the same for VAR_DECL references, e.g.

extern int ()[42];
constexpr int i = array_size (r);


Argh, right.
  

You also need to allow (implict or explicit) use of 'this', as in:

struct A
{
   constexpr int f() { return 42; }
   void g() { constexpr int i = f(); }
};


Ah, I thought that already worked, but not so.  Apology apology.


This patch works to the extent that the test case added in [expr.const]
works as expected, as well as the test in


Most importantly, the proposal makes this compile:

template 
constexpr auto array_size(T (&)[N]) -> size_t {
return N;
}

void check(int const ()[3]) {
constexpr auto s = array_size(param);
static_assert (s == 3);
}

and I think it would be a pity not to have it in GCC 14.

What still doesn't work (and I don't know if it should) is the test in $3.2:

struct A2 { constexpr int f() { return 0; } };
struct B2 : virtual A2 {};
void f2(B2 ) { constexpr int k = b.f(); }

where we say
error: '* & b' is not a constant expression


It seems like that is supposed to work, the problem is accessing the vtable
to perform the conversion.  I have WIP to recognize that conversion better
in order to fix PR53288; this testcase can wait for that fix.


Great.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
This patch is an attempt to implement (part of?) P2280, Using unknown
pointers and references in constant expressions.  (Note that R4 seems to
only allow References to unknown/Accesses via this, but not Pointers to
unknown.)

This patch works to the extent that the test case added in [expr.const]
works as expected, as well as the test in


Most importantly, the proposal makes this compile:

   template 
   constexpr auto array_size(T (&)[N]) -> size_t {
   return N;
   }

   void check(int const ()[3]) {
   constexpr auto s = array_size(param);
   static_assert (s == 3);
   }

and I think it would be a pity not to have it in GCC 14.

What still doesn't work is the test in $3.2:

   struct A2 { constexpr int f() { return 0; } };
   struct B2 : virtual A2 {};
   void f2(B2 ) { constexpr int k = b.f(); }

where we say
error: '* & b' is not a constant expression

This will be fixed in the future.

PR c++/106650

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression) : Allow
reference to unknown/this as per P2280.
: Allow reference to unknown as per P2280.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-array-ptr6.C: Remove dg-error.
* g++.dg/cpp0x/constexpr-ref12.C: Likewise.
* g++.dg/cpp0x/constexpr-ref2.C: Adjust dg-error.
* g++.dg/cpp0x/noexcept34.C: Remove dg-error.
* g++.dg/cpp1y/lambda-generic-const10.C: Likewise.
* g++.dg/cpp0x/constexpr-ref13.C: New test.
* g++.dg/cpp1z/constexpr-ref1.C: New test.
* g++.dg/cpp1z/constexpr-ref2.C: New test.
* g++.dg/cpp2a/constexpr-ref1.C: New test.
---
  gcc/cp/constexpr.cc   |  8 ++-
  .../g++.dg/cpp0x/constexpr-array-ptr6.C   |  2 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-ref12.C  |  4 +-
  gcc/testsuite/g++.dg/cpp0x/constexpr-ref13.C  | 41 ++
  gcc/testsuite/g++.dg/cpp0x/constexpr-ref2.C   |  4 +-
  gcc/testsuite/g++.dg/cpp0x/noexcept34.C   |  8 +--
  .../g++.dg/cpp1y/lambda-generic-const10.C |  2 +-
  gcc/testsuite/g++.dg/cpp1z/constexpr-ref1.C   | 26 +
  gcc/testsuite/g++.dg/cpp1z/constexpr-ref2.C   | 23 
  gcc/testsuite/g++.dg/cpp2a/constexpr-ref1.C   | 54 +++
  10 files changed, 161 insertions(+), 11 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-ref13.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-ref1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-ref2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-ref1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 

Re: [PATCH] Add C intrinsics for scalar crypto extension

2023-11-29 Thread Craig Topper
The intrinsics that use macros are the ones that require an integer constant 
expression for one of the arguments. Clang needs to be able to see the constant 
expression as an argument to the underlying builtin. Thus the macro.

Based on my previous x86 experience, gcc may only require a macro for -O0. 
There are many x86 intrinsics in gcc that have two versions based on whether 
__OPTIMIZE__ is defined. For example 
https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/xmmintrin.h#L52

> On Nov 29, 2023, at 9:58 AM, Christoph Müllner  
> wrote:
> 
> On Wed, Nov 29, 2023 at 5:49 PM Liao Shihua  > wrote:
>> 
>> 
>> 在 2023/11/29 23:03, Christoph Müllner 写道:
>> 
>> On Mon, Nov 27, 2023 at 9:36 AM Liao Shihua  wrote:
>> 
>> This patch add C intrinsics for scalar crypto extension.
>> Because of riscv-c-api 
>> (https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44/files) includes 
>> zbkb/zbkc/zbkx's
>> intrinsics in bit manipulation extension, this patch only support 
>> zkn*/zks*'s intrinsics.
>> 
>> Thanks for working on this!
>> Looking forward to seeing the second patch (covering bitmanip) soon as well!
>> A couple of comments can be found below.
>> 
>> 
>> Thanks for your comments, Christoph. Typos will be corrected in the next 
>> patch.
>> 
>> The implementation of intrinsic is belonged to the implementation in the 
>> LLVM.(It does look a little strange)
>> 
>> I will unify the implementation method in the next patch.
>> 
>> 
>> 
>> gcc/ChangeLog:
>> 
>>* config.gcc: Add riscv_crypto.h
>>* config/riscv/riscv_crypto.h: New file.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>* gcc.target/riscv/zknd32.c: Use intrinsics instead of builtins.
>>* gcc.target/riscv/zknd64.c: Likewise.
>>* gcc.target/riscv/zkne32.c: Likewise.
>>* gcc.target/riscv/zkne64.c: Likewise.
>>* gcc.target/riscv/zknh-sha256-32.c: Likewise.
>>* gcc.target/riscv/zknh-sha256-64.c: Likewise.
>>* gcc.target/riscv/zknh-sha512-32.c: Likewise.
>>* gcc.target/riscv/zknh-sha512-64.c: Likewise.
>>* gcc.target/riscv/zksed32.c: Likewise.
>>* gcc.target/riscv/zksed64.c: Likewise.
>>* gcc.target/riscv/zksh32.c: Likewise.
>>* gcc.target/riscv/zksh64.c: Likewise.
>> 
>> ---
>> gcc/config.gcc|   2 +-
>> gcc/config/riscv/riscv_crypto.h   | 219 ++
>> gcc/testsuite/gcc.target/riscv/zknd32.c   |   6 +-
>> gcc/testsuite/gcc.target/riscv/zknd64.c   |  12 +-
>> gcc/testsuite/gcc.target/riscv/zkne32.c   |   6 +-
>> gcc/testsuite/gcc.target/riscv/zkne64.c   |  10 +-
>> .../gcc.target/riscv/zknh-sha256-32.c |  22 +-
>> .../gcc.target/riscv/zknh-sha256-64.c |  10 +-
>> .../gcc.target/riscv/zknh-sha512-32.c |  14 +-
>> .../gcc.target/riscv/zknh-sha512-64.c |  10 +-
>> gcc/testsuite/gcc.target/riscv/zksed32.c  |   6 +-
>> gcc/testsuite/gcc.target/riscv/zksed64.c  |   6 +-
>> gcc/testsuite/gcc.target/riscv/zksh32.c   |   6 +-
>> gcc/testsuite/gcc.target/riscv/zksh64.c   |   6 +-
>> 14 files changed, 288 insertions(+), 47 deletions(-)
>> create mode 100644 gcc/config/riscv/riscv_crypto.h
>> 
>> diff --git a/gcc/config.gcc b/gcc/config.gcc
>> index b88591b6fd8..d67fe8b6a6f 100644
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -548,7 +548,7 @@ riscv*)
>>extra_objs="${extra_objs} riscv-vector-builtins.o 
>> riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
>>extra_objs="${extra_objs} thead.o riscv-target-attr.o"
>>d_target_objs="riscv-d.o"
>> -   extra_headers="riscv_vector.h"
>> +   extra_headers="riscv_vector.h riscv_crypto.h"
>>target_gtfiles="$target_gtfiles 
>> \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
>>target_gtfiles="$target_gtfiles 
>> \$(srcdir)/config/riscv/riscv-vector-builtins.h"
>>;;
>> diff --git a/gcc/config/riscv/riscv_crypto.h 
>> b/gcc/config/riscv/riscv_crypto.h
>> new file mode 100644
>> index 000..149c1132e10
>> --- /dev/null
>> +++ b/gcc/config/riscv/riscv_crypto.h
>> @@ -0,0 +1,219 @@
>> +/* RISC-V 'K' Extension intrinsics include file.
>> +   Copyright (C) 2023 Free Software Foundation, Inc.
>> +
>> +   This file is part of GCC.
>> +
>> +   GCC is free software; you can redistribute it and/or modify it
>> +   under the terms of the GNU General Public License as published
>> +   by the Free Software Foundation; either version 3, or (at your
>> +   option) any later version.
>> +
>> +   GCC is distributed in the hope that it will be useful, but WITHOUT
>> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +   License for more details.
>> +
>> +   Under Section 7 of GPL version 3, you are granted additional
>> +   permissions described in the GCC Runtime Library Exception, version
>> +   3.1, as published 

Re: [PATCH] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Jason Merrill

On 11/29/23 12:43, Marek Polacek wrote:

On Wed, Nov 29, 2023 at 12:23:46PM -0500, Patrick Palka wrote:

On Wed, 29 Nov 2023, Marek Polacek wrote:


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

Now that I'm posting this patch, I think you'll probably want me to use
ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
a trivial testsuite tweak:
   'C' is not an accessible base of 'X'
v.
   'C' is an inaccessible base of 'X'
We should probably unify those messages...

-- >8 --
Given

   struct A { constexpr static int a = 0; };
   struct B : A {};
   struct C : A {};
   struct D : B, C {};

we give the "'A' is an ambiguous base of 'D'" error for

   D{}.A::a;

which seems wrong: 'a' is a static data member so there is only one copy
so it can be unambiguously referred to even if there are multiple A
objects.  clang++/MSVC/icx agree.

PR c++/112744

gcc/cp/ChangeLog:

* typeck.cc (finish_class_member_access_expr): When accessing
a static data member, use ba_any for lookup_base.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/scoped11.C: New test.
* g++.dg/lookup/scoped12.C: New test.
* g++.dg/lookup/scoped13.C: New test.
---
  gcc/cp/typeck.cc   | 21 ++---
  gcc/testsuite/g++.dg/lookup/scoped11.C | 14 ++
  gcc/testsuite/g++.dg/lookup/scoped12.C | 14 ++
  gcc/testsuite/g++.dg/lookup/scoped13.C | 14 ++
  4 files changed, 60 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped11.C
  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped12.C
  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped13.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index e995fb6ddd7..c4de8bb2616 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -3476,7 +3476,7 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
   name, scope);
  return error_mark_node;
}
-   
+
  if (TREE_SIDE_EFFECTS (object))
val = build2 (COMPOUND_EXPR, TREE_TYPE (val), object, val);
  return val;
@@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
  return error_mark_node;
}
  
+	  /* NAME may refer to a static data member, in which case there is

+one copy of the data member that is shared by all the objects of
+the class.  So NAME can be unambiguously referred to even if
+there are multiple indirect base classes containing NAME.  */
+ const base_access ba = [scope, name] ()
+   {
+ if (identifier_p (name))
+   {
+ tree m = lookup_member (scope, name, /*protect=*/0,
+ /*want_type=*/false, tf_none);
+ if (!m || VAR_P (m))
+   return ba_any;


I wonder if we want to return ba_check_bit instead of ba_any so that we
still check access of the selected base?


That would certainly make sense to me.  I didn't do that because
I'd not seen ba_check_bit being used except as part of ba_check,
but that may not mean much.

So either I can tweak the lambda to return ba_check_bit rather
than ba_any or use ba_check_bit unconditionally.  Any opinions on that?


The relevant passage seems to be
https://eel.is/c++draft/class.access.base#6
after DR 52, which seems to have clarified that the pointer conversion 
only applies to non-static members.



   struct A { constexpr static int a = 0; };
   struct D : private A {};

   void f() {
 D{}.A::a; // #1 GCC (and Clang) currently rejects
   }


I see that MSVC also rejects it, while EDG accepts.

https://eel.is/c++draft/class.access.base#5.1 seems to say that a is 
accessible when named in A.


https://eel.is/c++draft/expr.ref#7 also only constrains references to 
non-static members.


But first we need to look up A in D, and A's injected-class-name looked 
up as a member of D is not accessible; it's private, and f() is not a 
friend, and we correctly complain about that.


If we avoid the lookup of A in D with

D{}.::A::a;

clang accepts it, which is consistent with accepting the template 
version, and seems correct.


So, I think ba_any is what we want here.

Jason



Re: [V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-29 Thread Joern Rennecke
On Wed, 29 Nov 2023 at 19:57, Joern Rennecke
 wrote:
>
> Attached is what I have for carry_backpropagate .
>
> The utility of special handling for SS_ASHIFT / US_ASHIFT seems
> somewhat marginal.
>
> I suspect it'd be more useful to add handling of LSHIFTRT and ASHIFTRT
> .  Some ports do
> a lot of static shifting.

> +case SS_ASHIFT:
> +case US_ASHIFT:
> +  if (!mask || XEXP (x, 1) == const0_rtx)
> +   return 0;

P.S.: I just realize that this is a pasto: in the case of a const0_rtx
shift count,
we returning 0 will usually be wrong.  OTOH the code below will handle this
just almost perfectly - the one imperfection being that SS_ASHIFT will see
the sign bit set live if anything is live.  Not that it actually
matters if we track
liveness in 8 / 8 / 16 / 32 bit chunks.


Re: [RFC PATCH] RISC-V: Remove f{r,s}flags builtins

2023-11-29 Thread Christoph Müllner
On Wed, Nov 29, 2023 at 8:24 PM Patrick O'Neill  wrote:
>
> Hi Christoph,
>
> The precommit-ci is seeing a large number of ICE segmentation faults as a 
> result of this patch:
> https://github.com/ewlu/gcc-precommit-ci/issues/796#issuecomment-1831853523
>
> The failures aren't in riscv.exp testsuite files so that's likely why you 
> didn't run into them in your testing.

Oh, I see.
Then keeping things like they are is probably the best idea.
Sorry for the noise!

BR
Christoph

>
> Debug log:
>
> /home/runner/work/gcc-precommit-ci/gcc-precommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.dg/c11-atomic-2.c:110:3:
>  internal compiler error: Segmentation fault
> 0x133afb3 crash_signal
> ../../../gcc/gcc/toplev.cc:316
> 0x1678d1f contains_struct_check(tree_node*, tree_node_structure_enum, char 
> const*, int, char const*)
> ../../../gcc/gcc/tree.h:3747
> 0x1678d1f build_call_expr_loc_array(unsigned int, tree_node*, int, 
> tree_node**)
> ../../../gcc/gcc/tree.cc:10815
> 0x1679043 build_call_expr(tree_node*, int, ...)
> ../../../gcc/gcc/tree.cc:10865
> 0x17f816e riscv_atomic_assign_expand_fenv(tree_node**, tree_node**, 
> tree_node**)
> ../../../gcc/gcc/config/riscv/riscv-builtins.cc:420
> 0xc5209b build_atomic_assign
> ../../../gcc/gcc/c/c-typeck.cc:4289
> 0xc60a47 build_modify_expr(unsigned int, tree_node*, tree_node*, tree_code, 
> unsigned int, tree_node*, tree_node*)
> ../../../gcc/gcc/c/c-typeck.cc:6406
> 0xc85a61 c_parser_expr_no_commas
> ../../../gcc/gcc/c/c-parser.cc:9112
> 0xc85db1 c_parser_expression
> ../../../gcc/gcc/c/c-parser.cc:12725
> 0xc862bb c_parser_expression_conv
> ../../../gcc/gcc/c/c-parser.cc:12765
> 0xca3607 c_parser_statement_after_labels
> ../../../gcc/gcc/c/c-parser.cc:7755
> 0xc9f27e c_parser_compound_statement_nostart
> ../../../gcc/gcc/c/c-parser.cc:7242
> 0xc9f804 c_parser_compound_statement
> ../../../gcc/gcc/c/c-parser.cc:6527
> 0xca359c c_parser_statement_after_labels
> ../../../gcc/gcc/c/c-parser.cc:7590
> 0xca5713 c_parser_statement
> ../../../gcc/gcc/c/c-parser.cc:7561
> 0xca5713 c_parser_c99_block_statement
> ../../../gcc/gcc/c/c-parser.cc:7820
> 0xca6a2c c_parser_do_statement
> ../../../gcc/gcc/c/c-parser.cc:8194
> 0xca3d51 c_parser_statement_after_labels
> ../../../gcc/gcc/c/c-parser.cc:7605
> 0xc9f27e c_parser_compound_statement_nostart
> ../../../gcc/gcc/c/c-parser.cc:7242
> 0xc9f804 c_parser_compound_statement
> ../../../gcc/gcc/c/c-parser.cc:6527
> Please submit a full bug report, with preprocessed source (by using 
> -freport-bug).
> Please include the complete backtrace with any bug report.
> See  for instructions.
> compiler exited with status 1
> FAIL: gcc.dg/c11-atomic-2.c (internal compiler error: Segmentation fault)
>
> Let me know if you need any additional info/investigation from me.
>
> Thanks,
> Patrick
>
> On 11/29/23 03:49, Christoph Muellner wrote:
>
> From: Christoph Müllner 
>
> We have two builtins which are undocumented and have no known users.
> Further, they don't exist in LLVM (so are no portable).
> This means they are in an unclear state of being supported or not.
> Let's remove them get them out of this undecided state.
>
> A discussion about making these builtins available in all
> compilers was held many years ago with the decision to
> not document them in the RISC-V C API documentation:
>   https://github.com/riscv-non-isa/riscv-c-api-doc/pull/3
>
> This is an RFC patch as this breaks existing code that uses
> these builtins, even if we don't know if such code exists.
>
> An alternative to this patch would be to document them
> in gcc/doc/extend.texi (like has been done with __builtin_riscv_pause)
> and put them into a supported state.
>
> This patch removes two tests for these builtins.
> A test of this patch did not trigger any regressions in riscv.exp.
>
> Signed-off-by: Christoph Müllner 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-builtins.cc: Remove the builtins
> __builtin_riscv_frflags and __builtin_riscv_fsflags.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/riscv/frflags.C: Removed.
> * gcc.target/riscv/fsflags.c: Removed.
> ---
>  gcc/config/riscv/riscv-builtins.cc   |  2 --
>  gcc/testsuite/g++.target/riscv/frflags.C |  7 ---
>  gcc/testsuite/gcc.target/riscv/fsflags.c | 16 
>  3 files changed, 25 deletions(-)
>  delete mode 100644 gcc/testsuite/g++.target/riscv/frflags.C
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/fsflags.c
>
> diff --git a/gcc/config/riscv/riscv-builtins.cc 
> b/gcc/config/riscv/riscv-builtins.cc
> index fc3976f3ba1..1655492b246 100644
> --- a/gcc/config/riscv/riscv-builtins.cc
> +++ b/gcc/config/riscv/riscv-builtins.cc
> @@ -188,8 +188,6 @@ static const struct riscv_builtin_description 
> riscv_builtins[] = {
>#include "riscv-scalar-crypto.def"
>#include "corev.def"
>
> -  DIRECT_BUILTIN (frflags, RISCV_USI_FTYPE, hard_float),
> -  DIRECT_NO_TARGET_BUILTIN (fsflags, RISCV_VOID_FTYPE_USI, hard_float),
>RISCV_BUILTIN (pause, "pause", 

[V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-29 Thread Joern Rennecke
Attached is what I have for carry_backpropagate .

The utility of special handling for SS_ASHIFT / US_ASHIFT seems
somewhat marginal.

I suspect it'd be more useful to add handling of LSHIFTRT and ASHIFTRT
.  Some ports do
a lot of static shifting.
commit ed47c3d0d38f85c9b4e22bdbd079e0665465ef9c
Author: Joern Rennecke 
Date:   Wed Nov 29 18:46:06 2023 +

* ext-dce.c: Fixes for carry handling.

* ext-dce.c (safe_for_live_propagation): Handle MINUS.
  (ext_dce_process_uses): Break out carry handling into ..
  (carry_backpropagate): This new function.
  Better handling of ASHIFT.
  Add handling of SMUL_HIGHPART, UMUL_HIGHPART, SIGN_EXTEND, SS_ASHIFT and
  US_ASHIFT.

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 590656f72c7..2a4508181a1 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -83,6 +83,7 @@ safe_for_live_propagation (rtx_code code)
 case SIGN_EXTEND:
 case TRUNCATE:
 case PLUS:
+case MINUS:
 case MULT:
 case SMUL_HIGHPART:
 case UMUL_HIGHPART:
@@ -365,6 +366,67 @@ binop_implies_op2_fully_live (rtx_code code)
 }
 }
 
+/* X, with code CODE, is an operation for which
+safe_for_live_propagation holds true,
+   and bits set in MASK are live in the result.  Compute a make of 
(potentially)
+   live bits in the non-constant inputs.  In case of
+binop_implies_op2_fully_live
+   (e.g. shifts), the computed mask may exclusively pertain to the
+first operand.  */
+
+HOST_WIDE_INT
+carry_backpropagate (HOST_WIDE_INT mask, enum rtx_code code, rtx x)
+{
+  enum machine_mode mode = GET_MODE (x);
+  HOST_WIDE_INT mmask = GET_MODE_MASK (mode);
+  switch (code)
+{
+case ASHIFT:
+  if (CONSTANT_P (XEXP (x, 1))
+ && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode)))
+   return mask >> INTVAL (XEXP (x, 1));
+  /* Fall through.  */
+case PLUS: case MINUS:
+case MULT:
+  return mask ? ((2ULL << floor_log2 (mask)) - 1) : 0;
+case SMUL_HIGHPART: case UMUL_HIGHPART:
+  if (!mask || XEXP (x, 1) == const0_rtx)
+   return 0;
+  if (CONSTANT_P (XEXP (x, 1)))
+   {
+ if (pow2p_hwi (INTVAL (XEXP (x, 1
+   return mmask & (mask << (GET_MODE_BITSIZE (mode).to_constant ()
+- exact_log2 (INTVAL (XEXP (x, 1);
+
+ int bits = (2 * GET_MODE_BITSIZE (mode).to_constant ()
+ - clz_hwi (mask) - ctz_hwi (INTVAL (XEXP (x, 1;
+ if (bits < GET_MODE_BITSIZE (mode).to_constant ())
+   return (1ULL << bits) - 1;
+   }
+  return mmask;
+case SIGN_EXTEND:
+  if (mask & ~mmask)
+   mask |= 1ULL << (GET_MODE_BITSIZE (mode).to_constant () - 1);
+  return mask;
+
+/* We propagate for the shifted operand, but not the shift
+   count.  The count is handled specially.  */
+case SS_ASHIFT:
+case US_ASHIFT:
+  if (!mask || XEXP (x, 1) == const0_rtx)
+   return 0;
+  if (CONSTANT_P (XEXP (x, 1))
+ && UINTVAL (XEXP (x, 1)) < GET_MODE_BITSIZE (mode).to_constant ())
+   {
+ return ((mmask & ~((unsigned HOST_WIDE_INT)mmask
+>> (INTVAL (XEXP (x, 1)) + (code == SS_ASHIFT
+ | (mask >> INTVAL (XEXP (x, 1;
+   }
+  return mmask;
+default:
+  return mask;
+}
+}
 /* Process uses in INSN contained in OBJ.  Set appropriate bits in LIVENOW
for any chunks of pseudos that become live, potentially filtering using
bits from LIVE_TMP.
@@ -480,11 +542,7 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
livenow,
 sure everything that should get marked as live is marked
 from here onward.  */
 
- /* ?!? What is the point of this adjustment to DST_MASK?  */
- if (code == PLUS || code == MINUS
- || code == MULT || code == ASHIFT)
-   dst_mask
- = dst_mask ? ((2ULL << floor_log2 (dst_mask)) - 1) : 0;
+ dst_mask = carry_backpropagate (dst_mask, code, src);
 
  /* We will handle the other operand of a binary operator
 at the bottom of the loop by resetting Y.  */


[PATCH] c++: bogus -Wparentheses warning [PR112765]

2023-11-29 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linu-xgnu, does this look OK for
trunk?

-- >8 --

We need to consistently look through implicit INDIRECT_REF when
setting/checking for -Wparentheses warning suppression.  In passing
use STRIP_REFERENCE_REF consistently as well.

PR c++/112765

gcc/cp/ChangeLog:

* pt.cc (tsubst_expr) : Look through
implicit INDIRECT_REF when propagating -Wparentheses
warning suppression.
* semantics.cc (maybe_warn_unparenthesized_assignment):
Replace REFERENCE_REF_P stripping with STRIP_REFERENCE_REF.
(finish_parenthesized_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wparentheses-33.C: New test.
---
 gcc/cp/pt.cc|  2 +-
 gcc/cp/semantics.cc |  6 ++
 gcc/testsuite/g++.dg/warn/Wparentheses-33.C | 24 +
 3 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-33.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 00b095265b6..fc4464dec02 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20282,7 +20282,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
   build_x_modify_expr sets it and it must not be reset
   here.  */
if (warning_suppressed_p (t, OPT_Wparentheses))
- suppress_warning (r, OPT_Wparentheses);
+ suppress_warning (STRIP_REFERENCE_REF (r), OPT_Wparentheses);
 
RETURN (r);
   }
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 3bf586453dc..fc00c20cba4 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -871,8 +871,7 @@ is_assignment_op_expr_p (tree t)
 void
 maybe_warn_unparenthesized_assignment (tree t, tsubst_flags_t complain)
 {
-  if (REFERENCE_REF_P (t))
-t = TREE_OPERAND (t, 0);
+  t = STRIP_REFERENCE_REF (t);
 
   if ((complain & tf_warning)
   && warn_parentheses
@@ -2176,8 +2175,7 @@ finish_parenthesized_expr (cp_expr expr)
 {
   /* This inhibits warnings in maybe_warn_unparenthesized_assignment
 and c_common_truthvalue_conversion.  */
-  tree inner = REFERENCE_REF_P (expr) ? TREE_OPERAND (expr, 0) : *expr;
-  suppress_warning (inner, OPT_Wparentheses);
+  suppress_warning (STRIP_REFERENCE_REF (*expr), OPT_Wparentheses);
 }
 
   if (TREE_CODE (expr) == OFFSET_REF
diff --git a/gcc/testsuite/g++.dg/warn/Wparentheses-33.C 
b/gcc/testsuite/g++.dg/warn/Wparentheses-33.C
new file mode 100644
index 000..6c65037d1b4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wparentheses-33.C
@@ -0,0 +1,24 @@
+// PR c++/112765
+
+struct A {
+  A& operator=(const A&);
+  operator bool() const;
+};
+
+template
+void f(A a1, A a2) {
+  if ((a2 = a1)) // { dg-bogus "suggest parentheses" }
+return;
+  bool b = (a2 = a1); // { dg-bogus "suggest parentheses" }
+}
+
+template void f(A, A);
+
+template
+void g(T a1, T a2) {
+  if ((a2 = a1)) // { dg-bogus "suggest parentheses" }
+return;
+  bool b = (a2 = a1); // { dg-bogus "suggest parentheses" }
+}
+
+template void g(A, A);
-- 
2.43.0.rc1



Re: [RFC PATCH] RISC-V: Remove f{r,s}flags builtins

2023-11-29 Thread Patrick O'Neill

Hi Christoph,

The precommit-ci is seeing a large number of ICE segmentation faults as 
a result of this patch:

https://github.com/ewlu/gcc-precommit-ci/issues/796#issuecomment-1831853523

The failures aren't in riscv.exp testsuite files so that's likely why 
you didn't run into them in your testing.


Debug log:

/home/runner/work/gcc-precommit-ci/gcc-precommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.dg/c11-atomic-2.c:110:3:
 internal compiler error: Segmentation fault
0x133afb3 crash_signal
../../../gcc/gcc/toplev.cc:316
0x1678d1f contains_struct_check(tree_node*, tree_node_structure_enum, char 
const*, int, char const*)
../../../gcc/gcc/tree.h:3747
0x1678d1f build_call_expr_loc_array(unsigned int, tree_node*, int, tree_node**)
../../../gcc/gcc/tree.cc:10815
0x1679043 build_call_expr(tree_node*, int, ...)
../../../gcc/gcc/tree.cc:10865
0x17f816e riscv_atomic_assign_expand_fenv(tree_node**, tree_node**, tree_node**)
../../../gcc/gcc/config/riscv/riscv-builtins.cc:420
0xc5209b build_atomic_assign
../../../gcc/gcc/c/c-typeck.cc:4289
0xc60a47 build_modify_expr(unsigned int, tree_node*, tree_node*, tree_code, 
unsigned int, tree_node*, tree_node*)
../../../gcc/gcc/c/c-typeck.cc:6406
0xc85a61 c_parser_expr_no_commas
../../../gcc/gcc/c/c-parser.cc:9112
0xc85db1 c_parser_expression
../../../gcc/gcc/c/c-parser.cc:12725
0xc862bb c_parser_expression_conv
../../../gcc/gcc/c/c-parser.cc:12765
0xca3607 c_parser_statement_after_labels
../../../gcc/gcc/c/c-parser.cc:7755
0xc9f27e c_parser_compound_statement_nostart
../../../gcc/gcc/c/c-parser.cc:7242
0xc9f804 c_parser_compound_statement
../../../gcc/gcc/c/c-parser.cc:6527
0xca359c c_parser_statement_after_labels
../../../gcc/gcc/c/c-parser.cc:7590
0xca5713 c_parser_statement
../../../gcc/gcc/c/c-parser.cc:7561
0xca5713 c_parser_c99_block_statement
../../../gcc/gcc/c/c-parser.cc:7820
0xca6a2c c_parser_do_statement
../../../gcc/gcc/c/c-parser.cc:8194
0xca3d51 c_parser_statement_after_labels
../../../gcc/gcc/c/c-parser.cc:7605
0xc9f27e c_parser_compound_statement_nostart
../../../gcc/gcc/c/c-parser.cc:7242
0xc9f804 c_parser_compound_statement
../../../gcc/gcc/c/c-parser.cc:6527
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1
FAIL: gcc.dg/c11-atomic-2.c (internal compiler error: Segmentation fault)

Let me know if you need any additional info/investigation from me.

Thanks,
Patrick

On 11/29/23 03:49, Christoph Muellner wrote:

From: Christoph Müllner

We have two builtins which are undocumented and have no known users.
Further, they don't exist in LLVM (so are no portable).
This means they are in an unclear state of being supported or not.
Let's remove them get them out of this undecided state.

A discussion about making these builtins available in all
compilers was held many years ago with the decision to
not document them in the RISC-V C API documentation:
   https://github.com/riscv-non-isa/riscv-c-api-doc/pull/3

This is an RFC patch as this breaks existing code that uses
these builtins, even if we don't know if such code exists.

An alternative to this patch would be to document them
in gcc/doc/extend.texi (like has been done with __builtin_riscv_pause)
and put them into a supported state.

This patch removes two tests for these builtins.
A test of this patch did not trigger any regressions in riscv.exp.

Signed-off-by: Christoph Müllner

gcc/ChangeLog:

* config/riscv/riscv-builtins.cc: Remove the builtins
__builtin_riscv_frflags and __builtin_riscv_fsflags.

gcc/testsuite/ChangeLog:

* g++.target/riscv/frflags.C: Removed.
* gcc.target/riscv/fsflags.c: Removed.
---
  gcc/config/riscv/riscv-builtins.cc   |  2 --
  gcc/testsuite/g++.target/riscv/frflags.C |  7 ---
  gcc/testsuite/gcc.target/riscv/fsflags.c | 16 
  3 files changed, 25 deletions(-)
  delete mode 100644 gcc/testsuite/g++.target/riscv/frflags.C
  delete mode 100644 gcc/testsuite/gcc.target/riscv/fsflags.c

diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index fc3976f3ba1..1655492b246 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -188,8 +188,6 @@ static const struct riscv_builtin_description 
riscv_builtins[] = {
#include "riscv-scalar-crypto.def"
#include "corev.def"
  
-  DIRECT_BUILTIN (frflags, RISCV_USI_FTYPE, hard_float),

-  DIRECT_NO_TARGET_BUILTIN (fsflags, RISCV_VOID_FTYPE_USI, hard_float),
RISCV_BUILTIN (pause, "pause", RISCV_BUILTIN_DIRECT_NO_TARGET, 
RISCV_VOID_FTYPE, hint_pause),
  };
  
diff --git a/gcc/testsuite/g++.target/riscv/frflags.C b/gcc/testsuite/g++.target/riscv/frflags.C

deleted file mode 100644

Re: [V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-29 Thread Jivan Hakobyan


The reason is removing MINUS from safe_for_live_propagation.
We did not do it on purpose, will roll back on V3.




> On 29 Nov 2023, at 19:46, Xi Ruoyao  wrote:
> 
> On Wed, 2023-11-29 at 20:37 +0800, Xi Ruoyao wrote:
>>> On Wed, 2023-11-29 at 17:33 +0800, Xi Ruoyao wrote:
>>> On Mon, 2023-11-27 at 23:06 -0700, Jeff Law wrote:
 This has (of course) been tested on rv64.  It's also been bootstrapped
 and regression tested on x86.  Bootstrap and regression tested (C only)
 for m68k, sh4, sh4eb, alpha.  Earlier versions were also bootstrapped
 and regression tested on ppc, hppa and s390x (C only for those as well).
   It's also been tested on the various crosses in my tester.  So we've
 got reasonable coverage of 16, 32 and 64 bit targets, big and little
 endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other
 oddities.
 
 The included tests are for RISC-V only because not all targets are going
 to have extraneous extensions.   There's tests from coremark, x264 and
 GCC's bz database.  It probably wouldn't be hard to add aarch64
 testscases.  The BZs listed are improved by this patch for aarch64.
>>> 
>>> I've successfully bootstrapped this on loongarch64-linux-gnu and tried
>>> the added test cases.  For loongarch64 the redundant extensions are
>>> removed for core_bench_list.c, core_init_matrix.c, core_list_init.c,
>>> matrix_add_const.c, and pr111384.c, but not mem-extend.c.
> 
>> Follow up: no regression in GCC test suite on LoongArch.
>> 
>>> Should I change something in LoongArch backend in order to make ext_dce
>>> work for mem-extend.c too?  If yes then any pointers?
> 
> Hmm... This test seems not working even for RISC-V:
> 
> $ ./gcc/cc1 -O2 ../gcc/gcc/testsuite/gcc.target/riscv/mem-extend.c  -nostdinc 
> -fdump-rtl-ext_dce -march=rv64gc_zbb -mabi=lp64d -o- 2>&1 | grep -F zext.h
>zext.ha5,a5
>zext.ha4,a4
> 
> and the 294r.ext_dce file does not contain "Successfully transformed
> to:" lines.
> 
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [RFA] New pass for sign/zero extension elimination

2023-11-29 Thread Jivan Hakobyan
We already noticed it and will roll back in V3



With the best regards
Jivan Hakobyan

> On 29 Nov 2023, at 21:37, Joern Rennecke  wrote:
> 
> Why did you leave out MINUS from safe_for_live_propagation ?


Re: [PATCH] bpf: change ASM_COMMENT_START to '#'

2023-11-29 Thread Jose E. Marchesi


Hi David.
OK.  Thanks.

> The BPF "pseudo-C" assembly dialect uses semi-colon (;) to separate
> statements, not to begin line comments. The GNU assembler was recently
> changed accordingly:
>
>   https://sourceware.org/pipermail/binutils/2023-November/130867.html
>
> This patch adapts the BPF backend in GCC accordingly, to use a hash (#)
> instead of semi-colon (;) for ASM_COMMENT_START. This is supported
> already in clang.
>
> Tested on x86_64-linux-gnu host for bpf-unknown-none target.
>
> gcc/
>   * config/bpf/bpf.h (ASM_COMMENT_START): Change from ';' to '#'.
>
> gcc/testsuite/
>   * gcc.target/bpf/core-builtin-enumvalue-opt.c: Change dg-final
>   scans to not assume a specific comment character.
>   * gcc.target/bpf/core-builtin-enumvalue.c: Likewise.
>   * gcc.target/bpf/core-builtin-type-based.c: Likewise.
>   * gcc.target/bpf/core-builtin-type-id.c: Likewise.
> ---
>  gcc/config/bpf/bpf.h |  2 +-
>  .../gcc.target/bpf/core-builtin-enumvalue-opt.c  |  8 
>  .../gcc.target/bpf/core-builtin-enumvalue.c  | 12 ++--
>  .../gcc.target/bpf/core-builtin-type-based.c |  8 
>  gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c  |  6 +++---
>  5 files changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
> index 1f177ec4c4e..d175e99046c 100644
> --- a/gcc/config/bpf/bpf.h
> +++ b/gcc/config/bpf/bpf.h
> @@ -393,7 +393,7 @@ enum reg_class
>  
>  /*** The Overall Framework of an Assembler File.  */
>  
> -#define ASM_COMMENT_START ";"
> +#define ASM_COMMENT_START "#"
>  
>  /* Output to assembler file text saying following lines
> may contain character constants, extra white space, comments, etc.  */
> diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c 
> b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c
> index c87e1a3ba3b..fc3c299fe9c 100644
> --- a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c
> +++ b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c
> @@ -26,10 +26,10 @@ unsigned long foo(void *data)
>   return 0;
>  }
>  
> -/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
> \\(named_ue64\\)" 2 } } */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
> \\(named_se64\\)" 2} } */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0xa\t; bpfcr_kind" 2 } } 
> BPF_ENUMVAL_EXISTS */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0xb\t; bpfcr_kind" 2 } } 
> BPF_ENUMVAL_VALUE */
> +/* { dg-final { scan-assembler-times "bpfcr_type \\(named_ue64\\)" 2 } } */
> +/* { dg-final { scan-assembler-times "bpfcr_type \\(named_se64\\)" 2} } */
> +/* { dg-final { scan-assembler-times "0xa\[\t \]+\[^\n\]*bpfcr_kind" 2 } } 
> BPF_ENUMVAL_EXISTS */
> +/* { dg-final { scan-assembler-times "0xb\[\t \]+\[^\n\]*bpfcr_kind" 2 } } 
> BPF_ENUMVAL_VALUE */
>  
>  /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0\"\\)" 4 } } */
>  
> diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c 
> b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c
> index 2f16903b8d6..23dfd8a10bf 100644
> --- a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c
> +++ b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c
> @@ -40,12 +40,12 @@ int foo(void *data)
>   return 0;
>  }
>  
> -/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
> \\(named_ue64\\)" 5 } } */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
> \\(named_se64\\)" 5} } */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
> \\(named_ue\\)" 5 } } */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
> \\(named_se\\)" 5} } */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0xa\t; bpfcr_kind" 12 } } 
> BPF_ENUMVAL_EXISTS */
> -/* { dg-final { scan-assembler-times "\t.4byte\t0xb\t; bpfcr_kind" 8 } } 
> BPF_ENUMVAL_VALUE */
> +/* { dg-final { scan-assembler-times "bpfcr_type \\(named_ue64\\)" 5 } } */
> +/* { dg-final { scan-assembler-times "bpfcr_type \\(named_se64\\)" 5} } */
> +/* { dg-final { scan-assembler-times "bpfcr_type \\(named_ue\\)" 5 } } */
> +/* { dg-final { scan-assembler-times "bpfcr_type \\(named_se\\)" 5} } */
> +/* { dg-final { scan-assembler-times "0xa\[\t \]+\[^\n\]*bpfcr_kind" 12 } } 
> BPF_ENUMVAL_EXISTS */
> +/* { dg-final { scan-assembler-times "0xb\[\t \]+\[^\n\]*bpfcr_kind" 8 } } 
> BPF_ENUMVAL_VALUE */
>  
>  /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0\"\\)" 8 } } */
>  /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"1\"\\)" 8 } } */
> diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c 
> b/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
> index 16b48ae0a00..74a8d5a14d9 100644
> --- a/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
> +++ b/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
> @@ 

Re: [PATCH #2/4] c++: mark short-enums as packed

2023-11-29 Thread Jason Merrill

On 11/29/23 04:39, Alexandre Oliva wrote:

Hello, Jason,

On Nov 22, 2023, Jason Merrill  wrote:


On 11/22/23 13:12, Jason Merrill wrote:

I'm coming to the conclusion that your C++ patch is fine but we
should remove the TYPE_PACKED warning from
check_address_or_pointer_of_packed_member.  And maybe add
-Wcast-align=strict to -Wextra.



Since I seem to have opinions, I'm preparing a patch for this.


Thanks for that patch.  It makes sense to me, but I suppose that, if
it goes in, I should revert the already-installed #1/4 in this series
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637244.html
rather than install #4/4 that Mike approved.
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637336.html

I wasn't sure whether your earlier conclusion (quoted above) was meant
as an 'Ok' for the C++ patch.  Please confirm if so.  TIA,


Yes.

Jason



Re: [PATCH] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Jason Merrill

On 11/29/23 10:45, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

Now that I'm posting this patch, I think you'll probably want me to use
ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
a trivial testsuite tweak:
   'C' is not an accessible base of 'X'
v.
   'C' is an inaccessible base of 'X'
We should probably unify those messages...


Hmm, won't using ba_any unconditionally break ambiguous base checking 
for non-static data members?



@@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
  return error_mark_node;
}
  
+	  /* NAME may refer to a static data member, in which case there is

+one copy of the data member that is shared by all the objects of
+the class.  So NAME can be unambiguously referred to even if
+there are multiple indirect base classes containing NAME.  */
+ const base_access ba = [scope, name] ()


Why a lambda?


+   {
+ if (identifier_p (name))
+   {
+ tree m = lookup_member (scope, name, /*protect=*/0,
+ /*want_type=*/false, tf_none);
+ if (!m || VAR_P (m))


Do you want shared_member_p here?

Jason



[PATCH v2] c++: P2280R4, Using unknown refs in constant expr [PR106650]

2023-11-29 Thread Marek Polacek
On Mon, Nov 20, 2023 at 04:29:33PM -0500, Jason Merrill wrote:
> On 11/17/23 16:46, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > This patch is an attempt to implement (part of?) P2280, Using unknown
> > pointers and references in constant expressions.  (Note that R4 seems to
> > only allow References to unknown/Accesses via this, but not Pointers to
> > unknown.)
> 
> Indeed.  That seems a bit arbitrary to me, but there it is.
> 
> We were rejecting the testcase before because cxx_bind_parameters_in_call
> was trying to perform an lvalue->rvalue conversion on the reference itself;
> this isn't really a thing in the language, but worked to implement the
> reference bullet that the paper removes.  Your approach to fixing that makes
> sense to me.
> 
> We should do the same for VAR_DECL references, e.g.
> 
> extern int ()[42];
> constexpr int i = array_size (r);

Argh, right.
 
> You also need to allow (implict or explicit) use of 'this', as in:
> 
> struct A
> {
>   constexpr int f() { return 42; }
>   void g() { constexpr int i = f(); }
> };

Ah, I thought that already worked, but not so.  Apology apology.

> > This patch works to the extent that the test case added in [expr.const]
> > works as expected, as well as the test in
> > 
> > 
> > Most importantly, the proposal makes this compile:
> > 
> >template 
> >constexpr auto array_size(T (&)[N]) -> size_t {
> >return N;
> >}
> > 
> >void check(int const ()[3]) {
> >constexpr auto s = array_size(param);
> >static_assert (s == 3);
> >}
> > 
> > and I think it would be a pity not to have it in GCC 14.
> > 
> > What still doesn't work (and I don't know if it should) is the test in $3.2:
> > 
> >struct A2 { constexpr int f() { return 0; } };
> >struct B2 : virtual A2 {};
> >void f2(B2 ) { constexpr int k = b.f(); }
> > 
> > where we say
> > error: '* & b' is not a constant expression
> 
> It seems like that is supposed to work, the problem is accessing the vtable
> to perform the conversion.  I have WIP to recognize that conversion better
> in order to fix PR53288; this testcase can wait for that fix.

Great.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch is an attempt to implement (part of?) P2280, Using unknown
pointers and references in constant expressions.  (Note that R4 seems to
only allow References to unknown/Accesses via this, but not Pointers to
unknown.)

This patch works to the extent that the test case added in [expr.const]
works as expected, as well as the test in


Most importantly, the proposal makes this compile:

  template 
  constexpr auto array_size(T (&)[N]) -> size_t {
  return N;
  }

  void check(int const ()[3]) {
  constexpr auto s = array_size(param);
  static_assert (s == 3);
  }

and I think it would be a pity not to have it in GCC 14.

What still doesn't work is the test in $3.2:

  struct A2 { constexpr int f() { return 0; } };
  struct B2 : virtual A2 {};
  void f2(B2 ) { constexpr int k = b.f(); }

where we say
error: '* & b' is not a constant expression

This will be fixed in the future.

PR c++/106650

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression) : Allow
reference to unknown/this as per P2280.
: Allow reference to unknown as per P2280.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-array-ptr6.C: Remove dg-error.
* g++.dg/cpp0x/constexpr-ref12.C: Likewise.
* g++.dg/cpp0x/constexpr-ref2.C: Adjust dg-error.
* g++.dg/cpp0x/noexcept34.C: Remove dg-error.
* g++.dg/cpp1y/lambda-generic-const10.C: Likewise.
* g++.dg/cpp0x/constexpr-ref13.C: New test.
* g++.dg/cpp1z/constexpr-ref1.C: New test.
* g++.dg/cpp1z/constexpr-ref2.C: New test.
* g++.dg/cpp2a/constexpr-ref1.C: New test.
---
 gcc/cp/constexpr.cc   |  8 ++-
 .../g++.dg/cpp0x/constexpr-array-ptr6.C   |  2 +-
 gcc/testsuite/g++.dg/cpp0x/constexpr-ref12.C  |  4 +-
 gcc/testsuite/g++.dg/cpp0x/constexpr-ref13.C  | 41 ++
 gcc/testsuite/g++.dg/cpp0x/constexpr-ref2.C   |  4 +-
 gcc/testsuite/g++.dg/cpp0x/noexcept34.C   |  8 +--
 .../g++.dg/cpp1y/lambda-generic-const10.C |  2 +-
 gcc/testsuite/g++.dg/cpp1z/constexpr-ref1.C   | 26 +
 gcc/testsuite/g++.dg/cpp1z/constexpr-ref2.C   | 23 
 gcc/testsuite/g++.dg/cpp2a/constexpr-ref1.C   | 54 +++
 10 files changed, 161 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-ref13.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-ref1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-ref2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-ref1.C


Re: [PATCH] c++, v4: Implement C++26 P2169R4 - Placeholder variables with no name [PR110349]

2023-11-29 Thread Jason Merrill

On 11/29/23 13:01, Jakub Jelinek wrote:

On Tue, Nov 28, 2023 at 11:27:55AM -0500, Jason Merrill wrote:

On 11/24/23 03:34, Jakub Jelinek wrote:

On Mon, Sep 18, 2023 at 07:12:40PM +0200, Jakub Jelinek via Gcc-patches wrote:

On Tue, Aug 22, 2023 at 09:39:11AM +0200, Jakub Jelinek via Gcc-patches wrote:

The following patch implements the C++26 P2169R4 paper.
As written in the PR, the patch expects that:
1) https://eel.is/c++draft/expr.prim.lambda.capture#2
 "Ignoring appearances in initializers of init-captures, an identifier
  or this shall not appear more than once in a lambda-capture."
 is adjusted such that name-independent lambda captures with initializers
 can violate this rule (but lambda captures which aren't name-independent
 can't appear after name-independent ones)
2) https://eel.is/c++draft/class.mem#general-5
 "A member shall not be declared twice in the member-specification,
  except that"
 having an exception that name-independent non-static data member
 declarations can appear multiple times (but again, if there is
 a member which isn't name-independent, it can't appear after
 name-independent ones)
3) it assumes that any name-independent declarations which weren't
 previously valid result in the _ lookups being ambiguous, not just
 if there are 2 _ declarations in the same scope, in particular the
 https://eel.is/c++draft/basic.scope#block-2 mentioned cases
4) it assumes that _ in static function/block scope structured bindings
 is never name-independent like in namespace scope structured bindings;
 it matches clang behavior and is consistent with e.g. static type _;
 not being name-independent both at namespace scope and at function/block
 scope



2023-11-23  Jakub Jelinek  

PR c++/110349
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__cpp_placeholder_variables=202306L for C++26.
gcc/cp/
* cp-tree.h: Implement C++26 P2169R4 - Placeholder variables with no
name.
(OVL_PLACEHOLDER_P): Define.


Throughout this patch let's follow the standard and avoid using the word
"placeholder" to refer to name-independent decls.  There are a lot of other
things that could be meant by "placeholder", notably "auto".


Ok, changed it almost everywhere, except for the predefined macro which
the C++ standard defines that way and the name of the P2169R4 paper in
ChangeLog and test comments because the paper is called like that.

+ else
+   {


Please add a comment here with your rationale from the commit message.


Done.


+ binding->value = nreverse (binding->value);
+ /* Skip over TREE_LISTs added for check_local_shadow detected


This should mention pushdecl, since that's where they're actually added
(which, as mentioned below, I don't understand why).


Done.


- check_local_shadow (decl);
+ tree local_shadow = check_local_shadow (decl);
+ if (placeholder_p && local_shadow)
+   {
+ if (cxx_dialect < cxx26 && !placeholder_diagnosed_p)
+   pedwarn (DECL_SOURCE_LOCATION (decl), OPT_Wc__26_extensions,
+"placeholder variables only available with "
+"%<-std=c++2c%> or %<-std=gnu++2c%>");
+ placeholder_diagnosed_p = true;
+ if (old == NULL_TREE)
+   {
+ old = build_tree_list (error_mark_node, local_shadow);
+ TREE_TYPE (old) = error_mark_node;
+   }
+   }


This needs a rationale comment; I don't understand what the purpose is.


Added it, but in short it is the assumption 3) above (which dunno if
it is something that has been discussed or I should file as DR or you
would), I believe it would be very weird if say for
void foo (int x) { int x = 5; }
the standard says it must be rejected even when the 2 x decls don't
live in the same scope (ditto all other cases check_local_shadow deals
with), while
void foo (int _) { int _ = 5; return _; }
would be accepted and
void bar () { int _ = 4; int _ = 5; return _; }
not.


@@ -7579,7 +7800,30 @@ lookup_name (tree name, LOOK_where where
&& (bool (want & LOOK_want::HIDDEN_LAMBDA)
|| !is_lambda_ignored_entity (iter->value))
&& qualify_lookup (iter->value, want))
- binding = iter->value;
+ {
+   binding = iter->value;
+   if (binding
+   && TREE_CODE (binding) == TREE_LIST
+   && name_independent_decl_p (TREE_VALUE (binding)))
+ {
+   for (tree b = binding; b; b = TREE_CHAIN (b))
+ if (TREE_CHAIN (b) == NULL
+ && TREE_CODE (TREE_VALUE (b)) == OVERLOAD)
+   {
+ /* If the scope has an overload with _ function
+declarations followed by at least one

[PATCH] bpf: change ASM_COMMENT_START to '#'

2023-11-29 Thread David Faust
The BPF "pseudo-C" assembly dialect uses semi-colon (;) to separate
statements, not to begin line comments. The GNU assembler was recently
changed accordingly:

  https://sourceware.org/pipermail/binutils/2023-November/130867.html

This patch adapts the BPF backend in GCC accordingly, to use a hash (#)
instead of semi-colon (;) for ASM_COMMENT_START. This is supported
already in clang.

Tested on x86_64-linux-gnu host for bpf-unknown-none target.

gcc/
* config/bpf/bpf.h (ASM_COMMENT_START): Change from ';' to '#'.

gcc/testsuite/
* gcc.target/bpf/core-builtin-enumvalue-opt.c: Change dg-final
scans to not assume a specific comment character.
* gcc.target/bpf/core-builtin-enumvalue.c: Likewise.
* gcc.target/bpf/core-builtin-type-based.c: Likewise.
* gcc.target/bpf/core-builtin-type-id.c: Likewise.
---
 gcc/config/bpf/bpf.h |  2 +-
 .../gcc.target/bpf/core-builtin-enumvalue-opt.c  |  8 
 .../gcc.target/bpf/core-builtin-enumvalue.c  | 12 ++--
 .../gcc.target/bpf/core-builtin-type-based.c |  8 
 gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c  |  6 +++---
 5 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
index 1f177ec4c4e..d175e99046c 100644
--- a/gcc/config/bpf/bpf.h
+++ b/gcc/config/bpf/bpf.h
@@ -393,7 +393,7 @@ enum reg_class
 
 /*** The Overall Framework of an Assembler File.  */
 
-#define ASM_COMMENT_START ";"
+#define ASM_COMMENT_START "#"
 
 /* Output to assembler file text saying following lines
may contain character constants, extra white space, comments, etc.  */
diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c 
b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c
index c87e1a3ba3b..fc3c299fe9c 100644
--- a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c
+++ b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue-opt.c
@@ -26,10 +26,10 @@ unsigned long foo(void *data)
  return 0;
 }
 
-/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
\\(named_ue64\\)" 2 } } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
\\(named_se64\\)" 2} } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0xa\t; bpfcr_kind" 2 } } 
BPF_ENUMVAL_EXISTS */
-/* { dg-final { scan-assembler-times "\t.4byte\t0xb\t; bpfcr_kind" 2 } } 
BPF_ENUMVAL_VALUE */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(named_ue64\\)" 2 } } */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(named_se64\\)" 2} } */
+/* { dg-final { scan-assembler-times "0xa\[\t \]+\[^\n\]*bpfcr_kind" 2 } } 
BPF_ENUMVAL_EXISTS */
+/* { dg-final { scan-assembler-times "0xb\[\t \]+\[^\n\]*bpfcr_kind" 2 } } 
BPF_ENUMVAL_VALUE */
 
 /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0\"\\)" 4 } } */
 
diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c 
b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c
index 2f16903b8d6..23dfd8a10bf 100644
--- a/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c
+++ b/gcc/testsuite/gcc.target/bpf/core-builtin-enumvalue.c
@@ -40,12 +40,12 @@ int foo(void *data)
  return 0;
 }
 
-/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
\\(named_ue64\\)" 5 } } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
\\(named_se64\\)" 5} } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
\\(named_ue\\)" 5 } } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0x\[0-9a-f\]+\t; bpfcr_type 
\\(named_se\\)" 5} } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0xa\t; bpfcr_kind" 12 } } 
BPF_ENUMVAL_EXISTS */
-/* { dg-final { scan-assembler-times "\t.4byte\t0xb\t; bpfcr_kind" 8 } } 
BPF_ENUMVAL_VALUE */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(named_ue64\\)" 5 } } */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(named_se64\\)" 5} } */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(named_ue\\)" 5 } } */
+/* { dg-final { scan-assembler-times "bpfcr_type \\(named_se\\)" 5} } */
+/* { dg-final { scan-assembler-times "0xa\[\t \]+\[^\n\]*bpfcr_kind" 12 } } 
BPF_ENUMVAL_EXISTS */
+/* { dg-final { scan-assembler-times "0xb\[\t \]+\[^\n\]*bpfcr_kind" 8 } } 
BPF_ENUMVAL_VALUE */
 
 /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0\"\\)" 8 } } */
 /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"1\"\\)" 8 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c 
b/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
index 16b48ae0a00..74a8d5a14d9 100644
--- a/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
+++ b/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
@@ -52,7 +52,7 @@ int foo(void *data)
   return 0;
 }
 
-/* { dg-final { scan-assembler-times "\t.4byte\t0x0\t; bpfcr_type" 0 } } */
-/* { dg-final { scan-assembler-times "\t.4byte\t0x8\t; bpfcr_kind" 13 } } 
BPF_TYPE_EXISTS */

[PATCH] s390: Fix builtin-classify-type-1.c on s390 too [PR112725]

2023-11-29 Thread Jakub Jelinek
Hi!

Given that the s390 backend defines pretty much the same target hook
as rs6000, I believe it suffers (at least when using -mvx?) the same
problem as rs6000, though admittedly this is so far completely
untested.

Ok for trunk if it passes bootstrap/regtest there?

2023-11-29  Jakub Jelinek  

PR target/112725
* config/s390/s390.cc (s390_invalid_arg_for_unprototyped_fn): Return
NULL for __builtin_classify_type calls with vector arguments.

--- gcc/config/s390/s390.cc.jj  2023-11-27 17:34:25.684287136 +0100
+++ gcc/config/s390/s390.cc 2023-11-29 09:41:08.569491077 +0100
@@ -12650,7 +12650,8 @@ s390_invalid_arg_for_unprototyped_fn (co
   && VECTOR_TYPE_P (TREE_TYPE (val))
   && (funcdecl == NULL_TREE
   || (TREE_CODE (funcdecl) == FUNCTION_DECL
-  && DECL_BUILT_IN_CLASS (funcdecl) != BUILT_IN_MD)))
+  && DECL_BUILT_IN_CLASS (funcdecl) != BUILT_IN_MD
+  && !fndecl_built_in_p (funcdecl, BUILT_IN_CLASSIFY_TYPE
  ? N_("vector argument passed to unprototyped function")
  : NULL);
 }

Jakub



[committed] rs6000: Fix up c-c++-common/builtin-classify-type-1.c failure [PR112725]

2023-11-29 Thread Jakub Jelinek
Hi!

The rs6000 backend (and s390 one as well) diagnoses passing vector types
to unprototyped functions, which breaks the builtin-classify-type-1.c test.
The builtin isn't really unprototyped, it is just type-generic and accepting
vector types is just fine there, all it does is categorize the vector type.
The following patch makes sure we don't diagnose it for this builtin.

Preapproved in the PR, bootstrapped/regtested on powerpc64le-linux where it
fixes
-FAIL: c-c++-common/builtin-classify-type-1.c  -Wc++-compat  (test for excess 
errors)
-UNRESOLVED: c-c++-common/builtin-classify-type-1.c  -Wc++-compat  compilation 
failed to produce executable
and committed to trunk.

2023-11-29  Jakub Jelinek  

PR target/112725
* config/rs6000/rs6000.cc (invalid_arg_for_unprototyped_fn): Return
NULL for __builtin_classify_type calls with vector arguments.

--- gcc/config/rs6000/rs6000.cc.jj  2023-11-17 15:08:20.816961466 +0100
+++ gcc/config/rs6000/rs6000.cc 2023-11-29 09:40:35.782955603 +0100
@@ -24389,7 +24389,8 @@ invalid_arg_for_unprototyped_fn (const_t
  && VECTOR_TYPE_P (TREE_TYPE (val))
   && (funcdecl == NULL_TREE
   || (TREE_CODE (funcdecl) == FUNCTION_DECL
-  && DECL_BUILT_IN_CLASS (funcdecl) != BUILT_IN_MD)))
+  && DECL_BUILT_IN_CLASS (funcdecl) != BUILT_IN_MD
+  && !fndecl_built_in_p (funcdecl, BUILT_IN_CLASSIFY_TYPE
  ? N_("AltiVec argument passed to unprototyped function")
  : NULL;
 }

Jakub



[PATCH] c++, v4: Implement C++26 P2169R4 - Placeholder variables with no name [PR110349]

2023-11-29 Thread Jakub Jelinek
On Tue, Nov 28, 2023 at 11:27:55AM -0500, Jason Merrill wrote:
> On 11/24/23 03:34, Jakub Jelinek wrote:
> > On Mon, Sep 18, 2023 at 07:12:40PM +0200, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > On Tue, Aug 22, 2023 at 09:39:11AM +0200, Jakub Jelinek via Gcc-patches 
> > > wrote:
> > > > The following patch implements the C++26 P2169R4 paper.
> > > > As written in the PR, the patch expects that:
> > > > 1) https://eel.is/c++draft/expr.prim.lambda.capture#2
> > > > "Ignoring appearances in initializers of init-captures, an 
> > > > identifier
> > > >  or this shall not appear more than once in a lambda-capture."
> > > > is adjusted such that name-independent lambda captures with 
> > > > initializers
> > > > can violate this rule (but lambda captures which aren't 
> > > > name-independent
> > > > can't appear after name-independent ones)
> > > > 2) https://eel.is/c++draft/class.mem#general-5
> > > > "A member shall not be declared twice in the member-specification,
> > > >  except that"
> > > > having an exception that name-independent non-static data member
> > > > declarations can appear multiple times (but again, if there is
> > > > a member which isn't name-independent, it can't appear after
> > > > name-independent ones)
> > > > 3) it assumes that any name-independent declarations which weren't
> > > > previously valid result in the _ lookups being ambiguous, not just
> > > > if there are 2 _ declarations in the same scope, in particular the
> > > > https://eel.is/c++draft/basic.scope#block-2 mentioned cases
> > > > 4) it assumes that _ in static function/block scope structured bindings
> > > > is never name-independent like in namespace scope structured 
> > > > bindings;
> > > > it matches clang behavior and is consistent with e.g. static type _;
> > > > not being name-independent both at namespace scope and at 
> > > > function/block
> > > > scope

> > 2023-11-23  Jakub Jelinek  
> > 
> > PR c++/110349
> > gcc/c-family/
> > * c-cppbuiltin.cc (c_cpp_builtins): Predefine
> > __cpp_placeholder_variables=202306L for C++26.
> > gcc/cp/
> > * cp-tree.h: Implement C++26 P2169R4 - Placeholder variables with no
> > name.
> > (OVL_PLACEHOLDER_P): Define.
> 
> Throughout this patch let's follow the standard and avoid using the word
> "placeholder" to refer to name-independent decls.  There are a lot of other
> things that could be meant by "placeholder", notably "auto".

Ok, changed it almost everywhere, except for the predefined macro which
the C++ standard defines that way and the name of the P2169R4 paper in
ChangeLog and test comments because the paper is called like that.
> > + else
> > +   {
> 
> Please add a comment here with your rationale from the commit message.

Done.

> > + binding->value = nreverse (binding->value);
> > + /* Skip over TREE_LISTs added for check_local_shadow detected
> 
> This should mention pushdecl, since that's where they're actually added
> (which, as mentioned below, I don't understand why).

Done.

> > - check_local_shadow (decl);
> > + tree local_shadow = check_local_shadow (decl);
> > + if (placeholder_p && local_shadow)
> > +   {
> > + if (cxx_dialect < cxx26 && !placeholder_diagnosed_p)
> > +   pedwarn (DECL_SOURCE_LOCATION (decl), OPT_Wc__26_extensions,
> > +"placeholder variables only available with "
> > +"%<-std=c++2c%> or %<-std=gnu++2c%>");
> > + placeholder_diagnosed_p = true;
> > + if (old == NULL_TREE)
> > +   {
> > + old = build_tree_list (error_mark_node, local_shadow);
> > + TREE_TYPE (old) = error_mark_node;
> > +   }
> > +   }
> 
> This needs a rationale comment; I don't understand what the purpose is.

Added it, but in short it is the assumption 3) above (which dunno if
it is something that has been discussed or I should file as DR or you
would), I believe it would be very weird if say for
void foo (int x) { int x = 5; }
the standard says it must be rejected even when the 2 x decls don't
live in the same scope (ditto all other cases check_local_shadow deals
with), while
void foo (int _) { int _ = 5; return _; }
would be accepted and
void bar () { int _ = 4; int _ = 5; return _; }
not.

> > @@ -7579,7 +7800,30 @@ lookup_name (tree name, LOOK_where where
> > && (bool (want & LOOK_want::HIDDEN_LAMBDA)
> > || !is_lambda_ignored_entity (iter->value))
> > && qualify_lookup (iter->value, want))
> > - binding = iter->value;
> > + {
> > +   binding = iter->value;
> > +   if (binding
> > +   && TREE_CODE (binding) == TREE_LIST
> > +   && name_independent_decl_p (TREE_VALUE (binding)))
> > + {
> > +   for (tree b = binding; b; b = TREE_CHAIN (b))
> > + if (TREE_CHAIN (b) == NULL
> > +   

Re: [PATCH v2] AArch64: Add inline memmove expansion

2023-11-29 Thread Richard Sandiford
Wilco Dijkstra  writes:
> v2: further cleanups, improved comments
>
> Add support for inline memmove expansions.  The generated code is identical
> as for memcpy, except that all loads are emitted before stores rather than
> being interleaved.  The maximum size is 256 bytes which requires at most 16
> registers.
>
> Passes regress/bootstrap, OK for commit?
>
> gcc/ChangeLog/
> * config/aarch64/aarch64.opt (aarch64_mops_memmove_size_threshold):
> Change default.
> * config/aarch64/aarch64.md (cpymemdi): Add a parameter.
> (movmemdi): Call aarch64_expand_cpymem.
> * config/aarch64/aarch64.cc (aarch64_copy_one_block): Rename function,
> simplify, support storing generated loads/stores.
> (aarch64_expand_cpymem): Support expansion of memmove.
> * config/aarch64/aarch64-protos.h (aarch64_expand_cpymem): Add bool 
> arg.
>
> gcc/testsuite/ChangeLog/
> * gcc.target/aarch64/memmove.c: Add new test.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> 60a55f4bc1956786ea687fc7cad7ec9e4a84e1f0..0d39622bd2826a3fde54d67b5c5da9ee9286cbbd
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -769,7 +769,7 @@ bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
>  tree aarch64_vector_load_decl (tree);
>  void aarch64_expand_call (rtx, rtx, rtx, bool);
>  bool aarch64_expand_cpymem_mops (rtx *, bool);
> -bool aarch64_expand_cpymem (rtx *);
> +bool aarch64_expand_cpymem (rtx *, bool);
>  bool aarch64_expand_setmem (rtx *);
>  bool aarch64_float_const_zero_rtx_p (rtx);
>  bool aarch64_float_const_rtx_p (rtx);
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 2fa5d09de85d385c1165e399bcc97681ef170916..e19e2d1de2e5b30eca672df05d9dcc1bc106ecc8
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25238,52 +25238,37 @@ aarch64_progress_pointer (rtx pointer)
>return aarch64_move_pointer (pointer, GET_MODE_SIZE (GET_MODE (pointer)));
>  }
>
> -/* Copy one MODE sized block from SRC to DST, then progress SRC and DST by
> -   MODE bytes.  */
> +/* Copy one block of size MODE from SRC to DST at offset OFFSET.  */
>
>  static void
> -aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst,
> - machine_mode mode)
> +aarch64_copy_one_block (rtx *load, rtx *store, rtx src, rtx dst,
> +   int offset, machine_mode mode)
>  {
> -  /* Handle 256-bit memcpy separately.  We do this by making 2 adjacent 
> memory
> - address copies using V4SImode so that we can use Q registers.  */
> -  if (known_eq (GET_MODE_BITSIZE (mode), 256))
> +  /* Emit explict load/store pair instructions for 32-byte copies.  */
> +  if (known_eq (GET_MODE_SIZE (mode), 32))
>  {
>mode = V4SImode;
> +  rtx src1 = adjust_address (src, mode, offset);
> +  rtx src2 = adjust_address (src, mode, offset + 16);
> +  rtx dst1 = adjust_address (dst, mode, offset);
> +  rtx dst2 = adjust_address (dst, mode, offset + 16);
>rtx reg1 = gen_reg_rtx (mode);
>rtx reg2 = gen_reg_rtx (mode);
> -  /* "Cast" the pointers to the correct mode.  */
> -  *src = adjust_address (*src, mode, 0);
> -  *dst = adjust_address (*dst, mode, 0);
> -  /* Emit the memcpy.  */
> -  emit_insn (aarch64_gen_load_pair (mode, reg1, *src, reg2,
> -   aarch64_progress_pointer (*src)));
> -  emit_insn (aarch64_gen_store_pair (mode, *dst, reg1,
> -aarch64_progress_pointer (*dst), 
> reg2));
> -  /* Move the pointers forward.  */
> -  *src = aarch64_move_pointer (*src, 32);
> -  *dst = aarch64_move_pointer (*dst, 32);
> +  *load = aarch64_gen_load_pair (mode, reg1, src1, reg2, src2);
> +  *store = aarch64_gen_store_pair (mode, dst1, reg1, dst2, reg2);
>return;
>  }
>
>rtx reg = gen_reg_rtx (mode);
> -
> -  /* "Cast" the pointers to the correct mode.  */
> -  *src = adjust_address (*src, mode, 0);
> -  *dst = adjust_address (*dst, mode, 0);
> -  /* Emit the memcpy.  */
> -  emit_move_insn (reg, *src);
> -  emit_move_insn (*dst, reg);
> -  /* Move the pointers forward.  */
> -  *src = aarch64_progress_pointer (*src);
> -  *dst = aarch64_progress_pointer (*dst);
> +  *load = gen_move_insn (reg, adjust_address (src, mode, offset));
> +  *store = gen_move_insn (adjust_address (dst, mode, offset), reg);
>  }
>
>  /* Expand a cpymem/movmem using the MOPS extension.  OPERANDS are taken
> from the cpymem/movmem pattern.  IS_MEMMOVE is true if this is a memmove
> rather than memcpy.  Return true iff we succeeded.  */
>  bool
> -aarch64_expand_cpymem_mops (rtx *operands, bool is_memmove = false)
> +aarch64_expand_cpymem_mops (rtx *operands, bool is_memmove)
>  {
>if (!TARGET_MOPS)
>  return false;
> @@ 

Re: [PATCH v2] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-11-29 Thread Richard Sandiford
Wilco Dijkstra  writes:
> v2: Use UINTVAL, rename max_mops_size.
>
> The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
> Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
> Clean up the condition when to use MOPS.
>
> Passes regress/bootstrap, OK for commit?
>
> gcc/ChangeLog/
> PR target/103100
> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.
> (setmemdi): Likewise.
> * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
> strict-align.  Cleanup condition for using MOPS.
> (aarch64_expand_setmem): Likewise.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> dd6874d13a75f20d10a244578afc355b25c73da2..8a12894d6b80de1031d6e7d02dca680c57bce136
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
>int mode_bits;
>rtx dst = operands[0];
>rtx src = operands[1];
> +  unsigned align = UINTVAL (operands[3]);
>rtx base;
>machine_mode cur_mode = BLKmode;
> +  bool size_p = optimize_function_for_size_p (cfun);
>
> -  /* Variable-sized memcpy can go through the MOPS expansion if available.  
> */
> -  if (!CONST_INT_P (operands[2]))
> +  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);
>
> -  unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
> -
> -  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  
> */
> -  unsigned HOST_WIDE_INT max_copy_size
> -= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
> +  unsigned HOST_WIDE_INT size = UINTVAL (operands[2]);
>
> -  bool size_p = optimize_function_for_size_p (cfun);
> +  /* Try to inline up to 256 bytes.  */
> +  unsigned max_copy_size = 256;
> +  unsigned mops_threshold = aarch64_mops_memcpy_size_threshold;
>
> -  /* Large constant-sized cpymem should go through MOPS when possible.
> - It should be a win even for size optimization in the general case.
> - For speed optimization the choice between MOPS and the SIMD sequence
> - depends on the size of the copy, rather than number of instructions,
> - alignment etc.  */
> -  if (size > max_copy_size)
> +  /* Large copies use MOPS when available or a library call.  */
> +  if (size > max_copy_size || (TARGET_MOPS && size > mops_threshold))
>  return aarch64_expand_cpymem_mops (operands);

It feels a little unintuitive to be calling aarch64_expand_cpymem_mops
for !TARGET_MOPS, but that's pre-existing, and I can see there are
arguments both ways.

Although !TARGET_SIMD is a niche interest on current trunk, it becomes
important for streaming-compatible mode.  So we might want to look
again at the different handling of !TARGET_SIMD in this function (where
we lower the copy size but not the threshold) and aarch64_expand_setmem
(where we bail out early).  That's not something for this patch though,
just mentioning it.

The patch is OK with me, but please give Richard E a day to object.

Thanks,
Richard

>
>int copy_bits = 256;
> @@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)
>unsigned HOST_WIDE_INT len;
>rtx dst = operands[0];
>rtx val = operands[2], src;
> +  unsigned align = UINTVAL (operands[3]);
>rtx base;
>machine_mode cur_mode = BLKmode, next_mode;
>
> -  /* If we don't have SIMD registers or the size is variable use the MOPS
> - inlined sequence if possible.  */
> -  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
> +  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
> +  || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_setmem_mops (operands);
>
>bool size_p = optimize_function_for_size_p (cfun);
> @@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)
>/* Default the maximum to 256-bytes when considering only libcall vs
>   SIMD broadcast sequence.  */
>unsigned max_set_size = 256;
> +  unsigned mops_threshold = aarch64_mops_memset_size_threshold;
>
> -  len = INTVAL (operands[1]);
> -  if (len > max_set_size && !TARGET_MOPS)
> -return false;
> +  len = UINTVAL (operands[1]);
> +
> +  /* Large memset uses MOPS when available or a library call.  */
> +  if (len > max_set_size || (TARGET_MOPS && len > mops_threshold))
> +return aarch64_expand_setmem_mops (operands);
>
>int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
>/* The MOPS sequence takes:
> @@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
>   the arguments + 1 for the call.  */
>unsigned libcall_cost = 4;
>
> -  /* Upper bound check.  For large constant-sized setmem use the MOPS 
> sequence
> - when available.  */
> -  if (TARGET_MOPS
> -  && len >= 

Re: [PATCH] testsuite: scev: expect fail on ilp32

2023-11-29 Thread Hans-Peter Nilsson
> From: Rainer Orth 
> Date: Tue, 28 Nov 2023 16:13:35 +0100

> Richard Biener  writes:
> 
> > On Sun, 19 Nov 2023, Jeff Law wrote:
> >
> >> 
> >> 
> >> On 11/19/23 00:30, Alexandre Oliva wrote:
> >> > 
> >> > I've recently patched scev-3.c and scev-5.c because it only passed by
> >> > accident on ia32.  It also fails on some (but not all) arm-eabi
> >> > variants.  It seems hard to characterize the conditions in which the
> >> > optimization is supposed to pass, but expecting them to fail on ilp32
> >> > targets, though probably a little excessive and possibly noisy, is not
> >> > quite as alarming as getting a fail in test reports, so I propose
> >> > changing the xfail marker from ia32 to ilp32.
> >> > 
> >> > I'm also proposing to add a similar marker to scev-4.c.  Though it
> >> > doesn't appear to be failing for me, I've got reports that suggest it
> >> > still does for others, and it certainly did for us as well.
> >> > 
> >> > Regstrapped on x86_64-linux-gnu, also tested on arm-eabi with default
> >> > cpu on trunk, and with tms570 on gcc-13.  Ok to install?
> >> > 
> >> > 
> >> > for  gcc/testsuite/ChangeLog
> >> > 
> >> >  * gcc.dg/tree-ssa/scev-3.c: xfail on all ilp32 targets,
> >> >  though some of these do pass.
> >> >  * gcc.dg/tree-ssa/scev-4.c: Likewise.
> >> >  * gcc.dg/tree-ssa/scev-5.c: Likewise.
> >> OK.  Though hopefully someone will figure out what properties actually 
> >> cause
> >> the differences so that we can do the right thing without the noisy XPASS 
> >> at
> >> some point.
> >
> > The tests all test IVOPTs induction variable selecting results
> > (assuming every target would come to the "obvious" conclusion),
> > so it's probably not only target but also sub-target (aka -mtune)
> > sensitive ...
> >
> > In the end we might need to move/duplicate the test to some
> > gcc.target/* dir and restrict it to a specific tuning.
> 
> FWIW, since Alexandre's patch all three tests XPASS on 32-bit
> Solaris/SPARC:
> 
> XPASS: gcc.dg/tree-ssa/scev-3.c scan-tree-dump-times ivopts "" 1
> XPASS: gcc.dg/tree-ssa/scev-4.c scan-tree-dump-times ivopts "" 1
> XPASS: gcc.dg/tree-ssa/scev-5.c scan-tree-dump-times ivopts "" 1

It XPASSes on the ilp32 targets I've tried - except "ia32"
(as in i686-elf) and h8300-elf.  Notably XPASSing targets
includes a *default* configuration of arm-eabi, which in
part contradicts your observation above.  I see it even
XPASSes in H.J.'s x86_64-pc-linux-gnu -mx32 results.  Right,
that's not ia32, but it's as ilp32ish as ia32 and can be
expected to share most "interesting" properties with ia32.
Example report at
https://gcc.gnu.org/pipermail/gcc-testresults/2023-November/801862.html.

Alex, can you share the presumably plural set of targets
where you found gcc.dg/tree-ssa/scev-[3-5].c to fail before
your patch, besides "ia32"?

I see them XPASS for:
m68k-unknown-linux-gnu
(https://gcc.gnu.org/pipermail/gcc-testresults/2023-November/801839.html)
pru-unknown-elf
(https://gcc.gnu.org/pipermail/gcc-testresults/2023-November/801732.html)

and from my own testing, at r14-5608-g69741355e6dbcf:
cris-elf, c6x-elf, epiphany-elf, ft32-elf,
hppa-unknown-linux-gnu, lm32-elf, microblaze-linux,
m32r-elf, arm-eabi.

So, ilp32 is IMO a really bad approximation for the elusive
property.

Would you please consider changing those "ilp32" to a
specific set of targets where these tests failed?

I'd prefer not to complicate those expressions by adding the
right spelling of "ilp32 except { list }".

brgds, H-P


Re: [PATCH] Add C intrinsics for scalar crypto extension

2023-11-29 Thread Christoph Müllner
On Wed, Nov 29, 2023 at 5:49 PM Liao Shihua  wrote:
>
>
> 在 2023/11/29 23:03, Christoph Müllner 写道:
>
> On Mon, Nov 27, 2023 at 9:36 AM Liao Shihua  wrote:
>
> This patch add C intrinsics for scalar crypto extension.
> Because of riscv-c-api 
> (https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44/files) includes 
> zbkb/zbkc/zbkx's
> intrinsics in bit manipulation extension, this patch only support zkn*/zks*'s 
> intrinsics.
>
> Thanks for working on this!
> Looking forward to seeing the second patch (covering bitmanip) soon as well!
> A couple of comments can be found below.
>
>
> Thanks for your comments, Christoph. Typos will be corrected in the next 
> patch.
>
> The implementation of intrinsic is belonged to the implementation in the 
> LLVM.(It does look a little strange)
>
> I will unify the implementation method in the next patch.
>
>
>
> gcc/ChangeLog:
>
> * config.gcc: Add riscv_crypto.h
> * config/riscv/riscv_crypto.h: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zknd32.c: Use intrinsics instead of builtins.
> * gcc.target/riscv/zknd64.c: Likewise.
> * gcc.target/riscv/zkne32.c: Likewise.
> * gcc.target/riscv/zkne64.c: Likewise.
> * gcc.target/riscv/zknh-sha256-32.c: Likewise.
> * gcc.target/riscv/zknh-sha256-64.c: Likewise.
> * gcc.target/riscv/zknh-sha512-32.c: Likewise.
> * gcc.target/riscv/zknh-sha512-64.c: Likewise.
> * gcc.target/riscv/zksed32.c: Likewise.
> * gcc.target/riscv/zksed64.c: Likewise.
> * gcc.target/riscv/zksh32.c: Likewise.
> * gcc.target/riscv/zksh64.c: Likewise.
>
> ---
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv_crypto.h   | 219 ++
>  gcc/testsuite/gcc.target/riscv/zknd32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zknd64.c   |  12 +-
>  gcc/testsuite/gcc.target/riscv/zkne32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zkne64.c   |  10 +-
>  .../gcc.target/riscv/zknh-sha256-32.c |  22 +-
>  .../gcc.target/riscv/zknh-sha256-64.c |  10 +-
>  .../gcc.target/riscv/zknh-sha512-32.c |  14 +-
>  .../gcc.target/riscv/zknh-sha512-64.c |  10 +-
>  gcc/testsuite/gcc.target/riscv/zksed32.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksed64.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh64.c   |   6 +-
>  14 files changed, 288 insertions(+), 47 deletions(-)
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b88591b6fd8..d67fe8b6a6f 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -548,7 +548,7 @@ riscv*)
> extra_objs="${extra_objs} riscv-vector-builtins.o 
> riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
> extra_objs="${extra_objs} thead.o riscv-target-attr.o"
> d_target_objs="riscv-d.o"
> -   extra_headers="riscv_vector.h"
> +   extra_headers="riscv_vector.h riscv_crypto.h"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.h"
> ;;
> diff --git a/gcc/config/riscv/riscv_crypto.h b/gcc/config/riscv/riscv_crypto.h
> new file mode 100644
> index 000..149c1132e10
> --- /dev/null
> +++ b/gcc/config/riscv/riscv_crypto.h
> @@ -0,0 +1,219 @@
> +/* RISC-V 'K' Extension intrinsics include file.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published
> +   by the Free Software Foundation; either version 3, or (at your
> +   option) any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +   License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#ifndef __RISCV_CRYPTO_H
> +#define __RISCV_CRYPTO_H
> +
> +#include 
> +
> +#if defined (__cplusplus)
> +extern "C" {
> +#endif
> +
> +#if defined(__riscv_zknd)
> +#if __riscv_xlen == 32
> +#define __riscv_aes32dsi(x, y, bs) __builtin_riscv_aes32dsi(x, y, bs)
> +#define __riscv_aes32dsmi(x, y, bs) __builtin_riscv_aes32dsmi(x, y, bs)
> +#endif
> +
> +#if __riscv_xlen == 

Re: [PATCH v2 4/5] Add support for target_version attribute

2023-11-29 Thread Richard Sandiford
Andrew Carlotti  writes:
> This patch adds support for the "target_version" attribute to the middle
> end and the C++ frontend, which will be used to implement function
> multiversioning in the aarch64 backend.
>
> On targets that don't use the "target" attribute for multiversioning,
> there is no conflict between the "target" and "target_clones"
> attributes.  This patch therefore makes the mutual exclusion in
> C-family, D and Ada conditonal upon the value of the
> expanded_clones_attribute target hook.
>
> The "target_version" attribute is only added to C++ in this patch,
> because this is currently the only frontend which supports
> multiversioning using the "target" attribute.  Support for the
> "target_version" attribute will be extended to C at a later date.
>
> Targets that currently use the "target" attribute for function
> multiversioning (i.e. i386 and rs6000) are not affected by this patch.
>
> Ok for master?
>
> gcc/ChangeLog:
>
>   * attribs.cc (decl_attributes): Pass attribute name to target.
>   (is_function_default_version): Update comment to specify
>   incompatibility with target_version attributes.
>   * cgraphclones.cc (cgraph_node::create_version_clone_with_body):
>   Call valid_version_attribute_p for target_version attributes.
>   * target.def (valid_version_attribute_p): New hook.
>   (expanded_clones_attribute): New hook.
>   * doc/tm.texi.in: Add new hooks.
>   * doc/tm.texi: Regenerate.
>   * multiple_target.cc (create_dispatcher_calls): Remove redundant
>   is_function_default_version check.
>   (expand_target_clones): Use target hook for attribute name.
>   * targhooks.cc (default_target_option_valid_version_attribute_p):
>   New.
>   * targhooks.h (default_target_option_valid_version_attribute_p):
>   New.
>   * tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
>   target_version attributes.
>
> gcc/c-family/ChangeLog:
>
>   * c-attribs.cc (CLONES_USES_TARGET): New macro.
>   (attr_target_exclusions): Use new macro.
>   (attr_target_clones_exclusions): Ditto, and add target_version.
>   (attr_target_version_exclusions): New.
>   (c_common_attribute_table): Add target_version.
>   (handle_target_version_attribute): New.
>
> gcc/ada/ChangeLog:
>
>   * gcc-interface/utils.cc (CLONES_USES_TARGET): New macro.
>   (attr_target_exclusions): Use new macro.
>   (attr_target_clones_exclusions): Ditto.
>
> gcc/d/ChangeLog:
>
>   * d-attribs.cc (CLONES_USES_TARGET): New macro.
>   (attr_target_exclusions): Use new macro.
>   (attr_target_clones_exclusions): Ditto.
>
> gcc/cp/ChangeLog:
>
>   * decl2.cc (check_classfn): Update comment to include
>   target_version attributes.
>
>
> diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
> index 
> e33a63948cebdeafc3abcdd539a35141969ad978..8850943cb3326568b4679a73405f50487aa1b7c6
>  100644
> --- a/gcc/ada/gcc-interface/utils.cc
> +++ b/gcc/ada/gcc-interface/utils.cc
> @@ -143,16 +143,21 @@ static const struct attribute_spec::exclusions 
> attr_noinline_exclusions[] =
>{ NULL, false, false, false },
>  };
>  
> +#define CLONES_USES_TARGET \
> +  (strcmp (targetm.target_option.expanded_clones_attribute, \
> +"target") == 0)
> +

Sorry for the slower review on this part.  I was hoping inspiration
would strike for a way to resolve this, but it hasn't, so:

The codebase usually avoids static variables that need dynamic
initialisation.  So although macros are not the preferred way of
doing things, I think one is probably appropriate here.  How about:

  TARGET_HAS_FMV_TARGET_ATTRIBUTE

with the default being true, and with AArch64 defining it to false?

This would replace the expanded_clones_attribute hook, with:

  const char *new_attr_name = targetm.target_option.expanded_clones_attribute;

becoming:

  const char *new_attr_name = (TARGET_HAS_FMV_TARGET_ATTRIBUTE
   ? "target" : "target_version");

I realise this is anything but elegant, but I think it's probably
the least worst option, given where we are.

>  static const struct attribute_spec::exclusions attr_target_exclusions[] =
>  {
> -  { "target_clones", true, true, true },
> +  { "target_clones", CLONES_USES_TARGET, CLONES_USES_TARGET,
> +CLONES_USES_TARGET },
>{ NULL, false, false, false },
>  };
>  
>  static const struct attribute_spec::exclusions 
> attr_target_clones_exclusions[] =
>  {
>{ "always_inline", true, true, true },
> -  { "target", true, true, true },
> +  { "target", CLONES_USES_TARGET, CLONES_USES_TARGET, CLONES_USES_TARGET },
>{ NULL, false, false, false },
>  };
>  
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index 
> f9fd258598914ce2112ecaaeaad6c63cd69a44e2..27533023ef5c481ba085c2f0c605dfb992987b3e
>  100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -657,7 +657,8 @@ decl_attributes (tree *node, tree attributes, int flags,
>   options 

Re: [PATCH] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Marek Polacek
On Wed, Nov 29, 2023 at 12:23:46PM -0500, Patrick Palka wrote:
> On Wed, 29 Nov 2023, Marek Polacek wrote:
> 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > Now that I'm posting this patch, I think you'll probably want me to use
> > ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
> > a trivial testsuite tweak:
> >   'C' is not an accessible base of 'X'
> > v.
> >   'C' is an inaccessible base of 'X'
> > We should probably unify those messages...
> > 
> > -- >8 --
> > Given
> > 
> >   struct A { constexpr static int a = 0; };
> >   struct B : A {};
> >   struct C : A {};
> >   struct D : B, C {};
> > 
> > we give the "'A' is an ambiguous base of 'D'" error for
> > 
> >   D{}.A::a;
> > 
> > which seems wrong: 'a' is a static data member so there is only one copy
> > so it can be unambiguously referred to even if there are multiple A
> > objects.  clang++/MSVC/icx agree.
> > 
> > PR c++/112744
> > 
> > gcc/cp/ChangeLog:
> > 
> > * typeck.cc (finish_class_member_access_expr): When accessing
> > a static data member, use ba_any for lookup_base.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/lookup/scoped11.C: New test.
> > * g++.dg/lookup/scoped12.C: New test.
> > * g++.dg/lookup/scoped13.C: New test.
> > ---
> >  gcc/cp/typeck.cc   | 21 ++---
> >  gcc/testsuite/g++.dg/lookup/scoped11.C | 14 ++
> >  gcc/testsuite/g++.dg/lookup/scoped12.C | 14 ++
> >  gcc/testsuite/g++.dg/lookup/scoped13.C | 14 ++
> >  4 files changed, 60 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped11.C
> >  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped12.C
> >  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped13.C
> > 
> > diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
> > index e995fb6ddd7..c4de8bb2616 100644
> > --- a/gcc/cp/typeck.cc
> > +++ b/gcc/cp/typeck.cc
> > @@ -3476,7 +3476,7 @@ finish_class_member_access_expr (cp_expr object, tree 
> > name, bool template_p,
> >name, scope);
> >   return error_mark_node;
> > }
> > - 
> > +
> >   if (TREE_SIDE_EFFECTS (object))
> > val = build2 (COMPOUND_EXPR, TREE_TYPE (val), object, val);
> >   return val;
> > @@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, 
> > tree name, bool template_p,
> >   return error_mark_node;
> > }
> >  
> > + /* NAME may refer to a static data member, in which case there is
> > +one copy of the data member that is shared by all the objects of
> > +the class.  So NAME can be unambiguously referred to even if
> > +there are multiple indirect base classes containing NAME.  */
> > + const base_access ba = [scope, name] ()
> > +   {
> > + if (identifier_p (name))
> > +   {
> > + tree m = lookup_member (scope, name, /*protect=*/0,
> > + /*want_type=*/false, tf_none);
> > + if (!m || VAR_P (m))
> > +   return ba_any;
> 
> I wonder if we want to return ba_check_bit instead of ba_any so that we
> still check access of the selected base?

That would certainly make sense to me.  I didn't do that because
I'd not seen ba_check_bit being used except as part of ba_check,
but that may not mean much.

So either I can tweak the lambda to return ba_check_bit rather
than ba_any or use ba_check_bit unconditionally.  Any opinions on that?

>   struct A { constexpr static int a = 0; };
>   struct D : private A {};
> 
>   void f() {
> D{}.A::a; // #1 GCC (and Clang) currently rejects
>   }
> 
>   template
>   void g() {
> D{}.T::a; // #2 GCC currently rejects, Clang accepts?!
>   }
> 
>   template void g();

Thanks for looking at the patch and the testcase.  I'll add it.

> > +   }
> > + return ba_check;
> > +   } ();
> > +
> >   /* Find the base of OBJECT_TYPE corresponding to SCOPE.  */
> > - access_path = lookup_base (object_type, scope, ba_check,
> > -NULL, complain);
> > + access_path = lookup_base (object_type, scope, ba, NULL, complain);
> >   if (access_path == error_mark_node)
> > return error_mark_node;
> >   if (!access_path)
> > diff --git a/gcc/testsuite/g++.dg/lookup/scoped11.C 
> > b/gcc/testsuite/g++.dg/lookup/scoped11.C
> > new file mode 100644
> > index 000..be743522fce
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/lookup/scoped11.C
> > @@ -0,0 +1,14 @@
> > +// PR c++/112744
> > +// { dg-do compile }
> > +
> > +struct A { const static int a = 0; };
> > +struct B : A {};
> > +struct C : A {};
> > +struct D : B, C {};
> > +
> > +int main()
> > +{
> > +  D d;
> > +  (void) d.a;
> > +  (void) d.A::a;
> > +}
> > diff --git a/gcc/testsuite/g++.dg/lookup/scoped12.C 
> > b/gcc/testsuite/g++.dg/lookup/scoped12.C
> > new file mode 100644
> > 

Re: [RFA] New pass for sign/zero extension elimination

2023-11-29 Thread Joern Rennecke
Why did you leave out MINUS from safe_for_live_propagation ?


Re: [PATCH] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Patrick Palka
On Wed, 29 Nov 2023, Marek Polacek wrote:

> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> Now that I'm posting this patch, I think you'll probably want me to use
> ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
> a trivial testsuite tweak:
>   'C' is not an accessible base of 'X'
> v.
>   'C' is an inaccessible base of 'X'
> We should probably unify those messages...
> 
> -- >8 --
> Given
> 
>   struct A { constexpr static int a = 0; };
>   struct B : A {};
>   struct C : A {};
>   struct D : B, C {};
> 
> we give the "'A' is an ambiguous base of 'D'" error for
> 
>   D{}.A::a;
> 
> which seems wrong: 'a' is a static data member so there is only one copy
> so it can be unambiguously referred to even if there are multiple A
> objects.  clang++/MSVC/icx agree.
> 
>   PR c++/112744
> 
> gcc/cp/ChangeLog:
> 
>   * typeck.cc (finish_class_member_access_expr): When accessing
>   a static data member, use ba_any for lookup_base.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/lookup/scoped11.C: New test.
>   * g++.dg/lookup/scoped12.C: New test.
>   * g++.dg/lookup/scoped13.C: New test.
> ---
>  gcc/cp/typeck.cc   | 21 ++---
>  gcc/testsuite/g++.dg/lookup/scoped11.C | 14 ++
>  gcc/testsuite/g++.dg/lookup/scoped12.C | 14 ++
>  gcc/testsuite/g++.dg/lookup/scoped13.C | 14 ++
>  4 files changed, 60 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped11.C
>  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped12.C
>  create mode 100644 gcc/testsuite/g++.dg/lookup/scoped13.C
> 
> diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
> index e995fb6ddd7..c4de8bb2616 100644
> --- a/gcc/cp/typeck.cc
> +++ b/gcc/cp/typeck.cc
> @@ -3476,7 +3476,7 @@ finish_class_member_access_expr (cp_expr object, tree 
> name, bool template_p,
>  name, scope);
> return error_mark_node;
>   }
> -   
> +
> if (TREE_SIDE_EFFECTS (object))
>   val = build2 (COMPOUND_EXPR, TREE_TYPE (val), object, val);
> return val;
> @@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, tree 
> name, bool template_p,
> return error_mark_node;
>   }
>  
> +   /* NAME may refer to a static data member, in which case there is
> +  one copy of the data member that is shared by all the objects of
> +  the class.  So NAME can be unambiguously referred to even if
> +  there are multiple indirect base classes containing NAME.  */
> +   const base_access ba = [scope, name] ()
> + {
> +   if (identifier_p (name))
> + {
> +   tree m = lookup_member (scope, name, /*protect=*/0,
> +   /*want_type=*/false, tf_none);
> +   if (!m || VAR_P (m))
> + return ba_any;

I wonder if we want to return ba_check_bit instead of ba_any so that we
still check access of the selected base?

  struct A { constexpr static int a = 0; };
  struct D : private A {};

  void f() {
D{}.A::a; // #1 GCC (and Clang) currently rejects
  }

  template
  void g() {
D{}.T::a; // #2 GCC currently rejects, Clang accepts?!
  }

  template void g();

> + }
> +   return ba_check;
> + } ();
> +
> /* Find the base of OBJECT_TYPE corresponding to SCOPE.  */
> -   access_path = lookup_base (object_type, scope, ba_check,
> -  NULL, complain);
> +   access_path = lookup_base (object_type, scope, ba, NULL, complain);
> if (access_path == error_mark_node)
>   return error_mark_node;
> if (!access_path)
> diff --git a/gcc/testsuite/g++.dg/lookup/scoped11.C 
> b/gcc/testsuite/g++.dg/lookup/scoped11.C
> new file mode 100644
> index 000..be743522fce
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/lookup/scoped11.C
> @@ -0,0 +1,14 @@
> +// PR c++/112744
> +// { dg-do compile }
> +
> +struct A { const static int a = 0; };
> +struct B : A {};
> +struct C : A {};
> +struct D : B, C {};
> +
> +int main()
> +{
> +  D d;
> +  (void) d.a;
> +  (void) d.A::a;
> +}
> diff --git a/gcc/testsuite/g++.dg/lookup/scoped12.C 
> b/gcc/testsuite/g++.dg/lookup/scoped12.C
> new file mode 100644
> index 000..ffa145598fd
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/lookup/scoped12.C
> @@ -0,0 +1,14 @@
> +// PR c++/112744
> +// { dg-do compile }
> +
> +class A { const static int a = 0; };
> +struct B : A {};
> +struct C : A {};
> +struct D : B, C {};
> +
> +int main()
> +{
> +  D d;
> +  (void) d.a;  // { dg-error "private" }
> +  (void) d.A::a;  // { dg-error "private" }
> +}
> diff --git a/gcc/testsuite/g++.dg/lookup/scoped13.C 
> b/gcc/testsuite/g++.dg/lookup/scoped13.C
> new file mode 100644
> index 000..970e1aa833e
> --- /dev/null
> +++ 

Re: [PATCH v2 1/6] libgomp: basic pinned memory on Linux

2023-11-29 Thread Andrew Stubbs

On 22/11/2023 14:26, Tobias Burnus wrote:

Hi Andrew,

Side remark:


-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \ - calloc (1,
(((void)(MEMSPACE), (SIZE


This fits a bit more to previous patch, but I wonder whether that should
use (MEMSPACE, NMEMB, SIZE) instead - to fit to the actual calloc 
arguments.


I think the main/only difference between SIZE and NMEMB and SIZE is that
"If the multiplication of nmemb and size would result in integer overflow,
then calloc() returns an error." (Linux manpage)

However, while this wording seems to be neither in POSIX nor in the OpenMP
spec. There was some alignment discussion at https://gcc.gnu.org/PR112364
regarding whether C (since C23) has a different alignment for
calloc(1, n) vs. calloc(n,1) but Joseph believes it doen't.

Thus, this is more bikesheding than making a real difference.


[Addressing this point separately to the others]

The size has already been calculated, aligned, and padded, before we get 
to calling MEMSPACE_CALLOC. I don't think we can revert to "nmemb, size" 
without breaking that.


Andrew


Re: [PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342]

2023-11-29 Thread Richard Sandiford
"Andre Vieira (lists)"  writes:
> Rebased, no major changes, still needs review.
>
> On 30/08/2023 10:19, Andre Vieira (lists) via Gcc-patches wrote:
>> This patch finalizes adding support for the generation of SVE simd 
>> clones when no simdlen is provided, following the ABI rules where the 
>> widest data type determines the minimum amount of elements in a length 
>> agnostic vector.
>> 
>> gcc/ChangeLog:
>> 
>>      * config/aarch64/aarch64-protos.h (add_sve_type_attribute): 
>> Declare.
>>  * config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute): 
>> Make
>>  visibility global.
>>  * config/aarch64/aarch64.cc (aarch64_fntype_abi): Ensure SVE ABI is
>>  chosen over SIMD ABI if a SVE type is used in return or arguments.
>>  (aarch64_simd_clone_compute_vecsize_and_simdlen): Create VLA simd 
>> clone
>>  when no simdlen is provided, according to ABI rules.
>>  (aarch64_simd_clone_adjust): Add '+sve' attribute to SVE simd clones.
>>  (aarch64_simd_clone_adjust_ret_or_param): New.
>>  (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Define.
>>  * omp-simd-clone.cc (simd_clone_mangle): Print 'x' for VLA simdlen.
>>  (simd_clone_adjust): Adapt safelen check to be compatible with VLA
>>  simdlen.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * c-c++-common/gomp/declare-variant-14.c: Adapt aarch64 scan.
>>  * gfortran.dg/gomp/declare-variant-14.f90: Likewise.
>>  * gcc.target/aarch64/declare-simd-1.c: Remove warning checks where no
>>  longer necessary.
>>  * gcc.target/aarch64/declare-simd-2.c: Add SVE clone scan.
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> 60a55f4bc1956786ea687fc7cad7ec9e4a84e1f0..769d637f63724a7f0044f48f3dd683e0fb46049c
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -1005,6 +1005,8 @@ namespace aarch64_sve {
>  #ifdef GCC_TARGET_H
>bool verify_type_context (location_t, type_context_kind, const_tree, bool);
>  #endif
> + void add_sve_type_attribute (tree, unsigned int, unsigned int,
> +   const char *, const char *);
>  }
>  
>  extern void aarch64_split_combinev16qi (rtx operands[3]);
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins.cc
> index 
> 161a14edde7c9fb1b13b146cf50463e2d78db264..6f99c438d10daa91b7e3b623c995489f1a8a0f4c
>  100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -569,14 +569,16 @@ static bool reported_missing_registers_p;
>  /* Record that TYPE is an ABI-defined SVE type that contains NUM_ZR SVE 
> vectors
> and NUM_PR SVE predicates.  MANGLED_NAME, if nonnull, is the ABI-defined
> mangling of the type.  ACLE_NAME is the  name of the type.  */
> -static void
> +void
>  add_sve_type_attribute (tree type, unsigned int num_zr, unsigned int num_pr,
>   const char *mangled_name, const char *acle_name)
>  {
>tree mangled_name_tree
>  = (mangled_name ? get_identifier (mangled_name) : NULL_TREE);
> +  tree acle_name_tree
> += (acle_name ? get_identifier (acle_name) : NULL_TREE);
>  
> -  tree value = tree_cons (NULL_TREE, get_identifier (acle_name), NULL_TREE);
> +  tree value = tree_cons (NULL_TREE, acle_name_tree, NULL_TREE);
>value = tree_cons (NULL_TREE, mangled_name_tree, value);
>value = tree_cons (NULL_TREE, size_int (num_pr), value);
>value = tree_cons (NULL_TREE, size_int (num_zr), value);
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 37507f091c2a6154fa944c3a9fad6a655ab5d5a1..cb0947b18c6a611d55579b5b08d93f6a4a9c3b2c
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -4080,13 +4080,13 @@ aarch64_takes_arguments_in_sve_regs_p (const_tree 
> fntype)
>  static const predefined_function_abi &
>  aarch64_fntype_abi (const_tree fntype)
>  {
> -  if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)))
> -return aarch64_simd_abi ();
> -
>if (aarch64_returns_value_in_sve_regs_p (fntype)
>|| aarch64_takes_arguments_in_sve_regs_p (fntype))
>  return aarch64_sve_abi ();
>  
> +  if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)))
> +return aarch64_simd_abi ();
> +
>return default_function_abi;
>  }
>  

I think we discussed this off-list later, but the change above shouldn't
be necessary.  aarch64_vector_pcs must not be attached to SVE PCS functions,
so the two cases should be mutually exclusive.

> @@ -27467,7 +27467,7 @@ aarch64_simd_clone_compute_vecsize_and_simdlen 
> (struct cgraph_node *node,
>   int num, bool explicit_p)
>  {
>tree t, ret_type;
> -  unsigned int nds_elt_bits;
> +  unsigned int nds_elt_bits, wds_elt_bits;
>int count;
>unsigned HOST_WIDE_INT const_simdlen;
>  
> @@ -27513,10 +27513,14 @@ 

[COMMITTED 2/2] PR tree-optimization/111922 - Check operands before invoking fold_range.

2023-11-29 Thread Andrew MacLeod
This patch utilizes the new check_operands_p() routine in range-ops to 
verify the operands are compatible before IPA tries to call 
fold_range().   I do not know if there are other places in IPA that 
should be checking this, but we have a bug report for this place at least.


The other option would be to have fold_range simply return false when 
operands don't match, but then we lose the compile time checking that 
everything is as it should be and bugs may sneak thru.


Bootstraps on x86_64-pc-linux-gnu with  no regressions. Committed.

Andrew


From 5f0c0f02702eba568374a7d82ec9463edd1a905c Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 28 Nov 2023 13:02:35 -0500
Subject: [PATCH 2/2] Check operands before invoking fold_range.

Call check_operands_p before fold_range to make sure it is a valid operation.

	PR tree-optimization/111922
	gcc/
	* ipa-cp.cc (ipa_vr_operation_and_type_effects): Check the
	operands are valid before calling fold_range.

	gcc/testsuite/
	* gcc.dg/pr111922.c: New.
---
 gcc/ipa-cp.cc   |  3 ++-
 gcc/testsuite/gcc.dg/pr111922.c | 29 +
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr111922.c

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 34fae065454..649ad536161 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1926,7 +1926,8 @@ ipa_vr_operation_and_type_effects (vrange _vr,
   Value_Range varying (dst_type);
   varying.set_varying (dst_type);
 
-  return (handler.fold_range (dst_vr, dst_type, src_vr, varying)
+  return (handler.operand_check_p (dst_type, src_type, dst_type)
+	  && handler.fold_range (dst_vr, dst_type, src_vr, varying)
 	  && !dst_vr.varying_p ()
 	  && !dst_vr.undefined_p ());
 }
diff --git a/gcc/testsuite/gcc.dg/pr111922.c b/gcc/testsuite/gcc.dg/pr111922.c
new file mode 100644
index 000..4f429d741c7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111922.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-tree-fre" } */
+
+void f2 (void);
+void f4 (int, int, int);
+struct A { int a; };
+struct B { struct A *b; int c; } v;
+
+static int
+f1 (x, y)
+  struct C *x;
+  struct A *y;
+{
+  (v.c = v.b->a) || (v.c = v.b->a);
+  f2 ();
+}
+
+static void
+f3 (int x, int y)
+{
+  int b = f1 (0, ~x);
+  f4 (0, 0, v.c);
+}
+
+void
+f5 (void)
+{
+  f3 (0, 0);
+}
-- 
2.41.0



[COMMITTED 1/2] Add operand_check_p to range-ops.

2023-11-29 Thread Andrew MacLeod
I've been going back and forth with this for the past week, and finally 
settled on a solution


This patch adds an operand_check_p() (lhs_type, op1_type, op2_type) 
method to range_ops which will confirm whether the types of the operands 
being passed to fold_range, op1_range, and op2_range  are properly 
compatible.   For range-ops this basically means the precision matches.


It was a bit tricky to do it any other way because various operations 
allow different precision or even different types in some operand positions.


This patch sets up the operand_check_p to return true by default, which 
means there is no variation from what we do today.  However, I have gone 
in to all the integral/mixed range operators, and added checks for 
things like  X = Y + Z requiring the precision to be the same for all 3 
operands.   however x = y && z only requires OP1 and OP2 to be the same 
precision, and  x = ~y only requires the LHS and OP1 to match.


This call is utilized in a gcc_assert when CHECKING_P is on for 
fold_range(), op1_range() and op2_range() to provide compilation time 
verification while not costing anything for a release build.


Bootstraps on x86_64-pc-linux-gnu with no regressions. committed.

Andrew

From 9f1149ef823b64ead6115f79f99ddf8eead1c2f4 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 28 Nov 2023 09:39:30 -0500
Subject: [PATCH 1/2] Add operand_check_p to range-ops.

Add an optional method to verify operands are compatible, and check
the operands before all range operations.

	* range-op-mixed.h (operator_equal::operand_check_p): New.
	(operator_not_equal::operand_check_p): New.
	(operator_lt::operand_check_p): New.
	(operator_le::operand_check_p): New.
	(operator_gt::operand_check_p): New.
	(operator_ge::operand_check_p): New.
	(operator_plus::operand_check_p): New.
	(operator_abs::operand_check_p): New.
	(operator_minus::operand_check_p): New.
	(operator_negate::operand_check_p): New.
	(operator_mult::operand_check_p): New.
	(operator_bitwise_not::operand_check_p): New.
	(operator_bitwise_xor::operand_check_p): New.
	(operator_bitwise_and::operand_check_p): New.
	(operator_bitwise_or::operand_check_p): New.
	(operator_min::operand_check_p): New.
	(operator_max::operand_check_p): New.
	* range-op.cc (range_op_handler::fold_range): Check operand
	parameter types.
	(range_op_handler::op1_range): Ditto.
	(range_op_handler::op2_range): Ditto.
	(range_op_handler::operand_check_p): New.
	(range_operator::operand_check_p): New.
	(operator_lshift::operand_check_p): New.
	(operator_rshift::operand_check_p): New.
	(operator_logical_and::operand_check_p): New.
	(operator_logical_or::operand_check_p): New.
	(operator_logical_not::operand_check_p): New.
	* range-op.h (range_operator::operand_check_p): New.
	(range_op_handler::operand_check_p): New.
---
 gcc/range-op-mixed.h | 63 +---
 gcc/range-op.cc  | 53 ++---
 gcc/range-op.h   |  5 
 3 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/gcc/range-op-mixed.h b/gcc/range-op-mixed.h
index 45e11df57df..4386a68e946 100644
--- a/gcc/range-op-mixed.h
+++ b/gcc/range-op-mixed.h
@@ -138,6 +138,9 @@ public:
   const frange &) const final override;
   void update_bitmask (irange , const irange ,
 		   const irange ) const final override;
+  // Check op1 and op2 for compatibility.
+  bool operand_check_p (tree, tree t1, tree t2) const final override
+{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
 };
 
 class operator_not_equal : public range_operator
@@ -174,6 +177,9 @@ public:
   const frange &) const final override;
   void update_bitmask (irange , const irange ,
 		   const irange ) const final override;
+  // Check op1 and op2 for compatibility.
+  bool operand_check_p (tree, tree t1, tree t2) const final override
+{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
 };
 
 class operator_lt :  public range_operator
@@ -207,6 +213,9 @@ public:
   const frange &) const final override;
   void update_bitmask (irange , const irange ,
 		   const irange ) const final override;
+  // Check op1 and op2 for compatibility.
+  bool operand_check_p (tree, tree t1, tree t2) const final override
+{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
 };
 
 class operator_le :  public range_operator
@@ -243,6 +252,9 @@ public:
   const frange &) const final override;
   void update_bitmask (irange , const irange ,
 		   const irange ) const final override;
+  // Check op1 and op2 for compatibility.
+  bool operand_check_p (tree, tree t1, tree t2) const final override
+{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
 };
 
 class operator_gt :  public range_operator
@@ -278,6 +290,9 @@ public:
   const frange &) const final override;
   void update_bitmask (irange , const irange ,
 		   const irange ) const final override;
+  // Check op1 and op2 for compatibility.
+  bool operand_check_p 

Re: [PATCH] Add C intrinsics for scalar crypto extension

2023-11-29 Thread Liao Shihua


在 2023/11/29 23:03, Christoph Müllner 写道:

On Mon, Nov 27, 2023 at 9:36 AM Liao Shihua  wrote:

This patch add C intrinsics for scalar crypto extension.
Because of riscv-c-api 
(https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44/files) includes 
zbkb/zbkc/zbkx's
intrinsics in bit manipulation extension, this patch only support zkn*/zks*'s 
intrinsics.

Thanks for working on this!
Looking forward to seeing the second patch (covering bitmanip) soon as well!
A couple of comments can be found below.



Thanks for your comments, Christoph. Typos will be corrected in the next 
patch.


The implementation of intrinsic is belonged to the implementation in the 
LLVM.(It does look a little strange)


I will unify the implementation method in the next patch.





gcc/ChangeLog:

 * config.gcc: Add riscv_crypto.h
 * config/riscv/riscv_crypto.h: New file.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/zknd32.c: Use intrinsics instead of builtins.
 * gcc.target/riscv/zknd64.c: Likewise.
 * gcc.target/riscv/zkne32.c: Likewise.
 * gcc.target/riscv/zkne64.c: Likewise.
 * gcc.target/riscv/zknh-sha256-32.c: Likewise.
 * gcc.target/riscv/zknh-sha256-64.c: Likewise.
 * gcc.target/riscv/zknh-sha512-32.c: Likewise.
 * gcc.target/riscv/zknh-sha512-64.c: Likewise.
 * gcc.target/riscv/zksed32.c: Likewise.
 * gcc.target/riscv/zksed64.c: Likewise.
 * gcc.target/riscv/zksh32.c: Likewise.
 * gcc.target/riscv/zksh64.c: Likewise.

---
  gcc/config.gcc|   2 +-
  gcc/config/riscv/riscv_crypto.h   | 219 ++
  gcc/testsuite/gcc.target/riscv/zknd32.c   |   6 +-
  gcc/testsuite/gcc.target/riscv/zknd64.c   |  12 +-
  gcc/testsuite/gcc.target/riscv/zkne32.c   |   6 +-
  gcc/testsuite/gcc.target/riscv/zkne64.c   |  10 +-
  .../gcc.target/riscv/zknh-sha256-32.c |  22 +-
  .../gcc.target/riscv/zknh-sha256-64.c |  10 +-
  .../gcc.target/riscv/zknh-sha512-32.c |  14 +-
  .../gcc.target/riscv/zknh-sha512-64.c |  10 +-
  gcc/testsuite/gcc.target/riscv/zksed32.c  |   6 +-
  gcc/testsuite/gcc.target/riscv/zksed64.c  |   6 +-
  gcc/testsuite/gcc.target/riscv/zksh32.c   |   6 +-
  gcc/testsuite/gcc.target/riscv/zksh64.c   |   6 +-
  14 files changed, 288 insertions(+), 47 deletions(-)
  create mode 100644 gcc/config/riscv/riscv_crypto.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b88591b6fd8..d67fe8b6a6f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -548,7 +548,7 @@ riscv*)
 extra_objs="${extra_objs} riscv-vector-builtins.o 
riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
 extra_objs="${extra_objs} thead.o riscv-target-attr.o"
 d_target_objs="riscv-d.o"
-   extra_headers="riscv_vector.h"
+   extra_headers="riscv_vector.h riscv_crypto.h"
 target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.cc"
 target_gtfiles="$target_gtfiles 
\$(srcdir)/config/riscv/riscv-vector-builtins.h"
 ;;
diff --git a/gcc/config/riscv/riscv_crypto.h b/gcc/config/riscv/riscv_crypto.h
new file mode 100644
index 000..149c1132e10
--- /dev/null
+++ b/gcc/config/riscv/riscv_crypto.h
@@ -0,0 +1,219 @@
+/* RISC-V 'K' Extension intrinsics include file.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#ifndef __RISCV_CRYPTO_H
+#define __RISCV_CRYPTO_H
+
+#include 
+
+#if defined (__cplusplus)
+extern "C" {
+#endif
+
+#if defined(__riscv_zknd)
+#if __riscv_xlen == 32
+#define __riscv_aes32dsi(x, y, bs) __builtin_riscv_aes32dsi(x, y, bs)
+#define __riscv_aes32dsmi(x, y, bs) __builtin_riscv_aes32dsmi(x, y, bs)
+#endif
+
+#if __riscv_xlen == 64
+static __inline__ uint64_t __attribute__ ((__always_inline__, __nodebug__))
+__riscv_aes64ds (uint64_t __x, uint64_t __y)
+{
+  return __builtin_riscv_aes64ds (__x, __y);
+}

I don't understand why some intrinsic functions are implemented as
macros 

Re: [committed v2] libstdc++: Define std::ranges::to for C++23 (P1206R7) [PR111055]

2023-11-29 Thread Patrick Palka
On Thu, 23 Nov 2023, Jonathan Wakely wrote:

> Here's the finished version of the std::ranges::to patch, which I've
> pushed to trunk.
> 
> Tested x86_64-linux.
> 
> -- >8 --
> 
> This adds the std::ranges::to functions for C++23. The rest of P1206R7
> is not yet implemented, i.e. the new constructors taking the
> std::from_range tag, and the new insert_range, assign_range, etc. member
> functions. std::ranges::to works with the standard containers even
> without the new constructors, so this is useful immediately.
> 
> The __cpp_lib_ranges_to_container feature test macro can be defined now,
> because that only indicates support for the changes in , which
> are implemented by this patch. The __cpp_lib_containers_ranges macro
> will be defined once all containers support the new member functions.
> 
> libstdc++-v3/ChangeLog:
> 
>   PR libstdc++/111055
>   * include/bits/ranges_base.h (from_range_t): Define new tag
>   type.
>   (from_range): Define new tag object.
>   * include/bits/version.def (ranges_to_container): Define.
>   * include/bits/version.h: Regenerate.
>   * include/std/ranges (ranges::to): Define.
>   * testsuite/std/ranges/conv/1.cc: New test.
>   * testsuite/std/ranges/conv/2_neg.cc: New test.
>   * testsuite/std/ranges/conv/version.cc: New test.
> ---
>  libstdc++-v3/include/bits/ranges_base.h   |   8 +-
>  libstdc++-v3/include/bits/version.def |  34 +-
>  libstdc++-v3/include/bits/version.h   | 111 +++---
>  libstdc++-v3/include/std/ranges   | 361 -
>  libstdc++-v3/testsuite/std/ranges/conv/1.cc   | 369 ++
>  .../testsuite/std/ranges/conv/2_neg.cc|  24 ++
>  .../testsuite/std/ranges/conv/version.cc  |  19 +
>  7 files changed, 866 insertions(+), 60 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/conv/1.cc
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/conv/2_neg.cc
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/conv/version.cc
> 
> diff --git a/libstdc++-v3/include/bits/ranges_base.h 
> b/libstdc++-v3/include/bits/ranges_base.h
> index 7fa43d1965a..1ca2c5ce2bb 100644
> --- a/libstdc++-v3/include/bits/ranges_base.h
> +++ b/libstdc++-v3/include/bits/ranges_base.h
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #ifdef __cpp_lib_concepts
>  namespace std _GLIBCXX_VISIBILITY(default)
> @@ -1056,8 +1057,13 @@ namespace ranges
>  using borrowed_iterator_t = __conditional_t,
>   iterator_t<_Range>,
>   dangling>;
> -
>  } // namespace ranges
> +
> +#if __glibcxx_ranges_to_container // C++ >= 23
> +  struct from_range_t { explicit from_range_t() = default; };
> +  inline constexpr from_range_t from_range{};
> +#endif
> +
>  _GLIBCXX_END_NAMESPACE_VERSION
>  } // namespace std
>  #endif // library concepts
> diff --git a/libstdc++-v3/include/bits/version.def 
> b/libstdc++-v3/include/bits/version.def
> index 605708dfee7..140777832ed 100644
> --- a/libstdc++-v3/include/bits/version.def
> +++ b/libstdc++-v3/include/bits/version.def
> @@ -1439,19 +1439,21 @@ ftms = {
>};
>  };
>  
> -ftms = {
> -  name = to_underlying;
> -  values = {
> -v = 202102;
> -cxxmin = 23;
> -  };
> -};
> +//ftms = {
> +//  name = container_ranges;
> +//  values = {
> +//v = 202202;
> +//cxxmin = 23;
> +//hosted = yes;
> +//  };
> +//};
>  
>  ftms = {
> -  name = unreachable;
> +  name = ranges_to_container;
>values = {
>  v = 202202;
>  cxxmin = 23;
> +hosted = yes;
>};
>  };
>  
> @@ -1683,6 +1685,22 @@ ftms = {
>};
>  };
>  
> +ftms = {
> +  name = to_underlying;
> +  values = {
> +v = 202102;
> +cxxmin = 23;
> +  };
> +};
> +
> +ftms = {
> +  name = unreachable;
> +  values = {
> +v = 202202;
> +cxxmin = 23;
> +  };
> +};
> +
>  ftms = {
>name = fstream_native_handle;
>values = {
> diff --git a/libstdc++-v3/include/bits/version.h 
> b/libstdc++-v3/include/bits/version.h
> index cacd9375cab..1fb1d148459 100644
> --- a/libstdc++-v3/include/bits/version.h
> +++ b/libstdc++-v3/include/bits/version.h
> @@ -1740,29 +1740,18 @@
>  #endif /* !defined(__cpp_lib_reference_from_temporary) && 
> defined(__glibcxx_want_reference_from_temporary) */
>  #undef __glibcxx_want_reference_from_temporary
>  
> -// from version.def line 1443
> -#if !defined(__cpp_lib_to_underlying)
> -# if (__cplusplus >= 202100L)
> -#  define __glibcxx_to_underlying 202102L
> -#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_to_underlying)
> -#   define __cpp_lib_to_underlying 202102L
> +// from version.def line 1452
> +#if !defined(__cpp_lib_ranges_to_container)
> +# if (__cplusplus >= 202100L) && _GLIBCXX_HOSTED
> +#  define __glibcxx_ranges_to_container 202202L
> +#  if defined(__glibcxx_want_all) || 
> defined(__glibcxx_want_ranges_to_container)
> +#   define __cpp_lib_ranges_to_container 202202L
>  

Re: [PATCH v2 1/3] libgomp, nvptx: low-latency memory allocator

2023-11-29 Thread Andrew Stubbs

On 08/09/2023 10:04, Tobias Burnus wrote:


Regarding patch 2/3 and MEMSPACE_VALIDATE.

In general, I wonder how to handle memory spaces (and traits) that
aren't supported. Namely, when to return 0L and when to silently use
ignore the trait / use another memory space.

The current omp_init_allocator code only returns omp_null_allocator for
invalid value – or for pinned memory (as it is unsupported). [RFC: Shall
we keep doing so – or return omp_null_mem_alloc more often? →
https://gcc.gnu.org/PR111044 for this question, improving libmemkind
usage, and extending the allocator-related documentation.]

As we do it on the host, I think auto-fallback to omp_default_mem_space
is is also find for nvptx (and gcn), but not as done in 2/3 but slightly
different:

(a) In omp_init_allocator, there should be a check whether it is
supported, if not, we can fallback to using default memory space. (In
line with the current code host + 1/2+2/3 nvptx behaviour.)

Note: That's not the same as the current 2/3 patch. Currently, if
MEMSPACE_VALIDATE fails, a retry is attempted – but the outcome depends
on the value for 'fallback'. When changing the memory space during
omp_init_allocator, only failed 'malloc' will give abort with abort_fb.

(b) For nvptx_memspace_validate, I think an additional check should be
done based on the __PTX_ISA_VERSION* as it feels off if plugin first
claims support for it but later unconditionally uses malloc at runtime.


I have looked at moving the MEMSPACE_VALIDATE call into 
omp_init_allocator so that we can't even create allocators that would be 
invalid, but that changes the semantics of the fall-back traits.  Here's 
the example from testcase omp_alloc-traits.c:


  omp_alloctrait_t traits_all[2]
= { { omp_atk_fallback, omp_atv_null_fb },
{ omp_atk_access, omp_atv_all } };
  omp_allocator_handle_t lowlat_all
= omp_init_allocator (omp_low_lat_mem_space, 2, traits_all);

  /* ... */

  void *b = omp_alloc (1, lowlat_all);

With my patch as proposed, "lowlat_all" is a valid allocator, but 
allocating low-latency memory fails in omp_alloc, so "b" ends up NULL 
(the fall-back setting).


With the proposed change, "lowlat_all" becomes omp_null_allocator, and 
"b" is non-NULL, pointing to default memory. This is probably surprising 
to the user because they thought they specified "low-latency or nothing".


Another option would be to create a custom allocator that goes straight 
to the fall-back somehow (we could invent an internal value 
"ompx_fallback_mem_space", or some such).


What is the desired behaviour in this case? I'm not sure that what the 
OpenMP spec actually says matches what the intention seems to have been 
with fallbacks.



(c) We also need to handle omp_low_lat_mem_alloc. I think the spec
implies access:all but nvptx/gcn only support cgroup (+ pteams +
thread), potentially leading to wrong code.
If we're not allowed to default to "cgroup" then surely 
omp_low_lat_mem_alloc is useless on all GPU devices (that I am aware of) 
on all toolchains? There may be some use on some specialist NUMA host 
devices, but that's it.


Andrew


Re: [PATCH v7 2/5] OpenMP/OpenACC: Rework clause expansion and nested struct handling

2023-11-29 Thread Tobias Burnus

Hi Julian,

On 29.11.23 12:43, Julian Brown wrote:

Here is a patch incorporating your initial review comments
(hopefully!).


Thanks.

The patch LGTM - with the two remarks below addressed.

(i.e. fixing one testcase and filing two PRs (or common PR) about the features
missing and exposed by the two test cases, referencing also to those testcases
- and for the lvalues mentioning the OpenMP spec issue number.)

* * *

BTW: The 1/5 has been several times approved and is just reindenting - and
is obviously still OK.


(Review wise, 3/5, 4/5 and 5/5 still has to be done.

I think the patch can go in before - given the huge improvements, even though
it regresses for a few cases (xfail added for 2 Fortran testcases). 3/5 
un-xfails
one and a half of the textcases, 5/5 un-xfails the remaining half and all of 
{3,4,5}/5
contain very useful improvements besides this. - But maybe waiting for at least 
3/5
makes sense.

In either case, I try to review the remaining patches soon.)

* * *

Question regarding the following:
(a) The dg-xfail-run-if looks bogus as this an OpenMP test and not an OpenACC 
test
(b) If there is shared memory, using 'omp target' should be fine.

Namely, given that:


--- /dev/null +++ b/libgomp/testsuite/libgomp.c++/target-49.C @@ -0,0
+1,37 @@ +#include  +#include  + +struct s { + int
()[10]; + s(int ()[10]) : a(a0) {} +}; + +int +main (int argc,
char *argv[]) +{ + int la[10]; + s v_real(la); + s *v = _real; + +
memset (la, 0, sizeof la); + + #pragma omp target enter data map(to:
v) + + /* Copying the whole v[0] here DOES NOT WORK yet because the
reference 'a' is + not copied "as if" it was mapped explicitly as a
member. FIXME. */ + #pragma omp target enter data map(to: v[0]) + +
//#pragma omp target + { + v->a[5]++; + } + + #pragma omp target exit
data map(release: v[0]) + #pragma omp target exit data map(from: v) +
+ assert (v->a[5] == 1); + + return 0; +} + +// { dg-xfail-run-if
"TODO" { *-*-* } { "-DACC_MEM_SHARED=0" } }

Shouldn't the XFAIL not be based on '{ target offload_device_nonshared_as }'
and the 'omp target' be uncommented?

And I wonder whether we need to file a PR about this issue - I guess it is not
addressed by any of the follow-up issues and might get forgotten unless there 
is PR.


* * *

libgomp/testsuite/libgomp.c++/baseptrs-4.C ... // Needs map clause
"lvalue"-parsing support. //#define REF2ARRAY_DECL_BASE


There is an open OpenMP issue to disallow some lvalues, namely:
OpenMP Issue 2618 ("Clarify behavior of mapping lvalues on target construct")
talks about code like the following

  map(*p = 10)
  map(x = 20)
  map(x ? y[0] : p[1])
  map(f(y))

is valid or not. The sentiment was to require that a 'map' clause list item
must have a base pointer or a base variable.


However, it looks as your examples would be valid in this regard. Can you file
a PR about this one? Referencing both to this testcase and to the OpenMP issue?

(I do note that Clang and GCC reject the lvalue examples from the OpenMP issue
but not your reference examples; those are accepted by clang++-14.)


Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder

2023-11-29 Thread Szabolcs Nagy
The 11/10/2023 19:48, Florian Weimer wrote:
>   * config/aarch64/linux-unwind.h
>   (aarch64_fallback_frame_state): Add cast to the expected type
>   in sc assignment.
> 
> (Almost a v2, but the other issue was already fixed via in r14-4183.)
> 
> ---
>  libgcc/config/aarch64/linux-unwind.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/libgcc/config/aarch64/linux-unwind.h 
> b/libgcc/config/aarch64/linux-unwind.h
> index 00eba866049..18b3df71e7b 100644
> --- a/libgcc/config/aarch64/linux-unwind.h
> +++ b/libgcc/config/aarch64/linux-unwind.h
> @@ -77,7 +77,10 @@ aarch64_fallback_frame_state (struct _Unwind_Context 
> *context,
>  }
>  
>rt_ = context->cfa;
> -  sc = _->uc.uc_mcontext;
> +  /* Historically, the uc_mcontext member was of type struct sigcontext, but
> + glibc uses a different type now with member names in the implementation
> + namespace.  */
> +  sc = (struct sigcontext *) _->uc.uc_mcontext;

FWIW this looks good to me.
(but i cannot approve patches)

(changing the type of sc to mcontext_t* is another option,
but then _GNU_SOURCE is required for the field names to
remain the same across glibc versions, while struct
sigcontext* is unlikely to cause API issues.)

>  
>  /* This define duplicates the definition in aarch64.md */
>  #define SP_REGNUM 31
> 
> base-commit: 3a6df3281a525ae6113f50d7b38b09fcd803801e
> 


Re: [PATCH] aarch64: enable mixed-types for aarch64 simdclones

2023-11-29 Thread Richard Sandiford
Sorry for the very slow review on this.  It LGTM apart from some
minor comments below:

"Andre Vieira (lists)"  writes:
> Hey,
>
> Just a minor update to the patch, I had missed the libgomp testsuite, so 
> had to make some adjustments there too.
>
> gcc/ChangeLog:
>
>  * config/aarch64/aarch64.cc (lane_size): New function.
>  (aarch64_simd_clone_compute_vecsize_and_simdlen): Determine 
> simdlen according to NDS rule
>  and reject combination of simdlen and types that lead to 
> vectors larger than 128bits.
>
> gcc/testsuite/ChangeLog:
>
>  * lib/target-supports.exp: Add aarch64 targets to vect_simd_clones.
>  * c-c++-common/gomp/declare-variant-14.c: Adapt test for aarch64.
>  * c-c++-common/gomp/pr60823-1.c: Likewise.
>  * c-c++-common/gomp/pr60823-2.c: Likewise.
>  * c-c++-common/gomp/pr60823-3.c: Likewise.
>  * g++.dg/gomp/attrs-10.C: Likewise.
>  * g++.dg/gomp/declare-simd-1.C: Likewise.
>  * g++.dg/gomp/declare-simd-3.C: Likewise.
>  * g++.dg/gomp/declare-simd-4.C: Likewise.
>  * g++.dg/gomp/declare-simd-7.C: Likewise.
>  * g++.dg/gomp/declare-simd-8.C: Likewise.
>  * g++.dg/gomp/pr88182.C: Likewise.
>  * gcc.dg/declare-simd.c: Likewise.
>  * gcc.dg/gomp/declare-simd-1.c: Likewise.
>  * gcc.dg/gomp/declare-simd-3.c: Likewise.
>  * gcc.dg/gomp/pr87887-1.c: Likewise.
>  * gcc.dg/gomp/pr87895-1.c: Likewise.
>  * gcc.dg/gomp/pr89246-1.c: Likewise.
>  * gcc.dg/gomp/pr99542.c: Likewise.
>  * gcc.dg/gomp/simd-clones-2.c: Likewise.
>  * gcc.dg/gcc.dg/vect/vect-simd-clone-1.c: Likewise.
>  * gcc.dg/gcc.dg/vect/vect-simd-clone-2.c: Likewise.
>  * gcc.dg/gcc.dg/vect/vect-simd-clone-4.c: Likewise.
>  * gcc.dg/gcc.dg/vect/vect-simd-clone-5.c: Likewise.
>  * gcc.dg/gcc.dg/vect/vect-simd-clone-8.c: Likewise.
>  * gfortran.dg/gomp/declare-simd-2.f90: Likewise.
>  * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Likewise.
>  * gfortran.dg/gomp/declare-variant-14.f90: Likewise.
>  * gfortran.dg/gomp/pr79154-1.f90: Likewise.
>  * gfortran.dg/gomp/pr83977.f90: Likewise.
>
> libgomp/testsuite/ChangeLog:
>
>  * libgomp.c/declare-variant-1.c: Adapt test for aarch64.
>  * libgomp.fortran/declare-simd-1.f90: Likewise.
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 9fbfc548a891f5d11940c6fd3c49a14bfbdec886..37507f091c2a6154fa944c3a9fad6a655ab5d5a1
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -27414,33 +27414,62 @@ supported_simd_type (tree t)
>return false;
>  }
>  
> -/* Return true for types that currently are supported as SIMD return
> -   or argument types.  */
> +/* Determine the lane size for the clone argument/return type.  This follows
> +   the LS(P) rule in the VFABIA64.  */
>  
> -static bool
> -currently_supported_simd_type (tree t, tree b)
> +static unsigned
> +lane_size (cgraph_simd_clone_arg_type clone_arg_type, tree type)
>  {
> -  if (COMPLEX_FLOAT_TYPE_P (t))
> -return false;
> +  gcc_assert (clone_arg_type != SIMD_CLONE_ARG_TYPE_MASK);
>  
> -  if (TYPE_SIZE (t) != TYPE_SIZE (b))
> -return false;
> +  /* For non map-to-vector types that are pointers we use the element type it
> + points to.  */
> +  if (POINTER_TYPE_P (type))
> +switch (clone_arg_type)
> +  {
> +  default:
> + break;
> +  case SIMD_CLONE_ARG_TYPE_UNIFORM:
> +  case SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP:
> +  case SIMD_CLONE_ARG_TYPE_LINEAR_VARIABLE_STEP:
> + type = TREE_TYPE (type);
> + break;
> +  }
>  
> -  return supported_simd_type (t);
> +  /* For types (or types pointers of non map-to-vector types point to) that 
> are
> + integers or floating point, we use their size if they are 1, 2, 4 or 8.
> +   */
> +  if (INTEGRAL_TYPE_P (type)
> +  || SCALAR_FLOAT_TYPE_P (type))
> +  switch (TYPE_PRECISION (type) / BITS_PER_UNIT)
> + {
> + default:
> +   break;
> + case 1:
> + case 2:
> + case 4:
> + case 8:
> +   return TYPE_PRECISION (type);
> + }

The formatting looks a bit off here.  The switch should be indented by
4 columns and the { by 6.

> +  /* For any other we use the size of uintptr_t.  For map-to-vector types 
> that
> + are pointers, using the size of uintptr_t is the same as using the size 
> of
> + their type, seeing all pointers are the same size as uintptr_t.  */
> +  return POINTER_SIZE;
>  }
>  
> +
>  /* Implement TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN.  */
>  
>  static int
>  aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
>   struct cgraph_simd_clone *clonei,
> - tree base_type, int num,
> - bool 

Re: [V2] New pass for sign/zero extension elimination -- not ready for "final" review

2023-11-29 Thread Xi Ruoyao
On Wed, 2023-11-29 at 20:37 +0800, Xi Ruoyao wrote:
> On Wed, 2023-11-29 at 17:33 +0800, Xi Ruoyao wrote:
> > On Mon, 2023-11-27 at 23:06 -0700, Jeff Law wrote:
> > > This has (of course) been tested on rv64.  It's also been bootstrapped
> > > and regression tested on x86.  Bootstrap and regression tested (C only) 
> > > for m68k, sh4, sh4eb, alpha.  Earlier versions were also bootstrapped 
> > > and regression tested on ppc, hppa and s390x (C only for those as well). 
> > >   It's also been tested on the various crosses in my tester.  So we've
> > > got reasonable coverage of 16, 32 and 64 bit targets, big and little
> > > endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other 
> > > oddities.
> > > 
> > > The included tests are for RISC-V only because not all targets are going 
> > > to have extraneous extensions.   There's tests from coremark, x264 and
> > > GCC's bz database.  It probably wouldn't be hard to add aarch64 
> > > testscases.  The BZs listed are improved by this patch for aarch64.
> > 
> > I've successfully bootstrapped this on loongarch64-linux-gnu and tried
> > the added test cases.  For loongarch64 the redundant extensions are
> > removed for core_bench_list.c, core_init_matrix.c, core_list_init.c,
> > matrix_add_const.c, and pr111384.c, but not mem-extend.c.

> Follow up: no regression in GCC test suite on LoongArch.
> 
> > Should I change something in LoongArch backend in order to make ext_dce
> > work for mem-extend.c too?  If yes then any pointers?

Hmm... This test seems not working even for RISC-V:

$ ./gcc/cc1 -O2 ../gcc/gcc/testsuite/gcc.target/riscv/mem-extend.c  -nostdinc 
-fdump-rtl-ext_dce -march=rv64gc_zbb -mabi=lp64d -o- 2>&1 | grep -F zext.h
zext.h  a5,a5
zext.h  a4,a4

and the 294r.ext_dce file does not contain "Successfully transformed
to:" lines.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] c++: wrong ambiguity in accessing static field [PR112744]

2023-11-29 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

Now that I'm posting this patch, I think you'll probably want me to use
ba_any unconditionally.  That works too; g++.dg/tc1/dr52.C just needs
a trivial testsuite tweak:
  'C' is not an accessible base of 'X'
v.
  'C' is an inaccessible base of 'X'
We should probably unify those messages...

-- >8 --
Given

  struct A { constexpr static int a = 0; };
  struct B : A {};
  struct C : A {};
  struct D : B, C {};

we give the "'A' is an ambiguous base of 'D'" error for

  D{}.A::a;

which seems wrong: 'a' is a static data member so there is only one copy
so it can be unambiguously referred to even if there are multiple A
objects.  clang++/MSVC/icx agree.

PR c++/112744

gcc/cp/ChangeLog:

* typeck.cc (finish_class_member_access_expr): When accessing
a static data member, use ba_any for lookup_base.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/scoped11.C: New test.
* g++.dg/lookup/scoped12.C: New test.
* g++.dg/lookup/scoped13.C: New test.
---
 gcc/cp/typeck.cc   | 21 ++---
 gcc/testsuite/g++.dg/lookup/scoped11.C | 14 ++
 gcc/testsuite/g++.dg/lookup/scoped12.C | 14 ++
 gcc/testsuite/g++.dg/lookup/scoped13.C | 14 ++
 4 files changed, 60 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lookup/scoped11.C
 create mode 100644 gcc/testsuite/g++.dg/lookup/scoped12.C
 create mode 100644 gcc/testsuite/g++.dg/lookup/scoped13.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index e995fb6ddd7..c4de8bb2616 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -3476,7 +3476,7 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
   name, scope);
  return error_mark_node;
}
- 
+
  if (TREE_SIDE_EFFECTS (object))
val = build2 (COMPOUND_EXPR, TREE_TYPE (val), object, val);
  return val;
@@ -3493,9 +3493,24 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
  return error_mark_node;
}
 
+ /* NAME may refer to a static data member, in which case there is
+one copy of the data member that is shared by all the objects of
+the class.  So NAME can be unambiguously referred to even if
+there are multiple indirect base classes containing NAME.  */
+ const base_access ba = [scope, name] ()
+   {
+ if (identifier_p (name))
+   {
+ tree m = lookup_member (scope, name, /*protect=*/0,
+ /*want_type=*/false, tf_none);
+ if (!m || VAR_P (m))
+   return ba_any;
+   }
+ return ba_check;
+   } ();
+
  /* Find the base of OBJECT_TYPE corresponding to SCOPE.  */
- access_path = lookup_base (object_type, scope, ba_check,
-NULL, complain);
+ access_path = lookup_base (object_type, scope, ba, NULL, complain);
  if (access_path == error_mark_node)
return error_mark_node;
  if (!access_path)
diff --git a/gcc/testsuite/g++.dg/lookup/scoped11.C 
b/gcc/testsuite/g++.dg/lookup/scoped11.C
new file mode 100644
index 000..be743522fce
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/scoped11.C
@@ -0,0 +1,14 @@
+// PR c++/112744
+// { dg-do compile }
+
+struct A { const static int a = 0; };
+struct B : A {};
+struct C : A {};
+struct D : B, C {};
+
+int main()
+{
+  D d;
+  (void) d.a;
+  (void) d.A::a;
+}
diff --git a/gcc/testsuite/g++.dg/lookup/scoped12.C 
b/gcc/testsuite/g++.dg/lookup/scoped12.C
new file mode 100644
index 000..ffa145598fd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/scoped12.C
@@ -0,0 +1,14 @@
+// PR c++/112744
+// { dg-do compile }
+
+class A { const static int a = 0; };
+struct B : A {};
+struct C : A {};
+struct D : B, C {};
+
+int main()
+{
+  D d;
+  (void) d.a;// { dg-error "private" }
+  (void) d.A::a;  // { dg-error "private" }
+}
diff --git a/gcc/testsuite/g++.dg/lookup/scoped13.C 
b/gcc/testsuite/g++.dg/lookup/scoped13.C
new file mode 100644
index 000..970e1aa833e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/scoped13.C
@@ -0,0 +1,14 @@
+// PR c++/112744
+// { dg-do compile }
+
+struct A { const static int a = 0; };
+struct B : A {};
+struct C : A {};
+struct D : B, C {};
+
+int main()
+{
+  D d;
+  (void) d.x;// { dg-error ".struct D. has no member named .x." }
+  (void) d.A::x;  // { dg-error ".struct A. has no member named .x." }
+}

base-commit: 3d104d93a7011146b0870ab160613147adb8d9b3
-- 
2.42.0



Re: [PATCH] aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c

2023-11-29 Thread Richard Earnshaw




On 10/11/2023 11:22, Florian Weimer wrote:

This test looks like it intends to pass a small struct argument
through both a non-variadic and variadic argument, but due to
the typo, it does not achieve that.

gcc/testsuite/

* gcc.target/aarch64/aapcs64/ice_1.c (foo): Call named.

---
  gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c
index 906ccebf616..edc35db2f6e 100644
--- a/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c
@@ -16,6 +16,6 @@ void unnamed (int, ...);
  
  void foo ()

  {
-  name (0, );
+  named (0, );
unnamed (0, );
  }

base-commit: 5f6c5fe078c45bc32c8d21da6b14c27c0ed7be6e



OK.

R.


Re: [PATCH] aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c

2023-11-29 Thread Szabolcs Nagy
The 11/10/2023 12:22, Florian Weimer wrote:
> This test looks like it intends to pass a small struct argument
> through both a non-variadic and variadic argument, but due to
> the typo, it does not achieve that.
> 
> gcc/testsuite/
> 
>   * gcc.target/aarch64/aapcs64/ice_1.c (foo): Call named.


FWIW, this looks good to me.
(but i cannot approve patches)

> 
> ---
>  gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c 
> b/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c
> index 906ccebf616..edc35db2f6e 100644
> --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/ice_1.c
> @@ -16,6 +16,6 @@ void unnamed (int, ...);
>  
>  void foo ()
>  {
> -  name (0, );
> +  named (0, );
>unnamed (0, );
>  }
> 
> base-commit: 5f6c5fe078c45bc32c8d21da6b14c27c0ed7be6e
> 


Re: [PATCH v2 2/2] libatomic: Enable LSE128 128-bit atomics for armv9.4-a

2023-11-29 Thread Richard Earnshaw




On 13/11/2023 11:37, Victor Do Nascimento wrote:

The armv9.4-a architectural revision adds three new atomic operations
associated with the LSE128 feature:

   * LDCLRP - Atomic AND NOT (bitclear) of a location with 128-bit
   value held in a pair of registers, with original data loaded into
   the same 2 registers.
   * LDSETP - Atomic OR (bitset) of a location with 128-bit value held
   in a pair of registers, with original data loaded into the same 2
   registers.
   * SWPP - Atomic swap of one 128-bit value with 128-bit value held
   in a pair of registers.

This patch adds the logic required to make use of these when the
architectural feature is present and a suitable assembler available.

In order to do this, the following changes are made:

   1. Add a configure-time check to check for LSE128 support in the
   assembler.
   2. Edit host-config.h so that when N == 16, nifunc = 2.
   3. Where available due to LSE128, implement the second ifunc, making
   use of the novel instructions.
   4. For atomic functions unable to make use of these new
   instructions, define a new alias which causes the _i1 function
   variant to point ahead to the corresponding _i2 implementation.

libatomic/ChangeLog:

* Makefile.am (AM_CPPFLAGS): add conditional setting of
-DHAVE_FEAT_LSE128.
* acinclude.m4 (LIBAT_TEST_FEAT_LSE128): New.
* config/linux/aarch64/atomic_16.S (LSE128): New macro
definition.
(libat_exchange_16): New LSE128 variant.
(libat_fetch_or_16): Likewise.
(libat_or_fetch_16): Likewise.
(libat_fetch_and_16): Likewise.
(libat_and_fetch_16): Likewise.
* config/linux/aarch64/host-config.h (IFUNC_COND_2): New.
(IFUNC_NCOND): Add operand size checking.
(has_lse2): Renamed from `ifunc1`.
(has_lse128): New.
(HAS_LSE128): Likewise.
* libatomic/configure.ac: Add call to LIBAT_TEST_FEAT_LSE128.
* configure (ac_subst_vars): Regenerated via autoreconf.
* libatomic/Makefile.in: Likewise.
* libatomic/auto-config.h.in: Likewise.
---
  libatomic/Makefile.am|   3 +
  libatomic/Makefile.in|   1 +
  libatomic/acinclude.m4   |  19 +++
  libatomic/auto-config.h.in   |   3 +
  libatomic/config/linux/aarch64/atomic_16.S   | 170 ++-
  libatomic/config/linux/aarch64/host-config.h |  27 ++-
  libatomic/configure  |  59 ++-
  libatomic/configure.ac   |   1 +
  8 files changed, 274 insertions(+), 9 deletions(-)

diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
index c0b8dea5037..24e843db67d 100644
--- a/libatomic/Makefile.am
+++ b/libatomic/Makefile.am
@@ -130,6 +130,9 @@ libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix 
_$(s)_.lo,$(SIZEOBJS)))
  ## On a target-specific basis, include alternates to be selected by IFUNC.
  if HAVE_IFUNC
  if ARCH_AARCH64_LINUX
+if ARCH_AARCH64_HAVE_LSE128
+AM_CPPFLAGS = -DHAVE_FEAT_LSE128
+endif
  IFUNC_OPTIONS  = -march=armv8-a+lse
  libatomic_la_LIBADD += $(foreach s,$(SIZES),$(addsuffix 
_$(s)_1_.lo,$(SIZEOBJS)))
  libatomic_la_SOURCES += atomic_16.S
diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index dc2330b91fd..cd48fa21334 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -452,6 +452,7 @@ M_SRC = $(firstword $(filter %/$(M_FILE), $(all_c_files)))
  libatomic_la_LIBADD = $(foreach s,$(SIZES),$(addsuffix \
_$(s)_.lo,$(SIZEOBJS))) $(am__append_1) $(am__append_3) \
$(am__append_4) $(am__append_5)
+@ARCH_AARCH64_HAVE_LSE128_TRUE@@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@AM_CPPFLAGS
 = -DHAVE_FEAT_LSE128
  @ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv8-a+lse
  @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=armv7-a+fp 
-DHAVE_KERNEL64
  @ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@IFUNC_OPTIONS = -march=i586
diff --git a/libatomic/acinclude.m4 b/libatomic/acinclude.m4
index f35ab5b60a5..4197db8f404 100644
--- a/libatomic/acinclude.m4
+++ b/libatomic/acinclude.m4
@@ -83,6 +83,25 @@ AC_DEFUN([LIBAT_TEST_ATOMIC_BUILTIN],[
])
  ])
  
+dnl

+dnl Test if the host assembler supports armv9.4-a LSE128 isns.
+dnl
+AC_DEFUN([LIBAT_TEST_FEAT_LSE128],[
+  AC_CACHE_CHECK([for armv9.4-a LSE128 insn support],
+[libat_cv_have_feat_lse128],[
+AC_LANG_CONFTEST([AC_LANG_PROGRAM([],[asm(".arch armv9-a+lse128")])])
+if AC_TRY_EVAL(ac_link); then
+  eval libat_cv_have_feat_lse128=yes
+else
+  eval libat_cv_have_feat_lse128=no
+fi
+rm -f conftest*
+  ])
+  LIBAT_DEFINE_YESNO([HAVE_FEAT_LSE128], [$libat_cv_have_feat_lse128],
+   [Have LSE128 support for 16 byte integers.])
+  AM_CONDITIONAL([ARCH_AARCH64_HAVE_LSE128], [test x$libat_cv_have_feat_lse128 
= xyes])
+])
+
  dnl
  dnl Test if we have __atomic_load and __atomic_store for mode $1, size $2
  dnl
diff --git 

Re: [PATCH] Add C intrinsics for scalar crypto extension

2023-11-29 Thread Christoph Müllner
On Mon, Nov 27, 2023 at 9:36 AM Liao Shihua  wrote:
>
> This patch add C intrinsics for scalar crypto extension.
> Because of riscv-c-api 
> (https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44/files) includes 
> zbkb/zbkc/zbkx's
> intrinsics in bit manipulation extension, this patch only support zkn*/zks*'s 
> intrinsics.

Thanks for working on this!
Looking forward to seeing the second patch (covering bitmanip) soon as well!
A couple of comments can be found below.

>
> gcc/ChangeLog:
>
> * config.gcc: Add riscv_crypto.h
> * config/riscv/riscv_crypto.h: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zknd32.c: Use intrinsics instead of builtins.
> * gcc.target/riscv/zknd64.c: Likewise.
> * gcc.target/riscv/zkne32.c: Likewise.
> * gcc.target/riscv/zkne64.c: Likewise.
> * gcc.target/riscv/zknh-sha256-32.c: Likewise.
> * gcc.target/riscv/zknh-sha256-64.c: Likewise.
> * gcc.target/riscv/zknh-sha512-32.c: Likewise.
> * gcc.target/riscv/zknh-sha512-64.c: Likewise.
> * gcc.target/riscv/zksed32.c: Likewise.
> * gcc.target/riscv/zksed64.c: Likewise.
> * gcc.target/riscv/zksh32.c: Likewise.
> * gcc.target/riscv/zksh64.c: Likewise.
>
> ---
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv_crypto.h   | 219 ++
>  gcc/testsuite/gcc.target/riscv/zknd32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zknd64.c   |  12 +-
>  gcc/testsuite/gcc.target/riscv/zkne32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zkne64.c   |  10 +-
>  .../gcc.target/riscv/zknh-sha256-32.c |  22 +-
>  .../gcc.target/riscv/zknh-sha256-64.c |  10 +-
>  .../gcc.target/riscv/zknh-sha512-32.c |  14 +-
>  .../gcc.target/riscv/zknh-sha512-64.c |  10 +-
>  gcc/testsuite/gcc.target/riscv/zksed32.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksed64.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh64.c   |   6 +-
>  14 files changed, 288 insertions(+), 47 deletions(-)
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b88591b6fd8..d67fe8b6a6f 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -548,7 +548,7 @@ riscv*)
> extra_objs="${extra_objs} riscv-vector-builtins.o 
> riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
> extra_objs="${extra_objs} thead.o riscv-target-attr.o"
> d_target_objs="riscv-d.o"
> -   extra_headers="riscv_vector.h"
> +   extra_headers="riscv_vector.h riscv_crypto.h"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.h"
> ;;
> diff --git a/gcc/config/riscv/riscv_crypto.h b/gcc/config/riscv/riscv_crypto.h
> new file mode 100644
> index 000..149c1132e10
> --- /dev/null
> +++ b/gcc/config/riscv/riscv_crypto.h
> @@ -0,0 +1,219 @@
> +/* RISC-V 'K' Extension intrinsics include file.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published
> +   by the Free Software Foundation; either version 3, or (at your
> +   option) any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +   License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#ifndef __RISCV_CRYPTO_H
> +#define __RISCV_CRYPTO_H
> +
> +#include 
> +
> +#if defined (__cplusplus)
> +extern "C" {
> +#endif
> +
> +#if defined(__riscv_zknd)
> +#if __riscv_xlen == 32
> +#define __riscv_aes32dsi(x, y, bs) __builtin_riscv_aes32dsi(x, y, bs)
> +#define __riscv_aes32dsmi(x, y, bs) __builtin_riscv_aes32dsmi(x, y, bs)
> +#endif
> +
> +#if __riscv_xlen == 64
> +static __inline__ uint64_t __attribute__ ((__always_inline__, __nodebug__))
> +__riscv_aes64ds (uint64_t __x, uint64_t __y)
> +{
> +  return __builtin_riscv_aes64ds (__x, __y);
> +}

I don't understand why some intrinsic functions are implemented as
macros to builtins
and some are implemented as static inline wrappers around butilins.
Is there a particular reason that this 

RE: [PATCH 10/21]middle-end: implement relevancy analysis support for control flow

2023-11-29 Thread Richard Biener
On Mon, 27 Nov 2023, Tamar Christina wrote:

> Ping
> 
> > -Original Message-
> > From: Tamar Christina 
> > Sent: Monday, November 6, 2023 7:40 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com
> > Subject: [PATCH 10/21]middle-end: implement relevancy analysis support for
> > control flow
> > 
> > Hi All,
> > 
> > This updates relevancy analysis to support marking gcond's belonging to 
> > early
> > breaks as relevant for vectorization.
> > 
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > 
> > Ok for master?
> > 
> > Thanks,
> > Tamar
> > 
> > gcc/ChangeLog:
> > 
> > * tree-vect-stmts.cc (vect_stmt_relevant_p,
> > vect_mark_stmts_to_be_vectorized, vect_analyze_stmt,
> > vect_is_simple_use,
> > vect_get_vector_types_for_stmt): Support early breaks.
> > 
> > --- inline copy of patch --
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index
> > 4809b822632279493a843d402a833c9267bb315e..31474e923cc3feb2604
> > ca2882ecfb300cd211679 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -359,9 +359,14 @@ vect_stmt_relevant_p (stmt_vec_info stmt_info,
> > loop_vec_info loop_vinfo,
> >*live_p = false;
> > 
> >/* cond stmt other than loop exit cond.  */
> > -  if (is_ctrl_stmt (stmt_info->stmt)
> > -  && STMT_VINFO_TYPE (stmt_info) != loop_exit_ctrl_vec_info_type)
> > -*relevant = vect_used_in_scope;
> > +  gimple *stmt = STMT_VINFO_STMT (stmt_info);
> > +  if (is_ctrl_stmt (stmt) && is_a  (stmt))

is_ctrl_stmt (stmt) is redundant

> > +{
> > +  gcond *cond = as_a  (stmt);

in total better written as

   if (gcond *cond = dyn_cast  (stmt))
 {

> > +  if (LOOP_VINFO_LOOP_CONDS (loop_vinfo).contains (cond)

linear search ...

> > + && LOOP_VINFO_LOOP_IV_COND (loop_vinfo) != cond)
> > +   *relevant = vect_used_in_scope;

but why not simply mark all gconds as vect_used_in_scope?

> > +}
> > 
> >/* changing memory.  */
> >if (gimple_code (stmt_info->stmt) != GIMPLE_PHI) @@ -374,6 +379,11 @@
> > vect_stmt_relevant_p (stmt_vec_info stmt_info, loop_vec_info loop_vinfo,
> > *relevant = vect_used_in_scope;
> >}
> > 
> > +  auto_vec exits = get_loop_exit_edges (loop);  auto_bitmap
> > + exit_bbs;  for (edge exit : exits)

is it your mail client messing patches up?  missing line-break
again.

> > +bitmap_set_bit (exit_bbs, exit->dest->index);
> > +

you don't seem to use the bitmap?

> >/* uses outside the loop.  */
> >FOR_EACH_PHI_OR_STMT_DEF (def_p, stmt_info->stmt, op_iter,
> > SSA_OP_DEF)
> >  {
> > @@ -392,7 +402,6 @@ vect_stmt_relevant_p (stmt_vec_info stmt_info,
> > loop_vec_info loop_vinfo,
> >   /* We expect all such uses to be in the loop exit phis
> >  (because of loop closed form)   */
> >   gcc_assert (gimple_code (USE_STMT (use_p)) == GIMPLE_PHI);
> > - gcc_assert (bb == single_exit (loop)->dest);
> > 
> >*live_p = true;
> > }
> > @@ -793,6 +802,20 @@ vect_mark_stmts_to_be_vectorized (loop_vec_info
> > loop_vinfo, bool *fatal)
> > return res;
> > }
> >   }
> > +   }
> > + else if (gcond *cond = dyn_cast  (stmt_vinfo->stmt))
> > +   {
> > + enum tree_code rhs_code = gimple_cond_code (cond);
> > + gcc_assert (TREE_CODE_CLASS (rhs_code) == tcc_comparison);
> > + opt_result res
> > +   = process_use (stmt_vinfo, gimple_cond_lhs (cond),
> > +  loop_vinfo, relevant, , false);
> > + if (!res)
> > +   return res;
> > + res = process_use (stmt_vinfo, gimple_cond_rhs (cond),
> > +   loop_vinfo, relevant, , false);
> > + if (!res)
> > +   return res;
> >  }

I guess we're missing an

  else
gcc_unreachable ();

to catch not handled stmt kinds (do we have gcond patterns yet?)

> >   else if (gcall *call = dyn_cast  (stmt_vinfo->stmt))
> > {
> > @@ -13043,11 +13066,15 @@ vect_analyze_stmt (vec_info *vinfo,
> >  node_instance, cost_vec);
> >if (!res)
> > return res;
> > -   }
> > +}
> > +
> > +  if (is_ctrl_stmt (stmt_info->stmt))
> > +STMT_VINFO_DEF_TYPE (stmt_info) = vect_early_exit_def;

I think it should rather be vect_condition_def.  It's also not
this functions business to set STMT_VINFO_DEF_TYPE.  If we ever
get to handle not if-converted code (or BB vectorization of that)
then a gcond would define the mask stmts are under.

> >switch (STMT_VINFO_DEF_TYPE (stmt_info))
> >  {
> >case vect_internal_def:
> > +  case vect_early_exit_def:
> >  break;
> > 
> >case vect_reduction_def:
> > @@ -13080,6 +13107,7 @@ vect_analyze_stmt (vec_info *vinfo,
> >  {
> >gcall *call = dyn_cast  (stmt_info->stmt);
> >gcc_assert (STMT_VINFO_VECTYPE (stmt_info)
> > +  

RE: [PATCH v4] [tree-optimization/110279] Consider FMA in get_reassociation_width

2023-11-29 Thread Di Zhao OS
> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, November 21, 2023 9:01 PM
> To: Di Zhao OS 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
> 
> On Thu, Nov 9, 2023 at 6:53 PM Di Zhao OS 
> wrote:
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Tuesday, October 31, 2023 9:48 PM
> > > To: Di Zhao OS 
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> > > get_reassociation_width
> > >
> > > On Sun, Oct 8, 2023 at 6:40 PM Di Zhao OS 
> > > wrote:
> > > >
> > > > Attached is a new version of the patch.
> > > >
> > > > > -Original Message-
> > > > > From: Richard Biener 
> > > > > Sent: Friday, October 6, 2023 5:33 PM
> > > > > To: Di Zhao OS 
> > > > > Cc: gcc-patches@gcc.gnu.org
> > > > > Subject: Re: [PATCH v4] [tree-optimization/110279] Consider FMA in
> > > > > get_reassociation_width
> > > > >
> > > > > On Thu, Sep 14, 2023 at 2:43 PM Di Zhao OS
> > > > >  wrote:
> > > > > >
> > > > > > This is a new version of the patch on "nested FMA".
> > > > > > Sorry for updating this after so long, I've been studying and
> > > > > > writing micro cases to sort out the cause of the regression.
> > > > >
> > > > > Sorry for taking so long to reply.
> > > > >
> > > > > > First, following previous discussion:
> > > > > > (https://gcc.gnu.org/pipermail/gcc-patches/2023-
> September/629080.html)
> > > > > >
> > > > > > 1. From testing more altered cases, I don't think the
> > > > > > problem is that reassociation works locally. In that:
> > > > > >
> > > > > >   1) On the example with multiplications:
> > > > > >
> > > > > > tmp1 = a + c * c + d * d + x * y;
> > > > > > tmp2 = x * tmp1;
> > > > > > result += (a + c + d + tmp2);
> > > > > >
> > > > > >   Given "result" rewritten by width=2, the performance is
> > > > > >   worse if we rewrite "tmp1" with width=2. In contrast, if we
> > > > > >   remove the multiplications from the example (and make "tmp1"
> > > > > >   not singe used), and still rewrite "result" by width=2, then
> > > > > >   rewriting "tmp1" with width=2 is better. (Make sense because
> > > > > >   the tree's depth at "result" is still smaller if we rewrite
> > > > > >   "tmp1".)
> > > > > >
> > > > > >   2) I tried to modify the assembly code of the example without
> > > > > >   FMA, so the width of "result" is 4. On Ampere1 there's no
> > > > > >   obvious improvement. So although this is an interesting
> > > > > >   problem, it doesn't seem like the cause of the regression.
> > > > >
> > > > > OK, I see.
> > > > >
> > > > > > 2. From assembly code of the case with FMA, one problem is
> > > > > > that, rewriting "tmp1" to parallel didn't decrease the
> > > > > > minimum CPU cycles (taking MULT_EXPRs into account), but
> > > > > > increased code size, so the overhead is increased.
> > > > > >
> > > > > >a) When "tmp1" is not re-written to parallel:
> > > > > > fmadd d31, d2, d2, d30
> > > > > > fmadd d31, d3, d3, d31
> > > > > > fmadd d31, d4, d5, d31  //"tmp1"
> > > > > > fmadd d31, d31, d4, d3
> > > > > >
> > > > > >b) When "tmp1" is re-written to parallel:
> > > > > > fmul  d31, d4, d5
> > > > > > fmadd d27, d2, d2, d30
> > > > > > fmadd d31, d3, d3, d31
> > > > > > fadd  d31, d31, d27 //"tmp1"
> > > > > > fmadd d31, d31, d4, d3
> > > > > >
> > > > > > For version a), there are 3 dependent FMAs to calculate "tmp1".
> > > > > > For version b), there are also 3 dependent instructions in the
> > > > > > longer path: the 1st, 3rd and 4th.
> > > > >
> > > > > Yes, it doesn't really change anything.  The patch has
> > > > >
> > > > > +  /* If there's code like "acc = a * b + c * d + acc" in a tight 
> > > > > loop,
> > > some
> > > > > + uarchs can execute results like:
> > > > > +
> > > > > +   _1 = a * b;
> > > > > +   _2 = .FMA (c, d, _1);
> > > > > +   acc_1 = acc_0 + _2;
> > > > > +
> > > > > + in parallel, while turning it into
> > > > > +
> > > > > +   _1 = .FMA(a, b, acc_0);
> > > > > +   acc_1 = .FMA(c, d, _1);
> > > > > +
> > > > > + hinders that, because then the first FMA depends on the result
> > > > > of preceding
> > > > > + iteration.  */
> > > > >
> > > > > I can't see what can be run in parallel for the first case.  The .FMA
> > > > > depends on the multiplication a * b.  Iff the uarch somehow decomposes
> > > > > .FMA into multiply + add then the c * d multiply could run in parallel
> > > > > with the a * b multiply which _might_ be able to hide some of the
> > > > > latency of the full .FMA.  Like on x86 Zen FMA has a latency of 4
> > > > > cycles but a multiply only 3.  But I never got confirmation from any
> > > > > of the CPU designers that .FMAs are issued when the multiply
> > > > > operands are ready and the add operand can be forwarded.
> > > > >
> 

Re: [PATCH] libiberty: Disable hwcaps for sha1.o

2023-11-29 Thread Jakub Jelinek
On Wed, Nov 29, 2023 at 03:10:00PM +0100, Rainer Orth wrote:
> 2023-11-29  Rainer Orth  
> 
>   config:
>   * hwcaps.m4 (GCC_CHECK_ASSEMBLER_HWCAP): Require
>   AC_CANONICAL_TARGET.
> 
>   libiberty:
>   * configure.ac (GCC_CHECK_ASSEMBLER_HWCAP): Invoke.
>   * configure, aclocal.m4: Regenerate.
>   * Makefile.in (COMPILE.c): Add HWCAP_CFLAGS.

Ok, thanks.

Jakub



In 'libgomp.c/target-simd-clone-{1,2,3}.c', restrict 'scan-offload-ipa-dump's to 'only_for_offload_target amdgcn-amdhsa' (was: [PATCH v4] OpenMP: Generate SIMD clones for functions with "declare targe

2023-11-29 Thread Thomas Schwinge
Hi!

On 2022-11-14T21:46:15-0700, Sandra Loosemore via Gcc-patches 
 wrote:
> [...] I've added infrastructure to support testing on the offload
> compiler, added new test cases, and reworked the existing test cases to
> scan for interesting things written to the dump file instead of
> examining the .s output.

Thanks!  (..., belatedly.  I think it was me who suggested that.)

Just one minor fix-up:

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/target-simd-clone-1.c
> @@ -0,0 +1,43 @@
> +/* { dg-do link { target { offload_target_amdgcn } } } */

This means, the test case is active if GCN offloading compilation is
enabled.  But, consider the case that nvptx offloading compilation also
is enabled:

> +/* { dg-additional-options "-O2 
> -foffload-options=-fdump-ipa-simdclone-details" } */

This will produced dump files for both, GCN and nvptx, separately.
However, this isn't applicable for nvptx, thus no dump file produced for
that.

> +[...]
> +/* { dg-final { scan-offload-ipa-dump "Generated local clone 
> _ZGV.*N.*_addit" "simdclone" } } */
> +/* { dg-final { scan-offload-ipa-dump "Generated local clone 
> _ZGV.*M.*_addit" "simdclone" } } */

..., and this will try to scan dump files for both GCN and nvptx.  The
latter don't exist, resulting in UNRESOLVEDs for nvptx.  I've pushed to
master branch commit 4c909c6ee381a43081d68abc1ff8a35ce20d24d9
"In 'libgomp.c/target-simd-clone-{1,2,3}.c', restrict 'scan-offload-ipa-dump's 
to 'only_for_offload_target amdgcn-amdhsa'",
see attached.

(This obviously depends on

"testsuite: Add 'only_for_offload_target' wrapper for 'scan-offload-tree-dump' 
etc.",
which I've also just pushed to master branch in
commit 27c79b91f6008a21006d4e7053a98e63f2990bb2.)


Grüße
 Thomas


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/target-simd-clone-2.c
> @@ -0,0 +1,39 @@
> +/* { dg-do link { target { offload_target_amdgcn } } } */
> +/* { dg-additional-options "-foffload-options=-fdump-ipa-simdclone-details 
> -foffload-options=-fno-openmp-target-simd-clone" } */
> +[...]
> +/* { dg-final { scan-offload-ipa-dump-not "Generated .* clone" "simdclone" } 
> } */

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/target-simd-clone-3.c
> @@ -0,0 +1,40 @@
> +/* { dg-do link { target { offload_target_amdgcn } } } */
> +/* { dg-additional-options "-O2 
> -foffload-options=-fdump-ipa-simdclone-details" } */
> +[...]
> +/* { dg-final { scan-offload-ipa-dump "device doesn't match" "simdclone" { 
> target x86_64-*-* } } } */
> +/* { dg-final { scan-offload-ipa-dump-not "Generated .* clone" "simdclone" { 
> target x86_64-*-* } } } */


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 4c909c6ee381a43081d68abc1ff8a35ce20d24d9 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 21 Nov 2023 20:20:21 +0100
Subject: [PATCH] In 'libgomp.c/target-simd-clone-{1,2,3}.c', restrict
 'scan-offload-ipa-dump's to 'only_for_offload_target amdgcn-amdhsa'

This gets rid of UNRESOLVEDs if nvptx offloading compilation is enabled in
addition to GCN:

 PASS: libgomp.c/target-simd-clone-1.c (test for excess errors)
 PASS: libgomp.c/target-simd-clone-1.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "Generated local clone _ZGV.*N.*_addit"
-UNRESOLVED: libgomp.c/target-simd-clone-1.c scan-nvptx-none-offload-ipa-dump simdclone "Generated local clone _ZGV.*N.*_addit"
 PASS: libgomp.c/target-simd-clone-1.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "Generated local clone _ZGV.*M.*_addit"
-UNRESOLVED: libgomp.c/target-simd-clone-1.c scan-nvptx-none-offload-ipa-dump simdclone "Generated local clone _ZGV.*M.*_addit"
 PASS: libgomp.c/target-simd-clone-2.c (test for excess errors)
 PASS: libgomp.c/target-simd-clone-2.c scan-amdgcn-amdhsa-offload-ipa-dump-not simdclone "Generated .* clone"
-UNRESOLVED: libgomp.c/target-simd-clone-2.c scan-nvptx-none-offload-ipa-dump-not simdclone "Generated .* clone"
 PASS: libgomp.c/target-simd-clone-3.c (test for excess errors)
 PASS: libgomp.c/target-simd-clone-3.c scan-amdgcn-amdhsa-offload-ipa-dump simdclone "device doesn't match"
-UNRESOLVED: libgomp.c/target-simd-clone-3.c scan-nvptx-none-offload-ipa-dump simdclone "device doesn't match"
 PASS: libgomp.c/target-simd-clone-3.c scan-amdgcn-amdhsa-offload-ipa-dump-not simdclone "Generated .* clone"
-UNRESOLVED: libgomp.c/target-simd-clone-3.c scan-nvptx-none-offload-ipa-dump-not simdclone "Generated .* clone"

Minor fix-up for commit 309e2d95e3b930c6f15c8a5346b913158404c76d
'OpenMP: Generate SIMD clones for functions with "declare target"'.

	libgomp/
	* testsuite/libgomp.c/target-simd-clone-1.c: Restrict
	'scan-offload-ipa-dump's to
	'only_for_offload_target amdgcn-amdhsa'.
	* 

Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-11-29 Thread Ajit Agarwal
Hello All:

I am working on fixing the below issues and incorporating comments from Kewen 
and
Michael.

Thanks & Regards
Ajit

On 28/11/23 9:11 pm, Michael Meissner wrote:
> On Tue, Nov 28, 2023 at 05:44:43PM +0800, Kewen.Lin wrote:
>> on 2023/11/28 15:05, Michael Meissner wrote:
>>> I tried using this patch to compare with the vector size attribute patch I
>>> posted.  I could not build it as a cross compiler on my x86_64 because the
>>> assembler gives the following error:
>>>
>>> Error: operand out of domain (11 is not a multiple of 2) for
>>> std_stacktrace-elf.o.  If you look at the assembler, it has combined a lxvp 
>>> 11
>>> and lxvp 12 into:
>>>
>>> lxvp 11,0(9)
>>>
>>> The powerpc architecture requires that registers that are loaded with load
>>> vector pair and stored with store vector point instructions only load/store
>>> even/odd register pairs, and not odd/even pairs.  Unfortunately, it will 
>>> mean
>>> that this optimization will match less often.
>>>
>>
>> Yes, the current implementation need some refinements, as comments in [1]:
>>
>>> Besides, it seems a bad idea to put this pass after reload? as register 
>>> allocation
>>> finishes, this pairing has to be restricted by the reg No. (I didn't see any
>>> checking on the reg No. relationship for paring btw.)
>>>
>>> Looking forward to the comments from Segher/David/Peter/Mike etc.
>>
>> I wonder if we should consider running such pass before reload instead.
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638070.html
>>
>> BR,
>> Kewen
> 
> If I add code to check if the target register is even, then the following
> fails:
> 
> /home/meissner/fsf-src/work148-ajit/libquadmath/math/erfq.c: In function 
> ‘erfcq’:
> /home/meissner/fsf-src/work148-ajit/libquadmath/math/erfq.c:943:1: error: 
> insn does not satisfy its constraints:
>   943 | }
>   | ^
> (insn 1087 1939 1088 66 (set (reg/v:KF 74 10 [orig:643 y ] [643])
> (fma:KF (reg/v:KF 64 0 [orig:153 z ] [153])
> (reg/v:KF 65 1 [orig:639 y ] [639])
> (reg:KF 76 12 [orig:642 MEM[(const _Float128 *)p_276 + 16B] ] 
> [642]))) "/home/meissner/fsf-src/work148-ajit/libquadmath/math/erfq.c":112:9 
> 1004 {fmakf4_hw}
>  (expr_list:REG_DEAD (reg/v:KF 65 1 [orig:639 y ] [639])
> (nil)))
> 
> In particular, the IEEE 128-bit arithmetic functions require Altivec 
> registers.
> So we would need to make sure the new insns all meet their constraints.
> 
> I tend to think that it would be desirable to do it before reload.  But then 
> we
> will need to check if extra moves are generated.  I suspect we will need
> Peter's patch to allow 128-bit types that are subregs of OOmode.  I.e., the
> code generated would change:
> 
>   (set (reg:MODE1 tmp-reg)
>(mem ...+8))
> 
>   (set (reg:MODE2 tmp-reg+1)
>(mem ...))
> 
> to:
> 
>   (set (reg:OO vp-reg)
>(mem ...))
> 
>   (set (reg:MODE1 tmp-reg)
>(subreg:MODE1 (reg:OO vp-reg 0)))
> 
>   (set (reg:MODE2 tmp-reg+1)
>(subreg:MODE2 (reg:OO vp-reg 16)))
> 
> Note, I may have the offsets and register numbers backwards in terms of 
> endian.
> 


[PATCH] libiberty: Disable hwcaps for sha1.o

2023-11-29 Thread Rainer Orth
This patch

commit bf4f40cc3195eb7b900bf5535cdba1ee51fdbb8e
Author: Jakub Jelinek 
Date:   Tue Nov 28 13:14:05 2023 +0100

libiberty: Use x86 HW optimized sha1

broke Solaris/x86 bootstrap with the native as:

libtool: compile:  /var/gcc/regression/master/11.4-gcc/build/./gcc/gccgo 
-B/var/gcc/regression/master/11.4-gcc/build/./gcc/ 
-B/vol/gcc/i386-pc-solaris2.11/bin/ -B/vol/gcc/i386-pc-solaris2.11/lib/ 
-isystem /vol/gcc/i386-pc-solaris2.11/include -isystem 
/vol/gcc/i386-pc-solaris2.11/sys-include -fchecking=1 -minline-all-stringops 
-O2 -g -I . -c -fgo-pkgpath=internal/goarch 
/vol/gcc/src/hg/master/local/libgo/go/internal/goarch/goarch.go zgoarch.go
ld.so.1: go1: fatal: /var/gcc/regression/master/11.4-gcc/build/gcc/go1: 
hardware capability (CA_SUNW_HW_2) unsupported: 0x400  [ SHA1 ]
gccgo: fatal error: Killed signal terminated program go1

As is already done in a couple of other similar cases, this patches
disables hwcaps support for libiberty.

Initially, this didn't work because config/hwcaps.m4 uses target_os, but
didn't ensure it is defined.

Tested on i386-pc-solaris2.11 with as and gas (gcc build completed, make
check still running).  binutils-gdb build is currently broken in
gdb/procfs.c, unfortunately.

Ok for both gcc and binutils?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2023-11-29  Rainer Orth  

config:
* hwcaps.m4 (GCC_CHECK_ASSEMBLER_HWCAP): Require
AC_CANONICAL_TARGET.

libiberty:
* configure.ac (GCC_CHECK_ASSEMBLER_HWCAP): Invoke.
* configure, aclocal.m4: Regenerate.
* Makefile.in (COMPILE.c): Add HWCAP_CFLAGS.

# HG changeset patch
# Parent  fd492da0442de0bcc8afbf1aa71957f3e05afdb7
libiberty: Disable hwcaps for sha1.o

diff --git a/config/hwcaps.m4 b/config/hwcaps.m4
--- a/config/hwcaps.m4
+++ b/config/hwcaps.m4
@@ -7,6 +7,7 @@ dnl  HWCAP_CFLAGS='-Wa,-nH' if possible.
 dnl
 AC_DEFUN([GCC_CHECK_ASSEMBLER_HWCAP], [
   test -z "$HWCAP_CFLAGS" && HWCAP_CFLAGS=''
+  AC_REQUIRE([AC_CANONICAL_TARGET])
 
   # Restrict the test to Solaris, other assemblers (e.g. AIX as) have -nH
   # with a different meaning.
diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in
--- a/libiberty/Makefile.in
+++ b/libiberty/Makefile.in
@@ -114,7 +114,7 @@ INCDIR=$(srcdir)/$(MULTISRCTOP)../includ
 
 COMPILE.c = $(CC) -c @DEFS@ $(CFLAGS) $(CPPFLAGS) -I. -I$(INCDIR) \
$(HDEFINES) @ac_libiberty_warn_cflags@ -D_GNU_SOURCE \
-   @CET_HOST_FLAGS@
+   @CET_HOST_FLAGS@ @HWCAP_CFLAGS@
 
 # Just to make sure we don't use a built-in rule with VPATH
 .c.$(objext):
diff --git a/libiberty/configure.ac b/libiberty/configure.ac
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -265,6 +265,8 @@ AC_SUBST(NOASANFLAG)
 GCC_CET_HOST_FLAGS(CET_HOST_FLAGS)
 AC_SUBST(CET_HOST_FLAGS)
 
+GCC_CHECK_ASSEMBLER_HWCAP
+
 echo "# Warning: this fragment is automatically generated" > temp-frag
 
 if [[ -n "${frag}" ]] && [[ -f "${frag}" ]]; then


Re: GCN: Generally enable the 'gcc.target/gcn/avgpr-[...]' test cases

2023-11-29 Thread Andrew Stubbs




On 29/11/2023 13:44, Thomas Schwinge wrote:

Hi!

On 2023-11-15T14:10:47+, Andrew Stubbs  wrote:

   * gcc.target/gcn/avgpr-mem-double.c: New test.
   * gcc.target/gcn/avgpr-mem-int.c: New test.
   * gcc.target/gcn/avgpr-mem-long.c: New test.
   * gcc.target/gcn/avgpr-mem-short.c: New test.
   * gcc.target/gcn/avgpr-spill-double.c: New test.
   * gcc.target/gcn/avgpr-spill-int.c: New test.
   * gcc.target/gcn/avgpr-spill-long.c: New test.
   * gcc.target/gcn/avgpr-spill-short.c: New test.



--- /dev/null
+++ b/gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=gfx90a -O1" } */
+/* { dg-skip-if "incompatible ISA" { *-*-* } { "-march=gfx90[068]" } } */
+[...]


Etc.

OK to push the attached
"GCN: Generally enable the 'gcc.target/gcn/avgpr-[...]' test cases"?


Assuming that the (internal) Dejagnu Bug has been fixed, this is OK (on 
both mainline and OG13).


Andrew


Re: [PATCH] testsuite, i386: Only check for cfi directives if supported [PR112729]

2023-11-29 Thread Rainer Orth
Rainer Orth  writes:

> gcc.target/i386/apx-interrupt-1.c and two more tests FAIL on Solaris/x86
> with the native assembler.  Like Darwin as, it doesn't support cfi
> directives.  Instead of adding more and more targets in every affected
> test, this patch introduces a cfi effective-target keyword to check for
> the prerequisite.
>
> Tested on i386-pc-solaris2.11 (as and gas), x86_64-pc-linux-gnu, and
> x86_64-apple-darwin23.1.0.
>
> Any comments on the CFI detection in target-supports.exp?  Otherwise,
> I'll commit the patch as is.

Given that nobody found fault with the CFI detection, I've rebased the
patch to account for the -fomit-frame-pointer changes, retested and
committed it to trunk.

> The tests still FAIL on Solaris/x86 and FreeBSD/x86_64 with gas due to
> their -fno-omit-frame-pointer default; this will be addressed
> separately.

This has been handled now.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


# HG changeset patch
# Parent  0314b8e4604293f389540a558f0c9580f957aaa1
testsuite, i386: Only check for cfi directives if supported

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2671,6 +2671,9 @@ The language for the compiler under test
 @item c99_runtime
 Target provides a full C99 runtime.
 
+@item cfi
+Target supports DWARF CFI directives.
+
 @item correct_iso_cpp_string_wchar_protos
 Target @code{string.h} and @code{wchar.h} headers provide C++ required
 overloads for @code{strchr} etc. functions.
diff --git a/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c b/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c
--- a/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c
+++ b/gcc/testsuite/gcc.target/i386/apx-interrupt-1.c
@@ -1,6 +1,5 @@
-/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-do compile { target { { ! ia32 } && cfi } } } */
 /* { dg-options "-mapx-features=egpr -m64 -O2 -mgeneral-regs-only -mno-cld -mno-push-args -maccumulate-outgoing-args -fomit-frame-pointer" } */
-/* { dg-skip-if "does not emit .cfi_xxx" "*-*-darwin*" } */
 
 extern void foo (void *) __attribute__ ((interrupt));
 extern int bar (int);
diff --git a/gcc/testsuite/gcc.target/i386/apx-push2pop2-1.c b/gcc/testsuite/gcc.target/i386/apx-push2pop2-1.c
--- a/gcc/testsuite/gcc.target/i386/apx-push2pop2-1.c
+++ b/gcc/testsuite/gcc.target/i386/apx-push2pop2-1.c
@@ -1,6 +1,5 @@
-/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-do compile { target { { ! ia32 } && cfi } } } */
 /* { dg-options "-O2 -mapx-features=push2pop2 -fomit-frame-pointer" } */
-/* { dg-skip-if "does not emit .cfi_xxx" "*-*-darwin*" } */
 
 extern int bar (int);
 
diff --git a/gcc/testsuite/gcc.target/i386/apx-push2pop2_force_drap-1.c b/gcc/testsuite/gcc.target/i386/apx-push2pop2_force_drap-1.c
--- a/gcc/testsuite/gcc.target/i386/apx-push2pop2_force_drap-1.c
+++ b/gcc/testsuite/gcc.target/i386/apx-push2pop2_force_drap-1.c
@@ -1,6 +1,5 @@
-/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-do compile { target { { ! ia32 } && cfi } } } */
 /* { dg-options "-O2 -mapx-features=push2pop2 -fomit-frame-pointer -mforce-drap" } */
-/* { dg-skip-if "does not emit .cfi_xxx" "*-*-darwin*" } */
 
 #include "apx-push2pop2-1.c"
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -10082,6 +10082,18 @@ proc check_effective_target_c99_runtime 
 }]
 }
 
+# Return 1 if the target supports DWARF CFI directives.
+
+proc check_effective_target_cfi { } {
+return [check_no_compiler_messages cfi assembly {
+	#ifdef __GCC_HAVE_DWARF2_CFI_ASM
+/* ok */
+	#else
+	#error unsupported
+	#endif
+} ""]
+}
+
 # Return 1 if the target provides the D runtime.
 
 proc check_effective_target_d_runtime { } {


RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-11-29 Thread Richard Biener
On Mon, 27 Nov 2023, Tamar Christina wrote:

> Ping
> 
> > -Original Message-
> > From: Tamar Christina 
> > Sent: Monday, November 6, 2023 7:40 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com
> > Subject: [PATCH 9/21]middle-end: implement vectorizable_early_exit for
> > codegen of exit code
> > 
> > Hi All,
> > 
> > This implements vectorable_early_exit which is used as the codegen part of
> > vectorizing a gcond.
> > 
> > For the most part it shares the majority of the code with
> > vectorizable_comparison with addition that it needs to be able to reduce
> > multiple resulting statements into a single one for use in the gcond, and 
> > also
> > needs to be able to perform masking on the comparisons.
> > 
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > 
> > Ok for master?
> > 
> > Thanks,
> > Tamar
> > 
> > gcc/ChangeLog:
> > 
> > * tree-vect-stmts.cc (vectorizable_comparison_1): Support stmts
> > without
> > lhs.
> > (vectorizable_early_exit): New.
> > (vect_analyze_stmt, vect_transform_stmt): Use it.
> > (vect_is_simple_use, vect_get_vector_types_for_stmt): Support
> > gcond.
> > 
> > --- inline copy of patch --
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index
> > 36aeca60a22cfaea8d3b43348000d75de1d525c7..4809b822632279493a84
> > 3d402a833c9267bb315e 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -12475,7 +12475,7 @@ vectorizable_comparison_1 (vec_info *vinfo,
> > tree vectype,
> >vec vec_oprnds0 = vNULL;
> >vec vec_oprnds1 = vNULL;
> >tree mask_type;
> > -  tree mask;
> > +  tree mask = NULL_TREE;
> > 
> >if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> >  return false;
> > @@ -12615,8 +12615,9 @@ vectorizable_comparison_1 (vec_info *vinfo,
> > tree vectype,
> >/* Transform.  */
> > 
> >/* Handle def.  */
> > -  lhs = gimple_assign_lhs (STMT_VINFO_STMT (stmt_info));
> > -  mask = vect_create_destination_var (lhs, mask_type);
> > +  lhs = gimple_get_lhs (STMT_VINFO_STMT (stmt_info));  if (lhs)
> > +mask = vect_create_destination_var (lhs, mask_type);

wrecked line-break / white-space

> > 
> >vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
> >  rhs1, _oprnds0, vectype,
> > @@ -12630,7 +12631,10 @@ vectorizable_comparison_1 (vec_info *vinfo,
> > tree vectype,
> >gimple *new_stmt;
> >vec_rhs2 = vec_oprnds1[i];
> > 
> > -  new_temp = make_ssa_name (mask);
> > +  if (lhs)
> > +   new_temp = make_ssa_name (mask);
> > +  else
> > +   new_temp = make_temp_ssa_name (mask_type, NULL, "cmp");
> >if (bitop1 == NOP_EXPR)
> > {
> >   new_stmt = gimple_build_assign (new_temp, code, @@ -12709,6
> > +12713,196 @@ vectorizable_comparison (vec_info *vinfo,
> >return true;
> >  }
> > 
> > +/* Check to see if the current early break given in STMT_INFO is valid for
> > +   vectorization.  */
> > +
> > +static bool
> > +vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info,
> > +gimple_stmt_iterator *gsi, gimple **vec_stmt,
> > +slp_tree slp_node, stmt_vector_for_cost *cost_vec) {

{ goes to the next line

> > +  loop_vec_info loop_vinfo = dyn_cast  (vinfo);
> > +  if (!loop_vinfo
> > +  || !is_a  (STMT_VINFO_STMT (stmt_info)))
> > +return false;
> > +
> > +  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_early_exit_def)
> > +return false;
> > +
> > +  if (!STMT_VINFO_RELEVANT_P (stmt_info))
> > +return false;
> > +
> > +  gimple_match_op op;
> > +  if (!gimple_extract_op (stmt_info->stmt, ))
> > +gcc_unreachable ();
> > +  gcc_assert (op.code.is_tree_code ());  auto code = tree_code
> > + (op.code);

missed line break

> > +
> > +  tree vectype_out = STMT_VINFO_VECTYPE (stmt_info);  gcc_assert
> > + (vectype_out);

likewise.

> > +  tree var_op = op.ops[0];
> > +
> > +  /* When vectorizing things like pointer comparisons we will assume that
> > + the VF of both operands are the same. e.g. a pointer must be compared
> > + to a pointer.  We'll leave this up to vectorizable_comparison_1 to
> > + check further.  */
> > +  tree vectype_op = vectype_out;
> > +  if (SSA_VAR_P (var_op))

TREE_CODE (var_op) == SSA_NAME

> > +{
> > +  stmt_vec_info operand0_info
> > +   = loop_vinfo->lookup_stmt (SSA_NAME_DEF_STMT (var_op));

lookup_def (var_op)

> > +  if (!operand0_info)
> > +   return false;
> > +
> > +  /* If we're in a pattern get the type of the original statement.  */
> > +  if (STMT_VINFO_IN_PATTERN_P (operand0_info))
> > +   operand0_info = STMT_VINFO_RELATED_STMT (operand0_info);
> > +  vectype_op = STMT_VINFO_VECTYPE (operand0_info);
> > +}

I think you want to use vect_is_simple_use on var_op instead, that's
the canonical way for querying operands.

> > +
> > +  tree truth_type = truth_type_for (vectype_op);  machine_mode mode =
> > + TYPE_MODE 

GCN: Generally enable the 'gcc.target/gcn/avgpr-[...]' test cases (was: [committed] amdgcn: Add Accelerator VGPR registers)

2023-11-29 Thread Thomas Schwinge
Hi!

On 2023-11-15T14:10:47+, Andrew Stubbs  wrote:
>   * gcc.target/gcn/avgpr-mem-double.c: New test.
>   * gcc.target/gcn/avgpr-mem-int.c: New test.
>   * gcc.target/gcn/avgpr-mem-long.c: New test.
>   * gcc.target/gcn/avgpr-mem-short.c: New test.
>   * gcc.target/gcn/avgpr-spill-double.c: New test.
>   * gcc.target/gcn/avgpr-spill-int.c: New test.
>   * gcc.target/gcn/avgpr-spill-long.c: New test.
>   * gcc.target/gcn/avgpr-spill-short.c: New test.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=gfx90a -O1" } */
> +/* { dg-skip-if "incompatible ISA" { *-*-* } { "-march=gfx90[068]" } } */
> +[...]

Etc.

OK to push the attached
"GCN: Generally enable the 'gcc.target/gcn/avgpr-[...]' test cases"?


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From a21b6768b2267cf831089ea2c950c0d77408b1bf Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 16 Nov 2023 23:17:36 +0100
Subject: [PATCH] GCN: Generally enable the 'gcc.target/gcn/avgpr-[...]' test
 cases

... added in commit ae0d2c240213c5a7f6959c032bfc9f0703cab787
"amdgcn: Add Accelerator VGPR registers".  This way, they're correctly tested
no matter what '-march=[...]' is used with 'make check'.

	gcc/testsuite/
	* gcc.target/gcn/avgpr-mem-double.c: Remove
	'dg-skip-if "incompatible ISA" [...]'.
	* gcc.target/gcn/avgpr-mem-int.c: Likewise.
	* gcc.target/gcn/avgpr-mem-long.c: Likewise.
	* gcc.target/gcn/avgpr-mem-short.c: Likewise.
	* gcc.target/gcn/avgpr-spill-double.c: Likewise.
	* gcc.target/gcn/avgpr-spill-int.c: Likewise.
	* gcc.target/gcn/avgpr-spill-long.c: Likewise.
	* gcc.target/gcn/avgpr-spill-short.c: Likewise.
---
 gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c   | 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-mem-int.c  | 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-mem-long.c | 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-mem-short.c| 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-spill-double.c | 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-spill-int.c| 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-spill-long.c   | 1 -
 gcc/testsuite/gcc.target/gcn/avgpr-spill-short.c  | 1 -
 8 files changed, 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c b/gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c
index ce089fb198d..34317a50715 100644
--- a/gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c
+++ b/gcc/testsuite/gcc.target/gcn/avgpr-mem-double.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-march=gfx90a -O1" } */
-/* { dg-skip-if "incompatible ISA" { *-*-* } { "-march=gfx90[068]" } } */
 /* { dg-final { scan-assembler {load[^\n]*a[0-9[]} } } */
 /* { dg-final { scan-assembler {store[^\n]*a[0-9[]} } } */
 
diff --git a/gcc/testsuite/gcc.target/gcn/avgpr-mem-int.c b/gcc/testsuite/gcc.target/gcn/avgpr-mem-int.c
index 03d81486466..5ea3755e1b8 100644
--- a/gcc/testsuite/gcc.target/gcn/avgpr-mem-int.c
+++ b/gcc/testsuite/gcc.target/gcn/avgpr-mem-int.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-march=gfx90a -O1" } */
-/* { dg-skip-if "incompatible ISA" { *-*-* } { "-march=gfx90[068]" } } */
 /* { dg-final { scan-assembler {load[^\n]*a[0-9[]} } } */
 /* { dg-final { scan-assembler {store[^\n]*a[0-9[]} } } */
 
diff --git a/gcc/testsuite/gcc.target/gcn/avgpr-mem-long.c b/gcc/testsuite/gcc.target/gcn/avgpr-mem-long.c
index dcfb483f3f3..b52fc98da85 100644
--- a/gcc/testsuite/gcc.target/gcn/avgpr-mem-long.c
+++ b/gcc/testsuite/gcc.target/gcn/avgpr-mem-long.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-march=gfx90a -O1" } */
-/* { dg-skip-if "incompatible ISA" { *-*-* } { "-march=gfx90[068]" } } */
 /* { dg-final { scan-assembler {load[^\n]*a[0-9[]} } } */
 /* { dg-final { scan-assembler {store[^\n]*a[0-9[]} } } */
 
diff --git a/gcc/testsuite/gcc.target/gcn/avgpr-mem-short.c b/gcc/testsuite/gcc.target/gcn/avgpr-mem-short.c
index 91cc14ef181..a3e4a8bf9a9 100644
--- a/gcc/testsuite/gcc.target/gcn/avgpr-mem-short.c
+++ b/gcc/testsuite/gcc.target/gcn/avgpr-mem-short.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-march=gfx90a -O1" } */
-/* { dg-skip-if "incompatible ISA" { *-*-* } { "-march=gfx90[068]" } } */
 /* { dg-final { scan-assembler {load[^\n]*a[0-9[]} } } */
 /* { dg-final { scan-assembler {store[^\n]*a[0-9[]} } } */
 
diff --git a/gcc/testsuite/gcc.target/gcn/avgpr-spill-double.c b/gcc/testsuite/gcc.target/gcn/avgpr-spill-double.c
index 3e9996d3d10..53853a4b075 100644
--- a/gcc/testsuite/gcc.target/gcn/avgpr-spill-double.c
+++ b/gcc/testsuite/gcc.target/gcn/avgpr-spill-double.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-march=gfx908 -O1" } */

Re: [PATCH v2 1/2] libatomic: atomic_16.S: Improve ENTRY, END and ALIAS macro interface

2023-11-29 Thread Richard Earnshaw




On 13/11/2023 11:37, Victor Do Nascimento wrote:

The introduction of further architectural-feature dependent ifuncs
for AArch64 makes hard-coding ifunc `_i' suffixes to functions
cumbersome to work with.  It is awkward to remember which ifunc maps
onto which arch feature and makes the code harder to maintain when new
ifuncs are added and their suffixes possibly altered.

This patch uses pre-processor `#define' statements to map each suffix to
a descriptive feature name macro, for example:

   #define LSE2 _i1

and reconstructs function names with the pre-processor's token
concatenation feature, such that for `MACRO(name)', we would now have
`MACRO(name, feature)' and in the macro definition body we replace
`name` with `name##feature`.

libatomic/ChangeLog:
* config/linux/aarch64/atomic_16.S (CORE): New macro.
(LSE2): Likewise.
(ENTRY): Modify macro to take in `arch' argument.
(END): Likewise.
(ALIAS): Likewise.
(ENTRY1): New macro.
(END1): Likewise.
(ALIAS): Likewise.
---
  libatomic/config/linux/aarch64/atomic_16.S | 147 +++--
  1 file changed, 79 insertions(+), 68 deletions(-)

diff --git a/libatomic/config/linux/aarch64/atomic_16.S 
b/libatomic/config/linux/aarch64/atomic_16.S
index 0485c284117..3f6225830e6 100644
--- a/libatomic/config/linux/aarch64/atomic_16.S
+++ b/libatomic/config/linux/aarch64/atomic_16.S
@@ -39,22 +39,34 @@
  
  	.arch	armv8-a+lse
  
-#define ENTRY(name)		\

-   .global name;   \
-   .hidden name;   \
-   .type name,%function;   \
-   .p2align 4; \
-name:  \
-   .cfi_startproc; \
+#define ENTRY(name, feat)  \
+   ENTRY1(name, feat)


I'd be much more inclined to keep the 'API' of ENTRY and the related 
functions the same and then define a new macro ENTRY_FEAT that took the 
second parameter; then you could define ENTRY as


#define ENTRY(name) ENTRY_FEAT (name, CORE)

and save the need to modify all the base functionality.


+
+#define ENTRY1(name, feat) \
+   .global name##feat; \
+   .hidden name##feat; \
+   .type name##feat,%function; \
+   .p2align 4; \
+name##feat:\
+   .cfi_startproc; \
hint34  // bti c
  
-#define END(name)		\

-   .cfi_endproc;   \
-   .size name, .-name;
+#define END(name, feat)\
+   END1(name, feat)
  
-#define ALIAS(alias,name)	\

-   .global alias;  \
-   .set alias, name;
+#define END1(name, feat)   \
+   .cfi_endproc;   \
+   .size name##feat, .-name##feat;
+
+#define ALIAS(alias, from, to) \
+   ALIAS1(alias,from,to)
+
+#define ALIAS1(alias, from, to)\
+   .global alias##from;\
+   .set alias##from, alias##to > +
+#define CORE
+#define LSE2   _i1
  
  #define res0 x0

  #define res1 x1
@@ -89,7 +101,7 @@ name:\
  #define SEQ_CST 5
  
  
-ENTRY (libat_load_16)

+ENTRY (libat_load_16, CORE)
mov x5, x0
cbnzw1, 2f
  
@@ -104,10 +116,10 @@ ENTRY (libat_load_16)

stxpw4, res0, res1, [x5]
cbnzw4, 2b
ret
-END (libat_load_16)
+END (libat_load_16, CORE)
  
  
-ENTRY (libat_load_16_i1)

+ENTRY (libat_load_16, LSE2)
cbnzw1, 1f
  
  	/* RELAXED.  */

@@ -127,10 +139,10 @@ ENTRY (libat_load_16_i1)
ldp res0, res1, [x0]
dmb ishld
ret
-END (libat_load_16_i1)
+END (libat_load_16, LSE2)
  
  
-ENTRY (libat_store_16)

+ENTRY (libat_store_16, CORE)
cbnzw4, 2f
  
  	/* RELAXED.  */

@@ -144,10 +156,10 @@ ENTRY (libat_store_16)
stlxp   w4, in0, in1, [x0]
cbnzw4, 2b
ret
-END (libat_store_16)
+END (libat_store_16, CORE)
  
  
-ENTRY (libat_store_16_i1)

+ENTRY (libat_store_16, LSE2)
cbnzw4, 1f
  
  	/* RELAXED.  */

@@ -159,10 +171,10 @@ ENTRY (libat_store_16_i1)
stlxp   w4, in0, in1, [x0]
cbnzw4, 1b
ret
-END (libat_store_16_i1)
+END (libat_store_16, LSE2)
  
  
-ENTRY (libat_exchange_16)

+ENTRY (libat_exchange_16, CORE)
mov x5, x0
cbnzw4, 2f
  
@@ -186,10 +198,10 @@ ENTRY (libat_exchange_16)

stlxp   w4, in0, in1, [x5]
cbnzw4, 4b
ret
-END (libat_exchange_16)
+END (libat_exchange_16, CORE)
  
  
-ENTRY (libat_compare_exchange_16)

+ENTRY (libat_compare_exchange_16, CORE)
ldp exp0, exp1, [x1]
cbz w4, 3f
cmp w4, RELEASE
@@ -228,10 +240,10 @@ ENTRY (libat_compare_exchange_16)
cbnzw4, 4b
mov x0, 1
ret
-END (libat_compare_exchange_16)
+END (libat_compare_exchange_16, CORE)
  
  
-ENTRY (libat_compare_exchange_16_i1)

+ENTRY (libat_compare_exchange_16, LSE2)
ldp exp0, exp1, [x1]
 

Re: Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread 钟居哲
>> overlap or group_overlap.  Then change "no" to "none" and rename
>> "vconstraint_enabled" to "group_overlap_valid" (or without the group).

>> Add a comment to group_overlap_valid:

>> ; Widening instructions have group-overlap constraints.  Those are only
>> ; valid for certain register-group sizes.  This attribute marks the
>> ; alternatives not matching the required register-group size as disabled.

Ok will send a patch to fix them. Thanks.


juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-29 21:30
To: 钟居哲; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; Jeff Law
Subject: Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt
>>> I can't really match spec and code.  For the lmul = 2 case sure,
>>> but W84 e.g. allows v4 and not v6?  What actually is "highest-numbered 
>>> part"?
> Yes.
> 
> For vwcvt, LMUL 4 -> LMUL 8. 
> We allow overlap  vwcvt v0 (occupy v0 - v7), v4 (occupy v4 - v7)
> This patch support the overlap above.
 
Ok thanks, that way it makes sense.  The allowed overlap size is the
size of the source group which is determined by the "extension factor".
 
But don't we allow e.g. v2 and v4 with W82?  Shouldn't it be % 8 == 6
and % 8 == 7 for W82 and W81? Or for W41, % 4 == 3?  At least when looking
at the given spec example that would correspond to W82?
 
> This is kito's code. Could you suggest another name ? I can modify it.
 
overlap or group_overlap.  Then change "no" to "none" and rename
"vconstraint_enabled" to "group_overlap_valid" (or without the group).
 
Add a comment to group_overlap_valid:
 
; Widening instructions have group-overlap constraints.  Those are only
; valid for certain register-group sizes.  This attribute marks the
; alternatives not matching the required register-group size as disabled.
 
 
> I experiment with many tests, turns out adding ? generate better codegen.
> You can try it (remove ?) and testing it on case (I added in this patch).
 
It looks like we spill without it but I don't get why.  Well, as long
as it works, I guess we can defer that question.
 
Regards
Robin
 


Re: T-Head Vector for GCC-14? (was Re: RISC-V: Support XTheadVector extensions)

2023-11-29 Thread Jason Kridner
On Tue, Nov 28, 2023 at 5:21 PM Jeff Law  wrote:
>
> On 11/28/23 12:56, Philipp Tomsich wrote:
>
> >> That's obviously a risky thing to do given it was sent right at the end
> >> of the window, but it meets the rules.
> >>
> >> Folks in the call seemed generally amenable to at least trying for 14,
> >> so unless anyone's opposed on the lists it seems like the way to go.
> >> IIRC we ended up with the following TODO list:
> >>
> >> * Make sure this doesn't regress on the targets we already support.
> >>From the sounds of things there's been test suite runs that look
fine,
> >>so hopefully that's all manageable.  Christoph said he'd send
> >>something out, we've had a bunch of test skew so there might be a
bit
> >>lurking but it should be generally manageable.
> >> * We agree on some sort of support lifecycle.  There seemed to be
> >>basically two proposals: merge for 14 with the aim of quickly
> >>deperecating it (maybe even for 15), or merge for 14 with the aim of
> >>keeping it until it ends up un-tested (ie, requiring test results
are
> >>published for every release).
> >
> > We expect real-world users, including the BeagleV-AHEAD community, to
> > need support for the foreseeable future.
> > Keeping it until it ends up untested (and test cases are reasonably
> > clean) sounds like a good threshold to ensure the integrity of the
> > codebase while giving this a clear path to stay in for its useful
> > life.
> I can live with it being in the tree as long as it's maintained
> (measured by ongoing testing with reasonable results).
>
> I'd proposed that it could end up deprecated quickly, but that was based
> on the assumption that once V1.0 compliant hardware was widely available
> that we'd see less and less interest in the thead extensions.
>

At BeagleBoard.org, we focus on long-term support and availability.
Long-term support is a key for us engaging with education, both
institutional and continuing, and industrial automation. Getting this into
mainline such that we can develop solutions that integrate with mainline
Linux distributions is key for us to enable broader RISC-V adoption. If it
is deprecated at some point, that won't be terrible as long as we are able
to get to a good snapshot where integration with the rest of the open
source developer community has reasonably happened.

The good news is it *will* get tested. We have confidence in that side of
things. We have a great community that will engage the compiler and
identify regressions.

My expectation is that the Alibaba folks really know the C910 CPU core and
will help us get things right. I'll be here to help escalate issues to them
if they become unresponsive to the list. Others involved in the
BeagleBoard.org project will help make sure I know when I need to escalate
such issues.

Let me know if there's anything I can do to encourage this being merged and
worrying about deprecation later.

--
https://beagleboard.org/about/jkridner - a 501c3 non-profit educating
around open hardware computing


Re: Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread 钟居哲
>> But don't we allow e.g. v2 and v4 with W82?  Shouldn't it be % 8 == 6
>> and % 8 == 7 for W82 and W81? Or for W41, % 4 == 3?  At least when looking
>> at the given spec example that would correspond to W82?

I think you are right.  It should be W86 for vsext.vf4 (LMUL2 -> LMUL8)
W87 for vsext.vf8 (LMUL1->LMUL8)
W43 for vsext.vf4 (LMUL1->LMUL4)

This patch is just only using W21,W42,W84.
Will adapt that in the following patches.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-11-29 21:30
To: 钟居哲; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; Jeff Law
Subject: Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt
>>> I can't really match spec and code.  For the lmul = 2 case sure,
>>> but W84 e.g. allows v4 and not v6?  What actually is "highest-numbered 
>>> part"?
> Yes.
> 
> For vwcvt, LMUL 4 -> LMUL 8. 
> We allow overlap  vwcvt v0 (occupy v0 - v7), v4 (occupy v4 - v7)
> This patch support the overlap above.
 
Ok thanks, that way it makes sense.  The allowed overlap size is the
size of the source group which is determined by the "extension factor".
 
But don't we allow e.g. v2 and v4 with W82?  Shouldn't it be % 8 == 6
and % 8 == 7 for W82 and W81? Or for W41, % 4 == 3?  At least when looking
at the given spec example that would correspond to W82?
 
> This is kito's code. Could you suggest another name ? I can modify it.
 
overlap or group_overlap.  Then change "no" to "none" and rename
"vconstraint_enabled" to "group_overlap_valid" (or without the group).
 
Add a comment to group_overlap_valid:
 
; Widening instructions have group-overlap constraints.  Those are only
; valid for certain register-group sizes.  This attribute marks the
; alternatives not matching the required register-group size as disabled.
 
 
> I experiment with many tests, turns out adding ? generate better codegen.
> You can try it (remove ?) and testing it on case (I added in this patch).
 
It looks like we spill without it but I don't get why.  Well, as long
as it works, I guess we can defer that question.
 
Regards
Robin
 


RE: [PATCH 8/21]middle-end: update vectorizable_live_reduction with support for multiple exits and different exits

2023-11-29 Thread Richard Biener
On Mon, 27 Nov 2023, Tamar Christina wrote:

>  >
> > > This is a respun patch with a fix for VLA.
> > >
> > > This adds support to vectorizable_live_reduction to handle multiple
> > > exits by doing a search for which exit the live value should be 
> > > materialized in.
> > >
> > > Additionally which value in the index we're after depends on whether
> > > the exit it's materialized in is an early exit or whether the loop's
> > > main exit is different from the loop's natural one (i.e. the one with
> > > the same src block as the latch).
> > >
> > > In those two cases we want the first rather than the last value as
> > > we're going to restart the iteration in the scalar loop.  For VLA this
> > > means we need to reverse both the mask and vector since there's only a
> > > way to get the last active element and not the first.
> > >
> > > For inductions and multiple exits:
> > >   - we test if the target will support vectorizing the induction
> > >   - mark all inductions in the loop as relevant
> > >   - for codegen of non-live inductions during codegen
> > >   - induction during an early exit gets the first element rather than 
> > > last.
> > >
> > > For reductions and multiple exits:
> > >   - Reductions for early exits reduces the reduction definition statement
> > > rather than the reduction step.  This allows us to get the value at 
> > > the
> > > start of the iteration.
> > >   - The peeling layout means that we just have to update one block, the
> > merge
> > > block.  We expect all the reductions to be the same but we leave it 
> > > up to
> > > the value numbering to clean up any duplicate code as we iterate over 
> > > all
> > > edges.
> > >
> > > These two changes fix the reduction codegen given before which has
> > > been added to the testsuite for early vect.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-vect-loop.cc (vectorizable_live_operation): Support early exits.
> > >   (vect_analyze_loop_operations): Check if target supports vectorizing
> > IV.
> > >   (vect_transform_loop): Call vectorizable_live_operation for non-live
> > >   inductions or reductions.
> > >   (find_connected_edge, vectorizable_live_operation_1): New.
> > >   (vect_create_epilog_for_reduction): Support reductions in early break.
> > >   * tree-vect-stmts.cc (perm_mask_for_reverse): Expose.
> > >   (vect_stmt_relevant_p): Mark all inductions when early break as being
> > >   relevant.
> > >   * tree-vectorizer.h (perm_mask_for_reverse): Expose.
> > >   (vect_iv_increment_position): New.
> > >   * tree-vect-loop-manip.cc (vect_iv_increment_position): Expose.
> > >
> > > --- inline copy of patch ---
> > >
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index
> > >
> > 476be8a0bb6da2d06c4ca7052cb07bacecca60b1..1a4ba349fb6ae39c79401
> > aecd4e7
> > > e9e2b8a0 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -453,7 +453,7 @@ vect_adjust_loop_lens_control (tree iv_type,
> > gimple_seq *seq,
> > > INSERT_AFTER is set to true if the increment should be inserted after
> > > *BSI.  */
> > >
> > > -static void
> > > +void
> > >  vect_iv_increment_position (edge loop_exit, gimple_stmt_iterator *bsi,
> > >   bool *insert_after)
> > >  {
> > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index
> > >
> > 8a50380de49bc12105be47ea1d8ee3cf1f2bdab4..b42318b2999e6a27e698
> > 33821907
> > > 92602cb25af1 100644
> > > --- a/gcc/tree-vect-loop.cc
> > > +++ b/gcc/tree-vect-loop.cc
> > > @@ -2163,6 +2163,15 @@ vect_analyze_loop_operations (loop_vec_info
> > loop_vinfo)
> > >   ok = vectorizable_live_operation (loop_vinfo, stmt_info, NULL,
> > NULL,
> > > -1, false, _vec);
> > >
> > > +   /* Check if we can perform the operation for early break if we force
> > > +  the live operation.  */
> > > +   if (ok
> > > +   && LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> > > +   && !STMT_VINFO_LIVE_P (stmt_info)
> > > +   && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def)
> > > + ok = vectorizable_live_operation (loop_vinfo, stmt_info, NULL,
> > NULL,
> > > +   -1, false, _vec);
> > 
> > can you add && !PURE_SLP_STMT?
> > 
> 
> I've cleaned up the patch a bit more, so these hunks are now all gone.
> 
> > > @@ -6132,23 +6147,30 @@ vect_create_epilog_for_reduction
> > (loop_vec_info loop_vinfo,
> > >   Store them in NEW_PHIS.  */
> > >if (double_reduc)
> > >  loop = outer_loop;
> > > -  exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest;
> > > +  /* We need to reduce values in all exits.  */  exit_bb =
> > > + loop_exit->dest;
> > >exit_gsi = gsi_after_labels (exit_bb);
> > >reduc_inputs.create (slp_node ? vec_num : ncopies);
> > > +  vec  vec_stmts;
> > 

Re: [PATCH] RISC-V: Support highpart overlap for vext.vf

2023-11-29 Thread Robin Dapp
LGTM (in context of the last message) but please consider adding
the comments/naming I suggested. 

Regards
 Robin


Re: [PATCH] RISC-V: Support highpart register overlap for vwcvt

2023-11-29 Thread Robin Dapp
>>> I can't really match spec and code.  For the lmul = 2 case sure,
>>> but W84 e.g. allows v4 and not v6?  What actually is "highest-numbered 
>>>part"?
> Yes.
> 
> For vwcvt, LMUL 4 -> LMUL 8. 
> We allow overlap  vwcvt v0 (occupy v0 - v7), v4 (occupy v4 - v7)
> This patch support the overlap above.

Ok thanks, that way it makes sense.  The allowed overlap size is the
size of the source group which is determined by the "extension factor".

But don't we allow e.g. v2 and v4 with W82?  Shouldn't it be % 8 == 6
and % 8 == 7 for W82 and W81? Or for W41, % 4 == 3?  At least when looking
at the given spec example that would correspond to W82?

> This is kito's code. Could you suggest another name ? I can modify it.

overlap or group_overlap.  Then change "no" to "none" and rename
"vconstraint_enabled" to "group_overlap_valid" (or without the group).

Add a comment to group_overlap_valid:

; Widening instructions have group-overlap constraints.  Those are only
; valid for certain register-group sizes.  This attribute marks the
; alternatives not matching the required register-group size as disabled.


> I experiment with many tests, turns out adding ? generate better codegen.
> You can try it (remove ?) and testing it on case (I added in this patch).

It looks like we spill without it but I don't get why.  Well, as long
as it works, I guess we can defer that question.

Regards
 Robin


Re: [PATCH] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2023-11-29 Thread Uros Bizjak
On Wed, Nov 29, 2023 at 1:25 PM Richard Biener
 wrote:
>
> On Wed, Nov 29, 2023 at 10:35 AM Uros Bizjak  wrote:
> >
> > The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
> >
> > internal compiler error: RTL check: expected elt 0 type 'e' or 'u',
> > have 'E' (rtx unspec) in try_combine, at combine.cc:3237
> >
> > This is
> >
> > 3236  /* Just replace the CC reg with a new mode.  */
> > 3237  SUBST (XEXP (*cc_use_loc, 0), newpat_dest);
> > 3238  undobuf.other_insn = cc_use_insn;
> >
> > in combine.cc, where *cc_use_loc is
> >
> > (unspec:DI [
> > (reg:CC 17 flags)
> > ] UNSPEC_PUSHFL)
> >
> > combine assumes CC must be used inside of a comparison and uses XEXP (..., 
> > 0)
> > without checking on the RTX type of the argument.
> >
> > Skip the modification of CC-using operation if *cc_use_loc is not 
> > COMPARISON_P.
> >
> > PR middle-end/112560
> >
> > gcc/ChangeLog:
> >
> > * combine.cc (try_combine): Skip the modification of CC-using
> > operation if *cc_use_loc is not COMPARISON_P.
> >
> > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> >
> > OK for master?
>
> Don't we need to stop the attempt to combine when we cannot handle a use?
> Simply not adjusting another use doesn't look correct, does it?

I was assuming that if the CC reg is not used inside the comparison,
then the mode of CC reg is irrelevant. We can still combine the
instructions into new insn, without updating the use of CC reg.

In this particular case, the combined insn is rejected, but
UNSPEC_PUSHFL does not care about the mode of the CC reg and would
handle combined insn just fine.

Alternatively, the attached patch skips the combination altogether.

Thanks,
Uros.
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 6344cd3c9f2..e533631d0e6 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -3184,11 +3184,21 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
  && (cc_use_loc = find_single_use (SET_DEST (newpat), i3,
_use_insn)))
{
- compare_code = orig_compare_code = GET_CODE (*cc_use_loc);
- if (is_a  (GET_MODE (i2dest), ))
-   compare_code = simplify_compare_const (compare_code, mode,
-  , );
- target_canonicalize_comparison (_code, , , 1);
+ if (COMPARISON_P (*cc_use_loc))
+   {
+ compare_code = orig_compare_code = GET_CODE (*cc_use_loc);
+ if (is_a  (GET_MODE (i2dest), ))
+   compare_code = simplify_compare_const (compare_code, mode,
+  , );
+ target_canonicalize_comparison (_code, , , 1);
+   }
+ else
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "CC register not used in comparison.\n");
+ undo_all ();
+ return 0;
+   }
}
 
   /* Do the rest only if op1 is const0_rtx, which may be the


Re: Fix 'g++.dg/cpp26/static_assert1.C' for '-fno-exceptions' configurations

2023-11-29 Thread Thomas Schwinge
Hi!

On 2023-11-28T12:11:22-0500, Jason Merrill  wrote:
> On 11/28/23 12:08, Thomas Schwinge wrote:
>>  // { dg-options "" }
>> +// Override any default-'-fno-exceptions':
>> +// { dg-additional-options -fexceptions }
>
> Might as well put the -fexceptions into the dg-options instead of having
> two separate lines?

The net effect is the same, but in my opinion, the intentions are clearer
in the "separate" form: 'dg-options ""' cancels the standard options, and
then 'dg-additional-options -fexceptions' adds an additional option, with
rationale.


Grüße
 Thomas


> OK either way.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


  1   2   >