date:20240311

[Committed] Reject -fno-multiflags [PR114314]

2024-03-11 Thread Andrew Pinski

When -fmultiflags option support was added in r13-3693-g6b1a2474f9e422,
it accidently allowed -fno-multiflags which then would pass on to cc1.
This fixes that oversight.

Committed as obvious after bootstrap/test on x86_64-linux-gnu.

gcc/ChangeLog:

PR driver/114314
* common.opt (fmultiflags): Add RejectNegative.

Signed-off-by: Andrew Pinski 
---
 gcc/common.opt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 51c4a17da83..1ad0169bd6f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2295,7 +2295,7 @@ Common Var(flag_move_loop_stores) Optimization
 Move stores out of loops.
 
 fmultiflags
-Common Driver
+Common Driver RejectNegative
 Building block for specs-based multilib-aware TFLAGS.
 
 fdce
-- 
2.43.0

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-11 Thread Li, Pan2

Hi Jeff,

Is there any suggestion(s) for how to fix this ICE in the reasonable approach? 
Thanks a lot.

Pan

-Original Message-
From: Li, Pan2 
Sent: Tuesday, March 5, 2024 2:23 PM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Thanks Jeff for comments.

> But in the case of a vector modes, we can usually reinterpret the 
> underlying bits in whatever mode we want and do any of the usual 
> operations on those bits.

Yes, I think that is why we can allow vector mode in get_stored_val if my 
understanding is correct.
And then the different modes will return by gen_low_part. Unfortunately, there 
are some modes
 (less than a vector bit size like V2SF, V2QI for vlen=128) are considered as 
invalid by validate_subreg, 
and return NULL_RTX result in the final ICE.

Thus, consider stage 4 I wonder if this is a acceptable fix, aka find some 
where to filter-out the invalid
modes before goes to gen_low_part.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, March 4, 2024 6:47 AM
To: Robin Dapp ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

On 2/29/24 06:28, Robin Dapp wrote:
> On 2/29/24 02:38, Li, Pan2 wrote:
>>> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I
>>> suspect those are going to fail for RISC-V as those aren't tieable.
>>
>> Yes, you are right. Different REG_CLASS are not allowed to be tieable in 
>> RISC-V.
>>
>> static bool
>> riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>> {
>>/* We don't allow different REG_CLASS modes tieable since it
>>   will cause ICE in register allocation (RA).
>>   E.g. V2SI and DI are not tieable.  */
>>if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
>>  return false;
>>return (mode1 == mode2
>>|| !(GET_MODE_CLASS (mode1) == MODE_FLOAT
>> && GET_MODE_CLASS (mode2) == MODE_FLOAT));
>> }
> 
> Yes, but what we set tieable is e.g. V4QI and V2SF.
But in the case of a vector modes, we can usually reinterpret the 
underlying bits in whatever mode we want and do any of the usual 
operations on those bits.

In my mind that's fundamentally different than the int vs fp case.  If 
we have an integer value in an FP register, we can't really operate on 
the value in any sensible way without first copying it over to the 
integer register file and vice-versa.

Jeff

RE: [PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-11 Thread Li, Pan2

Thanks Vinnet for reminder.

> While at it, can you also add the support for feature detection macro
> |__riscv_v_fixed_vlen

Kito told me that Greg will help to add that parts. Let's wait the comments 
from Kito.
Personally prefer a separated PATCH to cover that instead of appending here.

Pan

-Original Message-
From: Vineet Gupta  
Sent: Thursday, March 7, 2024 3:19 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, Yanzhang 
; rdapp@gmail.com; pal...@rivosinc.com
Subject: Re: [PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits 
for RVV



On 3/5/24 23:27, pan2...@intel.com wrote:
> From: Pan Li 
>
> Update in v2:
> * Cleanup some unused code.
> * Fix some typo of commit log.
>
> Original log:
>
> This patch would like to introduce one new gcc attribute for RVV.
> This attribute is used to define fixed-length variants of one
> existing sizeless RVV types.
>
> This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
> one args should be the integer constant and its' value is terminated
> by the LMUL and the vector register bits in zvl*b.  For example:
>
> typedef vint32m2_t fixed_vint32m2_t 
> __attribute__((riscv_rvv_vector_bits(128)));
>
> The above type define is valid when -march=rv64gc_zve64d_zvl64b
> (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
> -march=rv64gcv_zvl128b similar to below.
>
> "error: invalid RVV vector size '128', expected size is '256' based on
> LMUL of type and '-mrvv-vector-bits=zvl'"
>
> For the vint*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -
>
> For the vfloat*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, -
>
> For the vbool*_t types only below operations are allowed except
> the CMP and ALU. The CMP and ALU operations on vbool*_t is not
> well defined currently.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
>
> For the vint*x*m*_t tuple types are not suppored in this patch
> which is compatible with clang.
>
> This patch passed the below testsuites.
> * The riscv fully regression tests.

While at it, can you also add the support for feature detection macro
|__riscv_v_fixed_vlen

Thx,
-Vineet
|

>
> gcc/ChangeLog:
>
>   * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
>   New static func to take care of the RVV types decorated by
>   the attributes.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv.cc |  87 +-
>  .../riscv/rvv/base/riscv_rvv_vector_bits-1.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-10.c |  53 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-11.c |  76 
>  .../riscv/rvv/base/riscv_rvv_vector_bits-12.c |  14 +++
>  .../riscv/rvv/base/riscv_rvv_vector_bits-2.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-3.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-4.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-5.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-6.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-7.c  |  76 
>  .../riscv/rvv/base/riscv_rvv_vector_bits-8.c  |  75 
>  .../riscv/rvv/base/riscv_rvv_vector_bits-9.c  |  76 
>  .../riscv/rvv/base/riscv_rvv_vector_bits.h| 108 ++
>  14 files changed, 599 insertions(+), 2 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c
>  create

[PATCH v1] LoongArch: Remove masking process for operand 3 of xvpermi.q.

2024-03-11 Thread Chenghui Pan

The behavior of non-zero unused bits in xvpermi.q instruction's
third operand is undefined on LoongArch, according to our
discussion (https://github.com/llvm/llvm-project/pull/83540),
we think that keeping original insn operand as unmodified
state is better solution.

This patch partially reverts 7b158e036a95b1ab40793dd53bed7dbd770ffdaf.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove masking of operand 3.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c:
  Reposition operand 3's value into instruction's defined accept range.
---
 gcc/config/loongarch/lasx.md| 5 -
 .../gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c   | 6 +++---
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index ac84db7f0ce..3f25c0c1756 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -640,8 +640,6 @@ (define_insn "lasx_xvpermi_d__1"
(set_attr "mode" "")])
 
 ;; xvpermi.q
-;; Unused bits in operands[3] need be set to 0 to avoid
-;; causing undefined behavior on LA464.
 (define_insn "lasx_xvpermi_q_"
   [(set (match_operand:LASX 0 "register_operand" "=f")
(unspec:LASX
@@ -651,9 +649,6 @@ (define_insn "lasx_xvpermi_q_"
  UNSPEC_LASX_XVPERMI_Q))]
   "ISA_HAS_LASX"
 {
-  int mask = 0x33;
-  mask &= INTVAL (operands[3]);
-  operands[3] = GEN_INT (mask);
   return "xvpermi.q\t%u0,%u2,%3";
 }
   [(set_attr "type" "simd_splat")
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c 
b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
index dbc29d2fb22..f89dfc31120 100644
--- a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c
@@ -27,7 +27,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x7fff7fff7fff;
   *((unsigned long*)& __m256i_result[1]) = 0x7fe37fe3001d001d;
   *((unsigned long*)& __m256i_result[0]) = 0x7fff7fff7fff;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x2a);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x22);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   *((unsigned long*)& __m256i_op0[3]) = 0x;
@@ -42,7 +42,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x0019001c;
   *((unsigned long*)& __m256i_result[1]) = 0x;
   *((unsigned long*)& __m256i_result[0]) = 0x01fe;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0xb9);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x31);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   *((unsigned long*)& __m256i_op0[3]) = 0x00ff00ff00ff00ff;
@@ -57,7 +57,7 @@ main ()
   *((unsigned long*)& __m256i_result[2]) = 0x;
   *((unsigned long*)& __m256i_result[1]) = 0x00ff00ff00ff00ff;
   *((unsigned long*)& __m256i_result[0]) = 0x00ff00ff00ff00ff;
-  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0xca);
+  __m256i_out = __lasx_xvpermi_q (__m256i_op0, __m256i_op1, 0x02);
   ASSERTEQ_64 (__LINE__, __m256i_result, __m256i_out);
 
   return 0;
-- 
2.39.3

Re: [PATCH v3] LoongArch: Add support for TLS descriptors.

2024-03-11 Thread mengqinggang

The patch is here: 
https://sourceware.org/pipermail/gcc-patches/2024-March/647578.html,

first email was blocked by the server.


在 2024/3/11 下午4:21, mengqinggang 写道:

Add support for TLS descriptors on normal code model and extreme code model.

Normal code model instruction sequence:
   -mno-explicit-relocs:
 la.tls.desc$r4, s
 add.d  $r12, $r4, $r2
   -mexplicit-relocs:
 pcalau12i  $r4,%desc_pc_hi20(s)
 addi.d $r4,$r4,%desc_pc_lo12(s)
 ld.d   $r1,$r4,%desc_ld(s)
 jirl   $r1,$r1,%desc_call(s)"
 add.d  $r12, $r4, $r2

Extreme code model instruction sequence:
   -mno-explicit-relocs:
 la.tls.desc$r4, $r12, s
 add.d  $r12, $r4, $r2
   -mexplicit-relocs:
 pcalau12i  $r4,%desc_pc_hi20(s)
 addi.d $r12,$r0,%desc_pc_lo12(s)
 lu32i.d$r12,%desc64_pc_lo20(s)
 lu52i.d$r12,$r12,%desc64_pc_hi12(s)
 add.d  $r4,$r4,$r12
 ld.d   $r1,$r4,%desc_ld(s)
 jirl   $r1,$r1,%desc_call(s)
 add.d  $r12, $r4, $r2

The default is still traditional TLS model, but can be configured with
--with-tls={trad,desc}. The default can change to TLS descriptors once
libc and LLVM support this.

gcc/ChangeLog:

* config.gcc: Add --with-tls option to change TLS flavor.
* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
configure TLS flavor.
* config/loongarch/loongarch-def.h (struct loongarch_target): Add
tls_dialect.
* config/loongarch/loongarch-driver.cc (la_driver_init): Add tls
flavor.
* config/loongarch/loongarch-opts.cc (loongarch_init_target): Add
tls_dialect.
(loongarch_config_target): Ditto.
(loongarch_update_gcc_opt_status): Ditto.
* config/loongarch/loongarch-opts.h (loongarch_init_target):Ditto.
(TARGET_TLS_DESC): New define.
* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add TLS DESC
instructions sequence length.
(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
(loongarch_option_override_internal): Add la_opt_tls_dialect.
(loongarch_option_restore): Add la_target.tls_dialect.
* config/loongarch/loongarch.md (@got_load_tls_desc): Normal
code model for TLS DESC.
(got_load_tls_desc_off64): Extreme code model for TLS DESC.
* config/loongarch/loongarch.opt: Regenerated.
---
Changes v2 -> v3:
- Set default to traditional TLS model.
- Add support for -mexplicit-relocs and extreme code model.

Changes v1 -> v2:
- Clobber fcc0-fcc7 registers in got_load_tls_desc template.
- Support --with-tls in configure.

v2 link: https://sourceware.org/pipermail/gcc-patches/2024-February/646817.html
v1 link: https://sourceware.org/pipermail/gcc-patches/2023-December/638907.html

  gcc/config.gcc| 19 +-
  gcc/config/loongarch/genopts/loongarch.opt.in | 14 
  gcc/config/loongarch/loongarch-def.h  |  7 ++
  gcc/config/loongarch/loongarch-driver.cc  |  2 +-
  gcc/config/loongarch/loongarch-opts.cc| 12 +++-
  gcc/config/loongarch/loongarch-opts.h |  2 +
  gcc/config/loongarch/loongarch.cc | 48 +
  gcc/config/loongarch/loongarch.md | 68 +++
  gcc/config/loongarch/loongarch.opt| 14 
  9 files changed, 170 insertions(+), 16 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 624e0dae191..baebafdbf5d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4991,7 +4991,7 @@ case "${target}" in
;;
  
  	loongarch*-*)

-   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib"
+   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib tls"
  
  		# Local variables

unset \
@@ -5249,6 +5249,18 @@ case "${target}" in
with_multilib_list="${abi_base}/${abi_ext}"
fi
  
+		# Handle --with-tls.

+   case "$with_tls" in
+   "" \
+   | trad | desc)
+   # OK
+   ;;
+   *)
+   echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
+   exit 1
+   ;;
+   esac
+
# Check if the configured default ABI combination is included in
# ${with_multilib_list}.
loongarch_multilib_list_sane=no
@@ -5914,6 +5926,11 @@ case ${target} in
lasx)tm_defines="$tm_defines 
DEFAULT_ISA_EXT_SIMD=ISA_EXT_SIMD_LASX" ;;
esac
  
+		case ${with_tls} in

+   "" | trad)tm_defines="$tm_defines 
DEFAULT_TLS_TYPE=TLS_TRADITIONAL" ;;
+   desc)   tm_defines="$tm_defines 
DEFAULT_TLS_TYPE=TLS_DESCRIPTORS" ;;
+   esac
+
tmake_file="loongarch/t-loongarch $tmake_file"
;;
  
diff

[PATCH v3] LoongArch: Add support for TLS descriptors.

2024-03-11 Thread mengqinggang

Add support for TLS descriptors on normal code model and extreme code model.

Normal code model instruction sequence:
  -mno-explicit-relocs:
la.tls.desc $r4, s
add.d   $r12, $r4, $r2
  -mexplicit-relocs:
pcalau12i   $r4,%desc_pc_hi20(s)
addi.d  $r4,$r4,%desc_pc_lo12(s)
ld.d$r1,$r4,%desc_ld(s)
jirl$r1,$r1,%desc_call(s)"
add.d   $r12, $r4, $r2

Extreme code model instruction sequence:
  -mno-explicit-relocs:
la.tls.desc $r4, $r12, s
add.d   $r12, $r4, $r2
  -mexplicit-relocs:
pcalau12i   $r4,%desc_pc_hi20(s)
addi.d  $r12,$r0,%desc_pc_lo12(s)
lu32i.d $r12,%desc64_pc_lo20(s)
lu52i.d $r12,$r12,%desc64_pc_hi12(s)
add.d   $r4,$r4,$r12
ld.d$r1,$r4,%desc_ld(s)
jirl$r1,$r1,%desc_call(s)
add.d   $r12, $r4, $r2

The default is still traditional TLS model, but can be configured with
--with-tls={trad,desc}. The default can change to TLS descriptors once
libc and LLVM support this.

gcc/ChangeLog:

* config.gcc: Add --with-tls option to change TLS flavor.
* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
configure TLS flavor.
* config/loongarch/loongarch-def.h (struct loongarch_target): Add
tls_dialect.
* config/loongarch/loongarch-driver.cc (la_driver_init): Add tls
flavor.
* config/loongarch/loongarch-opts.cc (loongarch_init_target): Add
tls_dialect.
(loongarch_config_target): Ditto.
(loongarch_update_gcc_opt_status): Ditto.
* config/loongarch/loongarch-opts.h (loongarch_init_target):Ditto.
(TARGET_TLS_DESC): New define.
* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add TLS DESC
instructions sequence length.
(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
(loongarch_option_override_internal): Add la_opt_tls_dialect.
(loongarch_option_restore): Add la_target.tls_dialect.
* config/loongarch/loongarch.md (@got_load_tls_desc): Normal
code model for TLS DESC.
(got_load_tls_desc_off64): Extreme code model for TLS DESC.
* config/loongarch/loongarch.opt: Regenerated.
---
Changes v2 -> v3:
- Set default to traditional TLS model.
- Add support for -mexplicit-relocs and extreme code model.

Changes v1 -> v2:
- Clobber fcc0-fcc7 registers in got_load_tls_desc template.
- Support --with-tls in configure.

v2 link: https://sourceware.org/pipermail/gcc-patches/2024-February/646817.html
v1 link: https://sourceware.org/pipermail/gcc-patches/2023-December/638907.html

 gcc/config.gcc| 19 +-
 gcc/config/loongarch/genopts/loongarch.opt.in | 14 
 gcc/config/loongarch/loongarch-def.h  |  7 ++
 gcc/config/loongarch/loongarch-driver.cc  |  2 +-
 gcc/config/loongarch/loongarch-opts.cc| 12 +++-
 gcc/config/loongarch/loongarch-opts.h |  2 +
 gcc/config/loongarch/loongarch.cc | 48 +
 gcc/config/loongarch/loongarch.md | 68 +++
 gcc/config/loongarch/loongarch.opt| 14 
 9 files changed, 170 insertions(+), 16 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 624e0dae191..baebafdbf5d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4991,7 +4991,7 @@ case "${target}" in
;;
 
loongarch*-*)
-   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib"
+   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib tls"
 
# Local variables
unset \
@@ -5249,6 +5249,18 @@ case "${target}" in
with_multilib_list="${abi_base}/${abi_ext}"
fi
 
+   # Handle --with-tls.
+   case "$with_tls" in
+   "" \
+   | trad | desc)
+   # OK
+   ;;
+   *)
+   echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
+   exit 1
+   ;;
+   esac
+
# Check if the configured default ABI combination is included in
# ${with_multilib_list}.
loongarch_multilib_list_sane=no
@@ -5914,6 +5926,11 @@ case ${target} in
lasx)tm_defines="$tm_defines 
DEFAULT_ISA_EXT_SIMD=ISA_EXT_SIMD_LASX" ;;
esac
 
+   case ${with_tls} in
+   "" | trad)  tm_defines="$tm_defines 
DEFAULT_TLS_TYPE=TLS_TRADITIONAL" ;;
+   desc)   tm_defines="$tm_defines 
DEFAULT_TLS_TYPE=TLS_DESCRIPTORS" ;;
+   esac
+
tmake_file="loongarch/t-loongarch $tmake_file"
;;
 
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 02f918053f5..7de107c3e3d 100644
---

Re: [RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-11 Thread Jeff Law





On 3/11/24 1:46 AM, Richard Biener wrote:

On Sun, Mar 10, 2024 at 10:09 PM Jeff Law  wrote:




On 3/10/24 3:05 PM, Andrew Pinski wrote:

On Sun, Mar 10, 2024 at 2:04 PM Jeff Law  wrote:


Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
positive triggered by loop unrolling.

As I speculated a couple years ago, we could eliminate the comparisons
against bogus pointers.  Consider:


 [local count: 30530247]:
if (last_12 !=   [(void *)"aa" + 3B])
  goto ; [54.59%]
else
  goto ; [45.41%]



That's a valid comparison as ISO allows us to generate, but not
dereference, a pointer one element past the end of the object.

But +4B is a bogus pointer.  So given an EQ comparison against that
pointer we could always return false and for NE always return true.

VRP and DOM seem to be the most natural choices for this kind of
optimization on the surface.  However DOM is actually not viable because
the out-of-bounds pointer warning pass is run at the end of VRP.  So
we've got to take care of this prior to the end of VRP.



I haven't done a bootstrap or regression test with this.  But if it
looks reasonable I can certainly push on it further. I have confirmed it
does eliminate the tests and shuts up the bogus warning.

The downside is this would also shut up valid warnings if user code did
this kind of test.

Comments/Suggestions?


ENOPATCH

Yea, realized it as I pushed the send button.  Then t-bird crashed,
repeatedly.

Attached this time..


There's fold-const.cc:address_compare and
tree-ssa-alias.cc:ptrs_compare_unequal,
both eventually used by match.pd that could see this change, the former already
special-cases STRING_CST to some extent.

I'll note that the value we simplify such comparison to is arbitrary.
Doing such
simplification directly (as opposed to only benefit from its
undefinedness indirectly)
always gives me the creeps ;)

IMO we should instead simplify the condition to __builtin_unreachable/trap aka
isolate the path as unreachable.
Path isolation is a better conceptual place, but to do that I think we'd 
need to finish moving the array warnings out of VRP to a later point in 
the pipeline.


That's probably a good thing to do regardless -- even with the almost 
certain fallout.


jeff


Richard.


jeff

[11/12/13 only] build error: libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:180:10: fatal error: crypt.h: No such file or directory

2024-03-11 Thread Chris Packham

Hi from crosstool-ng,

I've had a user report a build error for GCC 13.2.0 with and aarch64
config with libsanitizer enabled
(https://github.com/crosstool-ng/crosstool-ng/issues/2010).

[ERROR]
/home/ctng/crosstool-ng/.build/aarch64-unknown-linux-gnu/src/gcc/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp:180:10:
fatal error: crypt.h: No such file or directory
[ALL  ]  180 | #include 
[ALL  ]  |  ^
[ALL  ]compilation terminated.
[ERROR]make[5]: *** [Makefile:614:
sanitizer_platform_limits_posix.lo] Error 1
[ERROR]make[5]: *** Waiting for unfinished jobs

It looks like this may have already been fixed in master by commit
d96e14ceb947 ("libsanitizer: merge from upstream (87e6e490e79384a5)").

Attached is a mimal backport from upstream libsanitizer. I posted it
to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111057 but it was
pointed out that I should send it to gcc-patches if I want it actually
applied.

Thanks,
Chris
From 9b116160a1482c5c0c199f9c21d78a527d11d9ea Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Fri, 28 Apr 2023 09:59:17 -0700
Subject: [PATCH] Remove crypt and crypt_r interceptors

From Florian Weimer's D144073

> On GNU/Linux (glibc), the crypt and crypt_r functions are not part of the main shared object (libc.so.6), but libcrypt (with multiple possible sonames). The sanitizer libraries do not depend on libcrypt, so it can happen that during sanitizer library initialization, no real implementation will be found because the crypt, crypt_r functions are not present in the process image (yet). If its interceptors are called nevertheless, this results in a call through a null pointer when the sanitizer library attempts to forward the call to the real implementation.
>
> Many distributions have already switched to libxcrypt, a library that is separate from glibc and that can be build with sanitizers directly (avoiding the need for interceptors). This patch disables building the interceptor for glibc targets.

Let's remove crypt and crypt_r interceptors (D68431) to fix issues with
newer glibc.

For older glibc, msan will not know that an uninstrumented crypt_r call
initializes `data`, so there is a risk for false positives. However, with some
codebase survey, I think crypt_r uses are very few and the call sites typically
have a `memset(, 0, sizeof(data));` anyway.

Fix https://github.com/google/sanitizers/issues/1365
Related: https://bugzilla.redhat.com/show_bug.cgi?id=2169432

Reviewed By: #sanitizers, fweimer, thesamesam, vitalybuka

Differential Revision: https://reviews.llvm.org/D149403
---
 .../sanitizer_common_interceptors.inc | 37 ---
 .../sanitizer_platform_interceptors.h |  2 -
 .../sanitizer_platform_limits_posix.cpp   |  8 
 .../sanitizer_platform_limits_posix.h |  1 -
 4 files changed, 48 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
index ba4b80081f0f..662c41997422 100644
--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -10187,41 +10187,6 @@ INTERCEPTOR(SSIZE_T, getrandom, void *buf, SIZE_T buflen, unsigned int flags) {
 #define INIT_GETRANDOM
 #endif
 
-#if SANITIZER_INTERCEPT_CRYPT
-INTERCEPTOR(char *, crypt, char *key, char *salt) {
-  void *ctx;
-  COMMON_INTERCEPTOR_ENTER(ctx, crypt, key, salt);
-  COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1);
-  COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1);
-  char *res = REAL(crypt)(key, salt);
-  if (res != nullptr)
-COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1);
-  return res;
-}
-#define INIT_CRYPT COMMON_INTERCEPT_FUNCTION(crypt);
-#else
-#define INIT_CRYPT
-#endif
-
-#if SANITIZER_INTERCEPT_CRYPT_R
-INTERCEPTOR(char *, crypt_r, char *key, char *salt, void *data) {
-  void *ctx;
-  COMMON_INTERCEPTOR_ENTER(ctx, crypt_r, key, salt, data);
-  COMMON_INTERCEPTOR_READ_RANGE(ctx, key, internal_strlen(key) + 1);
-  COMMON_INTERCEPTOR_READ_RANGE(ctx, salt, internal_strlen(salt) + 1);
-  char *res = REAL(crypt_r)(key, salt, data);
-  if (res != nullptr) {
-COMMON_INTERCEPTOR_WRITE_RANGE(ctx, data,
-   __sanitizer::struct_crypt_data_sz);
-COMMON_INTERCEPTOR_INITIALIZE_RANGE(res, internal_strlen(res) + 1);
-  }
-  return res;
-}
-#define INIT_CRYPT_R COMMON_INTERCEPT_FUNCTION(crypt_r);
-#else
-#define INIT_CRYPT_R
-#endif
-
 #if SANITIZER_INTERCEPT_GETENTROPY
 INTERCEPTOR(int, getentropy, void *buf, SIZE_T buflen) {
   void *ctx;
@@ -10772,8 +10737,6 @@ static void InitializeCommonInterceptors() {
   INIT_GETUSERSHELL;
   INIT_SL_INIT;
   INIT_GETRANDOM;
-  INIT_CRYPT;
-  INIT_CRYPT_R;
   INIT_GETENTROPY;
   INIT_QSORT;
   INIT_QSORT_R;
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h

Re: [PATCH] Fix PR ipa/113996

2024-03-11 Thread Jeff Law





On 3/11/24 4:38 PM, Eric Botcazou wrote:

Hi,

this is a regression present on all active branches: the attached Ada testcase
triggers an assertion failure when compiled with -O2 -gnatp -flto:

   /* Initialize the static chain.  */
   p = DECL_STRUCT_FUNCTION (fn)->static_chain_decl;
   gcc_assert (fn != current_function_decl);
   if (p)
 {
   /* No static chain?  Seems like a bug in tree-nested.cc.  */
   gcc_assert (static_chain);  <--- here

   setup_one_parameter (id, p, static_chain, fn, bb, );
 }

The problem is that the ICF pass identifies two functions, one of which has a
static chain but the other does not.  The proposed fix is just to prevent this
identification from occurring.

Tested on x86-64/Linux, OK for all active branches?


2024-03-11  Eric Botcazou  

PR ipa/113996
* ipa-icf.h (sem_function): Add static_chain_present member.
* ipa-icf.cc (sem_function::get_hash): Hash it.
(sem_function::equals_wpa): Compare it.
(sem_function::equals_private): Likewise.
(sem_function::init): Initialize it.


2024-03-11  Eric Botcazou  

* gnat.dg/lto27.adb: New test.

OK.
jeff

[PATCH v1] libstdc++: Optimize removal from unique assoc containers [PR112934]

2024-03-11 Thread Barnabás Pőcze

Previously, calling erase(key) on both std::map and std::set
would execute that same code that std::multi{map,set} would.
However, doing that is unnecessary because std::{map,set}
guarantee that all elements are unique.

It is reasonable to expect that erase(key) is equivalent
or better than:

  auto it = m.find(key);
  if (it != m.end())
m.erase(it);

However, this was not the case. Fix that by adding a new
function _Rb_tree<>::_M_erase_unique() that is essentially
equivalent to the above snippet, and use this from both
std::map and std::set.

libstdc++-v3/ChangeLog:

PR libstdc++/112934
* include/bits/stl_tree.h (_Rb_tree<>::_M_erase_unique): Add.
* include/bits/stl_map.h (map<>::erase): Use _M_erase_unique.
* include/bits/stl_set.h (set<>::erase): Likewise.
---
 libstdc++-v3/include/bits/stl_map.h  |  2 +-
 libstdc++-v3/include/bits/stl_set.h  |  2 +-
 libstdc++-v3/include/bits/stl_tree.h | 17 +
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_map.h 
b/libstdc++-v3/include/bits/stl_map.h
index ad58a631af5..229643b77fd 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -1115,7 +1115,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   size_type
   erase(const key_type& __x)
-  { return _M_t.erase(__x); }
+  { return _M_t._M_erase_unique(__x); }
 
 #if __cplusplus >= 201103L
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
diff --git a/libstdc++-v3/include/bits/stl_set.h 
b/libstdc++-v3/include/bits/stl_set.h
index c0eb4dbf65f..51a1717ec62 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -684,7 +684,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
*/
   size_type
   erase(const key_type& __x)
-  { return _M_t.erase(__x); }
+  { return _M_t._M_erase_unique(__x); }
 
 #if __cplusplus >= 201103L
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h
index 6f470f04f6a..9e80d449c7e 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -1225,6 +1225,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   size_type
   erase(const key_type& __x);
 
+  size_type
+  _M_erase_unique(const key_type& __x);
+
 #if __cplusplus >= 201103L
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // DR 130. Associative erase should return an iterator.
@@ -2518,6 +2521,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return __old_size - size();
 }
 
+  template
+typename _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::size_type
+_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::
+_M_erase_unique(const _Key& __x)
+{
+  iterator __it = find(__x);
+  if (__it == end())
+   return 0;
+
+  _M_erase_aux(__it);
+  return 1;
+}
+
   template
 typename _Rb_tree<_Key, _Val, _KeyOfValue,
-- 
2.44.0

[PATCH] c++: ICE with temporary of class type in array DMI [PR109966]

2024-03-11 Thread Marek Polacek

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/13?

-- >8 --
This ICE started with the fairly complicated r13-765.  We crash in
gimplify_var_or_parm_decl because a stray VAR_DECL leaked there.
The problem is ultimately that potential_prvalue_result_of wasn't
correctly handling arrays and replace_placeholders_for_class_temp_r
replaced a PLACEHOLDER_EXPR in a TARGET_EXPR which is used in the
context of copy elision.  If I have

  M m[2] = { M{""}, M{""} };

then we don't invoke the M(const M&) copy-ctor.  I think the fix is
to detect such a case in potential_prvalue_result_of.

PR c++/109966

gcc/cp/ChangeLog:

* typeck2.cc (potential_prvalue_result_of): Add walk_subtrees
parameter.  Handle initializing an array from a
brace-enclosed-initializer.
(replace_placeholders_for_class_temp_r): Pass walk_subtrees down to
potential_prvalue_result_of.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/nsdmi-aggr20.C: New test.
* g++.dg/cpp1y/nsdmi-aggr21.C: New test.
---
 gcc/cp/typeck2.cc | 27 ---
 gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr20.C | 17 +++
 gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr21.C | 59 +++
 3 files changed, 96 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr20.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr21.C

diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 31198b2f9f5..8b99ce78e9a 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1406,46 +1406,59 @@ digest_init_flags (tree type, tree init, int flags, 
tsubst_flags_t complain)
  A a = (A{});// initializer
  A a = (1, A{}); // initializer
  A a = true ? A{} : A{};  // initializer
+ A arr[1] = { A{} };  // initializer
  auto x = A{}.x; // temporary materialization
  auto x = foo(A{});  // temporary materialization
 
FULL_EXPR is the whole expression, SUBOB is its TARGET_EXPR subobject.  */
 
 static bool
-potential_prvalue_result_of (tree subob, tree full_expr)
+potential_prvalue_result_of (tree subob, tree full_expr, int *walk_subtrees)
 {
+#define RECUR(t) potential_prvalue_result_of (subob, t, walk_subtrees)
   if (subob == full_expr)
 return true;
   else if (TREE_CODE (full_expr) == TARGET_EXPR)
 {
   tree init = TARGET_EXPR_INITIAL (full_expr);
   if (TREE_CODE (init) == COND_EXPR)
-   return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 1))
-   || potential_prvalue_result_of (subob, TREE_OPERAND (init, 2)));
+   return (RECUR (TREE_OPERAND (init, 1))
+   || RECUR (TREE_OPERAND (init, 2)));
   else if (TREE_CODE (init) == COMPOUND_EXPR)
-   return potential_prvalue_result_of (subob, TREE_OPERAND (init, 1));
+   return RECUR (TREE_OPERAND (init, 1));
   /* ??? I don't know if this can be hit.  */
   else if (TREE_CODE (init) == PAREN_EXPR)
{
  gcc_checking_assert (false);
- return potential_prvalue_result_of (subob, TREE_OPERAND (init, 0));
+ return RECUR (TREE_OPERAND (init, 0));
}
 }
+  /* The array case listed above.  */
+  else if (TREE_CODE (full_expr) == CONSTRUCTOR
+  && TREE_CODE (TREE_TYPE (full_expr)) == ARRAY_TYPE)
+for (constructor_elt : CONSTRUCTOR_ELTS (full_expr))
+  if (e.value == subob)
+   {
+ *walk_subtrees = 0;
+ return true;
+   }
+
   return false;
+#undef RECUR
 }
 
 /* Callback to replace PLACEHOLDER_EXPRs in a TARGET_EXPR (which isn't used
in the context of guaranteed copy elision).  */
 
 static tree
-replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
+replace_placeholders_for_class_temp_r (tree *tp, int *walk_subtrees, void 
*data)
 {
   tree t = *tp;
   tree full_expr = *static_cast(data);
 
   /* We're looking for a TARGET_EXPR nested in the whole expression.  */
   if (TREE_CODE (t) == TARGET_EXPR
-  && !potential_prvalue_result_of (t, full_expr))
+  && !potential_prvalue_result_of (t, full_expr, walk_subtrees))
 {
   tree init = TARGET_EXPR_INITIAL (t);
   while (TREE_CODE (init) == COMPOUND_EXPR)
diff --git a/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr20.C 
b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr20.C
new file mode 100644
index 000..4796d861e83
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr20.C
@@ -0,0 +1,17 @@
+// PR c++/109966
+// { dg-do compile { target c++14 } }
+
+#define SA(X) static_assert ((X),#X)
+
+struct A {
+  int a;
+  int b = a;
+};
+
+struct B {
+  int x = 0;
+  int y[1]{A{x}.b};
+};
+
+constexpr B b = { };
+SA(b.y[0] == 0);
diff --git a/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr21.C 
b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr21.C
new file mode 100644
index 000..efec45bc1a8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr21.C
@@ -0,0 +1,59 @@
+// PR c++/109966
+// { dg-do compile { target c++14 } }
+
+struct k {
+  k(const

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-03-11 Thread Jan Hubicka

> [Public]
> 
> 
> Hi all,
> 
> 
> 
> PFA, the patch that enables support for the next generation AMD Zen5 CPU via 
> -march=znver5 with basic znver5 scheduler Model.
> 
> We may update the scheduler model going forward.
> 
> 
> 
> Good for trunk?
> 
> Thanks and Regards
> Karthiban
> 
> 
> Patch is inline here.
> From 6230938c1420604c8d0af27b0d080970d9b54ac5 Mon Sep 17 00:00:00 2001
> From: karthiban 
> karthiban.anbazha...@amd.com
> Date: Fri, 9 Feb 2024 15:03:09 +0530
> Subject: [PATCH] Add AMD znver5 processor enablement with scheduler model
> 
> gcc/ChangeLog:
> * common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5.
> * common/config/i386/i386-common.cc (processor_names): Add znver5.
> (processor_alias_table): Likewise.
> * common/config/i386/i386-cpuinfo.h (processor_types): Add new zen
> family.
> (processor_subtypes): Add znver5.
> * config.gcc (x86_64-*-* |...): Likewise.
> * config/i386/driver-i386.cc (host_detect_local_cpu): Let
> march=native detect znver5 cpu's.
> * config/i386/i386-c.cc (ix86_target_macros_internal): Add znver5.
> * config/i386/i386-options.cc (m_ZNVER5): New definition
> (processor_cost_table): Add znver5.
> * config/i386/i386.cc (ix86_reassociation_width): Likewise.
> * config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5
> (PTA_ZNVER5): New definition.
> * config/i386/i386.md (define_attr "cpu"): Add znver5.
> (Scheduling descriptions) Add znver5.md.
> * config/i386/x86-tune-costs.h (znver5_cost): New definition.
> * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5.
> (ix86_adjust_cost): Likewise.
> * config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5.
> (avx512_store_by_pieces): Add m_ZNVER5.
> * doc/extend.texi: Add znver5.
> * doc/invoke.texi: Likewise.
> * config/i386/znver5.md: New.
> 
> gcc/testsuite/ChangeLog:
> * g++.target/i386/mv29.C: Handle znver5 arch.
> * gcc.target/i386/funcspec-56.inc:Likewise.

Hi,
I went through the scheduler description and found some places that can
be commonized.  Most frequently it is the vector path instruction which
blocks all execution cores so patterns can be shared between znver3 and
5 (blocking the new cores for znver3 does not change anything since they
are not used anyway).  The insn automata growth is now about 5% which I
hope is acceptable.  I tried the completely separate model and it was
abour 7%.

I plan to commit the patch tomorrow if htere are no further ideas for
improvement.

Honza

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index a595ee537a8..017a952a5db 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -310,6 +310,22 @@ get_amd_cpu (struct __processor_model *cpu_model,
  cpu_model->__cpu_subtype = AMDFAM19H_ZNVER3;
}
   break;
+case 0x1a:
+  cpu_model->__cpu_type = AMDFAM1AH;
+  if (model <= 0x77)
+   {
+ cpu = "znver5";
+ CHECK___builtin_cpu_is ("znver5");
+ cpu_model->__cpu_subtype = AMDFAM1AH_ZNVER5;
+   }
+  else if (has_cpu_feature (cpu_model, cpu_features2,
+   FEATURE_AVX512VP2INTERSECT))
+   {
+ cpu = "znver5";
+ CHECK___builtin_cpu_is ("znver5");
+ cpu_model->__cpu_subtype = AMDFAM1AH_ZNVER5;
+   }
+  break;
 default:
   break;
 }
diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index c35191e6925..f814df8385b 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -2166,7 +2166,8 @@ const char *const processor_names[] =
   "znver1",
   "znver2",
   "znver3",
-  "znver4"
+  "znver4",
+  "znver5"
 };
 
 /* Guarantee that the array is aligned with enum processor_type.  */
@@ -2435,6 +2436,9 @@ const pta processor_alias_table[] =
   {"znver4", PROCESSOR_ZNVER4, CPU_ZNVER4,
 PTA_ZNVER4,
 M_CPU_SUBTYPE (AMDFAM19H_ZNVER4), P_PROC_AVX512F},
+  {"znver5", PROCESSOR_ZNVER5, CPU_ZNVER5,
+PTA_ZNVER5,
+M_CPU_SUBTYPE (AMDFAM1AH_ZNVER5), P_PROC_AVX512F},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
   | PTA_SSSE3 | PTA_SSE4A | PTA_ABM | PTA_CX16 | PTA_PRFCHW
diff --git a/gcc/common/config/i386/i386-cpuinfo.h 
b/gcc/common/config/i386/i386-cpuinfo.h
index 2ee7470c8da..73131657eab 100644
--- a/gcc/common/config/i386/i386-cpuinfo.h
+++ b/gcc/common/config/i386/i386-cpuinfo.h
@@ -63,6 +63,7 @@ enum processor_types
   INTEL_SIERRAFOREST,
   INTEL_GRANDRIDGE,
   INTEL_CLEARWATERFOREST,
+  AMDFAM1AH,
   CPU_TYPE_MAX,
   BUILTIN_CPU_TYPE_MAX = CPU_TYPE_MAX
 };
@@ -104,6 +105,7 @@ enum processor_subtypes
   INTEL_COREI7_ARROWLAKE_S,
   INTEL_COREI7_PANTHERLAKE,

[PATCH] Fix PR ipa/113996

2024-03-11 Thread Eric Botcazou

Hi,

this is a regression present on all active branches: the attached Ada testcase 
triggers an assertion failure when compiled with -O2 -gnatp -flto:

  /* Initialize the static chain.  */
  p = DECL_STRUCT_FUNCTION (fn)->static_chain_decl;
  gcc_assert (fn != current_function_decl);
  if (p)
{
  /* No static chain?  Seems like a bug in tree-nested.cc.  */
  gcc_assert (static_chain);  <--- here

  setup_one_parameter (id, p, static_chain, fn, bb, );
}

The problem is that the ICF pass identifies two functions, one of which has a 
static chain but the other does not.  The proposed fix is just to prevent this 
identification from occurring.

Tested on x86-64/Linux, OK for all active branches?


2024-03-11  Eric Botcazou  

PR ipa/113996
* ipa-icf.h (sem_function): Add static_chain_present member.
* ipa-icf.cc (sem_function::get_hash): Hash it.
(sem_function::equals_wpa): Compare it.
(sem_function::equals_private): Likewise.
(sem_function::init): Initialize it.


2024-03-11  Eric Botcazou  

* gnat.dg/lto27.adb: New test.

-- 
Eric Botcazou-- { dg-do link }
-- { dg-options "-O2 -gnatp -flto" { target lto } }

with Ada.Containers.Hashed_Maps;
with Ada.Strings.Hash;

procedure Lto27 is
   subtype Node_Name is String (1 .. 4);

   package Node_Maps is new Ada.Containers.Hashed_Maps
 (Key_Type=> Node_Name,
  Element_Type=> Integer,
  Hash=> Ada.Strings.Hash,
  Equivalent_Keys => "=");

begin
   null;
end;
diff --git a/gcc/ipa-icf.cc b/gcc/ipa-icf.cc
index 5d5a42f9c6c..dff7ad6efda 100644
--- a/gcc/ipa-icf.cc
+++ b/gcc/ipa-icf.cc
@@ -284,6 +284,7 @@ sem_function::get_hash (void)
   hstate.add_int (177454); /* Random number for function type.  */
 
   hstate.add_int (arg_count);
+  hstate.add_int (static_chain_present);
   hstate.add_int (cfg_checksum);
   hstate.add_int (gcode_hash);
 
@@ -655,7 +656,10 @@ sem_function::equals_wpa (sem_item *item,
 }
 
   if (list1 || list2)
-return return_false_with_msg ("Mismatched number of parameters");
+return return_false_with_msg ("mismatched number of parameters");
+
+  if (static_chain_present != m_compared_func->static_chain_present)
+return return_false_with_msg ("static chain mismatch");
 
   if (node->num_references () != item->node->num_references ())
 return return_false_with_msg ("different number of references");
@@ -876,7 +880,10 @@ sem_function::equals_private (sem_item *item)
 return return_false ();
 }
   if (arg1 || arg2)
-return return_false_with_msg ("Mismatched number of arguments");
+return return_false_with_msg ("mismatched number of arguments");
+
+  if (static_chain_present != m_compared_func->static_chain_present)
+return return_false_with_msg ("static chain mismatch");
 
   if (!dyn_cast  (node)->has_gimple_body_p ())
 return true;
@@ -1368,6 +1375,8 @@ sem_function::init (ipa_icf_gimple::func_checker *checker)
   /* iterating all function arguments.  */
   arg_count = count_formal_params (fndecl);
 
+  static_chain_present = func->static_chain_decl != NULL_TREE;
+
   edge_count = n_edges_for_fn (func);
   cgraph_node *cnode = dyn_cast  (node);
   if (!cnode->thunk)
diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
index ef7e41bfa88..bd9fd9fb294 100644
--- a/gcc/ipa-icf.h
+++ b/gcc/ipa-icf.h
@@ -355,6 +355,9 @@ public:
  parameters.  */
   bool compatible_parm_types_p (tree, tree);
 
+  /* Return true if parameter I may be used.  */
+  bool param_used_p (unsigned int i);
+
   /* Exception handling region tree.  */
   eh_region region_tree;
 
@@ -379,6 +382,9 @@ public:
   /* Total number of SSA names used in the function.  */
   unsigned ssa_names_size;
 
+  /* Whether the special PARM_DECL for the static chain is present.  */
+  bool static_chain_present;
+
   /* Array of structures for all basic blocks.  */
   vec  bb_sorted;
 
@@ -386,9 +392,6 @@ public:
  function.  */
   hashval_t m_alias_sets_hash;
 
-  /* Return true if parameter I may be used.  */
-  bool param_used_p (unsigned int i);
-
 private:
   /* Calculates hash value based on a BASIC_BLOCK.  */
   hashval_t get_bb_hash (const ipa_icf_gimple::sem_bb *basic_block);

[PATCH] Fortran: handle procedure pointer component in DT array [PR110826]

2024-03-11 Thread Harald Anlauf

Dear all,

the attached patch fixes an ICE-on-valid code when assigning
a procedure pointer that is a component of a DT array and
the function in question is array-valued.  (The procedure
pointer itself cannot be an array.)

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From a9be17cf987b796c49684cde2f20dac3839c736c Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 11 Mar 2024 22:05:51 +0100
Subject: [PATCH] Fortran: handle procedure pointer component in DT array
 [PR110826]

gcc/fortran/ChangeLog:

	PR fortran/110826
	* array.cc (gfc_array_dimen_size): When walking the ref chain of an
	array and the ultimate component is a procedure pointer, do not try
	to figure out its dimension even if it is a array-valued function.

gcc/testsuite/ChangeLog:

	PR fortran/110826
	* gfortran.dg/proc_ptr_comp_53.f90: New test.
---
 gcc/fortran/array.cc  |  7 
 .../gfortran.dg/proc_ptr_comp_53.f90  | 41 +++
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/proc_ptr_comp_53.f90

diff --git a/gcc/fortran/array.cc b/gcc/fortran/array.cc
index 3a6e3a7c95b..e9934f1491b 100644
--- a/gcc/fortran/array.cc
+++ b/gcc/fortran/array.cc
@@ -2597,6 +2597,13 @@ gfc_array_dimen_size (gfc_expr *array, int dimen, mpz_t *result)
 case EXPR_FUNCTION:
   for (ref = array->ref; ref; ref = ref->next)
 	{
+	  /* Ultimate component is a procedure pointer.  */
+	  if (ref->type == REF_COMPONENT
+	  && !ref->next
+	  && ref->u.c.component->attr.function
+	  && IS_PROC_POINTER (ref->u.c.component))
+	return false;
+
 	  if (ref->type != REF_ARRAY)
 	continue;

diff --git a/gcc/testsuite/gfortran.dg/proc_ptr_comp_53.f90 b/gcc/testsuite/gfortran.dg/proc_ptr_comp_53.f90
new file mode 100644
index 000..881ddd3558f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/proc_ptr_comp_53.f90
@@ -0,0 +1,41 @@
+! { dg-do compile }
+! PR fortran/110826 - procedure pointer component in DT array
+
+module m
+  implicit none
+
+  type pp
+procedure(func_template), pointer, nopass :: f =>null()
+  end type pp
+
+  abstract interface
+ function func_template(state) result(dstate)
+   implicit none
+   real, dimension(:,:), intent(in)  :: state
+   real, dimension(size(state,1), size(state,2)) :: dstate
+ end function
+  end interface
+
+contains
+
+  function zero_state(state) result(dstate)
+real, dimension(:,:), intent(in)  :: state
+real, dimension(size(state,1), size(state,2)) :: dstate
+dstate = 0.
+  end function zero_state
+
+end module m
+
+program test_func_array
+  use m
+  implicit none
+
+  real, dimension(4,6) :: state
+  type(pp) :: func_scalar
+  type(pp) :: func_array(4)
+
+  func_scalar  %f => zero_state
+  func_array(1)%f => zero_state
+  print *, func_scalar  %f(state)
+  print *, func_array(1)%f(state)
+end program test_func_array
--
2.35.3

Re: [PATCH v2] c++: Check module attachment instead of just purview when necessary [PR112631]

2024-03-11 Thread Jason Merrill


On 3/8/24 18:18, Nathaniel Shead wrote:

On Fri, Mar 08, 2024 at 10:19:52AM -0500, Jason Merrill wrote:

On 3/7/24 21:55, Nathaniel Shead wrote:

On Mon, Nov 27, 2023 at 03:59:39PM +1100, Nathaniel Shead wrote:

On Thu, Nov 23, 2023 at 03:03:37PM -0500, Nathan Sidwell wrote:

On 11/20/23 04:47, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu. I don't have write
access.

-- >8 --

Block-scope declarations of functions or extern values are not allowed
when attached to a named module. Similarly, class member functions are
not inline if attached to a named module. However, in both these cases
we currently only check if the declaration is within the module purview;
it is possible for such a declaration to occur within the module purview
but not be attached to a named module (e.g. in an 'extern "C++"' block).
This patch makes the required adjustments.



Ah I'd been puzzling over the default inlinedness of  member-fns of
block-scope structs.  Could you augment the testcase to make sure that's
right too?

Something like:

// dg-module-do link
export module Mod;

export auto Get () {
struct X { void Fn () {} };
return X();
}


///
import Mod
void Frob () { Get().Fn(); }



I gave this a try and it indeed doesn't work correctly; 'Fn' needs to be
marked 'inline' for this to link (whether or not 'Get' itself is
inline). I've tried tracing the code to work out what's going on but
I've been struggling to work out how all the different flags (e.g.
TREE_PUBLIC, TREE_EXTERNAL, TREE_COMDAT, DECL_NOT_REALLY_EXTERN)
interact, which flags we want to be set where, and where the decision of
what function definitions to emit is actually made.

I did find that doing 'mark_used(decl)' on all member functions in
block-scope structs seems to work however, but I wonder if that's maybe
too aggressive or if there's something else we should be doing?


I got around to looking at this again, here's an updated version of this
patch. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

(I'm not sure if 'start_preparsed_function' is the right place to be
putting this kind of logic or if it should instead be going in
'grokfndecl', e.g. decl.cc:10761 where the rules for making local
functions have no linkage are initially determined, but I found this
easier to implement: happy to move around though if preferred.)

-- >8 --

Block-scope declarations of functions or extern values are not allowed
when attached to a named module. Similarly, class member functions are
not inline if attached to a named module. However, in both these cases
we currently only check if the declaration is within the module purview;
it is possible for such a declaration to occur within the module purview
but not be attached to a named module (e.g. in an 'extern "C++"' block).
This patch makes the required adjustments.

While implementing this we discovered that block-scope local functions
are not correctly emitted, causing link failures; this patch also
corrects some assumptions here and ensures that they are emitted when
needed.

PR c++/112631

gcc/cp/ChangeLog:

* cp-tree.h (named_module_attach_p): New function.
* decl.cc (start_decl): Check for attachment not purview.
(grokmethod): Likewise.


These changes are OK; the others I want to consider more.


Thanks, I can commit this as a separate commit if you prefer?


Please.


+export auto n_n() {
+  internal();
+  struct X { void f() { internal(); } };
+  return X{};


Hmm, is this not a prohibited exposure?  Seems like X has no linkage because
it's at block scope, and the deduced return type names it.

I know we try to support this "voldemort" pattern, but is that actually
correct?


I had similar doubts, but this is not an especially uncommon pattern in
the wild either. I also asked some other people for their thoughts and
got told:

   "no linkage" doesn't mean it's ill-formed to name it in other scopes.
   It means a declaration in another scope cannot correspond to it

And that the wording in [basic.link] p2.4 is imprecise. (Apparently they
were going to raise a core issue about this too, I think?)

As for whether it's an exposure, looking at [basic.link] p15, the entity
'X' doesn't actually appear to be TU-local: it doesn't have a name with
internal linkage (no linkage is different) and is not declared or
introduced within the definition of a TU-local entity (n_n is not
TU-local).


Hmm, I think you're right.  And this rule:


-/* DR 757: A type without linkage shall not be used as the type of a
-   variable or function with linkage, unless
-   o the variable or function has extern "C" linkage (7.5 [dcl.link]), or
-   o the variable or function is not used (3.2 [basic.def.odr]) or is
-   defined in the same translation unit.


is no longer part of the standard since C++20; the remnant of this rule 
is the example in


https://eel.is/c++draft/basic#def.odr-11


auto f() {
  struct A {};
  return A{};
}

Re: [PATCH v6 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-03-11 Thread Siddhesh Poyarekar





On 2024-02-16 14:47, Qing Zhao wrote:

gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---
  gcc/c-family/c-ubsan.cc   | 42 +
  .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
  .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
  .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
  4 files changed, 167 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..164b29845b3a 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, 
data));
  }
  
+/* Get the tree that represented the number of counted_by, i.e, the maximum

+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int type_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));


Again for consistency, this should probably be class_of_size.


+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+  build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+   build_zero_cst (type), size);
+  }
+
+  /* Only when type_of_size is 1,i.e, the number of the elements of
+ the object type, return the size.  */
+  if (type_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
  /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
 that gets expanded in the sanopt pass, and make an array dimension
 of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
  && COMPLETE_TYPE_P (type)
  && integer_zerop (TYPE_SIZE (type)))
bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+  && is_access_with_size_p ((TREE_OPERAND (array, 0
+   {
+ bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+  bound,
+  build_int_cst (TREE_TYPE (bound), 1));
+   }


This will wrap if bound == 0, maybe that needs to be special-cased.  And 
maybe also add a test for it below.



else
return NULL_TREE;
  }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..148934975ee5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void __attribute__((__noinline__)) setup_and_test_vla_1 (int n1, int n2, int m)
+{
+  struct foo {
+int n;
+int p[][n2][n1] __attribute__((counted_by(n)));
+  } *f;
+
+  f

Re: [PATCH v6 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-11 Thread Siddhesh Poyarekar





On 2024-02-16 14:47, Qing Zhao wrote:

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.
---
  .../gcc.dg/builtin-object-size-common.h   |  11 ++
  .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
  .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
  .../gcc.dg/flex-array-counted-by-5.c  |  48 +
  gcc/tree-object-size.cc   |  59 ++
  5 files changed, 359 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
__builtin_abort ();   \
  return 0;   \
} while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v; \
+  if (p == v)\
+__builtin_printf ("ok:  %s == %zd\n", #p, p);  \
+  else   \
+{\
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);  
\
+  FAIL ();   \
+}\
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..0066c32ca808
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f; 
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
++ normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+ + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
++ attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+  array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+  array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..3ce7f3545549
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+we should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/* In general, Due to type casting, the

Re: [PATCH v6 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-03-11 Thread Siddhesh Poyarekar





On 2024-02-16 14:47, Qing Zhao wrote:

Including the following changes:
* The definition of the new internal function .ACCESS_WITH_SIZE
   in internal-fn.def.
* C FE converts every reference to a FAM with a "counted_by" attribute
   to a call to the internal function .ACCESS_WITH_SIZE.
   (build_component_ref in c_typeck.cc)

   This includes the case when the object is statically allocated and
   initialized.
   In order to make this working, the routines initializer_constant_valid_p_1
   and output_constant in varasm.cc are updated to handle calls to
   .ACCESS_WITH_SIZE.
   (initializer_constant_valid_p_1 and output_constant in varasm.c)

   However, for the reference inside "offsetof", the "counted_by" attribute is
   ignored since it's not useful at all.
   (c_parser_postfix_expression in c/c-parser.cc)

   In addtion to "offsetof", for the reference inside operator "typeof" and
   "alignof", we ignore counted_by attribute too.

   When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
   replace the call with its first argument.

* Convert every call to .ACCESS_WITH_SIZE to its first argument.
   (expand_ACCESS_WITH_SIZE in internal-fn.cc)
* Adjust alias analysis to exclude the new internal from clobbering anything.
   (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in 
tree-ssa-alias.cc)
* Adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when
   it's LHS is eliminated as dead code.
   (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
* Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
   get the reference from the call to .ACCESS_WITH_SIZE.
   (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)

gcc/c/ChangeLog:

* c-parser.cc (c_parser_postfix_expression): Ignore the counted-by
attribute when build_component_ref inside offsetof operator.
* c-tree.h (build_component_ref): Add one more parameter.
* c-typeck.cc (build_counted_by_ref): New function.
(build_access_with_size_for_counted_by): New function.
(build_component_ref): Check the counted-by attribute and build
call to .ACCESS_WITH_SIZE.
(build_unary_op): When building ADDR_EXPR for
 .ACCESS_WITH_SIZE, use its first argument.
 (lvalue_p): Accept call to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* internal-fn.cc (expand_ACCESS_WITH_SIZE): New function.
* internal-fn.def (ACCESS_WITH_SIZE): New internal function.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special case
IFN_ACCESS_WITH_SIZE.
(call_may_clobber_ref_p_1): Special case IFN_ACCESS_WITH_SIZE.
* tree-ssa-dce.cc (eliminate_unnecessary_stmts): Eliminate the call
to .ACCESS_WITH_SIZE when its LHS is dead.
* tree.cc (process_call_operands): Adjust side effect for function
.ACCESS_WITH_SIZE.
(is_access_with_size_p): New function.
(get_ref_from_access_with_size): New function.
* tree.h (is_access_with_size_p): New prototype.
(get_ref_from_access_with_size): New prototype.
* varasm.cc (initializer_constant_valid_p_1): Handle call to
.ACCESS_WITH_SIZE.
(output_constant): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-2.c: New test.
---
  gcc/c/c-parser.cc |  10 +-
  gcc/c/c-tree.h|   2 +-
  gcc/c/c-typeck.cc | 128 +-
  gcc/internal-fn.cc|  36 +
  gcc/internal-fn.def   |   4 +
  .../gcc.dg/flex-array-counted-by-2.c  | 112 +++
  gcc/tree-ssa-alias.cc |   2 +
  gcc/tree-ssa-dce.cc   |   5 +-
  gcc/tree.cc   |  25 +++-
  gcc/tree.h|   8 ++
  gcc/varasm.cc |  10 ++
  11 files changed, 332 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index c31349dae2ff..a6ed5ac43bb1 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10850,9 +10850,12 @@ c_parser_postfix_expression (c_parser *parser)
if (c_parser_next_token_is (parser, CPP_NAME))
  {
c_token *comp_tok = c_parser_peek_token (parser);
+   /* Ignore the counted_by attribute for reference inside
+  offsetof since the information is not useful at all.  */
offsetof_ref
  = build_component_ref (loc, offsetof_ref, comp_tok->value,
-comp_tok->location, UNKNOWN_LOCATION);
+comp_tok->location, UNKNOWN_LOCATION,
+false);
c_parser_consume_token (parser);

[PATCH] c++: recalculating local specs via build_extra_args [PR114303]

2024-03-11 Thread Patrick Palka

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk and release branches?

-- >8 --

r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts
first so that we prefer processing a local specialization in an evaluated
context even if its first use is in an unevaluated context.  But this
means we need to avoid walking a tree that already has extra args/specs
saved because the list of saved specs appears to be an evaluated
context.  It seems then that we should be calculating the saved specs
from scratch each time, rather than potentially walking the saved specs
list from an earlier partial instantiation when calling build_extra_args
a second time around.

PR c++/114303

gcc/cp/ChangeLog:

* constraint.cc (tsubst_requires_expr): Clear
REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args.
* pt.cc (tsubst_stmt) : Call build_extra_args
on the new IF_STMT instead of t which might already have
IF_STMT_EXTRA_ARGS.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda6.C: New test.
---
 gcc/cp/constraint.cc |  1 +
 gcc/cp/pt.cc |  2 +-
 .../g++.dg/cpp1z/constexpr-if-lambda6.C  | 16 
 3 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 49de3211d4c..8a3b5d80ba7 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2362,6 +2362,7 @@ tsubst_requires_expr (tree t, tree args, sat_info info)
 matching or dguide constraint rewriting), in which case we need
 to partially substitute.  */
   t = copy_node (t);
+  REQUIRES_EXPR_EXTRA_ARGS (t) = NULL_TREE;
   REQUIRES_EXPR_EXTRA_ARGS (t) = build_extra_args (t, args, info.complain);
   return t;
 }
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8cf0d5b7a8d..37f2392d035 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -18718,7 +18718,7 @@ tsubst_stmt (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  IF_COND (stmt) = IF_COND (t);
  THEN_CLAUSE (stmt) = THEN_CLAUSE (t);
  ELSE_CLAUSE (stmt) = ELSE_CLAUSE (t);
- IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (t, args, complain);
+ IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (stmt, args, complain);
  add_stmt (stmt);
  break;
}
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C
new file mode 100644
index 000..038c2a41210
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C
@@ -0,0 +1,16 @@
+// PR c++/114303
+// { dg-do compile { target c++17 } }
+
+struct A { static constexpr bool value = true; };
+
+int main() {
+  [](auto x1) {
+return [&](auto) {
+  return [&](auto x3) {
+if constexpr (decltype(x3)::value) {
+  static_assert(decltype(x1)::value);
+}
+  }(A{});
+}(0);
+  }(A{});
+}
-- 
2.44.0.165.ge09f1254c5

Re: [PATCH v6 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-03-11 Thread Siddhesh Poyarekar


On 2024-02-16 14:47, Qing Zhao wrote:

'counted_by (COUNT)'
  The 'counted_by' attribute may be attached to the C99 flexible
  array member of a structure.  It indicates that the number of the
  elements of the array is given by the field named "COUNT" in the
  same structure as the flexible array member.  GCC uses this
  information to improve the results of the array bound sanitizer and
  the '__builtin_dynamic_object_size'.

  For instance, the following code:

   struct P {
 size_t count;
 char other;
 char array[] __attribute__ ((counted_by (count)));
   } *p;

  specifies that the 'array' is a flexible array member whose number
  of elements is given by the field 'count' in the same structure.

  The field that represents the number of the elements should have an
  integer type.  Otherwise, the compiler will report a warning and
  ignore the attribute.

  When the field that represents the number of the elements is assigned a
  negative integer value, the compiler will treat the value as zero.

  An explicit 'counted_by' annotation defines a relationship between
  two objects, 'p->array' and 'p->count', and there are the following
  requirementthat on the relationship between this pair:

 * 'p->count' should be initialized before the first reference to
   'p->array';

 * 'p->array' has _at least_ 'p->count' number of elements
   available all the time.  This relationship must hold even
   after any of these related objects are updated during the
   program.

  It's the user's responsibility to make sure the above requirements
  to be kept all the time.  Otherwise the compiler will report
  warnings, at the same time, the results of the array bound
  sanitizer and the '__builtin_dynamic_object_size' is undefined.

  One important feature of the attribute is, a reference to the
  flexible array member field will use the latest value assigned to
  the field that represents the number of the elements before that
  reference.  For example,

 p->count = val1;
 p->array[20] = 0;  // ref1 to p->array
 p->count = val2;
 p->array[30] = 0;  // ref2 to p->array

  in the above, 'ref1' will use 'val1' as the number of the elements
  in 'p->array', and 'ref2' will use 'val2' as the number of elements
  in 'p->array'.


I can't approve of course, but here's a review of the code that should 
hopefully make it easier for the C frontend maintainers.




gcc/c-family/ChangeLog:

PR C/108896
* c-attribs.cc (handle_counted_by_attribute): New function.
(attribute_takes_identifier_p): Add counted_by attribute to the list.
* c-common.cc (c_flexible_array_member_type_p): ...To this.
* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

PR C/108896
* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
(add_flexible_array_elts_to_size): Use renamed function.
(is_flexible_array_member_p): Use renamed function.
(verify_counted_by_attribute): New function.
(finish_struct): Use renamed function and verify counted_by
attribute.
* c-tree.h (lookup_field): New prototype.
* c-typeck.cc (lookup_field): Expose as extern function.

gcc/ChangeLog:

PR C/108896
* doc/extend.texi: Document attribute counted_by.

gcc/testsuite/ChangeLog:

PR C/108896
* gcc.dg/flex-array-counted-by.c: New test.
---
  gcc/c-family/c-attribs.cc| 54 -
  gcc/c-family/c-common.cc | 13 +++
  gcc/c-family/c-common.h  |  1 +
  gcc/c/c-decl.cc  | 85 
  gcc/c/c-tree.h   |  1 +
  gcc/c/c-typeck.cc|  3 +-
  gcc/doc/extend.texi  | 64 +++
  gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 +
  8 files changed, 241 insertions(+), 20 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 40a0cf90295d..4395c0656b14 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -105,6 +105,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, 
tree, tree,
  int, bool *);
  static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 int, bool *);
+static tree handle_counted_by_attribute (tree *, tree, tree,
+  int, bool *);
  static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
  static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;

[PATCH] tree-optimization/114297 - SLP reduction with early break fix

2024-03-11 Thread Richard Biener

The following makes sure to pass in the SLP node for the live stmts
we are generating the reduction epilogue for to
vect_create_epilog_for_reduction.  This follows the previous fix for
the non-SLP path.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

PR tree-optimization/114297
* tree-vect-loop.cc (vectorizable_live_operation): Pass in the
live stmts SLP node to vect_create_epilog_for_reduction.

* gcc.dg/vect/vect-early-break_123-pr114297.c: New testcase.
---
 .../vect/vect-early-break_123-pr114297.c  | 22 +++
 gcc/tree-vect-loop.cc |  7 +++---
 2 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_123-pr114297.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_123-pr114297.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_123-pr114297.c
new file mode 100644
index 000..84487b7903b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_123-pr114297.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+
+void h() __attribute__((__noreturn__));
+struct Extremes {
+  int w;
+  int h;
+};
+struct Extremes *array;
+int f(int num, int size1)
+{
+  int sw = 0, sh = 0;
+  for (int i = 0; i < size1; ++i)
+  {
+if (num - i == 0)
+  h();
+sw += array[i].w;
+sh += array[i].h;
+  }
+  return (sw) +  (sh);
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 20ee0aad932..4375ebdcb49 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -10729,17 +10729,18 @@ vectorizable_live_operation (vec_info *vinfo, 
stmt_vec_info stmt_info,
 block, but we have to find an alternate exit first.  */
   if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
{
+ slp_tree phis_node = slp_node ? slp_node_instance->reduc_phis : NULL;
  for (auto exit : get_loop_exit_edges (LOOP_VINFO_LOOP (loop_vinfo)))
if (exit != LOOP_VINFO_IV_EXIT (loop_vinfo))
  {
vect_create_epilog_for_reduction (loop_vinfo, reduc_info,
- slp_node, slp_node_instance,
+ phis_node, slp_node_instance,
  exit);
break;
  }
  if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
-   vect_create_epilog_for_reduction (loop_vinfo, reduc_info, slp_node,
- slp_node_instance,
+   vect_create_epilog_for_reduction (loop_vinfo, reduc_info,
+ phis_node, slp_node_instance,
  LOOP_VINFO_IV_EXIT (loop_vinfo));
}
 
-- 
2.35.3

Re: [PATCH] c++/modules: Support target-specific nodes with streaming [PR111224]

2024-03-11 Thread Patrick Palka

On Sun, 10 Mar 2024, Nathaniel Shead wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu and
> aarch64-unknown-linux-gnu, OK for trunk?
> 
> It's worth noting that the AArch64 machines I had available to test with
> didn't have a new enough glibc to reproduce the ICEs in the PR, but this
> patch will be necessary (albeit possibly not sufficient) to fix it.
> 
> -- >8 --
> 
> Some targets make use of POLY_INT_CSTs and other custom builtin types,
> which currently violate some assumptions when streaming. This patch adds
> support for them, specifically AArch64 SVE types like __fp16.

It seems other built-in types are handled by adding them to the
fixed_trees vector in init_modules (and then we install them first
during streaming).  Could we just add all the target-specific types to
fixed_trees too?

> 
> This patch doesn't provide "full" support of AArch64 SVE, however, since
> for that we would need to support 'target' nodes (tracked in PR108080).
> 
>   PR c++/111224
> 
> gcc/cp/ChangeLog:
> 
>   * module.cc (enum tree_tag): Add new tag for builtin types.
>   (trees_out::start): POLY_INT_CSTs can be emitted.
>   (trees_in::start): Likewise.
>   (trees_out::core_vals): Stream POLY_INT_CSTs.
>   (trees_in::core_vals): Likewise.
>   (trees_out::type_node): Handle target-specific builtin types,
>   and vectors with NUM_POLY_INT_COEFFS > 1.
>   (trees_in::tree_node): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/pr111224_a.C: New test.
>   * g++.dg/modules/pr111224_b.C: New test.
> 
> Signed-off-by: Nathaniel Shead 
> ---
>  gcc/cp/module.cc  | 70 +++
>  gcc/testsuite/g++.dg/modules/pr111224_a.C | 17 ++
>  gcc/testsuite/g++.dg/modules/pr111224_b.C | 13 +
>  3 files changed, 90 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr111224_a.C
>  create mode 100644 gcc/testsuite/g++.dg/modules/pr111224_b.C
> 
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 99055523d91..0b5e2e67053 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -2718,6 +2718,7 @@ enum tree_tag {
>tt_typedef_type,   /* A (possibly implicit) typedefed type.  */
>tt_derived_type,   /* A type derived from another type.  */
>tt_variant_type,   /* A variant of another type.  */
> +  tt_builtin_type,  /* A custom builtin type.  */
>  
>tt_tinfo_var,  /* Typeinfo object. */
>tt_tinfo_typedef,  /* Typeinfo typedef.  */
> @@ -2732,7 +2733,7 @@ enum tree_tag {
>tt_binfo,  /* A BINFO.  */
>tt_vtable, /* A vtable.  */
>tt_thunk,  /* A thunk.  */
> -  tt_clone_ref,
> +  tt_clone_ref, /* A cloned function.  */
>  
>tt_entity, /* A extra-cluster entity.  */
>  
> @@ -5173,7 +5174,6 @@ trees_out::start (tree t, bool code_streamed)
>break;
>  
>  case FIXED_CST:
> -case POLY_INT_CST:
>gcc_unreachable (); /* Not supported in C++.  */
>break;
>  
> @@ -5259,7 +5259,6 @@ trees_in::start (unsigned code)
>  
>  case FIXED_CST:
>  case IDENTIFIER_NODE:
> -case POLY_INT_CST:
>  case SSA_NAME:
>  case TARGET_MEM_REF:
>  case TRANSLATION_UNIT_DECL:
> @@ -6106,7 +6105,10 @@ trees_out::core_vals (tree t)
>break;
>  
>  case POLY_INT_CST:
> -  gcc_unreachable (); /* Not supported in C++.  */
> +  if (streaming_p ())
> + for (unsigned ix = 0; ix != NUM_POLY_INT_COEFFS; ix++)
> +   WT (POLY_INT_CST_COEFF (t, ix));
> +  break;
>  
>  case REAL_CST:
>if (streaming_p ())
> @@ -6615,8 +6617,9 @@ trees_in::core_vals (tree t)
>break;
>  
>  case POLY_INT_CST:
> -  /* Not suported in C++.  */
> -  return false;
> +  for (unsigned ix = 0; ix != NUM_POLY_INT_COEFFS; ix++)
> + RT (POLY_INT_CST_COEFF (t, ix));
> +  break;
>  
>  case REAL_CST:
>if (const void *bytes = buf (sizeof (real_value)))
> @@ -8930,6 +8933,32 @@ trees_out::type_node (tree type)
>return;
>  }
>  
> +  if (tree name = TYPE_NAME (type))
> +if (TREE_CODE (name) == TYPE_DECL && DECL_ARTIFICIAL (name))
> +  {
> + /* Potentially a custom machine- or OS-specific builtin type.  */
> + bool found = false;
> + unsigned ix = 0;
> + for (tree t = registered_builtin_types; t; t = TREE_CHAIN (t), ix++)
> +   if (TREE_VALUE (t) == type)
> + {
> +   found = true;
> +   break;
> + }
> + if (found)
> +   {
> + int type_tag = insert (type);
> + if (streaming_p ())
> +   {
> + i (tt_builtin_type);
> + u (ix);
> + dump (dumper::TREE)
> +   && dump ("Wrote:%d builtin type %N", type_tag, name);
> +   }
> + return;
> +   }
> +  }
> +
>if (streaming_p ())
>  {
>u (tt_derived_type);
> @@ -9068,8 +9097,8 @@ trees_out::type_node (tree type)
>

Re: [PATCH] testsuite: vect: Require vect_hw_misalign in gcc.dg/vect/vect-cost-model-1.c etc. [PR98238]

2024-03-11 Thread Richard Biener

On Mon, 11 Mar 2024, Rainer Orth wrote:

> Several gcc.dg/vect/vect-cost-model-?.c tests FAIL on 32 and 64-bit
> Solaris/SPARC:
> 
> FAIL: gcc.dg/vect/vect-cost-model-1.c -flto -ffat-lto-objects  scan-tree-dump 
> vect "LOOP VECTORIZED"
> FAIL: gcc.dg/vect/vect-cost-model-1.c scan-tree-dump vect "LOOP VECTORIZED"
> FAIL: gcc.dg/vect/vect-cost-model-3.c -flto -ffat-lto-objects  scan-tree-dump 
> vect "LOOP VECTORIZED"
> FAIL: gcc.dg/vect/vect-cost-model-3.c scan-tree-dump vect "LOOP VECTORIZED"
> FAIL: gcc.dg/vect/vect-cost-model-5.c -flto -ffat-lto-objects  scan-tree-dump 
> vect "LOOP VECTORIZED"
> FAIL: gcc.dg/vect/vect-cost-model-5.c scan-tree-dump vect "LOOP VECTORIZED"
> 
> The dumps show
> 
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:7:30:
>  note:   ==> examining statement: _3 = *_2;
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:7:30:
>  missed:   unsupported unaligned access
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:8:6:
>  missed:   not vectorized: relevant stmt not supported: _3 = *_2;
> 
> so I think the tests need to require vect_hw_misalign.  This is what
> this patch does.
> 
> Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.
> 
> Ok for trunk?

OK.

Thanks,
Richard.

>   Rainer
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] testsuite: vect: Require vect_perm in several tests [PR114071, PR113557, PR96109]

2024-03-11 Thread Richard Biener

On Mon, 11 Mar 2024, Rainer Orth wrote:

> Several vectorization tests FAIL on 32 and 64-bit Solaris/SPARC:
> 
> FAIL: gcc.dg/vect/pr37027.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/pr37027.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/pr67790.c -flto -ffat-lto-objects  scan-tree-dump vect 
> "vectorizing stmts using SLP"
> FAIL: gcc.dg/vect/pr67790.c scan-tree-dump vect "vectorizing stmts using SLP"
> FAIL: gcc.dg/vect/slp-47.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
> "vectorizing stmts using SLP" 2
> FAIL: gcc.dg/vect/slp-47.c scan-tree-dump-times vect "vectorizing stmts using 
> SLP" 2
> FAIL: gcc.dg/vect/slp-48.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
> "vectorizing stmts using SLP" 2
> FAIL: gcc.dg/vect/slp-48.c scan-tree-dump-times vect "vectorizing stmts using 
> SLP" 2
> FAIL: gcc.dg/vect/slp-reduc-1.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-reduc-1.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorized 1 
> loops" 1
> FAIL: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/slp-reduc-2.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-reduc-2.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorized 1 
> loops" 1
> FAIL: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorized 1 
> loops" 1
> FAIL: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/slp-reduc-8.c -flto -ffat-lto-objects  scan-tree-dump vect 
> "vectorized 1 loops"
> FAIL: gcc.dg/vect/slp-reduc-8.c scan-tree-dump vect "vectorized 1 loops"
> FAIL: gcc.dg/vect/vect-multi-peel-gaps.c -flto -ffat-lto-objects  
> scan-tree-dump vect "LOOP VECTORIZED"
> FAIL: gcc.dg/vect/vect-multi-peel-gaps.c scan-tree-dump vect "LOOP VECTORIZED"
> 
> The dumps show variations of
> 
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17: note: 
>   ==> examining statement: _4 = a[i_19].f2;
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17: 
> missed:   unsupported vect permute { 1 0 3 2 5 4 }
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17: 
> missed:   unsupported load permutation
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:27:17: 
> missed:   not vectorized: relevant stmt not supported: _4 = a[i_19].f2;
> 
> so I think the tests should require vect_perm.  This is what this patch does
> 
> Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.
> 
> Ok for trunk?

OK.

Thanks,
Richard.

>   Rainer
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] testsuite: vect: Require vect_hw_misalign in gcc.dg/vect/vect-cost-model-1.c etc. [PR98238]

2024-03-11 Thread Rainer Orth

Several gcc.dg/vect/vect-cost-model-?.c tests FAIL on 32 and 64-bit
Solaris/SPARC:

FAIL: gcc.dg/vect/vect-cost-model-1.c -flto -ffat-lto-objects  scan-tree-dump 
vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-1.c scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-3.c -flto -ffat-lto-objects  scan-tree-dump 
vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-3.c scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-5.c -flto -ffat-lto-objects  scan-tree-dump 
vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-cost-model-5.c scan-tree-dump vect "LOOP VECTORIZED"

The dumps show

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:7:30:
 note:   ==> examining statement: _3 = *_2;
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:7:30:
 missed:   unsupported unaligned access
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c:8:6: 
missed:   not vectorized: relevant stmt not supported: _3 = *_2;

so I think the tests need to require vect_hw_misalign.  This is what
this patch does.

Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-02-22  Rainer Orth  

gcc/testsuite:
PR tree-optimization/98238
* gcc.dg/vect/vect-cost-model-1.c (scan-tree-dump): Also require
vect_hw_misalign.
* gcc.dg/vect/vect-cost-model-3.c: Likewise.
* gcc.dg/vect/vect-cost-model-5.c: Likewise.

# HG changeset patch
# Parent  7238442252360e699145461779d03baf86bc3a7a
testsuite: vect: Require vect_hw_misalign in gcc.dg/vect/vect-cost-model-1.c etc. [PR98238]

diff --git a/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c b/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c
--- a/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cost-model-1.c
@@ -8,4 +8,4 @@ f (int *x, int *y)
 x[i] += y[i];
 }
 
-/* { dg-final { scan-tree-dump {LOOP VECTORIZED} vect { target vect_int } } } */
+/* { dg-final { scan-tree-dump {LOOP VECTORIZED} vect { target { vect_int && vect_hw_misalign } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cost-model-3.c b/gcc/testsuite/gcc.dg/vect/vect-cost-model-3.c
--- a/gcc/testsuite/gcc.dg/vect/vect-cost-model-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cost-model-3.c
@@ -8,4 +8,4 @@ f (int *restrict x, int *restrict y)
 x[i] += y[i];
 }
 
-/* { dg-final { scan-tree-dump {LOOP VECTORIZED} vect { target vect_int } } } */
+/* { dg-final { scan-tree-dump {LOOP VECTORIZED} vect { target { vect_int && vect_hw_misalign } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cost-model-5.c b/gcc/testsuite/gcc.dg/vect/vect-cost-model-5.c
--- a/gcc/testsuite/gcc.dg/vect/vect-cost-model-5.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-cost-model-5.c
@@ -8,4 +8,4 @@ f (int *restrict x, int *restrict y)
 x[i] += y[i];
 }
 
-/* { dg-final { scan-tree-dump {LOOP VECTORIZED} vect { target vect_int } } } */
+/* { dg-final { scan-tree-dump {LOOP VECTORIZED} vect { target { vect_int && vect_hw_misalign } } } } */

[PATCH] testsuite: vect: Require vect_perm in several tests [PR114071, PR113557, PR96109]

2024-03-11 Thread Rainer Orth

Several vectorization tests FAIL on 32 and 64-bit Solaris/SPARC:

FAIL: gcc.dg/vect/pr37027.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 1 loops" 1
FAIL: gcc.dg/vect/pr37027.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/pr37027.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 1
FAIL: gcc.dg/vect/pr67790.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
FAIL: gcc.dg/vect/pr67790.c scan-tree-dump vect "vectorizing stmts using SLP"
FAIL: gcc.dg/vect/slp-47.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 2
FAIL: gcc.dg/vect/slp-47.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 2
FAIL: gcc.dg/vect/slp-48.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 2
FAIL: gcc.dg/vect/slp-48.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 2
FAIL: gcc.dg/vect/slp-reduc-1.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-1.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-1.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-2.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-2.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-2.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-reduc-7.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-8.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorized 1 loops"
FAIL: gcc.dg/vect/slp-reduc-8.c scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.dg/vect/vect-multi-peel-gaps.c -flto -ffat-lto-objects  
scan-tree-dump vect "LOOP VECTORIZED"
FAIL: gcc.dg/vect/vect-multi-peel-gaps.c scan-tree-dump vect "LOOP VECTORIZED"

The dumps show variations of

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17: note:   
==> examining statement: _4 = a[i_19].f2;
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17: missed: 
  unsupported vect permute { 1 0 3 2 5 4 }
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17: missed: 
  unsupported load permutation
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:27:17: missed: 
  not vectorized: relevant stmt not supported: _4 = a[i_19].f2;

so I think the tests should require vect_perm.  This is what this patch does

Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-02-22  Rainer Orth  

gcc/testsuite:
PR tree-optimization/114071
* gcc.dg/vect/pr37027.c: Require vect_perm.
* gcc.dg/vect/pr67790.c: Likewise.
* gcc.dg/vect/slp-reduc-1.c: Likewise.
* gcc.dg/vect/slp-reduc-2.c: Likewise.
* gcc.dg/vect/slp-reduc-7.c: Likewise.
* gcc.dg/vect/slp-reduc-8.c: Likewise.

PR tree-optimization/113557
* gcc.dg/vect/vect-multi-peel-gaps.c (scan-tree-dump): Also
require vect_perm.

PR testsuite/96109
* gcc.dg/vect/slp-47.c: Require vect_perm.
* gcc.dg/vect/slp-48.c: Likewise.

# HG changeset patch
# Parent  7af365ac71939e435bbc93927bcfa439568af4e5
testsuite: vect: Require vect_perm in several tests [PR??]

diff --git a/gcc/testsuite/gcc.dg/vect/pr37027.c b/gcc/testsuite/gcc.dg/vect/pr37027.c
--- a/gcc/testsuite/gcc.dg/vect/pr37027.c
+++ b/gcc/testsuite/gcc.dg/vect/pr37027.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/vect/pr67790.c b/gcc/testsuite/gcc.dg/vect/pr67790.c
--- a/gcc/testsuite/gcc.dg/vect/pr67790.c
+++ b/gcc/testsuite/gcc.dg/vect/pr67790.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_perm } */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/gcc.dg/vect/slp-47.c b/gcc/testsuite/gcc.dg/vect/slp-47.c
--- a/gcc/testsuite/gcc.dg/vect/slp-47.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-47.c
@@ -1,4 +1,5 @@
 /* {

Re: [PATCH] bitint, v2: Avoid rewriting large/huge _BitInt vars into SSA after bitint lowering [PR114278]

2024-03-11 Thread Richard Biener

On Mon, 11 Mar 2024, Jakub Jelinek wrote:

> On Mon, Mar 11, 2024 at 11:31:51AM +0100, Richard Biener wrote:
> > On Mon, 11 Mar 2024, Jakub Jelinek wrote:
> > 
> > > On Sat, Mar 09, 2024 at 12:25:42PM +0100, Richard Biener wrote:
> > > > Ideally we?d clear TREE_ADDRESSABLE but set DECL_NOT_GIMPLE_REG,
> > > > I think the analysis where we check the base would be a more
> > > > appropriate place to enforce that.
> > > 
> > > So like this?
> > 
> > Hm, I was thinking of non_rewritable_lvalue_p/non_rewritable_mem_ref_base
> > though that requires duplicating, so I guess handling in maybe_optimize_var 
> > would work.
> 
> I was considering it, but it looked like a waste to me, using bitmap bits
> for something that is always the case, we don't want to rewrite any
> large/huge _BitInt to SSA form after the lowering, not just some of them.
> 
> > I do now wonder whether setting DECL_NOT_GIMPLE_REG_P in bitfield
> > lowering would prevail?
> 
> Guess I can certainly try to set DECL_NOT_GIMPLE_REG_P on the large/huge
> _BitInt PARM_DECLs/RESULT_DECLs during bitint lowering even when they are
> TREE_ADDRESSABLE at that point; the VAR_DECLs have array types of limbs and
> so shouldn't be a problem.

Hmm, looking I think we're going to clear DECL_NOT_GIMPLE_REG_P since
we still have is_gimple_reg_type.

> > (sorry for approving the earlier patch now, I was too quick and didn't
> > remember the discussion)
> 
> Sorry, already committed, I can revert or incrementally adjust.

No problem, I think both patches are OK, the 2nd maybe a bit better
for alias analysis.

Richard.

Re: [PATCH] bitint, v2: Avoid rewriting large/huge _BitInt vars into SSA after bitint lowering [PR114278]

2024-03-11 Thread Jakub Jelinek

On Mon, Mar 11, 2024 at 11:31:51AM +0100, Richard Biener wrote:
> On Mon, 11 Mar 2024, Jakub Jelinek wrote:
> 
> > On Sat, Mar 09, 2024 at 12:25:42PM +0100, Richard Biener wrote:
> > > Ideally we?d clear TREE_ADDRESSABLE but set DECL_NOT_GIMPLE_REG,
> > > I think the analysis where we check the base would be a more
> > > appropriate place to enforce that.
> > 
> > So like this?
> 
> Hm, I was thinking of non_rewritable_lvalue_p/non_rewritable_mem_ref_base
> though that requires duplicating, so I guess handling in maybe_optimize_var 
> would work.

I was considering it, but it looked like a waste to me, using bitmap bits
for something that is always the case, we don't want to rewrite any
large/huge _BitInt to SSA form after the lowering, not just some of them.

> I do now wonder whether setting DECL_NOT_GIMPLE_REG_P in bitfield
> lowering would prevail?

Guess I can certainly try to set DECL_NOT_GIMPLE_REG_P on the large/huge
_BitInt PARM_DECLs/RESULT_DECLs during bitint lowering even when they are
TREE_ADDRESSABLE at that point; the VAR_DECLs have array types of limbs and
so shouldn't be a problem.

> (sorry for approving the earlier patch now, I was too quick and didn't
> remember the discussion)

Sorry, already committed, I can revert or incrementally adjust.

Jakub

Re: [PATCH] bitint, v2: Avoid rewriting large/huge _BitInt vars into SSA after bitint lowering [PR114278]

2024-03-11 Thread Richard Biener

On Mon, 11 Mar 2024, Jakub Jelinek wrote:

> On Sat, Mar 09, 2024 at 12:25:42PM +0100, Richard Biener wrote:
> > Ideally we?d clear TREE_ADDRESSABLE but set DECL_NOT_GIMPLE_REG,
> > I think the analysis where we check the base would be a more
> > appropriate place to enforce that.
> 
> So like this?

Hm, I was thinking of non_rewritable_lvalue_p/non_rewritable_mem_ref_base
though that requires duplicating, so I guess handling in maybe_optimize_var 
would work.

I do now wonder whether setting DECL_NOT_GIMPLE_REG_P in bitfield
lowering would prevail?

(sorry for approving the earlier patch now, I was too quick and didn't
remember the discussion)

Richard.

> Bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> 2024-03-11  Jakub Jelinek  
> 
>   PR tree-optimization/114278
>   * tree-ssa.cc (maybe_optimize_var): If large/huge _BitInt vars are no
>   longer addressable, set DECL_NOT_GIMPLE_REG_P on them.
> 
>   * gcc.dg/bitint-99.c: New test.
> 
> --- gcc/tree-ssa.cc.jj2024-01-03 11:51:39.902615009 +0100
> +++ gcc/tree-ssa.cc   2024-03-09 23:34:12.469223987 +0100
> @@ -1785,6 +1785,20 @@ maybe_optimize_var (tree var, bitmap add
> fprintf (dump_file, "\n");
>   }
>   }
> +  else if (TREE_CODE (TREE_TYPE (var)) == BITINT_TYPE
> +&& (cfun->curr_properties & PROP_gimple_lbitint) != 0
> +&& TYPE_PRECISION (TREE_TYPE (var)) > MAX_FIXED_MODE_SIZE)
> + {
> +   /* Don't rewrite large/huge _BitInt vars after _BitInt lowering
> +  into SSA form.  */
> +   DECL_NOT_GIMPLE_REG_P (var) = 1;
> +   if (dump_file)
> + {
> +   fprintf (dump_file, "_BitInt var after its lowering: ");
> +   print_generic_expr (dump_file, var);
> +   fprintf (dump_file, "\n");
> + }
> + }
>else if (DECL_NOT_GIMPLE_REG_P (var))
>   {
> maybe_reg = true;
> --- gcc/testsuite/gcc.dg/bitint-99.c.jj   2024-03-08 14:26:17.658069942 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-99.c  2024-03-08 14:25:36.292645965 +0100
> @@ -0,0 +1,26 @@
> +/* PR tree-optimization/114278 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -fno-tree-dce -fno-tree-dse -fno-tree-ccp" } */
> +/* { dg-additional-options "-mavx2" { target i?86-*-* x86_64-*-* } } */
> +
> +void
> +foo (void *p)
> +{
> +  _BitInt(64) b = *(_BitInt(64) *) __builtin_memmove (, p, sizeof 
> (_BitInt(64)));
> +}
> +
> +#if __BITINT_MAXWIDTH__ >= 128
> +void
> +bar (void *p)
> +{
> +  _BitInt(128) b = *(_BitInt(128) *) __builtin_memmove (, p, sizeof 
> (_BitInt(128)));
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 256
> +void
> +baz (void *p)
> +{
> +  _BitInt(256) b = *(_BitInt(256) *) __builtin_memmove (, p, sizeof 
> (_BitInt(256)));
> +}
> +#endif
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH v2] testsuite: xfail test for short_enums

2024-03-11 Thread Torbjörn SVENSSON

Changes compared to v1:
- Added reference to r14-6517-gb7e4a4c626e in dg-bogus comment
- Changed arm-*-* to short_enums in target selector
- Updated commit message to align with above changes


As the entire block generating the warning was removed in
r14-6517-gb7e4a4c626e, does it still make sense to add something to
trunk for the same line?
Do you want me to add the dg-bogus, but change "xfail" to "target" for
trunk?

Is this patch ok for releases/gcc-13?

--

On arm-none-eabi, the test case fails with
.../null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:63:65: warning: 
converting a packed 'enum obj_type' pointer (alignment 1) to a 'struct 
connection' pointer (alignment 4) may result in an unaligned pointer value 
[-Waddress-of-packed-member]

The error was fixed in basepoints/gcc-14-6517-gb7e4a4c626e, but it
was considered to be a too big change to be backported and thus, the
failing test is marked xfail in GCC13.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
Added dg-bogus with xfail on offending line for short_enums.

Signed-off-by: Torbjörn SVENSSON 
---
 .../null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/gcc/testsuite/gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c
 
b/gcc/testsuite/gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c
index 2a9c715c32c..e8cde7338a0 100644
--- 
a/gcc/testsuite/gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c
+++ 
b/gcc/testsuite/gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c
@@ -60,7 +60,7 @@ static inline enum obj_type obj_type(const enum obj_type *t)
 }
 static inline struct connection *__objt_conn(enum obj_type *t)
 {
- return ((struct connection *)(((void *)(t)) - ((long)&((struct connection 
*)0)->obj_type)));
+ return ((struct connection *)(((void *)(t)) - ((long)&((struct connection 
*)0)->obj_type))); /* { dg-bogus "may result in an unaligned pointer value" 
"Fixed in r14-6517-gb7e4a4c626e" { xfail short_enums } */
 }
 static inline struct connection *objt_conn(enum obj_type *t)
 {
-- 
2.25.1

Re: [PATCH] Revert "Pass GUILE down to subdirectories"

2024-03-11 Thread Andrew Burgess

Tom Tromey  writes:

>> "Andrew" == Andrew Burgess  writes:
>
> Andrew> Tom Tromey  writes:
>>> This reverts commit b7e5a29602143b53267efcd9c8d5ecc78cd5a62f.
>>> 
>>> This patch caused problems for some users when building gdb, because
>>> it would cause 'guild' to be invoked with the wrong versin of guile.
>>> On the whole it seems simpler to just back this out.
>>> 
>>> * Makefile.in: Rebuild.
>>> * Makefile.tpl (BASE_EXPORTS): Remove GUILE.
>>> (GUILE): Remove.
>>> * Makefile.def (flags_to_pass): Remove GUILE.
>
> Andrew> Is it going to be possible to merge this with GCC in stage 4?  Would 
> be
> Andrew> super useful if we could as this is still causing problems.
>
> We can always check it in to gdb now and then to gcc at some later date.
> If that sounds ok to you, I'll go ahead & do it.

Thanks, that would be great, and would certainly fix the build problems
I see.

Sorry for the late reply.

Thanks,
Andrew

Re: [Patch] OpenMP/Fortran: Fix defaultmap(none) issue with dummy procedures [PR114283]

2024-03-11 Thread Jakub Jelinek

On Mon, Mar 11, 2024 at 11:07:46AM +0100, Tobias Burnus wrote:
> Using dummy procedures in a target region with 'defaultmap(none)' leads to:
> 
>   Error: 'g' not specified in enclosing 'target'
> 
> and this cannot be fixed by using 'firstprivate' as non-pointer dummy routines
> are rejected as "Error: Object 'g' is not a variable".
> 
> Fixed by doing the same for mapping as for data sharing: using predetermined
> firstprivate.
> 
> BTW: Only since GCC 14, 'declare target indirect' makes it possible to
> simply use dummy procedures and procedures pointers in a target region.

So firstprivate clause handling remaps them then if declare target indirect
is used?
If so, the patch looks reasonable to me.

Jakub

[PATCH] middle-end/114299 - missing error recovery from gimplify failure

2024-03-11 Thread Richard Biener

When internal_get_tmp_var fails to gimplify the value the temporary
SSA name is supposed to be initialized with we can leak SSA names
with a NULL SSA_NAME_DEF_STMT into the IL.  That's bad, so recover
from this by instead returning a decl in that case.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR middle-end/114299
* gimplify.cc (internal_get_tmp_var): When gimplification
of VAL failed, return a decl.

* gcc.target/i386/pr114299.c: New testcase.
---
 gcc/gimplify.cc  |  5 +
 gcc/testsuite/gcc.target/i386/pr114299.c | 14 ++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr114299.c

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 81f06ad91bd..f6078386cdf 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -656,6 +656,11 @@ internal_get_tmp_var (tree val, gimple_seq *pre_p, 
gimple_seq *post_p,
   gimplify_and_add (mod, pre_p);
   ggc_free (mod);
 
+  /* If we failed to gimplify VAL then we can end up with the temporary
+ SSA name not having a definition.  In this case return a decl.  */
+  if (TREE_CODE (t) == SSA_NAME && ! SSA_NAME_DEF_STMT (t))
+return lookup_tmp_var (val, is_formal, not_gimple_reg);
+
   return t;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/pr114299.c 
b/gcc/testsuite/gcc.target/i386/pr114299.c
new file mode 100644
index 000..b4f30b7a95f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr114299.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-mgeneral-regs-only" } */
+
+typedef __attribute__((__vector_size__(8))) __bf16 V;
+typedef __attribute__((__vector_size__(16))) __bf16 W;
+
+V v;
+_Atomic V a;
+
+W
+foo(void) /* { dg-error "SSE" } */
+{
+  return __builtin_shufflevector(v, a, 1, 2, 5, 0, 1, 6, 6, 4); /* { dg-error 
"invalid" } */
+}
-- 
2.35.3

Re: [PATCH] s390: Deprecate some vector builtins

2024-03-11 Thread Andreas Krebbel

On 3/1/24 16:57, Stefan Schulze Frielinghaus wrote:
> According to IBM Open XL C/C++ for z/OS version 1.1 builtins
> 
> - vec_permi
> - vec_ctd
> - vec_ctsl
> - vec_ctul
> - vec_ld2f
> - vec_st2f
> 
> are deprecated.  Also deprecate helper builtins vec_ctd_s64 and
> vec_ctd_u64.
> 
> Furthermore, the overloads of vec_insert which make use of a bool vector
> are deprecated, too.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtins.def (vec_permi): Deprecate.
>   (vec_ctd): Deprecate.
>   (vec_ctd_s64): Deprecate.
>   (vec_ctd_u64): Deprecate.
>   (vec_ctsl): Deprecate.
>   (vec_ctul): Deprecate.
>   (vec_ld2f): Deprecate.
>   (vec_st2f): Deprecate.
>   (vec_insert): Deprecate overloads with bool vectors.

Ok. Thanks!

Andreas

Re: [PATCH] s390: Streamline vector builtins with LLVM

2024-03-11 Thread Andreas Krebbel

On 3/1/24 10:29, Stefan Schulze Frielinghaus wrote:
> Similar as to s390_lcbb, s390_vll, s390_vstl, et al. make use of a
> signed vector type for vlbb.  Furthermore, a const void pointer seems
> more common and an integer for the mask.
> 
> For s390_vfi(s,d)b make use of integers for masks, too.
> 
> Use unsigned integers for all s390_vlbr/vstbr variants.
> 
> Make use of type UV16QI for the length operand of s390_vstrs(,z)(h,f).
> 
> Following the Principles of Operation, change from signed to unsigned
> type for s390_va(c,cc,ccc)q and s390_vs(,c,bc)biq and s390_vmslg.
> 
> Make use of scalar type UINT128 instead of UV16QI for s390_vgfm(,a)g,
> and s390_vsumq(f,g).
> 
> Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390-builtin-types.def: Update to reflect latest
>   changes.
>   * config/s390/s390-builtins.def: Streamline vector builtins with
>   LLVM.

Ok. Thanks!

Andreas

Re: [PATCH] s390: Fix test vector/long-double-to-i64.c

2024-03-11 Thread Andreas Krebbel

On 2/29/24 13:15, Stefan Schulze Frielinghaus wrote:
> Starting with r14-8319-g86de9b66480b71 fwprop improved so that vpdi is
> no longer required.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/long-double-to-i64.c: Fix scan
>   assembler directive.

Should we perhaps rather turn the scan-assembler directives into something 
which checks for the
absence of vpdi then? In order to get notified once this really useful 
optimization breaks?

Andreas

> ---
>  .../gcc.target/s390/vector/long-double-to-i64.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c 
> b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> index 2dbbb5d1c03..ed89878e6ee 100644
> --- a/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> +++ b/gcc/testsuite/gcc.target/s390/vector/long-double-to-i64.c
> @@ -1,19 +1,24 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
>  /* { dg-do run { target { s390_z14_hw } } } */
> +/* { dg-final { check-function-bodies "**" "" "" { target { lp64 } } } } */
> +
>  #include 
>  #include 
>  
> +/*
> +** long_double_to_i64:
> +**   ld  %f0,0\(%r2\)
> +**   ld  %f2,8\(%r2\)
> +**   cgxbr   %r2,5,%f0
> +**   br  %r14
> +*/
>  __attribute__ ((noipa)) static int64_t
>  long_double_to_i64 (long double x)
>  {
>return x;
>  }
>  
> -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,1\n} 1 } } 
> */
> -/* { dg-final { scan-assembler-times {\n\tvpdi\t%v\d+,%v\d+,%v\d+,5\n} 1 } } 
> */
> -/* { dg-final { scan-assembler-times {\n\tcgxbr\t} 1 } } */
> -
>  int
>  main (void)
>  {

[Patch] OpenMP/Fortran: Fix defaultmap(none) issue with dummy procedures [PR114283]

2024-03-11 Thread Tobias Burnus


Using dummy procedures in a target region with 'defaultmap(none)' leads to:

  Error: 'g' not specified in enclosing 'target'

and this cannot be fixed by using 'firstprivate' as non-pointer dummy routines
are rejected as "Error: Object 'g' is not a variable".

Fixed by doing the same for mapping as for data sharing: using predetermined
firstprivate.

BTW: Only since GCC 14, 'declare target indirect' makes it possible to
simply use dummy procedures and procedures pointers in a target region.

Comments? Suggestions?

Tobias

PS: Procedure pointers aren't variables either, but they act even more like
variables as they permit changing pointer association such that '(first)private'
vs. 'shared'/'map' can both make sense. — GCC accepts those in (nearly) all 
clauses,
ifort only in (first)private while flang not at all. The spec is somewhat silent
about it. This is tracked in the same PR (PR114283) and in the specification
issue #3823.
OpenMP/Fortran: Fix defaultmap(none) issue with dummy procedures [PR114283]

Dummy procedures look similar to variables but aren't - neither in Fortran
nor in OpenMP. As the middle end sees PARM_DECLs, mark them as predetermined
firstprivate for mapping (as already done in gfc_omp_predetermined_sharing).

This does not address the isses related to procedure pointers, which are
still discussed on spec level [see PR].

	PR fortran/114283

gcc/fortran/ChangeLog:

	* trans-openmp.cc (gfc_omp_predetermined_mapping): Map dummy
	procedures as firstprivate.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/target4.f90: New test.

 gcc/fortran/trans-openmp.cc|  9 +
 gcc/testsuite/gfortran.dg/gomp/target4.f90 | 18 ++
 2 files changed, 27 insertions(+)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index a2bf15665b3..1dba47126ed 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -343,6 +343,15 @@ gfc_omp_predetermined_mapping (tree decl)
 	&& GFC_DECL_SAVED_DESCRIPTOR (decl)))
 return OMP_CLAUSE_DEFAULTMAP_TO;
 
+  /* Dummy procedures aren't considered variables by OpenMP, thus are
+ disallowed in OpenMP clauses.  They are represented as PARM_DECLs
+ in the middle-end, so return OMP_CLAUSE_DEFAULTMAP_FIRSTPRIVATE here
+ to avoid complaining about their uses with defaultmap(none).  */
+  if (TREE_CODE (decl) == PARM_DECL
+  && TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE
+  && TREE_CODE (TREE_TYPE (TREE_TYPE (decl))) == FUNCTION_TYPE)
+return OMP_CLAUSE_DEFAULTMAP_FIRSTPRIVATE;
+
   /* These are either array or derived parameters, or vtables.  */
   if (VAR_P (decl) && TREE_READONLY (decl)
   && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)))
diff --git a/gcc/testsuite/gfortran.dg/gomp/target4.f90 b/gcc/testsuite/gfortran.dg/gomp/target4.f90
new file mode 100644
index 000..09364e707f1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target4.f90
@@ -0,0 +1,18 @@
+! { dg-additional-options "-fdump-tree-gimple" }
+
+! PR fortran/114283
+
+! { dg-final { scan-tree-dump "#pragma omp parallel default\\(none\\) firstprivate\\(g\\)" "gimple" } }
+! { dg-final { scan-tree-dump "#pragma omp target num_teams\\(-2\\) thread_limit\\(0\\) defaultmap\\(none\\) firstprivate\\(g\\)" "gimple" } }
+
+subroutine f(g)
+procedure() :: g
+
+!$omp parallel default(none)
+  call g
+!$omp end parallel
+
+!$omp target defaultmap(none)
+  call g
+!$omp end target
+end

Re: [PATCH] s390: Fix tests rosbg_si_srl and rxsbg_si_srl

2024-03-11 Thread Andreas Krebbel

On 2/29/24 13:14, Stefan Schulze Frielinghaus wrote:
> Starting with r14-2047-gd0e891406b16dc two SI mode tests are optimized
> into DI mode.  Thus, the scan-assembler directives fail.  For example
> RTL expression
> 
> (ior:SI (subreg:SI (lshiftrt:DI (reg:DI 69)
> (const_int 2 [0x2])) 4)
> (subreg:SI (reg:DI 68) 4))
> 
> is optimized into
> 
> (ior:DI (lshiftrt:DI (reg:DI 69)
> (const_int 2 [0x2]))
> (reg:DI 68))
> 
> Fixed by moving operands into memory in order to enforce SI mode
> computation.
> 
> Furthermore, in r9-6056-g290dfd9bc7bea2 the starting bit position of the
> scan-assembler directive for rosbg was incorrectly set to 32 which
> actually should be 32+SHIFT_AMOUNT, i.e., in this particular case 34.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/md/rXsbg_mode_sXl.c: Fix tests rosbg_si_srl
>   and rxsbg_si_srl.

Ok, thanks!

Andreas

Re: [PATCH] bitint: Avoid rewriting large/huge _BitInt vars into SSA after bitint lowering [PR114278]

2024-03-11 Thread Richard Biener

On Sat, 9 Mar 2024, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs, because update-address-taken subpass of
> fre5 rewrites
>   _BitInt(128) b;
>   vector(16) unsigned char _3;
> 
>[local count: 1073741824]:
>   _3 = MEM  [(char * {ref-all})p_2(D)];
>   MEM  [(char * {ref-all})] = _3;
>   b ={v} {CLOBBER(eos)};
> to
>   _BitInt(128) b;
>   vector(16) unsigned char _3;
> 
>[local count: 1073741824]:
>   _3 = MEM  [(char * {ref-all})p_2(D)];
>   b_5 = VIEW_CONVERT_EXPR<_BitInt(128)>(_3);
> but we can't have large/huge _BitInt vars in SSA form after the bitint
> lowering except for function arguments loaded from memory, as expansion
> isn't able to deal with those, it relies on bitint lowering to lower
> those operations.
> The following patch fixes that by not clearing TREE_ADDRESSABLE for
> large/huge _BitInt vars after bitint lowering, such that we don't
> rewrite them into SSA form.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-03-09  Jakub Jelinek  
> 
>   PR tree-optimization/114278
>   * tree-ssa.cc (maybe_optimize_var): Punt on large/huge _BitInt
>   vars after bitint lowering.
> 
>   * gcc.dg/bitint-99.c: New test.
> 
> --- gcc/tree-ssa.cc.jj2024-01-03 11:51:39.902615009 +0100
> +++ gcc/tree-ssa.cc   2024-03-08 14:24:11.844821915 +0100
> @@ -1753,7 +1753,11 @@ maybe_optimize_var (tree var, bitmap add
>/* Global Variables, result decls cannot be changed.  */
>if (is_global_var (var)
>|| TREE_CODE (var) == RESULT_DECL
> -  || bitmap_bit_p (addresses_taken, DECL_UID (var)))
> +  || bitmap_bit_p (addresses_taken, DECL_UID (var))
> +  || (TREE_CODE (TREE_TYPE (var)) == BITINT_TYPE
> +   /* Don't change large/huge _BitInt vars after _BitInt lowering.  */
> +   && (cfun->curr_properties & PROP_gimple_lbitint) != 0
> +   && TYPE_PRECISION (TREE_TYPE (var)) > MAX_FIXED_MODE_SIZE))
>  return;
>  
>bool maybe_reg = false;
> --- gcc/testsuite/gcc.dg/bitint-99.c.jj   2024-03-08 14:26:17.658069942 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-99.c  2024-03-08 14:25:36.292645965 +0100
> @@ -0,0 +1,26 @@
> +/* PR tree-optimization/114278 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -fno-tree-dce -fno-tree-dse -fno-tree-ccp" } */
> +/* { dg-additional-options "-mavx2" { target i?86-*-* x86_64-*-* } } */
> +
> +void
> +foo (void *p)
> +{
> +  _BitInt(64) b = *(_BitInt(64) *) __builtin_memmove (, p, sizeof 
> (_BitInt(64)));
> +}
> +
> +#if __BITINT_MAXWIDTH__ >= 128
> +void
> +bar (void *p)
> +{
> +  _BitInt(128) b = *(_BitInt(128) *) __builtin_memmove (, p, sizeof 
> (_BitInt(128)));
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 256
> +void
> +baz (void *p)
> +{
> +  _BitInt(256) b = *(_BitInt(256) *) __builtin_memmove (, p, sizeof 
> (_BitInt(256)));
> +}
> +#endif
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] s390: Fix TARGET_SECONDARY_RELOAD for non-SYMBOL_REFs

2024-03-11 Thread Andreas Krebbel

On 2/29/24 13:13, Stefan Schulze Frielinghaus wrote:
> RTX X must not necessarily be a SYMBOL_REF and may e.g. be an
> UNSPEC_GOTENT for which SYMBOL_FLAG_NOTALIGN2_P fails.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_secondary_reload): Guard
>   SYMBOL_FLAG_NOTALIGN2_P.
Ok. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 943fc9bfd72..12430d77786 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -4778,7 +4778,7 @@ s390_secondary_reload (bool in_p, rtx x, reg_class_t 
> rclass_i,
>if (in_p
> && s390_loadrelative_operand_p (x, , )
> && mode == Pmode
> -   && !SYMBOL_FLAG_NOTALIGN2_P (symref)
> +   && (!SYMBOL_REF_P (symref) || !SYMBOL_FLAG_NOTALIGN2_P (symref))
> && (offset & 1) == 1)
>   sri->icode = ((mode == DImode) ? CODE_FOR_reloaddi_larl_odd_addend_z10
> : CODE_FOR_reloadsi_larl_odd_addend_z10);

Fix PR debug/113519 and debug/113777

2024-03-11 Thread Eric Botcazou

They both come from an oversight of mine in the placement of the DIE created 
for an enumeration type with reverse scalar storage order.

Tested on x86-64/Linux, both GCC and GDB, applied on mainline as obvious.


2024-03-11  Eric Botcazou  

PR debug/113519
PR debug/113777
* dwarf2out.cc (gen_enumeration_type_die): In the reverse case,
generate the DIE with the same parent as in the regular case.


2024-03-11  Eric Botcazou  

* gcc.dg/sso-20.c: New test.
* gcc.dg/sso-21.c: Likewise.

-- 
Eric Botcazoudiff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 03d73f9eecd..9b1548e4ae3 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -22868,18 +22868,19 @@ gen_enumeration_type_die (tree type, dw_die_ref context_die, bool reverse)
 
   if (type_die == NULL || reverse)
 {
+  dw_die_ref scope_die = scope_die_for (type, context_die);
+
   /* The DIE with DW_AT_endianity is placed right after the naked DIE.  */
   if (reverse)
 	{
 	  gcc_assert (type_die);
 	  dw_die_ref after_die = type_die;
 	  type_die = new_die_raw (DW_TAG_enumeration_type);
-	  add_child_die_after (context_die, type_die, after_die);
+	  add_child_die_after (scope_die, type_die, after_die);
 	}
   else
 	{
-	  type_die = new_die (DW_TAG_enumeration_type,
-			  scope_die_for (type, context_die), type);
+	  type_die = new_die (DW_TAG_enumeration_type, scope_die, type);
 	  equate_type_number_to_die (type, type_die);
 	}
   add_name_attribute (type_die, type_tag (type));
/* PR debug/113519 */
/* Reported by Zdenek Sojka  */

/* { dg-do compile } */
/* { dg-options "-g -fdebug-types-section" } */

enum E { X };

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
struct __attribute__((scalar_storage_order("big-endian")))
{
  enum E e;
} S;
#else
struct __attribute__((scalar_storage_order("little-endian")))
{
  enum E e;
} S;
#endif
/* PR debug/113777 */
/* Reported by Zdenek Sojka  */

/* { dg-do compile } */
/* { dg-options "-g" } */

typedef short __attribute__((__hardbool__)) hbool;

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
struct __attribute__((scalar_storage_order("big-endian")))
{
  hbool a[2];
} S;
#else
struct __attribute__((scalar_storage_order("little-endian")))
{
  hbool a[2];
} S;
#endif

Re: [RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-11 Thread Richard Biener

On Mon, Mar 11, 2024 at 8:46 AM Richard Biener
 wrote:
>
> On Sun, Mar 10, 2024 at 10:09 PM Jeff Law  wrote:
> >
> >
> >
> > On 3/10/24 3:05 PM, Andrew Pinski wrote:
> > > On Sun, Mar 10, 2024 at 2:04 PM Jeff Law  wrote:
> > >>
> > >> Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
> > >> positive triggered by loop unrolling.
> > >>
> > >> As I speculated a couple years ago, we could eliminate the comparisons
> > >> against bogus pointers.  Consider:
> > >>
> > >>> [local count: 30530247]:
> > >>>if (last_12 !=   [(void *)"aa" + 3B])
> > >>>  goto ; [54.59%]
> > >>>else
> > >>>  goto ; [45.41%]
> > >>
> > >>
> > >> That's a valid comparison as ISO allows us to generate, but not
> > >> dereference, a pointer one element past the end of the object.
> > >>
> > >> But +4B is a bogus pointer.  So given an EQ comparison against that
> > >> pointer we could always return false and for NE always return true.
> > >>
> > >> VRP and DOM seem to be the most natural choices for this kind of
> > >> optimization on the surface.  However DOM is actually not viable because
> > >> the out-of-bounds pointer warning pass is run at the end of VRP.  So
> > >> we've got to take care of this prior to the end of VRP.
> > >>
> > >>
> > >>
> > >> I haven't done a bootstrap or regression test with this.  But if it
> > >> looks reasonable I can certainly push on it further. I have confirmed it
> > >> does eliminate the tests and shuts up the bogus warning.
> > >>
> > >> The downside is this would also shut up valid warnings if user code did
> > >> this kind of test.
> > >>
> > >> Comments/Suggestions?
> > >
> > > ENOPATCH
> > Yea, realized it as I pushed the send button.  Then t-bird crashed,
> > repeatedly.
> >
> > Attached this time..
>
> There's fold-const.cc:address_compare and
> tree-ssa-alias.cc:ptrs_compare_unequal,
> both eventually used by match.pd that could see this change, the former 
> already
> special-cases STRING_CST to some extent.
>
> I'll note that the value we simplify such comparison to is arbitrary.
> Doing such
> simplification directly (as opposed to only benefit from its
> undefinedness indirectly)
> always gives me the creeps ;)
>
> IMO we should instead simplify the condition to __builtin_unreachable/trap aka
> isolate the path as unreachable.

I see for the testcase we get to see the invalid compares in forwprop while
the diagnostic happens in VRP which is way before path isolation.

IMO it's sensible to make forwprop do the path isolation, but this is probably
nothing for stage4.

Scheduling another pass isolation path before array-bound diagnostics might
also make sense (it would be nice to dis-entangle -Warray-bounds from VRP
itself)

Richard.

> Richard.
>
> > jeff
> >

[PATCH] bitint, v2: Avoid rewriting large/huge _BitInt vars into SSA after bitint lowering [PR114278]

2024-03-11 Thread Jakub Jelinek

On Sat, Mar 09, 2024 at 12:25:42PM +0100, Richard Biener wrote:
> Ideally we’d clear TREE_ADDRESSABLE but set DECL_NOT_GIMPLE_REG,
> I think the analysis where we check the base would be a more
> appropriate place to enforce that.

So like this?

Bootstrapped/regtested on x86_64-linux and i686-linux.

2024-03-11  Jakub Jelinek  

PR tree-optimization/114278
* tree-ssa.cc (maybe_optimize_var): If large/huge _BitInt vars are no
longer addressable, set DECL_NOT_GIMPLE_REG_P on them.

* gcc.dg/bitint-99.c: New test.

--- gcc/tree-ssa.cc.jj  2024-01-03 11:51:39.902615009 +0100
+++ gcc/tree-ssa.cc 2024-03-09 23:34:12.469223987 +0100
@@ -1785,6 +1785,20 @@ maybe_optimize_var (tree var, bitmap add
  fprintf (dump_file, "\n");
}
}
+  else if (TREE_CODE (TREE_TYPE (var)) == BITINT_TYPE
+  && (cfun->curr_properties & PROP_gimple_lbitint) != 0
+  && TYPE_PRECISION (TREE_TYPE (var)) > MAX_FIXED_MODE_SIZE)
+   {
+ /* Don't rewrite large/huge _BitInt vars after _BitInt lowering
+into SSA form.  */
+ DECL_NOT_GIMPLE_REG_P (var) = 1;
+ if (dump_file)
+   {
+ fprintf (dump_file, "_BitInt var after its lowering: ");
+ print_generic_expr (dump_file, var);
+ fprintf (dump_file, "\n");
+   }
+   }
   else if (DECL_NOT_GIMPLE_REG_P (var))
{
  maybe_reg = true;
--- gcc/testsuite/gcc.dg/bitint-99.c.jj 2024-03-08 14:26:17.658069942 +0100
+++ gcc/testsuite/gcc.dg/bitint-99.c2024-03-08 14:25:36.292645965 +0100
@@ -0,0 +1,26 @@
+/* PR tree-optimization/114278 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O2 -fno-tree-dce -fno-tree-dse -fno-tree-ccp" } */
+/* { dg-additional-options "-mavx2" { target i?86-*-* x86_64-*-* } } */
+
+void
+foo (void *p)
+{
+  _BitInt(64) b = *(_BitInt(64) *) __builtin_memmove (, p, sizeof 
(_BitInt(64)));
+}
+
+#if __BITINT_MAXWIDTH__ >= 128
+void
+bar (void *p)
+{
+  _BitInt(128) b = *(_BitInt(128) *) __builtin_memmove (, p, sizeof 
(_BitInt(128)));
+}
+#endif
+
+#if __BITINT_MAXWIDTH__ >= 256
+void
+baz (void *p)
+{
+  _BitInt(256) b = *(_BitInt(256) *) __builtin_memmove (, p, sizeof 
(_BitInt(256)));
+}
+#endif

Jakub

Re: [RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-11 Thread Richard Biener

On Sun, Mar 10, 2024 at 10:09 PM Jeff Law  wrote:
>
>
>
> On 3/10/24 3:05 PM, Andrew Pinski wrote:
> > On Sun, Mar 10, 2024 at 2:04 PM Jeff Law  wrote:
> >>
> >> Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
> >> positive triggered by loop unrolling.
> >>
> >> As I speculated a couple years ago, we could eliminate the comparisons
> >> against bogus pointers.  Consider:
> >>
> >>> [local count: 30530247]:
> >>>if (last_12 !=   [(void *)"aa" + 3B])
> >>>  goto ; [54.59%]
> >>>else
> >>>  goto ; [45.41%]
> >>
> >>
> >> That's a valid comparison as ISO allows us to generate, but not
> >> dereference, a pointer one element past the end of the object.
> >>
> >> But +4B is a bogus pointer.  So given an EQ comparison against that
> >> pointer we could always return false and for NE always return true.
> >>
> >> VRP and DOM seem to be the most natural choices for this kind of
> >> optimization on the surface.  However DOM is actually not viable because
> >> the out-of-bounds pointer warning pass is run at the end of VRP.  So
> >> we've got to take care of this prior to the end of VRP.
> >>
> >>
> >>
> >> I haven't done a bootstrap or regression test with this.  But if it
> >> looks reasonable I can certainly push on it further. I have confirmed it
> >> does eliminate the tests and shuts up the bogus warning.
> >>
> >> The downside is this would also shut up valid warnings if user code did
> >> this kind of test.
> >>
> >> Comments/Suggestions?
> >
> > ENOPATCH
> Yea, realized it as I pushed the send button.  Then t-bird crashed,
> repeatedly.
>
> Attached this time..

There's fold-const.cc:address_compare and
tree-ssa-alias.cc:ptrs_compare_unequal,
both eventually used by match.pd that could see this change, the former already
special-cases STRING_CST to some extent.

I'll note that the value we simplify such comparison to is arbitrary.
Doing such
simplification directly (as opposed to only benefit from its
undefinedness indirectly)
always gives me the creeps ;)

IMO we should instead simplify the condition to __builtin_unreachable/trap aka
isolate the path as unreachable.

Richard.

> jeff
>

Re: [PATCH] [strub] improve handling of indirected volatile parms [PR112938]

2024-03-11 Thread Richard Biener

On Sat, Mar 9, 2024 at 10:10 AM Alexandre Oliva  wrote:
>
>
> The earlier patch for PR112938 arranged for volatile parms to be made
> indirect in internal strub wrapped bodies.
>
> The first problem that remained, more evident, was that the indirected
> parameter remained volatile, despite the indirection, but it wasn't
> regimplified, so indirecting it was malformed gimple.
>
> Regimplifying turned out not to be needed.  The best course of action
> was to drop the volatility from the by-reference parm, that was being
> unexpectedly inherited from the original volatile parm.
>
> That exposed another problem: the dereferences would then lose their
> volatile status, so we had to bring volatile back to them.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK

>
> for  gcc/ChangeLog
>
> PR middle-end/112938
> * ipa-strub.cc (pass_ipa_strub::execute): Drop volatility from
> indirected parm.
> (maybe_make_indirect): Restore volatility in dereferences.
>
> for  gcc/testsuite/ChangeLog
>
> PR middle-end/112938
> * g++.dg/strub-internal-pr112938.cc: New.
> ---
>  gcc/ipa-strub.cc|7 +++
>  gcc/testsuite/g++.dg/strub-internal-pr112938.cc |   12 
>  2 files changed, 19 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/strub-internal-pr112938.cc
>
> diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
> index dff94222351ad..8fa7bdf530023 100644
> --- a/gcc/ipa-strub.cc
> +++ b/gcc/ipa-strub.cc
> @@ -1940,6 +1940,9 @@ maybe_make_indirect (indirect_parms_t _parms, 
> tree op, int *rec)
>   TREE_TYPE (TREE_TYPE (op)),
>   op,
>   build_int_cst (TREE_TYPE (op), 0));
> + if (TYPE_VOLATILE (TREE_TYPE (TREE_TYPE (op)))
> + && !TREE_THIS_VOLATILE (ret))
> +   TREE_SIDE_EFFECTS (ret) = TREE_THIS_VOLATILE (ret) = 1;
>   return ret;
> }
>  }
> @@ -2894,6 +2897,10 @@ pass_ipa_strub::execute (function *)
>  probably drop the TREE_ADDRESSABLE and keep the TRUE.  */
>   tree ref_type = build_ref_type_for (nparm);
>
> + if (TREE_THIS_VOLATILE (nparm)
> + && TYPE_VOLATILE (TREE_TYPE (nparm))
> + && !TYPE_VOLATILE (ref_type))
> +   TREE_SIDE_EFFECTS (nparm) = TREE_THIS_VOLATILE (nparm) = 0;
>   DECL_ARG_TYPE (nparm) = TREE_TYPE (nparm) = ref_type;
>   relayout_decl (nparm);
>   TREE_ADDRESSABLE (nparm) = 0;
> diff --git a/gcc/testsuite/g++.dg/strub-internal-pr112938.cc 
> b/gcc/testsuite/g++.dg/strub-internal-pr112938.cc
> new file mode 100644
> index 0..5a74becc2697e
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/strub-internal-pr112938.cc
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fdump-tree-optimized -O2" } */
> +/* { dg-require-effective-target strub } */
> +
> +bool __attribute__ ((__strub__ ("internal")))
> +f(bool i, volatile bool j)
> +{
> +  return (i ^ j) == j;
> +}
> +
> +/* Check for two dereferences of the indirected volatile j parm.  */
> +/* { dg-final { scan-tree-dump-times {={v} \*j_[0-9][0-9]*(D)} 2 "optimized" 
> } } */
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive

Re: [PATCH] middle-end/113680 - Optimize (x - y) CMP 0 as x CMP y

2024-03-11 Thread Richard Biener

On Fri, Mar 8, 2024 at 6:50 PM Ken Matsui  wrote:
>
> On Thu, Mar 7, 2024 at 10:49 PM Richard Biener
>  wrote:
> >
> > On Thu, Mar 7, 2024 at 8:29 PM Ken Matsui  wrote:
> > >
> > > On Tue, Mar 5, 2024 at 7:58 AM Richard Biener
> > >  wrote:
> > > >
> > > > On Tue, Mar 5, 2024 at 1:51 PM Ken Matsui  
> > > > wrote:
> > > > >
> > > > > On Tue, Mar 5, 2024 at 12:38 AM Richard Biener
> > > > >  wrote:
> > > > > >
> > > > > > On Mon, Mar 4, 2024 at 9:40 PM Ken Matsui  
> > > > > > wrote:
> > > > > > >
> > > > > > > (x - y) CMP 0 is equivalent to x CMP y where x and y are signed
> > > > > > > integers and CMP is <, <=, >, or >=.  Similarly, 0 CMP (x - y) is
> > > > > > > equivalent to y CMP x.  As reported in PR middle-end/113680, this
> > > > > > > equivalence does not hold for types other than signed integers.  
> > > > > > > When
> > > > > > > it comes to conditions, the former was translated to a 
> > > > > > > combination of
> > > > > > > sub and test, whereas the latter was translated to a single cmp.
> > > > > > > Thus, this optimization pass tries to optimize the former to the
> > > > > > > latter.
> > > > > > >
> > > > > > > When `-fwrapv` is enabled, GCC treats the overflow of signed 
> > > > > > > integers
> > > > > > > as defined behavior, specifically, wrapping around according to 
> > > > > > > two's
> > > > > > > complement arithmetic.  This has implications for optimizations 
> > > > > > > that
> > > > > > > rely on the standard behavior of signed integers, where overflow 
> > > > > > > is
> > > > > > > undefined.  Consider the example given:
> > > > > > >
> > > > > > > long long llmax = __LONG_LONG_MAX__;
> > > > > > > long long llmin = -llmax - 1;
> > > > > > >
> > > > > > > Here, `llmax - llmin` effectively becomes `llmax - (-llmax - 1)`, 
> > > > > > > which
> > > > > > > simplifies to `2 * llmax + 1`.  Given that `llmax` is the maximum 
> > > > > > > value
> > > > > > > for a `long long`, this calculation overflows in a defined manner
> > > > > > > (wrapping around), which under `-fwrapv` is a legal operation that
> > > > > > > produces a negative value due to two's complement wraparound.
> > > > > > > Therefore, `llmax - llmin < 0` is true.
> > > > > > >
> > > > > > > However, the direct comparison `llmax < llmin` is false since 
> > > > > > > `llmax`
> > > > > > > is the maximum possible value and `llmin` is the minimum.  Hence,
> > > > > > > optimizations that rely on the equivalence of `(x - y) CMP 0` to
> > > > > > > `x CMP y` (and vice versa) cannot be safely applied when 
> > > > > > > `-fwrapv` is
> > > > > > > enabled.  This is why this optimization pass is disabled under
> > > > > > > `-fwrapv`.
> > > > > > >
> > > > > > > This optimization pass must run before the Jump Threading pass 
> > > > > > > and the
> > > > > > > VRP pass, as it may modify conditions. For example, in the VRP 
> > > > > > > pass:
> > > > > > >
> > > > > > > (1)
> > > > > > >   int diff = x - y;
> > > > > > >   if (diff > 0)
> > > > > > > foo();
> > > > > > >   if (diff < 0)
> > > > > > > bar();
> > > > > > >
> > > > > > > The second condition would be converted to diff != 0 in the VRP 
> > > > > > > pass
> > > > > > > because we know the postcondition of the first condition is diff 
> > > > > > > <= 0,
> > > > > > > and then diff != 0 is cheaper than diff < 0. If we apply this pass
> > > > > > > after this VRP, we get:
> > > > > > >
> > > > > > > (2)
> > > > > > >   int diff = x - y;
> > > > > > >   if (x > y)
> > > > > > > foo();
> > > > > > >   if (diff != 0)
> > > > > > > bar();
> > > > > > >
> > > > > > > This generates sub and test for the second condition and cmp for 
> > > > > > > the
> > > > > > > first condition. However, if we apply this pass beforehand, we 
> > > > > > > simply
> > > > > > > get:
> > > > > > >
> > > > > > > (3)
> > > > > > >   int diff = x - y;
> > > > > > >   if (x > y)
> > > > > > > foo();
> > > > > > >   if (x < y)
> > > > > > > bar();
> > > > > > >
> > > > > > > In this code, diff will be eliminated as a dead code, and sub and 
> > > > > > > test
> > > > > > > will not be generated, which is more efficient.
> > > > > > >
> > > > > > > For the Jump Threading pass, without this optimization pass, (1) 
> > > > > > > and
> > > > > > > (3) above are recognized as different, which prevents TCO.
> > > > > > >
> > > > > > > PR middle-end/113680
> > > > > >
> > > > > > This shouldn't be done as a new optimization pass.  It fits either
> > > > > > the explicit code present in the forwprop pass or a new match.pd
> > > > > > pattern.  There's possible interaction with x - y value being used
> > > > > > elsewhere and thus exposing a CSE opportunity as well as
> > > > > > a comparison against zero being possibly implemented by
> > > > > > a flag setting subtraction instruction.
> > > > > >
> > > > >
>

[PATCH v1 1/3] LoongArch: Remove unused/useless definitions.

2024-03-11 Thread Chenghui Pan

This patch removes some unnecessary definitions of target hook
functions according to the documentation of GCC.

gcc/ChangeLog:

* config/loongarch/loongarch-protos.h 
(loongarch_cfun_has_cprestore_slot_p): Delete.
(loongarch_adjust_insn_length): Delete.
(current_section_name): Delete.
(loongarch_split_symbol_type): Delete.
* config/loongarch/loongarch.cc (loongarch_case_values_threshold): 
Delete.
(loongarch_spill_class): Delete.
(TARGET_OPTAB_SUPPORTED_P): Delete.
(TARGET_CASE_VALUES_THRESHOLD): Delete.
(TARGET_SPILL_CLASS): Delete.

Change-Id: I115b3f5e45170d67dcbd7afb5d73e5ac4108b647
---
 gcc/config/loongarch/loongarch-protos.h |  5 -
 gcc/config/loongarch/loongarch.cc   | 26 -
 2 files changed, 31 deletions(-)

diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 1fdfda9af01..871544f760c 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -93,7 +93,6 @@ extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx 
(*)(rtx, rtx, rtx));
 extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
 extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
-extern bool loongarch_cfun_has_cprestore_slot_p (void);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
 extern bool loongarch_expand_vec_cmp (rtx *);
@@ -135,7 +134,6 @@ extern int loongarch_class_max_nregs (enum reg_class, 
machine_mode);
 extern machine_mode loongarch_hard_regno_caller_save_mode (unsigned int,
   unsigned int,
   machine_mode);
-extern int loongarch_adjust_insn_length (rtx_insn *, int);
 extern const char *loongarch_output_conditional_branch (rtx_insn *, rtx *,
const char *,
const char *);
@@ -157,7 +155,6 @@ extern bool loongarch_global_symbol_noweak_p (const_rtx);
 extern bool loongarch_weak_symbol_p (const_rtx);
 extern bool loongarch_symbol_binds_local_p (const_rtx);
 
-extern const char *current_section_name (void);
 extern unsigned int current_section_flags (void);
 extern bool loongarch_use_ins_ext_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 extern bool loongarch_check_zero_div_p (void);
@@ -198,8 +195,6 @@ extern bool loongarch_epilogue_uses (unsigned int);
 extern bool loongarch_load_store_bonding_p (rtx *, machine_mode, bool);
 extern bool loongarch_split_symbol_type (enum loongarch_symbol_type);
 
-typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
-
 extern void loongarch_register_frame_header_opt (void);
 extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
 extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode,
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 0428b6e65d5..8f13d9ca264 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10797,23 +10797,6 @@ loongarch_expand_vec_cmp (rtx operands[])
   return true;
 }
 
-/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
-
-unsigned int
-loongarch_case_values_threshold (void)
-{
-  return default_case_values_threshold ();
-}
-
-/* Implement TARGET_SPILL_CLASS.  */
-
-static reg_class_t
-loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
-  machine_mode mode ATTRIBUTE_UNUSED)
-{
-  return NO_REGS;
-}
-
 /* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
 
 /* This function is equivalent to default_promote_function_mode_always_promote
@@ -11268,9 +11251,6 @@ loongarch_asm_code_end (void)
 #undef TARGET_FUNCTION_ARG_BOUNDARY
 #define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
 
-#undef TARGET_OPTAB_SUPPORTED_P
-#define TARGET_OPTAB_SUPPORTED_P loongarch_optab_supported_p
-
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p
 
@@ -11340,18 +11320,12 @@ loongarch_asm_code_end (void)
 #undef TARGET_SCHED_REASSOCIATION_WIDTH
 #define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width
 
-#undef TARGET_CASE_VALUES_THRESHOLD
-#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold
-
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv
 
 #undef TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
 #define TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS true
 
-#undef TARGET_SPILL_CLASS
-#define TARGET_SPILL_CLASS loongarch_spill_class
-
 #undef TARGET_HARD_REGNO_NREGS
 #define TARGET_HARD_REGNO_NREGS loongarch_hard_regno_nregs
 #undef TARGET_HARD_REGNO_MODE_OK
-- 
2.39.3

[PATCH v1 0/3] LoongArch: Cleanup unused/redundant codes.

2024-03-11 Thread Chenghui Pan

There's some ununsed/useless definition inside LoongArch target support
codes, these patches make a simple cleanup. Regression test passed.

Chenghui Pan (3):
  LoongArch: Remove unused/useless definitions.
  LoongArch: Change loongarch_expand_vec_cmp()'s return type from bool
to void.
  LoongArch: Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.

 gcc/config/loongarch/lasx.md|  6 ++--
 gcc/config/loongarch/loongarch-protos.h |  7 +
 gcc/config/loongarch/loongarch.cc   | 39 -
 gcc/config/loongarch/loongarch.h|  7 ++---
 gcc/config/loongarch/lsx.md |  6 ++--
 5 files changed, 13 insertions(+), 52 deletions(-)

-- 
2.39.3

[PATCH v1 3/3] LoongArch: Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.

2024-03-11 Thread Chenghui Pan

These macros are completely same in definition, so we can keep the previous one 
and
eliminate later one.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_hard_regno_mode_ok_uncached):
Combine UNITS_PER_FP_REG and UNITS_PER_FPREG macros.
(loongarch_hard_regno_nregs): Ditto.
(loongarch_class_max_nregs): Ditto.
(loongarch_get_separate_components): Ditto.
(loongarch_process_components): Ditto.
* config/loongarch/loongarch.h (UNITS_PER_FPREG): Ditto.
(UNITS_PER_HWFPVALUE): Ditto.
(UNITS_PER_FPVALUE): Ditto.

Change-Id: Ic3b846bcc5af710a3bfd9ae1db771ae93411ea13
---
 gcc/config/loongarch/loongarch.cc | 10 +-
 gcc/config/loongarch/loongarch.h  |  7 ++-
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index e1073c9debd..519e29db1c3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -6757,7 +6757,7 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int 
regno, machine_mode mode)
 and TRUNC.  There's no point allowing sizes smaller than a word,
 because the FPU has no appropriate load/store instructions.  */
   if (mclass == MODE_INT)
-   return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FPREG;
+   return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FP_REG;
 }
 
   return false;
@@ -6800,7 +6800,7 @@ loongarch_hard_regno_nregs (unsigned int regno, 
machine_mode mode)
   if (LASX_SUPPORTED_MODE_P (mode))
return 1;
 
-  return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+  return (GET_MODE_SIZE (mode) + UNITS_PER_FP_REG - 1) / UNITS_PER_FP_REG;
 }
 
   /* All other registers are word-sized.  */
@@ -6835,7 +6835,7 @@ loongarch_class_max_nregs (enum reg_class rclass, 
machine_mode mode)
  else if (LSX_SUPPORTED_MODE_P (mode))
size = MIN (size, UNITS_PER_LSX_REG);
  else
-   size = MIN (size, UNITS_PER_FPREG);
+   size = MIN (size, UNITS_PER_FP_REG);
}
   left &= ~reg_class_contents[FP_REGS];
 }
@@ -8209,7 +8209,7 @@ loongarch_get_separate_components (void)
if (IMM12_OPERAND (offset))
  bitmap_set_bit (components, regno);
 
-   offset -= UNITS_PER_FPREG;
+   offset -= UNITS_PER_FP_REG;
   }
 
   /* Don't mess with the hard frame pointer.  */
@@ -8288,7 +8288,7 @@ loongarch_process_components (sbitmap components, 
loongarch_save_restore_fn fn)
if (bitmap_bit_p (components, regno))
  loongarch_save_restore_reg (mode, regno, offset, fn);
 
-   offset -= UNITS_PER_FPREG;
+   offset -= UNITS_PER_FP_REG;
   }
 }
 
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index bf2351f0968..888a633961d 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -138,19 +138,16 @@ along with GCC; see the file COPYING3.  If not see
 /* Width of a LASX vector register in bits.  */
 #define BITS_PER_LASX_REG (UNITS_PER_LASX_REG * BITS_PER_UNIT)
 
-/* For LARCH, width of a floating point register.  */
-#define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
-
 /* The largest size of value that can be held in floating-point
registers and moved with a single instruction.  */
 #define UNITS_PER_HWFPVALUE \
-  (TARGET_SOFT_FLOAT ? 0 : UNITS_PER_FPREG)
+  (TARGET_SOFT_FLOAT ? 0 : UNITS_PER_FP_REG)
 
 /* The largest size of value that can be held in floating-point
registers.  */
 #define UNITS_PER_FPVALUE \
   (TARGET_SOFT_FLOAT ? 0 \
-   : TARGET_SINGLE_FLOAT ? UNITS_PER_FPREG \
+   : TARGET_SINGLE_FLOAT ? UNITS_PER_FP_REG \
 : LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT)
 
 /* The number of bytes in a double.  */
-- 
2.39.3

[PATCH v1 2/3] LoongArch: Change loongarch_expand_vec_cmp()'s return type from bool to void.

2024-03-11 Thread Chenghui Pan

This function is always return true at the end of function implementation,
so the return value is useless.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove checking of 
loongarch_expand_vec_cmp()'s
return value.
* config/loongarch/loongarch-protos.h (loongarch_expand_vec_cmp): Change
loongarch_expand_vec_cmp()'s return type from bool to void.
* config/loongarch/loongarch.cc (loongarch_expand_vec_cmp): Ditto.
* config/loongarch/lsx.md: Remove checking of 
loongarch_expand_vec_cmp()'s
return value.

Change-Id: I4925ec9c7355125a231f2ab8b8b00c14b72739a2
---
 gcc/config/loongarch/lasx.md| 6 ++
 gcc/config/loongarch/loongarch-protos.h | 2 +-
 gcc/config/loongarch/loongarch.cc   | 3 +--
 gcc/config/loongarch/lsx.md | 6 ++
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index ac84db7f0ce..8d4c6b4ec35 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -1383,8 +1383,7 @@
   (match_operand:LASX 3 "register_operand")]))]
   "ISA_HAS_LASX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
@@ -1395,8 +1394,7 @@
   (match_operand:ILASX 3 "register_operand")]))]
   "ISA_HAS_LASX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 871544f760c..e3ed2b912a5 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -95,7 +95,7 @@ extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
-extern bool loongarch_expand_vec_cmp (rtx *);
+extern void loongarch_expand_vec_cmp (rtx *);
 extern void loongarch_expand_conditional_branch (rtx *);
 extern void loongarch_expand_conditional_move (rtx *);
 extern void loongarch_expand_conditional_trap (rtx);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 8f13d9ca264..e1073c9debd 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10788,13 +10788,12 @@ loongarch_expand_vec_cond_mask_expr (machine_mode 
mode, machine_mode vimode,
 }
 
 /* Expand integer vector comparison */
-bool
+void
 loongarch_expand_vec_cmp (rtx operands[])
 {
 
   rtx_code code = GET_CODE (operands[1]);
   loongarch_expand_lsx_cmp (operands[0], code, operands[2], operands[3]);
-  return true;
 }
 
 /* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index b9b94b9079c..87d3e7c5d9f 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -518,8 +518,7 @@
   (match_operand:LSX 3 "register_operand")]))]
   "ISA_HAS_LSX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
@@ -530,8 +529,7 @@
   (match_operand:ILSX 3 "register_operand")]))]
   "ISA_HAS_LSX"
 {
-  bool ok = loongarch_expand_vec_cmp (operands);
-  gcc_assert (ok);
+  loongarch_expand_vec_cmp (operands);
   DONE;
 })
 
-- 
2.39.3

50 matches

Mail list logo