Re: [PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]

2024-04-10 Thread juzhe.zh...@rivai.ai
LGTM



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-04-11 11:51
To: gcc-patches
CC: juzhe.zhong; kito.cheng; Pan Li
Subject: [PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]
From: Pan Li 
 
Just notice there are some test case still have -Wno-psabi option,
which is deprecated now.  Remove them all for riscv test cases.
 
The below test are passed for this patch.
* The riscv rvv regression test.
 
gcc/testsuite/ChangeLog:
 
* g++.target/riscv/rvv/base/pr109244.C: Remove deprecated
-Wno-psabi option.
* g++.target/riscv/rvv/base/pr109535.C: Ditto.
* gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto.
 
Signed-off-by: Pan Li 
---
gcc/testsuite/g++.target/riscv/rvv/base/pr109244.C  | 2 +-
gcc/testsuite/g++.target/riscv/rvv/base/pr109535.C  | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c  | 2 +-
.../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c | 2 +-
.../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.c | 2 +-
.../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-3.c | 2 +-
.../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-4.c | 2 +-
.../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-5.c | 2 +-
.../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-6.c | 2 +-

[PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]

2024-04-10 Thread pan2 . li
From: Pan Li 

Just notice there are some test case still have -Wno-psabi option,
which is deprecated now.  Remove them all for riscv test cases.

The below test are passed for this patch.
* The riscv rvv regression test.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr109244.C: Remove deprecated
-Wno-psabi option.
* g++.target/riscv/rvv/base/pr109535.C: Ditto.
* gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/g++.target/riscv/rvv/base/pr109244.C  | 2 +-
 gcc/testsuite/g++.target/riscv/rvv/base/pr109535.C  | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c  | 2 +-
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c | 2 +-
 

Re: [PATCH] c++: recalculating local specs via build_extra_args [PR114303]

2024-04-10 Thread Jason Merrill

On 4/10/24 20:00, Patrick Palka wrote:

On Wed, 10 Apr 2024, Jason Merrill wrote:


On 4/10/24 17:39, Patrick Palka wrote:

On Wed, 10 Apr 2024, Jason Merrill wrote:


On 3/12/24 10:51, Patrick Palka wrote:

On Tue, 12 Mar 2024, Patrick Palka wrote:

On Tue, 12 Mar 2024, Jason Merrill wrote:

On 3/11/24 12:53, Patrick Palka wrote:


r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated
contexts
first so that we prefer processing a local specialization in an
evaluated
context even if its first use is in an unevaluated context.  But
this
means we need to avoid walking a tree that already has extra
args/specs
saved because the list of saved specs appears to be an evaluated
context.  It seems then that we should be calculating the saved
specs
from scratch each time, rather than potentially walking the saved
specs
list from an earlier partial instantiation when calling
build_extra_args
a second time around.


Makes sense, but I wonder if we want to approach that by avoiding
walking into
*_EXTRA_ARGS in extract_locals_r?  Or do we still want to walk into
any
nested
extra args?  And if so, will we run into this same problem then?


I'm not sure totally but I'd expect a nested extra-args tree to always
have empty *_EXTRA_ARGS since the outer extra-args tree should
intercept
any substitution before the inner extra-args tree can see it?


... and so in extract_locals_r I think we can assume *_EXTRA_ARGS is
empty, and not have to explicitly avoid walking it.


It seems more robust to me to handle _EXTRA_ARGS appropriately in
build_extra_args rather than expect callers to know that they shouldn't
pass
in a tree with _EXTRA_ARGS set.  At least check and abort in that case?


Sounds good.  That IMHO seems simpler than actually avoiding walking
into *_EXTRA_ARGS from extract_locals_r because we'd have to repeat
the walking logic from cp_walk_subtree modulo the *_EXTRA_ARGS walk.

How does the following look? Bootstraped and regtested on
x86_64-pc-linux-gnu.

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c38594cd862..6cc9b95fc06 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13310,6 +13310,19 @@ extract_locals_r (tree *tp, int *walk_subtrees,
void *data_)
   /* Remember local typedefs (85214).  */
   tp = _NAME (*tp);
   


Please add a comment explaining why it needs to be null.

Also, how about a generic _EXTRA_ARGS accessor so places like this don't need
to check each code themselves?


Sounds good.




+  if (has_extra_args_mechanism_p (*tp))
+{
+  if (PACK_EXPANSION_P (*tp))
+   gcc_checking_assert (!PACK_EXPANSION_EXTRA_ARGS (*tp));
+  else if (TREE_CODE (*tp) == REQUIRES_EXPR)
+   gcc_checking_assert (!REQUIRES_EXPR_EXTRA_ARGS (*tp));
+  else if (TREE_CODE (*tp) == IF_STMT
+  && IF_STMT_CONSTEXPR_P (*tp))
+   gcc_checking_assert (!IF_STMT_EXTRA_ARGS (*tp));
+  else
+   gcc_unreachable ();
+}
+
 if (TREE_CODE (*tp) == DECL_EXPR)
   {
 tree decl = DECL_EXPR_DECL (*tp);
@@ -18738,7 +18751,8 @@ tsubst_stmt (tree t, tree args, tsubst_flags_t
complain, tree in_decl)
  IF_COND (stmt) = IF_COND (t);
  THEN_CLAUSE (stmt) = THEN_CLAUSE (t);
  ELSE_CLAUSE (stmt) = ELSE_CLAUSE (t);
- IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (t, args, complain);
+ IF_SCOPE (stmt) = NULL_TREE;


What does IF_SCOPE have to do with this?


IF_SCOPE is the same field as IF_STMT_EXTRA_ARGS so we need to clear it
before calling build_extra_args to avoid tripping over the added assert.


Let's clear it a few lines earlier, then, immediately after the 
poplevel; OK with that change.


finish_if_stmt clears it even before calling poplevel, but that doesn't 
seem necessary.


Jason



RE: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread Li, Pan2
Committed, thanks Juzhe and Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, April 11, 2024 10:50 AM
To: juzhe.zh...@rivai.ai
Cc: Li, Pan2 ; gcc-patches 
Subject: Re: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode 
switch

I was thinking we may guarded with TARGET_VECTOR and TARGET_HARD_FLOAT
or checking with ABI in riscv_function_value_regno_p, however I think
it's fine with current implementation (no checking) after checking all
use site of `targetm.calls.function_value_regno_p`, so LGTM :)

Thanks Pan for fixing this issue!

On Thu, Apr 11, 2024 at 10:23 AM juzhe.zh...@rivai.ai
 wrote:
>
> Thanks for fixing it. LGTM from my side.
>
> I prefer wait kito for another ACK.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: pan2.li
> Date: 2024-04-11 10:16
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; Pan Li
> Subject: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode 
> switch
> From: Pan Li 
>
> This patch would like to fix a ICE in mode sw for below example code.
>
> during RTL pass: mode_sw
> test.c: In function ‘vbool16_t j(vuint64m4_t)’:
> test.c:15:1: internal compiler error: in create_pre_exit, at
> mode-switching.cc:451
>15 | }
>   | ^
> 0x3978f12 create_pre_exit
> __RISCV_BUILD__/../gcc/mode-switching.cc:451
> 0x3979e9e optimize_mode_switching
> __RISCV_BUILD__/../gcc/mode-switching.cc:849
> 0x397b9bc execute
> __RISCV_BUILD__/../gcc/mode-switching.cc:1324
>
> extern size_t get_vl ();
>
> vbool16_t
> test (vuint64m4_t a)
> {
>   unsigned long b;
>   return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
> }
>
> The create_pre_exit would like to find a return value copy.  If
> not, there will be a reason in assert but not available for above
> sample code when vector calling convension is enabled by default.
> This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
> for vector register and then we will have hard_regno_nregs for copy_num,
> aka there is a return value copy.
>
> As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
> TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
> cannot be converted to fixed_size_mode.  Thus override the hook
> TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
> fixed_size_mode.
>
> The below tests are passed for this patch.
> * The fully riscv regression tests.
> * The reproducing test in bugzilla PR114639.
>
> PR target/114639
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_function_value_regno_p): New func
> impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
> (riscv_get_raw_result_mode): New func imple for hook
> TARGET_GET_RAW_RESULT_MODE.
> (TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
> (TARGET_GET_RAW_RESULT_MODE): Ditto.
> * config/riscv/riscv.h (V_RETURN): New macro for vector return.
> (GP_RETURN_FIRST): New macro for the first GPR in return.
> (GP_RETURN_LAST): New macro for the last GPR in return.
> (FP_RETURN_FIRST): Diito but for FPR.
> (FP_RETURN_LAST): Ditto.
> (FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
> TARGET_FUNCTION_VALUE_REGNO_P.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/riscv/rvv/base/pr114639-1.C: New test.
> * gcc.target/riscv/rvv/base/pr114639-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
> gcc/config/riscv/riscv.cc | 34 +++
> gcc/config/riscv/riscv.h  |  8 +++--
> .../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++
> .../gcc.target/riscv/rvv/base/pr114639-1.c| 14 
> 4 files changed, 79 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 00defa69fd8..91f017dd52a 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
>return true;
> }
> +/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P.  */
> +
> +static bool
> +riscv_function_value_regno_p (const unsigned regno)
> +{
> +  if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST)
> +return true;
> +
> +  if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST)
> +return true;
> +
> +  if (regno == V_RETURN)
> +return true;
> +
> +  return false;
> +}
> +
> +/* Implements hook TARGET_GET_RAW_RESULT_MODE.  */
> +
> +static fixed_size_mode
> +riscv_get_raw_result_mode (int regno)
> +{
> +  if (!is_a  (reg_raw_mode[regno]))
> +return as_a  (VOIDmode);
> +
> +  return default_get_reg_raw_mode (regno);
> +}
> +
> /* Initialize the GCC target structure.  */
> #undef TARGET_ASM_ALIGNED_HI_OP
> #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> @@ -11343,6 +11371,12 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
> #undef TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P
> #define 

Re: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread Kito Cheng
I was thinking we may guarded with TARGET_VECTOR and TARGET_HARD_FLOAT
or checking with ABI in riscv_function_value_regno_p, however I think
it's fine with current implementation (no checking) after checking all
use site of `targetm.calls.function_value_regno_p`, so LGTM :)

Thanks Pan for fixing this issue!

On Thu, Apr 11, 2024 at 10:23 AM juzhe.zh...@rivai.ai
 wrote:
>
> Thanks for fixing it. LGTM from my side.
>
> I prefer wait kito for another ACK.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: pan2.li
> Date: 2024-04-11 10:16
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; Pan Li
> Subject: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode 
> switch
> From: Pan Li 
>
> This patch would like to fix a ICE in mode sw for below example code.
>
> during RTL pass: mode_sw
> test.c: In function ‘vbool16_t j(vuint64m4_t)’:
> test.c:15:1: internal compiler error: in create_pre_exit, at
> mode-switching.cc:451
>15 | }
>   | ^
> 0x3978f12 create_pre_exit
> __RISCV_BUILD__/../gcc/mode-switching.cc:451
> 0x3979e9e optimize_mode_switching
> __RISCV_BUILD__/../gcc/mode-switching.cc:849
> 0x397b9bc execute
> __RISCV_BUILD__/../gcc/mode-switching.cc:1324
>
> extern size_t get_vl ();
>
> vbool16_t
> test (vuint64m4_t a)
> {
>   unsigned long b;
>   return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
> }
>
> The create_pre_exit would like to find a return value copy.  If
> not, there will be a reason in assert but not available for above
> sample code when vector calling convension is enabled by default.
> This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
> for vector register and then we will have hard_regno_nregs for copy_num,
> aka there is a return value copy.
>
> As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
> TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
> cannot be converted to fixed_size_mode.  Thus override the hook
> TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
> fixed_size_mode.
>
> The below tests are passed for this patch.
> * The fully riscv regression tests.
> * The reproducing test in bugzilla PR114639.
>
> PR target/114639
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_function_value_regno_p): New func
> impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
> (riscv_get_raw_result_mode): New func imple for hook
> TARGET_GET_RAW_RESULT_MODE.
> (TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
> (TARGET_GET_RAW_RESULT_MODE): Ditto.
> * config/riscv/riscv.h (V_RETURN): New macro for vector return.
> (GP_RETURN_FIRST): New macro for the first GPR in return.
> (GP_RETURN_LAST): New macro for the last GPR in return.
> (FP_RETURN_FIRST): Diito but for FPR.
> (FP_RETURN_LAST): Ditto.
> (FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
> TARGET_FUNCTION_VALUE_REGNO_P.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/riscv/rvv/base/pr114639-1.C: New test.
> * gcc.target/riscv/rvv/base/pr114639-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
> gcc/config/riscv/riscv.cc | 34 +++
> gcc/config/riscv/riscv.h  |  8 +++--
> .../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++
> .../gcc.target/riscv/rvv/base/pr114639-1.c| 14 
> 4 files changed, 79 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 00defa69fd8..91f017dd52a 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
>return true;
> }
> +/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P.  */
> +
> +static bool
> +riscv_function_value_regno_p (const unsigned regno)
> +{
> +  if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST)
> +return true;
> +
> +  if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST)
> +return true;
> +
> +  if (regno == V_RETURN)
> +return true;
> +
> +  return false;
> +}
> +
> +/* Implements hook TARGET_GET_RAW_RESULT_MODE.  */
> +
> +static fixed_size_mode
> +riscv_get_raw_result_mode (int regno)
> +{
> +  if (!is_a  (reg_raw_mode[regno]))
> +return as_a  (VOIDmode);
> +
> +  return default_get_reg_raw_mode (regno);
> +}
> +
> /* Initialize the GCC target structure.  */
> #undef TARGET_ASM_ALIGNED_HI_OP
> #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> @@ -11343,6 +11371,12 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
> #undef TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P
> #define TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P 
> riscv_vector_mode_supported_any_target_p
> +#undef TARGET_FUNCTION_VALUE_REGNO_P
> +#define TARGET_FUNCTION_VALUE_REGNO_P riscv_function_value_regno_p
> +
> +#undef TARGET_GET_RAW_RESULT_MODE
> +#define 

Re: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread juzhe.zh...@rivai.ai
Thanks for fixing it. LGTM from my side.

I prefer wait kito for another ACK.



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-04-11 10:16
To: gcc-patches
CC: juzhe.zhong; kito.cheng; Pan Li
Subject: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch
From: Pan Li 
 
This patch would like to fix a ICE in mode sw for below example code.
 
during RTL pass: mode_sw
test.c: In function ‘vbool16_t j(vuint64m4_t)’:
test.c:15:1: internal compiler error: in create_pre_exit, at
mode-switching.cc:451
   15 | }
  | ^
0x3978f12 create_pre_exit
__RISCV_BUILD__/../gcc/mode-switching.cc:451
0x3979e9e optimize_mode_switching
__RISCV_BUILD__/../gcc/mode-switching.cc:849
0x397b9bc execute
__RISCV_BUILD__/../gcc/mode-switching.cc:1324
 
extern size_t get_vl ();
 
vbool16_t
test (vuint64m4_t a)
{
  unsigned long b;
  return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
}
 
The create_pre_exit would like to find a return value copy.  If
not, there will be a reason in assert but not available for above
sample code when vector calling convension is enabled by default.
This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
for vector register and then we will have hard_regno_nregs for copy_num,
aka there is a return value copy.
 
As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
cannot be converted to fixed_size_mode.  Thus override the hook
TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
fixed_size_mode.
 
The below tests are passed for this patch.
* The fully riscv regression tests.
* The reproducing test in bugzilla PR114639.
 
PR target/114639
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (riscv_function_value_regno_p): New func
impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
(riscv_get_raw_result_mode): New func imple for hook
TARGET_GET_RAW_RESULT_MODE.
(TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
(TARGET_GET_RAW_RESULT_MODE): Ditto.
* config/riscv/riscv.h (V_RETURN): New macro for vector return.
(GP_RETURN_FIRST): New macro for the first GPR in return.
(GP_RETURN_LAST): New macro for the last GPR in return.
(FP_RETURN_FIRST): Diito but for FPR.
(FP_RETURN_LAST): Ditto.
(FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
TARGET_FUNCTION_VALUE_REGNO_P.
 
gcc/testsuite/ChangeLog:
 
* g++.target/riscv/rvv/base/pr114639-1.C: New test.
* gcc.target/riscv/rvv/base/pr114639-1.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv.cc | 34 +++
gcc/config/riscv/riscv.h  |  8 +++--
.../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++
.../gcc.target/riscv/rvv/base/pr114639-1.c| 14 
4 files changed, 79 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 00defa69fd8..91f017dd52a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
   return true;
}
+/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P.  */
+
+static bool
+riscv_function_value_regno_p (const unsigned regno)
+{
+  if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST)
+return true;
+
+  if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST)
+return true;
+
+  if (regno == V_RETURN)
+return true;
+
+  return false;
+}
+
+/* Implements hook TARGET_GET_RAW_RESULT_MODE.  */
+
+static fixed_size_mode
+riscv_get_raw_result_mode (int regno)
+{
+  if (!is_a  (reg_raw_mode[regno]))
+return as_a  (VOIDmode);
+
+  return default_get_reg_raw_mode (regno);
+}
+
/* Initialize the GCC target structure.  */
#undef TARGET_ASM_ALIGNED_HI_OP
#define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11343,6 +11371,12 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
#undef TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P
#define TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P 
riscv_vector_mode_supported_any_target_p
+#undef TARGET_FUNCTION_VALUE_REGNO_P
+#define TARGET_FUNCTION_VALUE_REGNO_P riscv_function_value_regno_p
+
+#undef TARGET_GET_RAW_RESULT_MODE
+#define TARGET_GET_RAW_RESULT_MODE riscv_get_raw_result_mode
+
struct gcc_target targetm = TARGET_INITIALIZER;
#include "gt-riscv.h"
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 269b8c1f076..7797e67317a 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -683,6 +683,12 @@ enum reg_class
#define GP_RETURN GP_ARG_FIRST
#define FP_RETURN (UNITS_PER_FP_ARG == 0 ? GP_RETURN : FP_ARG_FIRST)
+#define V_RETURN  V_REG_FIRST
+
+#define GP_RETURN_FIRST GP_ARG_FIRST
+#define GP_RETURN_LAST  GP_ARG_FIRST + 1
+#define FP_RETURN_FIRST FP_RETURN
+#define FP_RETURN_LAST  FP_RETURN + 1
#define MAX_ARGS_IN_REGISTERS \
   (riscv_abi == ABI_ILP32E || 

[PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread pan2 . li
From: Pan Li 

This patch would like to fix a ICE in mode sw for below example code.

during RTL pass: mode_sw
test.c: In function ‘vbool16_t j(vuint64m4_t)’:
test.c:15:1: internal compiler error: in create_pre_exit, at
mode-switching.cc:451
   15 | }
  | ^
0x3978f12 create_pre_exit
__RISCV_BUILD__/../gcc/mode-switching.cc:451
0x3979e9e optimize_mode_switching
__RISCV_BUILD__/../gcc/mode-switching.cc:849
0x397b9bc execute
__RISCV_BUILD__/../gcc/mode-switching.cc:1324

extern size_t get_vl ();

vbool16_t
test (vuint64m4_t a)
{
  unsigned long b;
  return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
}

The create_pre_exit would like to find a return value copy.  If
not, there will be a reason in assert but not available for above
sample code when vector calling convension is enabled by default.
This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
for vector register and then we will have hard_regno_nregs for copy_num,
aka there is a return value copy.

As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
cannot be converted to fixed_size_mode.  Thus override the hook
TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
fixed_size_mode.

The below tests are passed for this patch.
* The fully riscv regression tests.
* The reproducing test in bugzilla PR114639.

PR target/114639

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_function_value_regno_p): New func
impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
(riscv_get_raw_result_mode): New func imple for hook
TARGET_GET_RAW_RESULT_MODE.
(TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
(TARGET_GET_RAW_RESULT_MODE): Ditto.
* config/riscv/riscv.h (V_RETURN): New macro for vector return.
(GP_RETURN_FIRST): New macro for the first GPR in return.
(GP_RETURN_LAST): New macro for the last GPR in return.
(FP_RETURN_FIRST): Diito but for FPR.
(FP_RETURN_LAST): Ditto.
(FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
TARGET_FUNCTION_VALUE_REGNO_P.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr114639-1.C: New test.
* gcc.target/riscv/rvv/base/pr114639-1.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 34 +++
 gcc/config/riscv/riscv.h  |  8 +++--
 .../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++
 .../gcc.target/riscv/rvv/base/pr114639-1.c| 14 
 4 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 00defa69fd8..91f017dd52a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
   return true;
 }
 
+/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P.  */
+
+static bool
+riscv_function_value_regno_p (const unsigned regno)
+{
+  if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST)
+return true;
+
+  if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST)
+return true;
+
+  if (regno == V_RETURN)
+return true;
+
+  return false;
+}
+
+/* Implements hook TARGET_GET_RAW_RESULT_MODE.  */
+
+static fixed_size_mode
+riscv_get_raw_result_mode (int regno)
+{
+  if (!is_a  (reg_raw_mode[regno]))
+return as_a  (VOIDmode);
+
+  return default_get_reg_raw_mode (regno);
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11343,6 +11371,12 @@ riscv_vector_mode_supported_any_target_p (machine_mode)
 #undef TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P
 #define TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P 
riscv_vector_mode_supported_any_target_p
 
+#undef TARGET_FUNCTION_VALUE_REGNO_P
+#define TARGET_FUNCTION_VALUE_REGNO_P riscv_function_value_regno_p
+
+#undef TARGET_GET_RAW_RESULT_MODE
+#define TARGET_GET_RAW_RESULT_MODE riscv_get_raw_result_mode
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-riscv.h"
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 269b8c1f076..7797e67317a 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -683,6 +683,12 @@ enum reg_class
 
 #define GP_RETURN GP_ARG_FIRST
 #define FP_RETURN (UNITS_PER_FP_ARG == 0 ? GP_RETURN : FP_ARG_FIRST)
+#define V_RETURN  V_REG_FIRST
+
+#define GP_RETURN_FIRST GP_ARG_FIRST
+#define GP_RETURN_LAST  GP_ARG_FIRST + 1
+#define FP_RETURN_FIRST FP_RETURN
+#define FP_RETURN_LAST  FP_RETURN + 1
 
 #define MAX_ARGS_IN_REGISTERS \
   (riscv_abi == ABI_ILP32E || riscv_abi == ABI_LP64E \
@@ -714,8 +720,6 @@ enum reg_class
 #define FUNCTION_VALUE(VALTYPE, FUNC) \
   riscv_function_value 

Re: [PATCH] c++: recalculating local specs via build_extra_args [PR114303]

2024-04-10 Thread Patrick Palka
On Wed, 10 Apr 2024, Jason Merrill wrote:

> On 4/10/24 17:39, Patrick Palka wrote:
> > On Wed, 10 Apr 2024, Jason Merrill wrote:
> > 
> > > On 3/12/24 10:51, Patrick Palka wrote:
> > > > On Tue, 12 Mar 2024, Patrick Palka wrote:
> > > > > On Tue, 12 Mar 2024, Jason Merrill wrote:
> > > > > > On 3/11/24 12:53, Patrick Palka wrote:
> > > > > > > 
> > > > > > > r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated
> > > > > > > contexts
> > > > > > > first so that we prefer processing a local specialization in an
> > > > > > > evaluated
> > > > > > > context even if its first use is in an unevaluated context.  But
> > > > > > > this
> > > > > > > means we need to avoid walking a tree that already has extra
> > > > > > > args/specs
> > > > > > > saved because the list of saved specs appears to be an evaluated
> > > > > > > context.  It seems then that we should be calculating the saved
> > > > > > > specs
> > > > > > > from scratch each time, rather than potentially walking the saved
> > > > > > > specs
> > > > > > > list from an earlier partial instantiation when calling
> > > > > > > build_extra_args
> > > > > > > a second time around.
> > > > > > 
> > > > > > Makes sense, but I wonder if we want to approach that by avoiding
> > > > > > walking into
> > > > > > *_EXTRA_ARGS in extract_locals_r?  Or do we still want to walk into
> > > > > > any
> > > > > > nested
> > > > > > extra args?  And if so, will we run into this same problem then?
> > > > > 
> > > > > I'm not sure totally but I'd expect a nested extra-args tree to always
> > > > > have empty *_EXTRA_ARGS since the outer extra-args tree should
> > > > > intercept
> > > > > any substitution before the inner extra-args tree can see it?
> > > > 
> > > > ... and so in extract_locals_r I think we can assume *_EXTRA_ARGS is
> > > > empty, and not have to explicitly avoid walking it.
> > > 
> > > It seems more robust to me to handle _EXTRA_ARGS appropriately in
> > > build_extra_args rather than expect callers to know that they shouldn't
> > > pass
> > > in a tree with _EXTRA_ARGS set.  At least check and abort in that case?
> > 
> > Sounds good.  That IMHO seems simpler than actually avoiding walking
> > into *_EXTRA_ARGS from extract_locals_r because we'd have to repeat
> > the walking logic from cp_walk_subtree modulo the *_EXTRA_ARGS walk.
> > 
> > How does the following look? Bootstraped and regtested on
> > x86_64-pc-linux-gnu.
> > 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index c38594cd862..6cc9b95fc06 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -13310,6 +13310,19 @@ extract_locals_r (tree *tp, int *walk_subtrees,
> > void *data_)
> >   /* Remember local typedefs (85214).  */
> >   tp = _NAME (*tp);
> >   
> 
> Please add a comment explaining why it needs to be null.
> 
> Also, how about a generic _EXTRA_ARGS accessor so places like this don't need
> to check each code themselves?

Sounds good.

> 
> > +  if (has_extra_args_mechanism_p (*tp))
> > +{
> > +  if (PACK_EXPANSION_P (*tp))
> > +   gcc_checking_assert (!PACK_EXPANSION_EXTRA_ARGS (*tp));
> > +  else if (TREE_CODE (*tp) == REQUIRES_EXPR)
> > +   gcc_checking_assert (!REQUIRES_EXPR_EXTRA_ARGS (*tp));
> > +  else if (TREE_CODE (*tp) == IF_STMT
> > +  && IF_STMT_CONSTEXPR_P (*tp))
> > +   gcc_checking_assert (!IF_STMT_EXTRA_ARGS (*tp));
> > +  else
> > +   gcc_unreachable ();
> > +}
> > +
> > if (TREE_CODE (*tp) == DECL_EXPR)
> >   {
> > tree decl = DECL_EXPR_DECL (*tp);
> > @@ -18738,7 +18751,8 @@ tsubst_stmt (tree t, tree args, tsubst_flags_t
> > complain, tree in_decl)
> >   IF_COND (stmt) = IF_COND (t);
> >   THEN_CLAUSE (stmt) = THEN_CLAUSE (t);
> >   ELSE_CLAUSE (stmt) = ELSE_CLAUSE (t);
> > - IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (t, args, complain);
> > + IF_SCOPE (stmt) = NULL_TREE;
> 
> What does IF_SCOPE have to do with this?

IF_SCOPE is the same field as IF_STMT_EXTRA_ARGS so we need to clear it
before calling build_extra_args to avoid tripping over the added assert.

How does the following look?

-- >8 --

Subject: [PATCH] c++: build_extra_args recapturing local specs [PR114303]

PR c++/114303

gcc/cp/ChangeLog:

* constraint.cc (tsubst_requires_expr): Clear
REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args.
* pt.cc (tree_extra_args): Define.
(extract_locals_r): Assert *_EXTRA_ARGS is empty.
(tsubst_stmt) : Clear IF_SCOPE on the new
IF_STMT.  Call build_extra_args on the new IF_STMT instead
of t which might already have IF_STMT_EXTRA_ARGS.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda6.C: New test.
---
 gcc/cp/constraint.cc  |  1 +
 gcc/cp/pt.cc  | 31 ++-
 .../g++.dg/cpp1z/constexpr-if-lambda6.C   | 16 ++
 3 files changed, 47 insertions(+), 1 deletion(-)
 create mode 100644 

[REVERTED] testsuite/gcc.target/cris/pr93372-2.c: Handle xpass from combine improvement

2024-04-10 Thread Hans-Peter Nilsson
> Date: Tue, 9 Apr 2024 15:18:10 -0500
> From: Segher Boessenkool 

> All (target-specific) new testsuite failures are just like that: bad
> testcases!

With a touch of bad assumptions by port-specific code, no
doubt.  Maybe also rtx costs including my pet peeve, the
default implementation of insn_costs (the one that doesn't
look at the destination of setters and which when you try
fixing it, pulls you down a rabbit-hole of cost-related
regressions that even Bernd S. backed away from).

> So no, no reversion.

(...)

> > That's the only test that's improved to the point of
> > affecting test-patterns.  E.g. pr93372-5.c (which references
> > pr93372-2.c) is also improved, though it retains a redundant
> > compare insn.  (PR 93372 was about regressions from the cc0
> > representation; not further improvement like here, thus it's
> > not tagged.  Though, I did not double-check whether this
> > actually *was* a regression from cc0.)
> 
> Interesting that this improved tests for you.  Huh.  Do you have an
> explanation how this happened?

Just a hunch: less combine churn (more straightforward code)
made cmpelim's job easier, same thing you wrote in order
words:

>  I suspect that as uaual it is just a
> side effect of random factors: combine is opportunistic, always does the
> first change it thinks good, not considering what this then does for
> other possible combinations; it is greedy.  It would be nice to see
> written out what happens in this example though :-)

Yes it would, but I have other things on my plate.  Besides,
it's your patch, can't rob you of the fun.

I committed the revert below, but hope to re-apply
(re-revert) it in stage 1, when as per Richard B's message
the combine improvement will reappear.

brgds, H-P

-- >8 --
From: Hans-Peter Nilsson 
Date: Wed, 10 Apr 2024 17:24:10 +0200
Subject: [PATCH] Revert "testsuite/gcc.target/cris/pr93372-2.c: Handle xpass
 from combine improvement"

This reverts commit 4c8b3600c4856f7915281ae3ff4d97271c83a540.
---
 gcc/testsuite/gcc.target/cris/pr93372-2.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/cris/pr93372-2.c 
b/gcc/testsuite/gcc.target/cris/pr93372-2.c
index 2ef6471a990b..912069c018d5 100644
--- a/gcc/testsuite/gcc.target/cris/pr93372-2.c
+++ b/gcc/testsuite/gcc.target/cris/pr93372-2.c
@@ -1,20 +1,19 @@
 /* Check that eliminable compare-instructions are eliminated. */
 /* { dg-do compile } */
 /* { dg-options "-O2" } */
-/* { dg-final { scan-assembler-not "\tcmp|\ttest" } } */
-/* { dg-final { scan-assembler-not "\tnot" } } */
-/* { dg-final { scan-assembler-not "\tlsr" } } */
-/* We should get just one move, storing the result into *d.  */
-/* { dg-final { scan-assembler-times "\tmove" 1 } } */
+/* { dg-final { scan-assembler-not "\tcmp|\ttest" { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "\tnot" { xfail cc0 } } } */
+/* { dg-final { scan-assembler-not "\tlsr" { xfail cc0 } } } */
 
 int f(int a, int b, int *d)
 {
   int c = a - b;
 
-  /* We used to get a cmp.d with the original operands here. */
+  /* Whoops!  We get a cmp.d with the original operands here. */
   *d = (c == 0);
 
-  /* We used to get a suboptimal sequence, but now we get the optimal "sge"
- (a.k.a "spl") re-using flags from the subtraction. */
+  /* Whoops!  While we don't get a test.d for the result here for cc0,
+ we get a sequence of insns: a move, a "not" and a shift of the
+ subtraction-result, where a simple "spl" would have done. */
   return c >= 0;
 }
-- 
2.30.2



brgds, H-P


Re: [PATCH] c++/modules: local class merging [PR99426]

2024-04-10 Thread Jason Merrill

On 4/10/24 14:48, Patrick Palka wrote:

On Tue, 9 Apr 2024, Jason Merrill wrote:


On 3/5/24 10:31, Patrick Palka wrote:

On Tue, 27 Feb 2024, Patrick Palka wrote:

Subject: [PATCH] c++/modules: local type merging [PR99426]

One known missing piece in the modules implementation is merging of a
streamed-in local type (class or enum) with the corresponding in-TU
version of the local type.  This missing piece turns out to cause a
hard-to-reduce use-after-free GC issue due to the entity_ary not being
marked as a GC root (deliberately), and manifests as a serialization
error on stream-in as in PR99426 (see comment #6 for a reduction).  It's
also reproducible on trunk when running the xtreme-header tests without
-fno-module-lazy.

This patch makes us merge such local types according to their position
within the containing function's definition, analogous to how we merge
FIELD_DECLs of a class according to their index in the TYPE_FIELDS
list.

PR c++/99426

gcc/cp/ChangeLog:

* module.cc (merge_kind::MK_local_type): New enumerator.
(merge_kind_name): Update.
(trees_out::chained_decls): Move BLOCK-specific handling
of DECL_LOCAL_DECL_P decls to ...
(trees_out::core_vals) : ... here.  Stream
BLOCK_VARS manually.
(trees_in::core_vals) : Stream BLOCK_VARS
manually.  Handle deduplicated local types..
(trees_out::key_local_type): Define.
(trees_in::key_local_type): Define.
(trees_out::get_merge_kind) : Return
MK_local_type for a local type.
(trees_out::key_mergeable) : Use
key_local_type.
(trees_in::key_mergeable) : Likewise.
(trees_in::is_matching_decl): Be flexible with type mismatches
for local entities.

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 80b63a70a62..d9e34e9a4b9 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -6714,7 +6720,37 @@ trees_in::core_vals (tree t)
   case BLOCK:
 t->block.locus = state->read_location (*this);
 t->block.end_locus = state->read_location (*this);
-  t->block.vars = chained_decls ();
+
+  for (tree *chain = >block.vars;;)
+   if (tree decl = tree_node ())
+ {
+   /* For a deduplicated local type or enumerator, chain the
+  duplicate decl instead of the canonical in-TU decl.  Seeing
+  a duplicate here means the containing function whose body
+  we're streaming in is a duplicate too, so we'll end up
+  discarding this BLOCK (and the rest of the duplicate function
+  body) anyway.  */
+   if (is_duplicate (decl))
+ decl = maybe_duplicate (decl);
+   else if (DECL_IMPLICIT_TYPEDEF_P (decl)
+&& TYPE_TEMPLATE_INFO (TREE_TYPE (decl)))
+ {
+   tree tmpl = TYPE_TI_TEMPLATE (TREE_TYPE (decl));
+   if (DECL_TEMPLATE_RESULT (tmpl) == decl && is_duplicate
(tmpl))
+ decl = DECL_TEMPLATE_RESULT (maybe_duplicate (tmpl));
+ }


This seems like a lot of generally-applicable code for finding the duplicate,
which other calls to maybe_duplicate/odr_duplicate don't use.  If the template
is a duplicate, why isn't its result?  If there's a good reason for that,
should this template handling go into maybe_duplicate?


Ah yeah, that makes sense.

Some context: IIUC modules treats the TEMPLATE_DECL instead of the
DECL_TEMPLATE_RESULT as the canonical decl, which in turn means we'll
register_duplicate only the TEMPLATE_DECL.  But BLOCK_VARS never contains
a TEMPLATE_DECL, always the DECL_TEMPLATE_RESULT (i.e. a TYPE_DECL),
hence the extra handling.

Given that it's relatively more difficult to get at the TEMPLATE_DECL
from the DECL_TEMPLATE_RESULT rather than vice versa, maybe we should
just register both as duplicates from register_duplicate?  That way
callers can just simply pass the DECL_TEMPLATE_RESULT to maybe_duplicate
and it'll do the right thing.


Sounds good.


@@ -10337,6 +10373,83 @@ trees_in::fn_parms_fini (int tag, tree fn, tree
existing, bool is_defn)
   }
   }
   +/* Encode into KEY the position of the local type (class or enum)
+   declaration DECL within FN.  The position is encoded as the
+   index of the innermost BLOCK (numbered in BFS order) along with
+   the index within its BLOCK_VARS list.  */


Since we already set DECL_DISCRIMINATOR for mangling, could we use it+name for
the key as well?


We could (and IIUc that'd be more robust to ODR violations), but
wouldn't it mean we'd have to do a linear walk over all BLOCK_VARs of
all BLOCKS in order to find the one with the matching
name+discriminator?  That'd be slower than the current approach which
lets us skip to the correct BLOCK and walk only its BLOCK_VARS.


Ah, good point.  How about block number + name instead of the index?

Jason



Re: [PATCH] c++: recalculating local specs via build_extra_args [PR114303]

2024-04-10 Thread Jason Merrill

On 4/10/24 17:39, Patrick Palka wrote:

On Wed, 10 Apr 2024, Jason Merrill wrote:


On 3/12/24 10:51, Patrick Palka wrote:

On Tue, 12 Mar 2024, Patrick Palka wrote:

On Tue, 12 Mar 2024, Jason Merrill wrote:

On 3/11/24 12:53, Patrick Palka wrote:


r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts
first so that we prefer processing a local specialization in an
evaluated
context even if its first use is in an unevaluated context.  But this
means we need to avoid walking a tree that already has extra
args/specs
saved because the list of saved specs appears to be an evaluated
context.  It seems then that we should be calculating the saved specs
from scratch each time, rather than potentially walking the saved
specs
list from an earlier partial instantiation when calling
build_extra_args
a second time around.


Makes sense, but I wonder if we want to approach that by avoiding
walking into
*_EXTRA_ARGS in extract_locals_r?  Or do we still want to walk into any
nested
extra args?  And if so, will we run into this same problem then?


I'm not sure totally but I'd expect a nested extra-args tree to always
have empty *_EXTRA_ARGS since the outer extra-args tree should intercept
any substitution before the inner extra-args tree can see it?


... and so in extract_locals_r I think we can assume *_EXTRA_ARGS is
empty, and not have to explicitly avoid walking it.


It seems more robust to me to handle _EXTRA_ARGS appropriately in
build_extra_args rather than expect callers to know that they shouldn't pass
in a tree with _EXTRA_ARGS set.  At least check and abort in that case?


Sounds good.  That IMHO seems simpler than actually avoiding walking
into *_EXTRA_ARGS from extract_locals_r because we'd have to repeat
the walking logic from cp_walk_subtree modulo the *_EXTRA_ARGS walk.

How does the following look? Bootstraped and regtested on
x86_64-pc-linux-gnu.

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c38594cd862..6cc9b95fc06 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13310,6 +13310,19 @@ extract_locals_r (tree *tp, int *walk_subtrees, void 
*data_)
  /* Remember local typedefs (85214).  */
  tp = _NAME (*tp);
  


Please add a comment explaining why it needs to be null.

Also, how about a generic _EXTRA_ARGS accessor so places like this don't 
need to check each code themselves?



+  if (has_extra_args_mechanism_p (*tp))
+{
+  if (PACK_EXPANSION_P (*tp))
+   gcc_checking_assert (!PACK_EXPANSION_EXTRA_ARGS (*tp));
+  else if (TREE_CODE (*tp) == REQUIRES_EXPR)
+   gcc_checking_assert (!REQUIRES_EXPR_EXTRA_ARGS (*tp));
+  else if (TREE_CODE (*tp) == IF_STMT
+  && IF_STMT_CONSTEXPR_P (*tp))
+   gcc_checking_assert (!IF_STMT_EXTRA_ARGS (*tp));
+  else
+   gcc_unreachable ();
+}
+
if (TREE_CODE (*tp) == DECL_EXPR)
  {
tree decl = DECL_EXPR_DECL (*tp);
@@ -18738,7 +18751,8 @@ tsubst_stmt (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  IF_COND (stmt) = IF_COND (t);
  THEN_CLAUSE (stmt) = THEN_CLAUSE (t);
  ELSE_CLAUSE (stmt) = ELSE_CLAUSE (t);
- IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (t, args, complain);
+ IF_SCOPE (stmt) = NULL_TREE;


What does IF_SCOPE have to do with this?

Jason



Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Joseph Myers
On Wed, 10 Apr 2024, Qing Zhao wrote:

> Okay, the above is very clear, thanks a lot for the explanation.
> So, basically, for “counted-by” attribute:
> **The following is good:
> struct f { 
>   int b;
>   int c;
>   int a[]  __attribute__ ((counted_by (b))) };
> struct f {
>   int b;
>   int c;
>   int a[] __attribute__ ((counted_by (b))) };
> 
> **The following should error:
> 
> struct f { 
>   int b;
>   int c;
>   int a[]  __attribute__ ((counted_by (b))) };
> struct f {
>   int b;
>   int c;
>   int a[] __attribute__ ((counted_by (c))) };  /* error here */
> 
> For the same tag in different scopes case:
> 
> struct f { 
>   int b;
>   int c;
>   int a[]  __attribute__ ((counted_by (b))) }  y0;
> 
> void test1(void) 
> {   
> struct f {
>   int b;
>   int c;
>   int a[] __attribute__ ((counted_by (c))) } x;
> 
>   y0 = x;  /* will report incompatible type error here */
> }
> 
> Are the above complete?

Yes, that looks like what should be tested (with the addition of the case 
of same tag, different scopes, same counted_by so compatible).

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH v8 5/5] Add the 6th argument to .ACCESS_WITH_SIZE

2024-04-10 Thread Siddhesh Poyarekar

On 2024-03-29 12:07, Qing Zhao wrote:

to carry the TYPE of the flexible array.

Such information is needed during tree-object-size.cc.

We cannot use the result type or the type of the 1st argument
of the routine .ACCESS_WITH_SIZE to decide the element type
of the original array due to possible type casting in the
source code.

gcc/c/ChangeLog:

* c-typeck.cc (build_access_with_size_for_counted_by): Add the 6th
argument to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): Use the type
of the 6th argument for the type of the element.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-6.c: New test.


This version looks fine to me for stage 1, but I'm not a maintainer so 
you'll need an ack from one to commit.


Thanks,
Sid


---
  gcc/c/c-typeck.cc | 11 +++--
  gcc/internal-fn.cc|  2 +
  .../gcc.dg/flex-array-counted-by-6.c  | 46 +++
  gcc/tree-object-size.cc   | 16 ---
  4 files changed, 66 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-6.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index f7b0e08459b0..05948f76039e 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2608,7 +2608,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
  
 to:
  
-   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1))

+   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1,
+   (TYPE_OF_ARRAY *)0))
  
 NOTE: The return type of this function is the POINTER type pointing

 to the original flexible array type.
@@ -2620,6 +2621,9 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
 The 4th argument of the call is a constant 0 with the TYPE of the
 object pointed by COUNTED_BY_REF.
  
+   The 6th argument of the call is a constant 0 with the pointer TYPE

+   to the original flexible array type.
+
*/
  static tree
  build_access_with_size_for_counted_by (location_t loc, tree ref,
@@ -2632,12 +2636,13 @@ build_access_with_size_for_counted_by (location_t loc, 
tree ref,
  
tree call

  = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
-   result_type, 5,
+   result_type, 6,
array_to_pointer_conversion (loc, ref),
counted_by_ref,
build_int_cst (integer_type_node, 1),
build_int_cst (counted_by_type, 0),
-   build_int_cst (integer_type_node, -1));
+   build_int_cst (integer_type_node, -1),
+   build_int_cst (result_type, 0));
/* Wrap the call with an INDIRECT_REF with the flexible array type.  */
call = build1 (INDIRECT_REF, TREE_TYPE (ref), call);
SET_EXPR_LOCATION (call, loc);
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index e744080ee670..34e4a4aea534 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3411,6 +3411,8 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
   1: read_only
   2: write_only
   3: read_write
+   6th argument: A constant 0 with the pointer TYPE to the original flexible
+ array type.
  
 Both the return type and the type of the first argument of this

 function have been converted from the incomplete array type to
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
new file mode 100644
index ..65fa01443d95
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
@@ -0,0 +1,46 @@
+/* Test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size: when the type of the flexible array member
+ * is casting to another type.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+typedef unsigned short u16;
+
+struct info {
+   u16 data_len;
+   char data[] __attribute__((counted_by(data_len)));
+};
+
+struct foo {
+   int a;
+   int b;
+};
+
+static __attribute__((__noinline__))
+struct info *setup ()
+{
+ struct info *p;
+ size_t bytes = 3 * sizeof(struct foo);
+
+ p = (struct info *)malloc (sizeof (struct info) + bytes);
+ p->data_len = bytes;
+
+ return p;
+}
+
+static void
+__attribute__((__noinline__)) report (struct info *p)
+{
+ struct foo *bar = (struct foo *)p->data;
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 1), 1), 16);
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 2), 1), 8);
+}
+
+int main(int argc, char *argv[])
+{
+ struct info *p = setup();
+ report(p);
+ return 0;
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index 8de264d1dee2..4c1fa9b555fa 100644
--- 

Re: [PATCH v8 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-04-10 Thread Siddhesh Poyarekar

On 2024-03-29 12:07, Qing Zhao wrote:

gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-4.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---


This version looks fine to me for stage 1, but I'm not a maintainer so 
you'll need an ack from one to commit.


Thanks,
Sid


  gcc/c-family/c-ubsan.cc   | 42 +
  .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
  .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
  .../ubsan/flex-array-counted-by-bounds-4.c| 34 ++
  .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
  5 files changed, 201 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-4.c
  create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..7cd3c6aa5b88 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, 
data));
  }
  
+/* Get the tree that represented the number of counted_by, i.e, the maximum

+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int class_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));
+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+  build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+   build_zero_cst (type), size);
+  }
+
+  /* Only when class_of_size is 1, i.e, the number of the elements of
+ the object type, return the size.  */
+  if (class_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
  /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
 that gets expanded in the sanopt pass, and make an array dimension
 of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
  && COMPLETE_TYPE_P (type)
  && integer_zerop (TYPE_SIZE (type)))
bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+  && is_access_with_size_p ((TREE_OPERAND (array, 0
+   {
+ bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+  bound,
+  build_int_cst (TREE_TYPE (bound), 1));
+   }
else
return NULL_TREE;
  }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..b503320628d2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* Test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void 

Re: [PATCH v8 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-04-10 Thread Siddhesh Poyarekar

On 2024-03-29 12:07, Qing Zhao wrote:

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.


This version looks fine to me for stage 1, but I'm not a maintainer so 
you'll need an ack from one to commit.


Thanks,
Sid


---
  .../gcc.dg/builtin-object-size-common.h   |  11 ++
  .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
  .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
  .../gcc.dg/flex-array-counted-by-5.c  |  48 +
  gcc/tree-object-size.cc   |  60 ++
  5 files changed, 360 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
  create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
__builtin_abort ();   \
  return 0;   \
} while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v; \
+  if (p == v)\
+__builtin_printf ("ok:  %s == %zd\n", #p, p);  \
+  else   \
+{\
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);  
\
+  FAIL ();   \
+}\
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..78f50230e891
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* Test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f; 
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
++ normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+ + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
++ attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+  array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+  array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..20103d58ef51
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* Test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+We should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline 

Re: [PATCH] c++: recalculating local specs via build_extra_args [PR114303]

2024-04-10 Thread Patrick Palka
On Wed, 10 Apr 2024, Jason Merrill wrote:

> On 3/12/24 10:51, Patrick Palka wrote:
> > On Tue, 12 Mar 2024, Patrick Palka wrote:
> > > On Tue, 12 Mar 2024, Jason Merrill wrote:
> > > > On 3/11/24 12:53, Patrick Palka wrote:
> > > > > 
> > > > > r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts
> > > > > first so that we prefer processing a local specialization in an
> > > > > evaluated
> > > > > context even if its first use is in an unevaluated context.  But this
> > > > > means we need to avoid walking a tree that already has extra
> > > > > args/specs
> > > > > saved because the list of saved specs appears to be an evaluated
> > > > > context.  It seems then that we should be calculating the saved specs
> > > > > from scratch each time, rather than potentially walking the saved
> > > > > specs
> > > > > list from an earlier partial instantiation when calling
> > > > > build_extra_args
> > > > > a second time around.
> > > > 
> > > > Makes sense, but I wonder if we want to approach that by avoiding
> > > > walking into
> > > > *_EXTRA_ARGS in extract_locals_r?  Or do we still want to walk into any
> > > > nested
> > > > extra args?  And if so, will we run into this same problem then?
> > > 
> > > I'm not sure totally but I'd expect a nested extra-args tree to always
> > > have empty *_EXTRA_ARGS since the outer extra-args tree should intercept
> > > any substitution before the inner extra-args tree can see it?
> > 
> > ... and so in extract_locals_r I think we can assume *_EXTRA_ARGS is
> > empty, and not have to explicitly avoid walking it.
> 
> It seems more robust to me to handle _EXTRA_ARGS appropriately in
> build_extra_args rather than expect callers to know that they shouldn't pass
> in a tree with _EXTRA_ARGS set.  At least check and abort in that case?

Sounds good.  That IMHO seems simpler than actually avoiding walking
into *_EXTRA_ARGS from extract_locals_r because we'd have to repeat
the walking logic from cp_walk_subtree modulo the *_EXTRA_ARGS walk.

How does the following look? Bootstraped and regtested on
x86_64-pc-linux-gnu.

-- > 8--

Subject: [PATCH] c++: build_extra_args recapturing local specs [PR114303]

PR c++/114303

gcc/cp/ChangeLog:

* constraint.cc (tsubst_requires_expr): Clear
REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args.
* pt.cc (extract_locals_r): Assert *_EXTRA_ARGS is empty.
(tsubst_stmt) : Call build_extra_args
on the new IF_STMT instead of t which might already have
IF_STMT_EXTRA_ARGS.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda6.C: New test.
---
 gcc/cp/constraint.cc |  1 +
 gcc/cp/pt.cc | 16 +++-
 .../g++.dg/cpp1z/constexpr-if-lambda6.C  | 16 
 3 files changed, 32 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 49de3211d4c..8a3b5d80ba7 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2362,6 +2362,7 @@ tsubst_requires_expr (tree t, tree args, sat_info info)
 matching or dguide constraint rewriting), in which case we need
 to partially substitute.  */
   t = copy_node (t);
+  REQUIRES_EXPR_EXTRA_ARGS (t) = NULL_TREE;
   REQUIRES_EXPR_EXTRA_ARGS (t) = build_extra_args (t, args, info.complain);
   return t;
 }
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c38594cd862..6cc9b95fc06 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13310,6 +13310,19 @@ extract_locals_r (tree *tp, int *walk_subtrees, void 
*data_)
 /* Remember local typedefs (85214).  */
 tp = _NAME (*tp);
 
+  if (has_extra_args_mechanism_p (*tp))
+{
+  if (PACK_EXPANSION_P (*tp))
+   gcc_checking_assert (!PACK_EXPANSION_EXTRA_ARGS (*tp));
+  else if (TREE_CODE (*tp) == REQUIRES_EXPR)
+   gcc_checking_assert (!REQUIRES_EXPR_EXTRA_ARGS (*tp));
+  else if (TREE_CODE (*tp) == IF_STMT
+  && IF_STMT_CONSTEXPR_P (*tp))
+   gcc_checking_assert (!IF_STMT_EXTRA_ARGS (*tp));
+  else
+   gcc_unreachable ();
+}
+
   if (TREE_CODE (*tp) == DECL_EXPR)
 {
   tree decl = DECL_EXPR_DECL (*tp);
@@ -18738,7 +18751,8 @@ tsubst_stmt (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  IF_COND (stmt) = IF_COND (t);
  THEN_CLAUSE (stmt) = THEN_CLAUSE (t);
  ELSE_CLAUSE (stmt) = ELSE_CLAUSE (t);
- IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (t, args, complain);
+ IF_SCOPE (stmt) = NULL_TREE;
+ IF_STMT_EXTRA_ARGS (stmt) = build_extra_args (stmt, args, complain);
  add_stmt (stmt);
  break;
}
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda6.C
new file mode 100644
index 000..038c2a41210
--- /dev/null
+++ 

[wwwdocs] Document more C++ changes

2024-04-10 Thread Marek Polacek
I went through all cp/ commits in GCC 14 and documented a few interesting
user-visible changes, modulo Modules.

W3 validated.  Pushed.

commit d65752191baaa137eb6d604b802e7b9170a39752
Author: Marek Polacek 
Date:   Wed Apr 10 17:21:09 2024 -0400

gcc-14/changes: Document more C++ changes

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 4a063346..5c2439ab 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -273,6 +273,9 @@ a work-in-progress.
   
   Several C++23 features have been implemented:
 
+  https://wg21.link/P0847R7;>P0847R7, Deducing this
+  (https://gcc.gnu.org/PR102609;>PR102609)
+  
   https://wg21.link/P2280R4;>P2280R4, Using unknown
   references in constant expressions
   (https://gcc.gnu.org/PR106650;>PR106650)
@@ -289,12 +292,26 @@ a work-in-progress.
   
   Several C++ Defect Reports have been resolved, e.g.:
 
+  https://wg21.link/cwg532;>DR 532,
+  Member/nonmember operator template partial ordering
   https://wg21.link/cwg976;>DR 976,
   Deduction for const T& conversion operators
+  https://wg21.link/cwg2262;>DR 2262,
+   Attributes for asm-definition
+  https://wg21.link/cwg2359;>DR 2359,
+  Unintended copy initialization with designated initializers
+  https://wg21.link/cwg2386;>DR 2386,
+  tuple_size requirements for structured binding
   https://wg21.link/cwg2406;>DR 2406,
   [[fallthrough]] attribute and iteration statements
   https://wg21.link/cwg2543;>DR 2543,
   constinit and optimized dynamic initialization
+  https://wg21.link/cwg2586;>DR 2586,
+  Explicit object parameter for assignment and comparison
+  https://wg21.link/cwg2735;>DR 2735,
+  List-initialization and conversions in overload resolution
+  https://wg21.link/cwg2799;>DR 2799,
+  Inheriting default constructors
 
   
   
@@ -304,6 +321,85 @@ a work-in-progress.
 the template is instantiated ("required from here"),
 rather than just print filename and line/column numbers.
   
+  New built-in __type_pack_element to speed up traits
+  such as std::tuple_element
+  (https://gcc.gnu.org/PR100157;>PR100157)
+  goto can cross the initialization of a trivially initialized
+  object with a non-trivial destructor
+  (https://cplusplus.github.io/CWG/issues/2256.html;>DR 
2256)
+  -Wdangling-reference false positives have been reduced.  The
+  warning does not warn about std::span-like classes; there is
+  also a new attribute gnu::no_dangling to suppress the
+  warning.  See
+  https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html#index-Wdangling-reference;>the
 manual
+  for more info.
+  noexcept(expr) is now mangled as per the Itanium ABI
+  the named return value optimization can now be performed even for
+  variables declared in an inner block of a function, see the
+  https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/g%2B%2B.dg/opt/nrv23.C;h=9e1253cd830a84ad4de5ff3076a07c543afe344f;hb=7e0b65b239c3a0d68ce94896b236b03de666ffd6;>
+  test
+  New -Wnrvo warning, to warn if the named return value
+  optimization is not performed although it is allowed by
+  [class.copy.elision].  See
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wnrvo;>the 
manual
+  for more info.
+  The backing array for std::initializer_list has been made
+  static, allowing combining multiple equivalent initializer-lists
+  (https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=4d935f52b0d5c00fcc154461b87415ebd8791a94;>git)
+  
+  New -Welaborated-enum-base warning, to warn if an additional
+  enum-base is used in an elaborated-type-specifier
+  Better #include hints for missing headers
+  (https://gcc.gnu.org/PR110164;>PR110164)
+  The arguments of a variable template-id are coerced earlier than
+  before, so various problems are detected earlier
+  (https://gcc.gnu.org/PR89442;>PR89442)
+  -Wmissing-field-initializers is no longer emitted for
+  empty classes
+  (https://gcc.gnu.org/PR110064;>PR110064)
+  The constexpr code now tracks lifetimes in constant evaluation; this
+  change helps to detect bugs such as accessing a variable whose
+  lifetime has ended
+  (https://gcc.gnu.org/PR70331;>PR70331,
+  https://gcc.gnu.org/PR96630;>PR96630,
+  https://gcc.gnu.org/PR98675;>PR98675)
+  
+  Array destruction can now be devirtualized
+  In-class member variable template partial specializations are now
+  accepted (https://gcc.gnu.org/PR71954;>PR71954)
+  Improved diagnostic for explicit conversion functions: when a conversion
+  doesn't work out only because the conversion function necessary to do the
+  conversion couldn't be used because it was marked explicit, explain that
+  to the user
+  (https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=85ad41a494e31311f8a6b2dbe930a128c5e85840;>git)
+  

[pushed] analyzer: add SARIF property bag to -Wanalyzer-infinite-recursion

2024-04-10 Thread David Malcolm
Tested lightly by hand.
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9900-g960e07d73a5295.

gcc/analyzer/ChangeLog:
* infinite-recursion.cc: Include "diagnostic-format-sarif.h".
(infinite_recursion_diagnostic::maybe_add_sarif_properties): New.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/infinite-recursion.cc | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/analyzer/infinite-recursion.cc 
b/gcc/analyzer/infinite-recursion.cc
index 112e4bd08f28..65f136ddad6b 100644
--- a/gcc/analyzer/infinite-recursion.cc
+++ b/gcc/analyzer/infinite-recursion.cc
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "make-unique.h"
 #include "analyzer/checker-path.h"
 #include "analyzer/feasible-graph.h"
+#include "diagnostic-format-sarif.h"
 
 /* A subclass of pending_diagnostic for complaining about suspected
infinite recursion.  */
@@ -236,6 +237,18 @@ public:
 return false;
   }
 
+  void maybe_add_sarif_properties (sarif_object _obj)
+const final override
+  {
+sarif_property_bag  = result_obj.get_or_create_properties ();
+#define PROPERTY_PREFIX "gcc/analyzer/infinite_recursion_diagnostic/"
+props.set_integer (PROPERTY_PREFIX "prev_entry_enode",
+  m_prev_entry_enode->m_index);
+props.set_integer (PROPERTY_PREFIX "new_entry_enode",
+  m_new_entry_enode->m_index);
+#undef PROPERTY_PREFIX
+  }
+
 private:
   /* Return true iff control flow along FEDGE was affected by
  a conjured_svalue.  */
-- 
2.26.3



[pushed] analyzer: fixes to internal docs

2024-04-10 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9897-g7f6599a201be2a.

gcc/ChangeLog:
* doc/analyzer.texi: Various tweaks.

Signed-off-by: David Malcolm 
---
 gcc/doc/analyzer.texi | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/analyzer.texi b/gcc/doc/analyzer.texi
index 8eb40272cb71..b53096e7b7d9 100644
--- a/gcc/doc/analyzer.texi
+++ b/gcc/doc/analyzer.texi
@@ -21,6 +21,9 @@
 
 @subsection Overview
 
+At a high-level, we're doing coverage-guided symbolic execution of the
+user's code.
+
 The analyzer implementation works on the gimple-SSA representation.
 (I chose this in the hopes of making it easy to work with LTO to
 do whole-program analysis).
@@ -55,7 +58,9 @@ Next is the heart of the analyzer: we use a worklist to 
explore state
 within the supergraph, building an "exploded graph".
 Nodes in the exploded graph correspond to  pairs, as in
  "Precise Interprocedural Dataflow Analysis via Graph Reachability"
- (Thomas Reps, Susan Horwitz and Mooly Sagiv).
+ (Thomas Reps, Susan Horwitz and Mooly Sagiv) - but note that
+we're not using the algorithm described in that paper, just the
+``exploded graph'' terminology.
 
 We reuse nodes for  pairs we've already seen, and avoid
 tracking state too closely, so that (hopefully) we rapidly converge
@@ -499,7 +504,8 @@ which dumps a @file{SRC.eg.txt} file containing the full 
@code{exploded_graph}.
 
 Assuming that you have the
 
@uref{https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html,,python 
support scripts for gdb}
-installed, you can use:
+installed (which you should do, it makes debugging GCC much easier),
+you can use:
 
 @smallexample
 (gdb) break-on-saved-diagnostic
-- 
2.26.3



[pushed] analyzer: fix ICE on negative values for size_t [PR114472]

2024-04-10 Thread David Malcolm
I made several attempts to fix this properly, but for now apply
a band-aid to at least prevent crashing on such cases.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9902-g4a94551d7eaaf7.

gcc/analyzer/ChangeLog:
PR analyzer/114472
* access-diagram.cc (bit_size_expr::maybe_get_formatted_str):
Reject attempts to print sizes that are too large.
* region.cc (region_offset::calc_symbolic_bit_offset): Use a
typeless svalue for the bit offset.
* store.cc (bit_range::intersects_p): Replace assertion with
test.
(bit_range::exceeds_p): Likewise.
(bit_range::falls_short_of_p): Likewise.

gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/out-of-bounds-pr114472.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/access-diagram.cc|  4 
 gcc/analyzer/region.cc|  2 +-
 gcc/analyzer/store.cc | 20 +++
 .../analyzer/out-of-bounds-pr114472.c | 17 
 4 files changed, 38 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/analyzer/out-of-bounds-pr114472.c

diff --git a/gcc/analyzer/access-diagram.cc b/gcc/analyzer/access-diagram.cc
index 4cb6570e90b9..873205c46499 100644
--- a/gcc/analyzer/access-diagram.cc
+++ b/gcc/analyzer/access-diagram.cc
@@ -373,6 +373,8 @@ bit_size_expr::maybe_get_formatted_str 
(text_art::style_manager ,
   if (tree cst = num_bytes->maybe_get_constant ())
{
  byte_size_t concrete_num_bytes = wi::to_offset (cst);
+ if (!wi::fits_uhwi_p (concrete_num_bytes))
+   return nullptr;
  if (concrete_num_bytes == 1)
return ::make_unique 
  (fmt_styled_string (sm, concrete_single_byte_fmt,
@@ -396,6 +398,8 @@ bit_size_expr::maybe_get_formatted_str 
(text_art::style_manager ,
   else if (tree cst = m_num_bits.maybe_get_constant ())
 {
   bit_size_t concrete_num_bits = wi::to_offset (cst);
+  if (!wi::fits_uhwi_p (concrete_num_bits))
+   return nullptr;
   if (concrete_num_bits == 1)
return ::make_unique 
  (fmt_styled_string (sm, concrete_single_bit_fmt,
diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 705816b62454..7d79b45563fd 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -89,7 +89,7 @@ region_offset::calc_symbolic_bit_offset (region_model_manager 
*mgr) const
m_sym_offset, bits_per_byte);
 }
   else
-return *mgr->get_or_create_int_cst (size_type_node, m_offset);
+return *mgr->get_or_create_int_cst (NULL_TREE, m_offset);
 }
 
 const svalue *
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index e85a19647f7e..a36de13c1743 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -290,7 +290,10 @@ bit_range::intersects_p (const bit_range ,
   bit_offset_t overlap_next
= MIN (get_next_bit_offset (),
   other.get_next_bit_offset ());
-  gcc_assert (overlap_next > overlap_start);
+  if (overlap_next <= overlap_start)
+   /* If this has happened, some kind of overflow has happened in
+  our arithmetic.  For now, reject such cases.  */
+   return false;
   bit_range abs_overlap_bits (overlap_start, overlap_next - overlap_start);
   *out_this = abs_overlap_bits - get_start_bit_offset ();
   *out_other = abs_overlap_bits - other.get_start_bit_offset ();
@@ -316,7 +319,10 @@ bit_range::intersects_p (const bit_range ,
 other.get_start_bit_offset ());
   bit_offset_t overlap_next = MIN (get_next_bit_offset (),
other.get_next_bit_offset ());
-  gcc_assert (overlap_next > overlap_start);
+  if (overlap_next <= overlap_start)
+   /* If this has happened, some kind of overflow has happened in
+  our arithmetic.  For now, reject such cases.  */
+   return false;
   *out_num_overlap_bits = overlap_next - overlap_start;
   return true;
 }
@@ -339,7 +345,10 @@ bit_range::exceeds_p (const bit_range ,
   bit_offset_t start = MAX (get_start_bit_offset (),
 other.get_next_bit_offset ());
   bit_offset_t size = get_next_bit_offset () - start;
-  gcc_assert (size > 0);
+  if (size <= 0)
+   /* If this has happened, some kind of overflow has happened in
+  our arithmetic.  For now, reject such cases.  */
+   return false;
   out_overhanging_bit_range->m_start_bit_offset = start;
   out_overhanging_bit_range->m_size_in_bits = size;
   return true;
@@ -362,7 +371,10 @@ bit_range::falls_short_of_p (bit_offset_t offset,
   /* THIS falls short of OFFSET.  */
   bit_offset_t start = get_start_bit_offset ();
   bit_offset_t size = MIN (offset, 

[pushed] analyzer: show size in SARIF property bag for -Wanalyzer-tainted-allocation-size

2024-04-10 Thread David Malcolm
Tested lightly by hand.
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9898-g115d5c6b009456.

gcc/analyzer/ChangeLog:
* sm-taint.cc (tainted_allocation_size::tainted_allocation_size):
Add "size_in_bytes" param.
(tainted_allocation_size::maybe_add_sarif_properties): New.
(tainted_allocation_size::m_size_in_bytes): New field.
(region_model::check_dynamic_size_for_taint): Pass size_in_bytes
to tainted_allocation_size ctor.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/sm-taint.cc | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 1d1e208fdf49..a9c6d4db43f4 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -645,8 +645,10 @@ class tainted_allocation_size : public taint_diagnostic
 {
 public:
   tainted_allocation_size (const taint_state_machine , tree arg,
+  const svalue *size_in_bytes,
   enum bounds has_bounds, enum memory_space mem_space)
   : taint_diagnostic (sm, arg, has_bounds),
+m_size_in_bytes (size_in_bytes),
 m_mem_space (mem_space)
   {
   }
@@ -781,7 +783,18 @@ public:
}
   }
 
+  void maybe_add_sarif_properties (sarif_object _obj)
+const final override
+  {
+taint_diagnostic::maybe_add_sarif_properties (result_obj);
+sarif_property_bag  = result_obj.get_or_create_properties ();
+#define PROPERTY_PREFIX "gcc/analyzer/tainted_allocation_size/"
+props.set (PROPERTY_PREFIX "size_in_bytes", m_size_in_bytes->to_json ());
+#undef PROPERTY_PREFIX
+  }
+
 private:
+  const svalue *m_size_in_bytes;
   enum memory_space m_mem_space;
 };
 
@@ -1678,7 +1691,7 @@ region_model::check_dynamic_size_for_taint (enum 
memory_space mem_space,
 {
   tree arg = get_representative_tree (size_in_bytes);
   ctxt->warn (make_unique
-   (taint_sm, arg, b, mem_space));
+   (taint_sm, arg, size_in_bytes, b, mem_space));
 }
 }
 
-- 
2.26.3



[pushed] analyzer, testuite: comment fixes

2024-04-10 Thread David Malcolm
Successfully regrtested on x86_64-pc-linux-gnu.

Pushed to trunk as r14-9896-g082374f6570a31.

gcc/testsuite/ChangeLog:
* c-c++-common/analyzer/memset-1.c: Clarify some comments.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/c-c++-common/analyzer/memset-1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/analyzer/memset-1.c 
b/gcc/testsuite/c-c++-common/analyzer/memset-1.c
index 75aef53d3487..d6695d494111 100644
--- a/gcc/testsuite/c-c++-common/analyzer/memset-1.c
+++ b/gcc/testsuite/c-c++-common/analyzer/memset-1.c
@@ -58,7 +58,7 @@ void test_5 (int n)
   __analyzer_eval (buf[42] == 'A'); /* { dg-warning "TRUE" } */
   memset (buf, 0, n);
 
-  /* We can't know if buf[42] was written to or not.  */
+  /* We can't know if buf[42] was overwritten by the memset or not.  */
   __analyzer_eval (buf[42] == 'A'); /* { dg-warning "UNKNOWN" } */
   __analyzer_eval (buf[42] == '\0'); /* { dg-warning "UNKNOWN" } */
 }
@@ -72,7 +72,7 @@ void test_5a (int n)
   __analyzer_eval (buf[42] == 'A'); /* { dg-warning "TRUE" } */
   __builtin___memset_chk (buf, 0, n, __builtin_object_size (buf, 0));
 
-  /* We can't know if buf[42] was written to or not.  */
+  /* We can't know if buf[42] was overwritten by the memset or not.  */
   __analyzer_eval (buf[42] == 'A'); /* { dg-warning "UNKNOWN" } */
   __analyzer_eval (buf[42] == '\0'); /* { dg-warning "UNKNOWN" } */
 }
-- 
2.26.3



[pushed] testsuite: add some missing -fanalyzer to plugin tests

2024-04-10 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9895-gd09d70cdb2a4bc.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/copy_from_user-1.c: Add missing directives for an
analyzer test.
* gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c: Add missing
-fanalyzer to options.
* gcc.dg/plugin/taint-CVE-2011-0521-1.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c: Likewise.
(dvb_usercopy): Add default case to avoid complaints about NULL
derefs.
* gcc.dg/plugin/taint-CVE-2011-0521-2.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c: Add missing
-fanalyzer to options.
* gcc.dg/plugin/taint-CVE-2011-0521-3.c: Likewise.  Drop
xfail.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/plugin/copy_from_user-1.c| 4 
 gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c | 2 +-
 gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1.c   | 2 +-
 gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c | 4 +++-
 gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2.c   | 4 +++-
 gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c | 2 +-
 gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-3.c   | 5 ++---
 7 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/plugin/copy_from_user-1.c 
b/gcc/testsuite/gcc.dg/plugin/copy_from_user-1.c
index a1415f38aa65..1acedc2e2ce8 100644
--- a/gcc/testsuite/gcc.dg/plugin/copy_from_user-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/copy_from_user-1.c
@@ -1,3 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-fanalyzer" } */
+/* { dg-require-effective-target analyzer } */
+
 typedef __SIZE_TYPE__ size_t;
 
 #define __user
diff --git a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c 
b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c
index 51526b831c02..9ad05ff670a2 100644
--- a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c
+++ b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target analyzer } */
-/* { dg-additional-options "-Wno-pedantic" } */
+/* { dg-additional-options "-fanalyzer -Wno-pedantic" } */
 
 /* See notes in this header.  */
 #include "taint-CVE-2011-0521.h"
diff --git a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1.c 
b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1.c
index 3d11a75073c1..688d014956ec 100644
--- a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-Wno-pedantic" } */
+/* { dg-additional-options "-fanalyzer -Wno-pedantic" } */
 /* { dg-require-effective-target analyzer } */
 
 /* See notes in this header.  */
diff --git a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c 
b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c
index d035266b16ad..7e597037ec24 100644
--- a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c
+++ b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-Wno-pedantic" } */
+/* { dg-additional-options "-fanalyzer -Wno-pedantic" } */
 /* { dg-require-effective-target analyzer } */
 
 /* See notes in this header.  */
@@ -67,6 +67,8 @@ int dvb_usercopy(struct file *file,
if (copy_from_user(parg, (void __user *)arg, _IOC_SIZE(cmd)))
goto out;
break;
+   default:
+   goto out;
}
 
/* call driver */
diff --git a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2.c 
b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2.c
index 5270e22f1a32..9189cdb2c37c 100644
--- a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-2.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target analyzer } */
-/* { dg-additional-options "-Wno-pedantic" } */
+/* { dg-additional-options "-fanalyzer -Wno-pedantic" } */
 
 /* See notes in this header.  */
 #include "taint-CVE-2011-0521.h"
@@ -67,6 +67,8 @@ int dvb_usercopy(struct file *file,
if (copy_from_user(parg, (void __user *)arg, _IOC_SIZE(cmd)))
goto out;
break;
+   default:
+   goto out;
}
 
/* call driver */
diff --git a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c 
b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c
index b8268fa4a826..d10ce28b40e2 100644
--- a/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c
+++ b/gcc/testsuite/gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target analyzer } */
-/* { dg-additional-options "-Wno-pedantic" } */
+/* { dg-additional-options 

[pushed] analyzer: add SARIF property bag to -Wanalyzer-infinite-loop

2024-04-10 Thread David Malcolm
Tested lightly by hand.
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9901-g107b0e63be023c.

gcc/analyzer/ChangeLog:
* infinite-loop.cc: Include "diagnostic-format-sarif.h".
(infinite_loop::to_json): New.
(infinite_loop_diagnostic::maybe_add_sarif_properties): New.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/infinite-loop.cc | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/gcc/analyzer/infinite-loop.cc b/gcc/analyzer/infinite-loop.cc
index 296489b1146d..e277a8384a04 100644
--- a/gcc/analyzer/infinite-loop.cc
+++ b/gcc/analyzer/infinite-loop.cc
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "analyzer/checker-path.h"
 #include "analyzer/feasible-graph.h"
 #include "make-unique.h"
+#include "diagnostic-format-sarif.h"
 
 /* A bundle of data characterizing a particular infinite loop
identified within the exploded graph.  */
@@ -105,6 +106,18 @@ struct infinite_loop
&& m_loc == other.m_loc);
   }
 
+  json::object *
+  to_json () const
+  {
+json::object *loop_obj = new json::object ();
+loop_obj->set_integer ("enode", m_enode.m_index);
+json::array *edge_arr = new json::array ();
+for (auto eedge : m_eedge_vec)
+  edge_arr->append (eedge->to_json ());
+loop_obj->set ("eedges", edge_arr);
+return loop_obj;
+  }
+
   const exploded_node _enode;
   location_t m_loc;
   std::vector m_eedge_vec;
@@ -297,6 +310,15 @@ public:
   }
   }
 
+  void maybe_add_sarif_properties (sarif_object _obj)
+const final override
+  {
+sarif_property_bag  = result_obj.get_or_create_properties ();
+#define PROPERTY_PREFIX "gcc/analyzer/infinite_loop_diagnostic/"
+props.set (PROPERTY_PREFIX "inf_loop", m_inf_loop->to_json ());
+#undef PROPERTY_PREFIX
+  }
+
 private:
   std::unique_ptr m_inf_loop;
 };
-- 
2.26.3



[pushed] analyzer: add SARIF property bags to -Wanalyzer-overlapping-buffers

2024-04-10 Thread David Malcolm
Tested lightly by hand.
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9899-g7a49d5dc0ef345.

gcc/analyzer/ChangeLog:
* call-details.cc: Include "diagnostic-format-sarif.h".
(overlapping_buffers::overlapping_buffers): Add params for new
fields.
(overlapping_buffers::maybe_add_sarif_properties): New.
(overlapping_buffers::m_byte_range_a): New field.
(overlapping_buffers::byte_range_b): New field.
(overlapping_buffers::m_num_bytes_read_sval): New field.
(call_details::complain_about_overlap): Pass new params to
overlapping_buffers ctor.
* ranges.cc (symbolic_byte_offset::to_json): New.
(symbolic_byte_range::to_json): New.
* ranges.h (symbolic_byte_offset::to_json): New decl.
(symbolic_byte_range::to_json): New decl.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/call-details.cc | 33 ++---
 gcc/analyzer/ranges.cc   | 15 +++
 gcc/analyzer/ranges.h|  4 
 3 files changed, 49 insertions(+), 3 deletions(-)

diff --git a/gcc/analyzer/call-details.cc b/gcc/analyzer/call-details.cc
index 5b145a2ce638..ca47953f1461 100644
--- a/gcc/analyzer/call-details.cc
+++ b/gcc/analyzer/call-details.cc
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "make-unique.h"
+#include "diagnostic-format-sarif.h"
 
 #if ENABLE_ANALYZER
 
@@ -425,8 +426,14 @@ class overlapping_buffers
 : public pending_diagnostic_subclass
 {
 public:
-  overlapping_buffers (tree fndecl)
-  : m_fndecl (fndecl)
+  overlapping_buffers (tree fndecl,
+  const symbolic_byte_range _range_a,
+  const symbolic_byte_range _range_b,
+  const svalue *num_bytes_read_sval)
+  : m_fndecl (fndecl),
+m_byte_range_a (byte_range_a),
+m_byte_range_b (byte_range_b),
+m_num_bytes_read_sval (num_bytes_read_sval)
   {
   }
 
@@ -469,8 +476,25 @@ public:
m_fndecl);
   }
 
+  void maybe_add_sarif_properties (sarif_object _obj)
+const final override
+  {
+sarif_property_bag  = result_obj.get_or_create_properties ();
+#define PROPERTY_PREFIX "gcc/analyzer/overlapping_buffers/"
+props.set (PROPERTY_PREFIX "bytes_range_a",
+  m_byte_range_a.to_json ());
+props.set (PROPERTY_PREFIX "bytes_range_b",
+  m_byte_range_b.to_json ());
+props.set (PROPERTY_PREFIX "num_bytes_read_sval",
+  m_num_bytes_read_sval->to_json ());
+#undef PROPERTY_PREFIX
+  }
+
 private:
   tree m_fndecl;
+  symbolic_byte_range m_byte_range_a;
+  symbolic_byte_range m_byte_range_b;
+  const svalue *m_num_bytes_read_sval;
 };
 
 
@@ -517,7 +541,10 @@ call_details::complain_about_overlap (unsigned arg_idx_a,
   if (!byte_range_a.intersection (byte_range_b, *model).is_true ())
 return;
 
-  ctxt->warn (make_unique (get_fndecl_for_call ()));
+  ctxt->warn (make_unique (get_fndecl_for_call (),
+   byte_range_a,
+   byte_range_b,
+   num_bytes_read_sval));
 }
 
 } // namespace ana
diff --git a/gcc/analyzer/ranges.cc b/gcc/analyzer/ranges.cc
index ffdd0d4c5722..659ada7609d6 100644
--- a/gcc/analyzer/ranges.cc
+++ b/gcc/analyzer/ranges.cc
@@ -103,6 +103,12 @@ symbolic_byte_offset::dump (bool simple) const
   pp_flush ();
 }
 
+json::value *
+symbolic_byte_offset::to_json () const
+{
+  return m_num_bytes_sval->to_json ();
+}
+
 tree
 symbolic_byte_offset::maybe_get_constant () const
 {
@@ -156,6 +162,15 @@ symbolic_byte_range::dump (bool simple, 
region_model_manager ) const
   pp_flush ();
 }
 
+json::value *
+symbolic_byte_range::to_json () const
+{
+  json::object *obj = new json::object ();
+  obj->set ("start", m_start.to_json ());
+  obj->set ("size", m_size.to_json ());
+  return obj;
+}
+
 bool
 symbolic_byte_range::empty_p () const
 {
diff --git a/gcc/analyzer/ranges.h b/gcc/analyzer/ranges.h
index 92d963b7a2bc..aca4554bde69 100644
--- a/gcc/analyzer/ranges.h
+++ b/gcc/analyzer/ranges.h
@@ -39,6 +39,8 @@ public:
   void dump_to_pp (pretty_printer *pp, bool) const;
   void dump (bool) const;
 
+  json::value *to_json () const;
+
   bool operator== (const symbolic_byte_offset ) const
   {
return m_num_bytes_sval == other.m_num_bytes_sval;
@@ -70,6 +72,8 @@ public:
   region_model_manager ) const;
   void dump (bool, region_model_manager ) const;
 
+  json::value *to_json () const;
+
   bool empty_p () const;
 
   symbolic_byte_offset get_start_byte_offset () const
-- 
2.26.3



Re: [Patch, fortran] PR113363 - ICE on ASSOCIATE and unlimited polymorphic function

2024-04-10 Thread Harald Anlauf

Hi Paul!

On 4/10/24 10:25, Paul Richard Thomas wrote:

Hi All,

This patch corrects incorrect results from assignment of unlimited
polymorphic function results both in assignment statements and allocation
with source.

The first chunk in trans-array.cc ensures that the array dtype is set to
the source dtype. The second chunk ensures that the lhs _len field does not
default to zero and so is specific to dynamic types of character.

The addition to trans-stmt.cc transforms the source expression, aka expr3,
from a derived type of type "STAR" into a proper unlimited polymorphic
expression ready for assignment to the newly allocated entity.


I am wondering about the following snippet in trans-stmt.cc:

+ /* Copy over the lhs _data component ref followed by the
+full array reference for source expressions with rank.
+Otherwise, just copy the _data component ref.  */
+ if (code->expr3->rank
+ && ref && ref->next && !ref->next->next)
+   {
+ rhs->ref = gfc_copy_ref (ref);
+ rhs->ref->next = gfc_copy_ref (ref->next);
+ break;
+   }

Why the two gfc_copy_ref?  valgrind pointed my to the tail
of gfc_copy_ref which already has:

  dest->next = gfc_copy_ref (src->next);

so this looks redundant and leaks frontend memory?

***

Playing with the testcase, I find several invalid writes with
valgrind, or a heap buffer overflow with -fsanitize=address .

It is sufficient to look at a mini-test where the class(*) function
result is assigned to the class(*), allocatable in the main:

  x = foo ()
  deallocate (x)

The dump tree suggests that array bounds in foo() are read before
they are properly set.

These invalid writes do not occur with 13-branch, so this might
be a regression.

Can you have a look yourself?

Thanks,
Harald


OK for mainline?

Paul

Fortran: Fix wrong code in unlimited polymorphic assignment [PR113363]

2024-04-10  Paul Thomas  

gcc/fortran
PR fortran/113363
* trans-array.cc (gfc_array_init_size): Use the expr3 dtype so
that the correct element size is used.
(gfc_alloc_allocatable_for_assignment): Set the _len field for
unlimited polymorphic assignments.
* trans-stmt.cc (gfc_trans_allocate): Build a correct rhs for
the assignment of an unlimited polymorphic 'source'.

gcc/testsuite/
PR fortran/113363
* gfortran.dg/pr113363.f90: New test.






Re: [PATCH v8 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-04-10 Thread Qing Zhao
Thanks a lot for your review.

Will fix these typos before committing to GCC15.

Qing

> On Apr 10, 2024, at 14:36, Joseph Myers  wrote:
> 
> On Fri, 29 Mar 2024, Qing Zhao wrote:
> 
>> +/* For a SUBDATUM field of a structure or union DATUM, generate a REF to
>> +   the object that represents its counted_by per the attribute counted_by
>> +   attached to this field if it's a flexible array member field, otherwise
>> +   return NULL_TREE.
>> +   set COUNTED_BY_TYPE to the TYPE of the counted_by field.
> 
> Use an uppercase letter at the start of a sentence, "Set".
> 
>> +static tree
>> +build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
>> +{
>> +  tree type = TREE_TYPE (datum);
>> +  if (!(c_flexible_array_member_type_p (TREE_TYPE (subdatum
>> +return NULL_TREE;
> 
> There are redundant parentheses here around the call to 
> c_flexible_array_member_type_p.
> 
> The C front-end changes in this patch are OK for GCC 15 (after GCC 14 has 
> branched, and once a version of patch 1 has also been approved) with those 
> fixes.
> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 



Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Qing Zhao


> On Apr 10, 2024, at 15:05, Martin Uecker  wrote:
> 
> Am Mittwoch, dem 10.04.2024 um 20:25 +0200 schrieb Martin Uecker:
>> Am Mittwoch, dem 10.04.2024 um 17:35 + schrieb Joseph Myers:
>>> On Fri, 29 Mar 2024, Qing Zhao wrote:
>>> 
 +  /* Issue error when there is a counted_by attribute with a different
 + field as the argument for the same flexible array member field.  */
>>> 
>>> There's another case of this to consider, though I'm not sure where best 
>>> to check for it (Martin might have suggestions) - of course this case will 
>>> need testcases as well.
>>> 
>>> Suppose, as allowed in C23, a structure is defined twice in the same 
>>> scope, but the two definitions of the structure use inconsistent 
>>> counted_by attributes.  I'd say that, when the declarations are in the 
>>> same scope (thus required to be consistent), it should be an error for the 
>>> two definitions of what is meant to be the same structure to use 
>>> incompatible counted_by attributes (even though the member declarations 
>>> are otherwise the same).
>> 
>> I think the right place could be comp_types_attributes in
>> attributes.cc.  It may be sufficient to set the
>> affects_type_identify flag.
>> 
>> This should then give a redefinition error as it should do for
>> "packed".
> 
> Thinking about this a bit more, this will not work here, because
> the counted_by attribute is not applied to the struct type but
> one of the members.
> 
> So probably there should be a check added directly
> to tagged_types_tu_compatible_p


There are two cases we will check:

  A. Both definitions are in the same scope;
  Then if the 2nd definition has a counted-by attribute different from the 
1st definition, the 2nd definition will be given a redefinition error; 

  B. These two definitions are in different scope;
  When these two definitions are used in a way need to be compatible, an 
incompatible error need to be issued at that
Point;


My question is, Will the routine “tagged_types_tu_compatible_p” can handle both 
A and B?

Thanks.

Qing
> 
> Martin
> 
>> 
>>> 
>>> In C23 structures defined with the same tag in different scopes are 
>>> compatible given requirements including compatible types for corresponding 
>>> elements.  It would seem most appropriate to me for such structures with 
>>> incompatible counted_by attributes to be considered *not* compatible types 
>>> (but it would be valid to define structures with the same tag, different 
>>> scopes, and elements the same except for counted_by - just not to use them 
>>> in any way requiring them to be compatible).
>> 
>> Another option might be to warn about the case when those types
>> are then used together in a way where they are required to
>> be compatible.  Then comp_types_attributes would have to return 2.
>> 
>> 
>> Martin
>> 
>>> 
 +The @code{counted_by} attribute may be attached to the C99 flexible array
 +member of a structure.  It indicates that the number of the elements of 
 the
 +array is given by the field "@var{count}" in the same structure as the
>>> 
>>> As noted previously, the "" quotes should be removed there (or replaced by 
>>> ``'' quotes).
>>> 
>> 
> 



Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Qing Zhao


> On Apr 10, 2024, at 14:44, Joseph Myers  wrote:
> 
> On Wed, 10 Apr 2024, Qing Zhao wrote:
> 
>> A stupid question first, the same scope means the same file? (Or same 
>> function)
> 
> struct X { int a; };
> struct X { int a; };
> 
> is an example of the same scope (file scope, in this case).  The 
> structures must have the same contents (in an appropriate sense) and are 
> then considered the same type.
> 
> struct X { int a; };
> void f() { struct X { int a; }; }
> 
> is not the same scope - but C23 makes the types compatible (not the same).  
> It's OK to have incompatible types with the same tag in different scopes 
> as well
> 
> struct X { int a; };
> void f() { struct X { long b; }; }
> 
> but if you use them in a way requiring compatibility, then the contents 
> must be compatible
> 
> struct X { int a; } v;
> void f() { struct X { int a; } *p =  }

Okay, the above is very clear, thanks a lot for the explanation.
So, basically, for “counted-by” attribute:
**The following is good:
struct f { 
  int b;
  int c;
  int a[]  __attribute__ ((counted_by (b))) };
struct f {
  int b;
  int c;
  int a[] __attribute__ ((counted_by (b))) };

**The following should error:

struct f { 
  int b;
  int c;
  int a[]  __attribute__ ((counted_by (b))) };
struct f {
  int b;
  int c;
  int a[] __attribute__ ((counted_by (c))) };  /* error here */

For the same tag in different scopes case:

struct f { 
  int b;
  int c;
  int a[]  __attribute__ ((counted_by (b))) }  y0;

void test1(void) 
{   
struct f {
  int b;
  int c;
  int a[] __attribute__ ((counted_by (c))) } x;

  y0 = x;  /* will report incompatible type error here */
}

Are the above complete?

> 
>> Is there a testing case for this feature in current GCC source tree I can 
>> take a look? (and
>> Then I can use it to construct the new testing case for the counted-by 
>> attribute).
> 
> See gcc.dg/c23-tag-*.c for many tests of different cases involving the tag 
> compatibility rules (and gcc.dg/gnu23-tag-* where GNU extensions are 
> involved).

Got it. Will take a look on them.

thanks.

Qing

> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 



Re: [PATCH 2/2] btf: do not skip members of data type with type id BTF_VOID_TYPEID

2024-04-10 Thread David Faust
Hi Indu,

On 4/10/24 11:25, Indu Bhagat wrote:
> Testing the previous fix in gen_ctf_sou_type () reveals an issue in BTF
> generation, however: BTF emission was currently decrementing the vlen
> (indicating the number of members) to skip members of type CTF_K_UNKNOWN
> altogether, but still emitting the BTF for the corresponding member (in
> output_asm_btf_sou_fields ()).
> 
> One can see malformed BTF by executing the newly added CTF testcase
> (gcc.dg/debug/ctf/ctf-bitfields-5.c) with -gbtf instead or even existing
> btf-struct-2.c without this patch.
> 
> To fix the issue, it makes sense to rather _not_ skip members of data
> type of type id BTF_VOID_TYPEID.

Thank you for catching and fixing this.

FWIW, what to do in such cases for a struct with a member that has no
representation is undefined behavior in BTF.  I certainly agree it's
better not to emit something malformed, and using 'void' is a good
choice.  Better to know there was a member there that could not be
represented than to skip it altogether, and the total struct size
shall still be correct.

OK.
Thanks!

> 
> gcc/ChangeLog:
>   * btfout.cc (btf_asm_type): Do not skip emitting members of
>   unknown type.
> 
> gcc/testsuite/ChangeLog:
>   * btf-bitfields-4.c: Update the vlen check.
>   * btf-struct-2.c: Check that member named 'f' with void data
>   type is emitted.
> ---
>  gcc/btfout.cc| 5 -
>  gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c | 6 +++---
>  gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c| 9 +
>  3 files changed, 8 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
> index 4a8ec4d1ff0..ab491f0297f 100644
> --- a/gcc/btfout.cc
> +++ b/gcc/btfout.cc
> @@ -820,11 +820,6 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
> /* Set kflag if this member is a representable bitfield.  */
> if (btf_dmd_representable_bitfield_p (ctfc, dmd))
>   btf_kflag = 1;
> -
> -   /* Struct members that refer to unsupported types or bitfield formats
> -  shall be skipped. These are marked during preprocessing.  */
> -   else if (!btf_emit_id_p (dmd->dmd_type))
> - btf_vlen -= 1;
>   }
>  }
>  
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c 
> b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c
> index c00c8b3d87f..d4a6ef6a1eb 100644
> --- a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c
> @@ -6,14 +6,14 @@
> In this test, we construct a structure such that the bitfield will have
> an offset so large as to be unrepresentable in BTF. We expect that the
> resulting BTF will describe the rest of the structure, ignoring the
> -   non-representable bitfield.  */
> +   non-representable bitfield by simply using void data type for the same.  
> */
>  
>  /* { dg-do compile } */
>  /* { dg-options "-O0 -gbtf -dA" } */
>  /* { dg-require-effective-target size32plus } */
>  
> -/* Struct with 3 members and no bitfield (kind_flag not set).  */
> -/* { dg-final { scan-assembler-times "\[\t \]0x403\[\t 
> \]+\[^\n\]*btt_info" 1 } } */
> +/* Struct with 4 members and no bitfield (kind_flag not set).  */
> +/* { dg-final { scan-assembler-times "\[\t \]0x404\[\t 
> \]+\[^\n\]*btt_info" 1 } } */
>  
>  struct bigly
>  {
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c 
> b/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c
> index e9ff06883db..fa7231be75c 100644
> --- a/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c
> @@ -2,14 +2,15 @@
> unsupported type.
>  
> BTF does not support vector types (among other things). When
> -   generating BTF for a struct (or union) type, members which refer to
> -   unsupported types should be skipped.  */
> +   generating BTF for a struct (or union) type.  Members which refer to
> +   unsupported types should not be skipped, however.  */
>  
>  /* { dg-do compile } */
>  /* { dg-options "-O0 -gbtf -dA" } */
>  
> -/* Expect a struct with only 2 members - 'f' should not be present.  */
> -/* { dg-final { scan-assembler-times "\[\t \]0x402\[\t 
> \]+\[^\n\]*btt_info" 1 } } */
> +/* Expect a struct with 3 members - 'f' is present but is of data type void. 
>  */
> +/* { dg-final { scan-assembler-times "\[\t \]0x403\[\t 
> \]+\[^\n\]*btt_info" 1 } } */
> +/* { dg-final { scan-assembler-times " MEMBER 'f' 
> idx=1\[\\r\\n\]+\[^\\r\\n\]*0\[\t \]+\[^\n\]*btm_type: void" 1 } } */
>  
>  struct with_float
>  {


Re: [PATCH 1/2] ctf: fix PR debug/112878

2024-04-10 Thread David Faust
On 4/10/24 11:25, Indu Bhagat wrote:
> PR debug/112878: ICE: in ctf_add_slice, at ctfc.cc:499 with _BitInt > 255 in 
> a struct and -gctf1
> 
> The CTF generation in GCC does not have a mechanism to roll-back an
> already added type.  In this testcase presented in the PR, we hit a
> representation limit in CTF slices (for a member of a struct) and ICE,
> after the type for struct (CTF_K_STRUCT) has already been added to the
> container.
> 
> To exit gracefully instead, we now check for both the offset and size of
> the bitfield to be explicitly <= 255.  If the check fails, we emit the
> member with type CTF_K_UNKNOWN.  Note that, the value 255 stems from the
> existing binutils libctf checks which were motivated to guard against
> malformed inputs.
> 
> Although it is not accurate to say that this is a CTF representation
> limit, mark the code with TBD_CTF_REPRESENTATION_LIMIT for now so that
> this can be taken care of with the next format version bump, when
> libctf's checks for the slice data can be lifted as well.

OK.

> 
> gcc/ChangeLog:
>   PR debug/112878
>   * dwarf2ctf.cc (gen_ctf_sou_type): Check for conditions before
>   call to ctf_add_slice.  Use CTF_K_UNKNOWN type if fail.
> 
> gcc/testsuite/ChangeLog:
>   PR debug/112878
>   * gcc.dg/debug/ctf/ctf-bitfields-5.c: New test.
> ---
>  gcc/dwarf2ctf.cc| 15 ++-
>  .../gcc.dg/debug/ctf/ctf-bitfields-5.c  | 17 +
>  2 files changed, 27 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c
> 
> diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
> index 77d6bf89689..dc59569fe56 100644
> --- a/gcc/dwarf2ctf.cc
> +++ b/gcc/dwarf2ctf.cc
> @@ -606,11 +606,16 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref 
> sou, uint32_t kind)
> if (attr)
>   bitpos += AT_unsigned (attr);
>  
> -   field_type_id = ctf_add_slice (ctfc, CTF_ADD_NONROOT,
> -  field_type_id,
> -  bitpos - field_location,
> -  bitsize,
> -  c);
> +   /* This is not precisely a TBD_CTF_REPRESENTATION_LIMIT, but
> +  surely something to look at for the next format version bump
> +  for CTF.  */
> +   if (bitsize <= 255 && (bitpos - field_location) <= 255)
> + field_type_id = ctf_add_slice (ctfc, CTF_ADD_NONROOT,
> +field_type_id,
> +bitpos - field_location,
> +bitsize, c);
> +   else
> + field_type_id = gen_ctf_unknown_type (ctfc);
>   }
>  
> /* Add the field type to the struct or union type.  */
> diff --git a/gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c 
> b/gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c
> new file mode 100644
> index 000..fee8228647c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c
> @@ -0,0 +1,17 @@
> +/* Bitfield where the bit offset is > 255 is not allowed in CTF.
> +
> +   PR debug/112878.
> +   This testcase is to ensure graceful handling. No slices are expected.  */
> +
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O0 -gctf -dA" } */
> +
> +/* No slices are expected, but a struct with one member is expected.
> +   CTF_K_UNKNOWN is also expected.  */
> +/* { dg-final { scan-assembler-times "cts_type" 0 } } */
> +/* { dg-final { scan-assembler-times "\[\t \]0x1a01\[\t 
> \]+\[^\n\]*ctt_info" 1 } } */
> +/* { dg-final { scan-assembler-times "ascii \"unknown.0\"\[\t 
> \]+\[^\n\]*ctf_string" 1 } } */
> +
> +struct {
> +  _BitInt(282) a : 280;
> +} b;


Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Martin Uecker
Am Mittwoch, dem 10.04.2024 um 20:25 +0200 schrieb Martin Uecker:
> Am Mittwoch, dem 10.04.2024 um 17:35 + schrieb Joseph Myers:
> > On Fri, 29 Mar 2024, Qing Zhao wrote:
> > 
> > > +  /* Issue error when there is a counted_by attribute with a different
> > > + field as the argument for the same flexible array member field.  */
> > 
> > There's another case of this to consider, though I'm not sure where best 
> > to check for it (Martin might have suggestions) - of course this case will 
> > need testcases as well.
> > 
> > Suppose, as allowed in C23, a structure is defined twice in the same 
> > scope, but the two definitions of the structure use inconsistent 
> > counted_by attributes.  I'd say that, when the declarations are in the 
> > same scope (thus required to be consistent), it should be an error for the 
> > two definitions of what is meant to be the same structure to use 
> > incompatible counted_by attributes (even though the member declarations 
> > are otherwise the same).
> 
> I think the right place could be comp_types_attributes in
> attributes.cc.  It may be sufficient to set the
> affects_type_identify flag.
> 
> This should then give a redefinition error as it should do for
> "packed".

Thinking about this a bit more, this will not work here, because
the counted_by attribute is not applied to the struct type but
one of the members.

So probably there should be a check added directly
to tagged_types_tu_compatible_p

Martin

> 
> > 
> > In C23 structures defined with the same tag in different scopes are 
> > compatible given requirements including compatible types for corresponding 
> > elements.  It would seem most appropriate to me for such structures with 
> > incompatible counted_by attributes to be considered *not* compatible types 
> > (but it would be valid to define structures with the same tag, different 
> > scopes, and elements the same except for counted_by - just not to use them 
> > in any way requiring them to be compatible).
> 
> Another option might be to warn about the case when those types
> are then used together in a way where they are required to
> be compatible.  Then comp_types_attributes would have to return 2.
> 
> 
> Martin
> 
> > 
> > > +The @code{counted_by} attribute may be attached to the C99 flexible array
> > > +member of a structure.  It indicates that the number of the elements of 
> > > the
> > > +array is given by the field "@var{count}" in the same structure as the
> > 
> > As noted previously, the "" quotes should be removed there (or replaced by 
> > ``'' quotes).
> > 
> 



Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Andrew Carlotti
On Wed, Apr 10, 2024 at 07:51:44PM +0100, Richard Sandiford wrote:
> Andrew Carlotti  writes:
> > On Wed, Apr 10, 2024 at 05:42:05PM +0100, Richard Sandiford wrote:
> >> Andrew Carlotti  writes:
> >> > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
> >> >> Andrew Carlotti  writes:
> >> >> > The first three patches are trivial changes to the feature list to 
> >> >> > reflect
> >> >> > recent changes in the ACLE.  Patch 4 removes most of the FMV 
> >> >> > multiversioning
> >> >> > features that don't work at the moment, and should be entirely 
> >> >> > uncontroversial.
> >> >> >
> >> >> > Patch 5 handles the remaining cases, where there's an inconsistency 
> >> >> > in how
> >> >> > features are named in the current FMV specification compared to the 
> >> >> > existing
> >> >> > command line options.  It might be better to instead preserve the 
> >> >> > "memtag2",
> >> >> > "ssbs2" and "ls64_accdata" names for now; I'd be happy to commit 
> >> >> > either
> >> >> > version.
> >> >> 
> >> >> Yeah, I suppose patch 5 leaves things in a somewhat awkward state,
> >> >> since e.g.:
> >> >> 
> >> >> -AARCH64_OPT_FMV_EXTENSION("memtag", MEMTAG, (), (), (), "")
> >> >> +AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "")
> >> >>  
> >> >> -AARCH64_FMV_FEATURE("memtag2", MEMTAG2, (MEMTAG))
> >> >> +AARCH64_FMV_FEATURE("memtag", MEMTAG2, (MEMTAG))
> >> >> 
> >> >> seems to drop "memtag2" and FEAT_MEMTAG, but keep "memtag" and
> >> >> FEAT_MEMTAG2.  Is that right?
> >> >
> >> > That's deliberate. The FEAT_MEMTAG bit in __aarch64_cpu_features is 
> >> > defined to
> >> > match the definition of FEAT_MTE in the architecture, and likewise for
> >> > FEAT_MEMTAG2/FEAT_MTE2.  However, in Binutils the "+memtag" extension 
> >> > enables
> >> > both FEAT_MTE and FEAT_MTE2 instructions (although none of the FEAT_MTE2
> >> > instructions can be generated from GCC without inline assembly).  The FMV
> >> > specification in the ACLE currently uses names "memtag" and "memtag2" 
> >> > that
> >> > match the architecture names, but arguably don't match the command line
> >> > extension names.  I'm advocating for that to change to match the 
> >> > extension
> >> > names in command line options.
> >> 
> >> Hmm, ok.  I agree it makes sense for the user-visible FMV namnes to match
> >> the command line.  But shouldn't __aarch64_cpu_features either (a) use 
> >> exactly
> >> the same names as the architecture or (b) use exactly the same names as the
> >> command-line (mangled where necessary)?  It seems that we're instead
> >> using a third convention that doesn't exactly match the other two.
> >
> > I agree that the name isn't one I would choose now, but I don't think it 
> > matters much that it's inconsistent.
> 
> I kind-of think it does though.  Given...
> 
> >> That is, I can see the rationale for "memtag" => FEAT_MTE2 and
> >> "memtag" => FEAT_MEMTAG.  It just seems odd to have "memtag" => 
> >> FEAT_MEMTAG2
> >> (where MEMTAG2 is an alias of MTE2).
> >> 
> >> How much leeway do we have to change the __aarch64_cpu_features names?
> >> Is it supposed to be a public API (as opposed to ABI)?
> >
> > I think we're designing it to be capable of being a public API, but we 
> > haven't
> > yet made it one.  That's partly why I've kept the enum value names the same 
> > as
> > in LLVM so far.
> 
> ...this, I don't want to sleep-walk into a situation where we have
> one naming convention for the architecture, one for the attributes,
> and a third one for the API.  If we're not in a position to commit
> to a consistent naming scheme for the API by GCC 14 then it might be
> better to remove the FMV features in 5/5 for GCC 14 and revisit in GCC 15.
> 
> A patch to do that is pre-approved if you agree (but please say
> if you don't).

I'm happy to remove those features for GCC 14 (pending agreement on the
attribute names in particular), but I don't think that does anything to solve
the enum names issue.  I'll remove the names from my FMV documentation patch as
well.

> Thanks,
> Richard


Re: [PATCH] c++/modules: local class merging [PR99426]

2024-04-10 Thread Patrick Palka
On Tue, 9 Apr 2024, Jason Merrill wrote:

> On 3/5/24 10:31, Patrick Palka wrote:
> > On Tue, 27 Feb 2024, Patrick Palka wrote:
> > 
> > Subject: [PATCH] c++/modules: local type merging [PR99426]
> > 
> > One known missing piece in the modules implementation is merging of a
> > streamed-in local type (class or enum) with the corresponding in-TU
> > version of the local type.  This missing piece turns out to cause a
> > hard-to-reduce use-after-free GC issue due to the entity_ary not being
> > marked as a GC root (deliberately), and manifests as a serialization
> > error on stream-in as in PR99426 (see comment #6 for a reduction).  It's
> > also reproducible on trunk when running the xtreme-header tests without
> > -fno-module-lazy.
> > 
> > This patch makes us merge such local types according to their position
> > within the containing function's definition, analogous to how we merge
> > FIELD_DECLs of a class according to their index in the TYPE_FIELDS
> > list.
> > 
> > PR c++/99426
> > 
> > gcc/cp/ChangeLog:
> > 
> > * module.cc (merge_kind::MK_local_type): New enumerator.
> > (merge_kind_name): Update.
> > (trees_out::chained_decls): Move BLOCK-specific handling
> > of DECL_LOCAL_DECL_P decls to ...
> > (trees_out::core_vals) : ... here.  Stream
> > BLOCK_VARS manually.
> > (trees_in::core_vals) : Stream BLOCK_VARS
> > manually.  Handle deduplicated local types..
> > (trees_out::key_local_type): Define.
> > (trees_in::key_local_type): Define.
> > (trees_out::get_merge_kind) : Return
> > MK_local_type for a local type.
> > (trees_out::key_mergeable) : Use
> > key_local_type.
> > (trees_in::key_mergeable) : Likewise.
> > (trees_in::is_matching_decl): Be flexible with type mismatches
> > for local entities.
> > 
> > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > index 80b63a70a62..d9e34e9a4b9 100644
> > --- a/gcc/cp/module.cc
> > +++ b/gcc/cp/module.cc
> > @@ -6714,7 +6720,37 @@ trees_in::core_vals (tree t)
> >   case BLOCK:
> > t->block.locus = state->read_location (*this);
> > t->block.end_locus = state->read_location (*this);
> > -  t->block.vars = chained_decls ();
> > +
> > +  for (tree *chain = >block.vars;;)
> > +   if (tree decl = tree_node ())
> > + {
> > +   /* For a deduplicated local type or enumerator, chain the
> > +  duplicate decl instead of the canonical in-TU decl.  Seeing
> > +  a duplicate here means the containing function whose body
> > +  we're streaming in is a duplicate too, so we'll end up
> > +  discarding this BLOCK (and the rest of the duplicate function
> > +  body) anyway.  */
> > +   if (is_duplicate (decl))
> > + decl = maybe_duplicate (decl);
> > +   else if (DECL_IMPLICIT_TYPEDEF_P (decl)
> > +&& TYPE_TEMPLATE_INFO (TREE_TYPE (decl)))
> > + {
> > +   tree tmpl = TYPE_TI_TEMPLATE (TREE_TYPE (decl));
> > +   if (DECL_TEMPLATE_RESULT (tmpl) == decl && is_duplicate
> > (tmpl))
> > + decl = DECL_TEMPLATE_RESULT (maybe_duplicate (tmpl));
> > + }
> 
> This seems like a lot of generally-applicable code for finding the duplicate,
> which other calls to maybe_duplicate/odr_duplicate don't use.  If the template
> is a duplicate, why isn't its result?  If there's a good reason for that,
> should this template handling go into maybe_duplicate?

Ah yeah, that makes sense.

Some context: IIUC modules treats the TEMPLATE_DECL instead of the
DECL_TEMPLATE_RESULT as the canonical decl, which in turn means we'll
register_duplicate only the TEMPLATE_DECL.  But BLOCK_VARS never contains
a TEMPLATE_DECL, always the DECL_TEMPLATE_RESULT (i.e. a TYPE_DECL),
hence the extra handling.

Given that it's relatively more difficult to get at the TEMPLATE_DECL
from the DECL_TEMPLATE_RESULT rather than vice versa, maybe we should
just register both as duplicates from register_duplicate?  That way
callers can just simply pass the DECL_TEMPLATE_RESULT to maybe_duplicate
and it'll do the right thing.

> 
> > @@ -10337,6 +10373,83 @@ trees_in::fn_parms_fini (int tag, tree fn, tree
> > existing, bool is_defn)
> >   }
> >   }
> >   +/* Encode into KEY the position of the local type (class or enum)
> > +   declaration DECL within FN.  The position is encoded as the
> > +   index of the innermost BLOCK (numbered in BFS order) along with
> > +   the index within its BLOCK_VARS list.  */
> 
> Since we already set DECL_DISCRIMINATOR for mangling, could we use it+name for
> the key as well?

We could (and IIUc that'd be more robust to ODR violations), but
wouldn't it mean we'd have to do a linear walk over all BLOCK_VARs of
all BLOCKS in order to find the one with the matching
name+discriminator?  That'd be slower than the current approach which
lets us skip to the correct BLOCK and walk only its BLOCK_VARS.

Here's a tested patch that implements the register_duplicate idea to

Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Richard Sandiford
Andrew Carlotti  writes:
> On Wed, Apr 10, 2024 at 05:42:05PM +0100, Richard Sandiford wrote:
>> Andrew Carlotti  writes:
>> > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
>> >> Andrew Carlotti  writes:
>> >> > The first three patches are trivial changes to the feature list to 
>> >> > reflect
>> >> > recent changes in the ACLE.  Patch 4 removes most of the FMV 
>> >> > multiversioning
>> >> > features that don't work at the moment, and should be entirely 
>> >> > uncontroversial.
>> >> >
>> >> > Patch 5 handles the remaining cases, where there's an inconsistency in 
>> >> > how
>> >> > features are named in the current FMV specification compared to the 
>> >> > existing
>> >> > command line options.  It might be better to instead preserve the 
>> >> > "memtag2",
>> >> > "ssbs2" and "ls64_accdata" names for now; I'd be happy to commit either
>> >> > version.
>> >> 
>> >> Yeah, I suppose patch 5 leaves things in a somewhat awkward state,
>> >> since e.g.:
>> >> 
>> >> -AARCH64_OPT_FMV_EXTENSION("memtag", MEMTAG, (), (), (), "")
>> >> +AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "")
>> >>  
>> >> -AARCH64_FMV_FEATURE("memtag2", MEMTAG2, (MEMTAG))
>> >> +AARCH64_FMV_FEATURE("memtag", MEMTAG2, (MEMTAG))
>> >> 
>> >> seems to drop "memtag2" and FEAT_MEMTAG, but keep "memtag" and
>> >> FEAT_MEMTAG2.  Is that right?
>> >
>> > That's deliberate. The FEAT_MEMTAG bit in __aarch64_cpu_features is 
>> > defined to
>> > match the definition of FEAT_MTE in the architecture, and likewise for
>> > FEAT_MEMTAG2/FEAT_MTE2.  However, in Binutils the "+memtag" extension 
>> > enables
>> > both FEAT_MTE and FEAT_MTE2 instructions (although none of the FEAT_MTE2
>> > instructions can be generated from GCC without inline assembly).  The FMV
>> > specification in the ACLE currently uses names "memtag" and "memtag2" that
>> > match the architecture names, but arguably don't match the command line
>> > extension names.  I'm advocating for that to change to match the extension
>> > names in command line options.
>> 
>> Hmm, ok.  I agree it makes sense for the user-visible FMV namnes to match
>> the command line.  But shouldn't __aarch64_cpu_features either (a) use 
>> exactly
>> the same names as the architecture or (b) use exactly the same names as the
>> command-line (mangled where necessary)?  It seems that we're instead
>> using a third convention that doesn't exactly match the other two.
>
> I agree that the name isn't one I would choose now, but I don't think it 
> matters much that it's inconsistent.

I kind-of think it does though.  Given...

>> That is, I can see the rationale for "memtag" => FEAT_MTE2 and
>> "memtag" => FEAT_MEMTAG.  It just seems odd to have "memtag" => FEAT_MEMTAG2
>> (where MEMTAG2 is an alias of MTE2).
>> 
>> How much leeway do we have to change the __aarch64_cpu_features names?
>> Is it supposed to be a public API (as opposed to ABI)?
>
> I think we're designing it to be capable of being a public API, but we haven't
> yet made it one.  That's partly why I've kept the enum value names the same as
> in LLVM so far.

...this, I don't want to sleep-walk into a situation where we have
one naming convention for the architecture, one for the attributes,
and a third one for the API.  If we're not in a position to commit
to a consistent naming scheme for the API by GCC 14 then it might be
better to remove the FMV features in 5/5 for GCC 14 and revisit in GCC 15.

A patch to do that is pre-approved if you agree (but please say
if you don't).

Thanks,
Richard


Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Joseph Myers
On Wed, 10 Apr 2024, Qing Zhao wrote:

> A stupid question first, the same scope means the same file? (Or same 
> function)

struct X { int a; };
struct X { int a; };

is an example of the same scope (file scope, in this case).  The 
structures must have the same contents (in an appropriate sense) and are 
then considered the same type.

struct X { int a; };
void f() { struct X { int a; }; }

is not the same scope - but C23 makes the types compatible (not the same).  
It's OK to have incompatible types with the same tag in different scopes 
as well

struct X { int a; };
void f() { struct X { long b; }; }

but if you use them in a way requiring compatibility, then the contents 
must be compatible

struct X { int a; } v;
void f() { struct X { int a; } *p =  }

> Is there a testing case for this feature in current GCC source tree I can 
> take a look? (and
> Then I can use it to construct the new testing case for the counted-by 
> attribute).

See gcc.dg/c23-tag-*.c for many tests of different cases involving the tag 
compatibility rules (and gcc.dg/gnu23-tag-* where GNU extensions are 
involved).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-04-10 Thread Richard Sandiford
Evgeny Karpov  writes:
> Hello,
>
> v2 is ready for the review!
> Based on the v1 review: 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/thread.html#646203
>
> Testing for the x86_64-w64-mingw32 target is in progress to avoid
> regression due to refactoring.

Thanks for the updates and sorry again for the slow review.
I've replied to some of the patches in the series but otherwise
it looks good to me.

If you agree with the suggested changes then the series is OK for
stage 1, assuming no objections from those with an interest in the
x86 cygwin/mingw port.

Richard

> Regards,
> Evgeny
>
>
> Changes from v1 to v2:
> Adjust the target name to aarch64-*-mingw* to exclude the big-endian
> target from support.
> Exclude 64-bit ISA.
> Rename enum calling_abi to aarch64_calling_abi.
> Move AArch64 MS ABI definitions FIXED_REGISTERS,
> CALL_REALLY_USED_REGISTERS, and STATIC_CHAIN_REGNUM from aarch64.h 
> to aarch64-abi-ms.h.
> Rename TARGET_ARM64_MS_ABI to TARGET_AARCH64_MS_ABI.
> Exclude TARGET_64BIT from the aarch64 target.
> Exclude HAVE_GAS_WEAK.
> Set HAVE_GAS_ALIGNED_COMM to 1 by default.
> Use a reference from "x86 Windows Options" to 
> "Cygwin and MinGW Options".
> Update commit descriptions to follow standard style.
> Rebase from 4th March 2024.


Re: [PATCH v8 5/5] Add the 6th argument to .ACCESS_WITH_SIZE

2024-04-10 Thread Joseph Myers
The C front-end changes in this patch are OK for GCC 15.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v8 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-04-10 Thread Joseph Myers
On Fri, 29 Mar 2024, Qing Zhao wrote:

> +/* For a SUBDATUM field of a structure or union DATUM, generate a REF to
> +   the object that represents its counted_by per the attribute counted_by
> +   attached to this field if it's a flexible array member field, otherwise
> +   return NULL_TREE.
> +   set COUNTED_BY_TYPE to the TYPE of the counted_by field.

Use an uppercase letter at the start of a sentence, "Set".

> +static tree
> +build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
> +{
> +  tree type = TREE_TYPE (datum);
> +  if (!(c_flexible_array_member_type_p (TREE_TYPE (subdatum
> +return NULL_TREE;

There are redundant parentheses here around the call to 
c_flexible_array_member_type_p.

The C front-end changes in this patch are OK for GCC 15 (after GCC 14 has 
branched, and once a version of patch 1 has also been approved) with those 
fixes.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v8 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-04-10 Thread Joseph Myers
The C front-end changes in this patch are OK for GCC 15.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-04-10 Thread Richard Sandiford
Evgeny Karpov  writes:
> From: Zac Walker 
> Date: Fri, 1 Mar 2024 02:17:39 +0100
> Subject: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW
>  Options"
>
> Rename "x86 Windows Options" to "Cygwin and MinGW Options".
> It will be used also for AArch64.
>
> gcc/ChangeLog:
>
>   * config/i386/mingw-w64.opt.urls: Rename options' name and
>   regenerate option URLs.
>   * config/lynx.opt.urls: Likewise.
>   * config/mingw/cygming.opt.urls: Likewise.
>   * config/mingw/mingw.opt.urls: Likewise.
>   * doc/invoke.texi: Likewise.
> ---
>  gcc/config/i386/mingw-w64.opt.urls |  2 +-
>  gcc/config/lynx.opt.urls   |  2 +-
>  gcc/config/mingw/cygming.opt.urls  | 18 +-
>  gcc/config/mingw/mingw.opt.urls|  2 +-
>  gcc/doc/invoke.texi| 12 ++--
>  5 files changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/config/i386/mingw-w64.opt.urls 
> b/gcc/config/i386/mingw-w64.opt.urls
> index 6bb53ef29b2..5cceba1d1a1 100644
> --- a/gcc/config/i386/mingw-w64.opt.urls
> +++ b/gcc/config/i386/mingw-w64.opt.urls
> @@ -1,5 +1,5 @@
>  ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/mingw-w64.opt 
> and generated HTML
>  
>  municode
> -UrlSuffix(gcc/x86-Windows-Options.html#index-municode)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-municode)
>  
> diff --git a/gcc/config/lynx.opt.urls b/gcc/config/lynx.opt.urls
> index 63e7b9c4b33..b547138f7ff 100644
> --- a/gcc/config/lynx.opt.urls
> +++ b/gcc/config/lynx.opt.urls
> @@ -1,5 +1,5 @@
>  ; Autogenerated by regenerate-opt-urls.py from gcc/config/lynx.opt and 
> generated HTML
>  
>  mthreads
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mthreads-1)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1)
>  
> diff --git a/gcc/config/mingw/cygming.opt.urls 
> b/gcc/config/mingw/cygming.opt.urls
> index 87799befe3c..c624e22e442 100644
> --- a/gcc/config/mingw/cygming.opt.urls
> +++ b/gcc/config/mingw/cygming.opt.urls
> @@ -1,30 +1,30 @@
>  ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/cygming.opt 
> and generated HTML
>  
>  mconsole
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mconsole)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mconsole)
>  
>  mdll
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mdll)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mdll)
>  
>  mnop-fun-dllimport
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mnop-fun-dllimport)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mnop-fun-dllimport)
>  
>  ; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> +;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1'
>  ;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> -;   duplicate: 'gcc/x86-Windows-Options.html#index-mthreads-1'
>  
>  mwin32
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mwin32)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mwin32)
>  
>  mwindows
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mwindows)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mwindows)
>  
>  mpe-aligned-commons
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mpe-aligned-commons)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mpe-aligned-commons)
>  
>  fset-stack-executable
> -UrlSuffix(gcc/x86-Windows-Options.html#index-fno-set-stack-executable)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-fno-set-stack-executable)
>  
>  fwritable-relocated-rdata
> -UrlSuffix(gcc/x86-Windows-Options.html#index-fno-writable-relocated-rdata)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-fno-writable-relocated-rdata)
>  
> diff --git a/gcc/config/mingw/mingw.opt.urls b/gcc/config/mingw/mingw.opt.urls
> index 2cbbaadf310..f8ee5be6a53 100644
> --- a/gcc/config/mingw/mingw.opt.urls
> +++ b/gcc/config/mingw/mingw.opt.urls
> @@ -1,7 +1,7 @@
>  ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/mingw.opt and 
> generated HTML
>  
>  mcrtdll=
> -UrlSuffix(gcc/x86-Windows-Options.html#index-mcrtdll)
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mcrtdll)
>  
>  ; skipping UrlSuffix for 'pthread' due to multiple URLs:
>  ;   duplicate: 'gcc/Link-Options.html#index-pthread-1'
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index bdf05be387d..e2e473e095f 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1493,6 +1493,8 @@ See RS/6000 and PowerPC Options.
>  -munroll-only-small-loops -mlam=@var{choice}}
>  
>  @emph{x86 Windows Options}
> +
> +@emph{Cygwin and MinGW Options}
>  @gccoptlist{-mconsole  -mcrtdll=@var{library}  -mdll
>  -mnop-fun-dllimport  -mthread
>  -municode  -mwin32  -mwindows  -fno-set-stack-executable}
> @@ -20976,6 +20978,7 @@ platform.
>  * C6X Options::
>  * CRIS Options::
>  * C-SKY Options::
> +* Cygwin and MinGW Options::
>  * Darwin Options::
>  * DEC Alpha Options::
>  * eBPF Options::
> @@ -36112,8 +36115,13 @@ positions 62:57 can be used for metadata.
>  
>  @node x86 Windows Options
>  @subsection 

Re: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-04-10 Thread Richard Sandiford
Evgeny Karpov  writes:
> From: Zac Walker 
> Date: Fri, 1 Mar 2024 10:49:28 +0100
> Subject: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for
>  AArch64
>
> Define Cygwin and MinGW environment such as types, SEH definitions,
> shared libraries, etc.
>
> gcc/ChangeLog:
>
>   * config.gcc: Add Cygwin and MinGW difinitions.
>   * config/aarch64/aarch64-protos.h
>   (mingw_pe_maybe_record_exported_symbol): Declare functions
>   which are used in Cygwin and MinGW environment.
>   (mingw_pe_section_type_flags): Likewise.
>   (mingw_pe_unique_section): Likewise.
>   (mingw_pe_encode_section_info): Likewise.
>   * config/aarch64/cygming.h: New file.
> ---
>  gcc/config.gcc  |   4 +
>  gcc/config/aarch64/aarch64-protos.h |   5 +
>  gcc/config/aarch64/cygming.h| 175 
>  3 files changed, 184 insertions(+)
>  create mode 100644 gcc/config/aarch64/cygming.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 3aca257c322..4471599454b 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1267,7 +1267,11 @@ aarch64*-*-linux*)
>  aarch64-*-mingw*)
>   tm_file="${tm_file} aarch64/aarch64-abi-ms.h"
>   tm_file="${tm_file} aarch64/aarch64-coff.h"
> + tm_file="${tm_file} aarch64/cygming.h"
> + tm_file="${tm_file} mingw/mingw32.h"
> + tm_file="${tm_file} mingw/mingw-stdint.h"
>   tmake_file="${tmake_file} aarch64/t-aarch64"
> + target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt.cc"
>   case ${enable_threads} in
> "" | yes | win32)
>   thread_file='win32'
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index bd719b992a5..759e1a0f9da 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -1110,6 +1110,11 @@ extern void aarch64_output_patchable_area (unsigned 
> int, bool);
>  
>  extern void aarch64_adjust_reg_alloc_order ();
>  
> +extern void mingw_pe_maybe_record_exported_symbol (tree, const char *, int);
> +extern unsigned int mingw_pe_section_type_flags (tree, const char *, int);
> +extern void mingw_pe_unique_section (tree, int);
> +extern void mingw_pe_encode_section_info (tree, rtx, int);
> +
>  bool aarch64_optimize_mode_switching (aarch64_mode_entity);
>  void aarch64_restore_za (rtx);
>  
> diff --git a/gcc/config/aarch64/cygming.h b/gcc/config/aarch64/cygming.h
> new file mode 100644
> index 000..2f239c42a89
> --- /dev/null
> +++ b/gcc/config/aarch64/cygming.h
> @@ -0,0 +1,175 @@
> +/* Operating system specific defines to be used when targeting GCC for
> +   hosting on Windows32, using a Unix style C library and tools.
> +   Copyright (C) 1995-2024 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +#ifndef GCC_AARCH64_CYGMING_H
> +#define GCC_AARCH64_CYGMING_H
> +
> +#undef PREFERRED_DEBUGGING_TYPE
> +#define PREFERRED_DEBUGGING_TYPE DINFO_TYPE_NONE
> +
> +#define FASTCALL_PREFIX '@'
> +
> +#define print_reg(rtx, code, file)

How about:

#define print_reg(rtx, code, file) (gcc_unreachable ())

so that attempts to use this are a noisy runtime failure?

> +#define SYMBOL_FLAG_DLLIMPORT 0
> +#define SYMBOL_FLAG_DLLEXPORT 0
> +
> +#define SYMBOL_REF_DLLEXPORT_P(X) \
> + ((SYMBOL_REF_FLAGS (X) & SYMBOL_FLAG_DLLEXPORT) != 0)
> +
> +/* Disable SEH and declare the required SEH-related macros that are
> +still needed for compilation.  */
> +#undef TARGET_SEH
> +#define TARGET_SEH 0
> +
> +#define SSE_REGNO_P(N) 0
> +#define GENERAL_REGNO_P(N) 0
> +#define SEH_MAX_FRAME_SIZE 0

Similarly here, how about:

#define SSE_REGNO_P(N) (gcc_unreachable (), 0)
#define GENERAL_REGNO_P(N) (gcc_unreachable (), 0)
#define SEH_MAX_FRAME_SIZE (gcc_unreachable (), 0)

Thanks,
Richard


Re: Combine patch ping

2024-04-10 Thread Uros Bizjak
On Wed, Apr 10, 2024 at 7:56 PM Segher Boessenkool
 wrote:
>
> On Sun, Apr 07, 2024 at 08:31:38AM +0200, Uros Bizjak wrote:
> > If there are no further comments, I plan to commit the referred patch
> > to the mainline on Wednesday. The latest version can be considered an
> > obvious patch that solves certain oversight in the original
> > implementation.
>
> This is never okay.  You cannot commit a patch without approval, *ever*.
>
> That patch is also obvious -- obviously *wrong*, that is.  There are
> big assumptions everywhere in the compiler how a CC reg can be used.
> This violates that, as explained elsewhere.

Can you please elaborate what is wrong with this concrete patch. The
part that the patch touches has several wrong assumptions, and the
fixed "???" comment just emphasizes that. I don't see what is wrong
with:

(define_insn "@pushfl2"
  [(set (match_operand:W 0 "push_operand" "=<")
(unspec:W [(match_operand 1 "flags_reg_operand")]
  UNSPEC_PUSHFL))]
  "GET_MODE_CLASS (GET_MODE (operands[1])) == MODE_CC"
  "pushf{}"
  [(set_attr "type" "push")
   (set_attr "mode" "")])

it is just a push of the flags reg to the stack. If the push can't be
described in this way, then it is the middle end at fault, we can't
just change modes at will.

Feel free to revert the patch, I will unassign myself from the PR.

Uros.


Re: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF

2024-04-10 Thread Richard Sandiford
Evgeny Karpov  writes:
> From: Zac Walker 
> Date: Fri, 1 Mar 2024 01:55:47 +0100
> Subject: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF
>
> Define ASM specific for COFF format on AArch64.
>
> gcc/ChangeLog:
>
>   * config.gcc: Add COFF format support definitions.
>   * config/aarch64/aarch64-coff.h: New file.
> ---
>  gcc/config.gcc|  1 +
>  gcc/config/aarch64/aarch64-coff.h | 91 +++
>  2 files changed, 92 insertions(+)
>  create mode 100644 gcc/config/aarch64/aarch64-coff.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b762393b64c..cb6661f44ef 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1266,6 +1266,7 @@ aarch64*-*-linux*)
>   ;;
>  aarch64-*-mingw*)
>   tm_file="${tm_file} aarch64/aarch64-abi-ms.h"
> + tm_file="${tm_file} aarch64/aarch64-coff.h"
>   tmake_file="${tmake_file} aarch64/t-aarch64"
>   case ${enable_threads} in
> "" | yes | win32)
> diff --git a/gcc/config/aarch64/aarch64-coff.h 
> b/gcc/config/aarch64/aarch64-coff.h
> new file mode 100644
> index 000..79c5a43b970
> --- /dev/null
> +++ b/gcc/config/aarch64/aarch64-coff.h
> @@ -0,0 +1,91 @@
> +/* Machine description for AArch64 architecture.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but
> +   WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */
> +
> +#ifndef GCC_AARCH64_COFF_H
> +#define GCC_AARCH64_COFF_H
> +
> +#include "aarch64.h"

Is this needed?  It looks like aarch64-coff.h comes after aarch64.h
in the include list, so I'd have expected the #include to be a no-op.

If you want to emphasise that this file must be included after aarch64.h
then perhaps:

#if !defined(GCC_AARCH64_H)
#error This file must be included after aarch64.h
#endif

would work.  But it should also be ok just to drop the include without
replacing it with anything.

> +
> +#ifndef LOCAL_LABEL_PREFIX
> +# define LOCAL_LABEL_PREFIX  ""
> +#endif
> +
> +/* Using long long breaks -ansi and -std=c90, so these will need to be
> +   made conditional for an LLP64 ABI.  */
> +#undef SIZE_TYPE
> +#define SIZE_TYPE"long long unsigned int"
> +
> +#undef PTRDIFF_TYPE
> +#define PTRDIFF_TYPE "long long int"
> +
> +#undef LONG_TYPE_SIZE
> +#define LONG_TYPE_SIZE 32
> +
> +#ifndef ASM_GENERATE_INTERNAL_LABEL
> +# define ASM_GENERATE_INTERNAL_LABEL(STRING, PREFIX, NUM)  \
> +  sprintf (STRING, "*%s%s%u", LOCAL_LABEL_PREFIX, PREFIX, (unsigned 
> int)(NUM))
> +#endif
> +
> +#define ASM_OUTPUT_ALIGN(STREAM, POWER)  \
> +  fprintf (STREAM, "\t.align\t%d\n", (int)POWER)
> +
> +/* Output a common block.  */
> +#ifndef ASM_OUTPUT_COMMON
> +# define ASM_OUTPUT_COMMON(STREAM, NAME, SIZE, ROUNDED)  \
> +{\
> +  fprintf (STREAM, "\t.comm\t"); \
> +  assemble_name (STREAM, NAME);  \
> +  asm_fprintf (STREAM, ", %d, %d\n", \
> +  (int)(ROUNDED), (int)(SIZE));  \
> +}
> +#endif
> +
> +/* Output a local common block.  /bin/as can't do this, so hack a
> +   `.space' into the bss segment.  Note that this is *bad* practice,
> +   which is guaranteed NOT to work since it doesn't define STATIC
> +   COMMON space but merely STATIC BSS space.  */
> +#ifndef ASM_OUTPUT_ALIGNED_LOCAL
> +# define ASM_OUTPUT_ALIGNED_LOCAL(STREAM, NAME, SIZE, ALIGN) \
> +{
> \
> +  switch_to_section (bss_section);   
> \
> +  ASM_OUTPUT_ALIGN (STREAM, floor_log2 (ALIGN / BITS_PER_UNIT)); \
> +  ASM_OUTPUT_LABEL (STREAM, NAME);   
> \
> +  fprintf (STREAM, "\t.space\t%d\n", (int)(SIZE));   
> \
> +}
> +#endif
> +
> +#define ASM_OUTPUT_SKIP(STREAM, NBYTES)  \
> +  fprintf (STREAM, "\t.space\t%d  // skip\n", (int) (NBYTES))
> +
> +#define ASM_OUTPUT_TYPE_DIRECTIVE(STREAM, NAME, TYPE)
> +#define ASM_DECLARE_FUNCTION_SIZE(FILE, FNAME, DECL)

Just curious: are these empty definitions the final intended
definitions, or are they just temporary?  Might be worth a comment
either way.

Thanks,
Richard

> +
> +#define TEXT_SECTION_ASM_OP  "\t.text"
> +#define DATA_SECTION_ASM_OP  "\t.data"
> +#define BSS_SECTION_ASM_OP   

[PATCH 1/2] ctf: fix PR debug/112878

2024-04-10 Thread Indu Bhagat
PR debug/112878: ICE: in ctf_add_slice, at ctfc.cc:499 with _BitInt > 255 in a 
struct and -gctf1

The CTF generation in GCC does not have a mechanism to roll-back an
already added type.  In this testcase presented in the PR, we hit a
representation limit in CTF slices (for a member of a struct) and ICE,
after the type for struct (CTF_K_STRUCT) has already been added to the
container.

To exit gracefully instead, we now check for both the offset and size of
the bitfield to be explicitly <= 255.  If the check fails, we emit the
member with type CTF_K_UNKNOWN.  Note that, the value 255 stems from the
existing binutils libctf checks which were motivated to guard against
malformed inputs.

Although it is not accurate to say that this is a CTF representation
limit, mark the code with TBD_CTF_REPRESENTATION_LIMIT for now so that
this can be taken care of with the next format version bump, when
libctf's checks for the slice data can be lifted as well.

gcc/ChangeLog:
PR debug/112878
* dwarf2ctf.cc (gen_ctf_sou_type): Check for conditions before
call to ctf_add_slice.  Use CTF_K_UNKNOWN type if fail.

gcc/testsuite/ChangeLog:
PR debug/112878
* gcc.dg/debug/ctf/ctf-bitfields-5.c: New test.
---
 gcc/dwarf2ctf.cc| 15 ++-
 .../gcc.dg/debug/ctf/ctf-bitfields-5.c  | 17 +
 2 files changed, 27 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c

diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
index 77d6bf89689..dc59569fe56 100644
--- a/gcc/dwarf2ctf.cc
+++ b/gcc/dwarf2ctf.cc
@@ -606,11 +606,16 @@ gen_ctf_sou_type (ctf_container_ref ctfc, dw_die_ref sou, 
uint32_t kind)
  if (attr)
bitpos += AT_unsigned (attr);
 
- field_type_id = ctf_add_slice (ctfc, CTF_ADD_NONROOT,
-field_type_id,
-bitpos - field_location,
-bitsize,
-c);
+ /* This is not precisely a TBD_CTF_REPRESENTATION_LIMIT, but
+surely something to look at for the next format version bump
+for CTF.  */
+ if (bitsize <= 255 && (bitpos - field_location) <= 255)
+   field_type_id = ctf_add_slice (ctfc, CTF_ADD_NONROOT,
+  field_type_id,
+  bitpos - field_location,
+  bitsize, c);
+ else
+   field_type_id = gen_ctf_unknown_type (ctfc);
}
 
  /* Add the field type to the struct or union type.  */
diff --git a/gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c 
b/gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c
new file mode 100644
index 000..fee8228647c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c
@@ -0,0 +1,17 @@
+/* Bitfield where the bit offset is > 255 is not allowed in CTF.
+
+   PR debug/112878.
+   This testcase is to ensure graceful handling. No slices are expected.  */
+
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O0 -gctf -dA" } */
+
+/* No slices are expected, but a struct with one member is expected.
+   CTF_K_UNKNOWN is also expected.  */
+/* { dg-final { scan-assembler-times "cts_type" 0 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1a01\[\t 
\]+\[^\n\]*ctt_info" 1 } } */
+/* { dg-final { scan-assembler-times "ascii \"unknown.0\"\[\t 
\]+\[^\n\]*ctf_string" 1 } } */
+
+struct {
+  _BitInt(282) a : 280;
+} b;
-- 
2.43.0



[PATCH 0/2] Fix PR debug/112878 and a BTF issue

2024-04-10 Thread Indu Bhagat
Hi,

The patch series includes two patches: first one is a fix for PR
debug/112878 and the second one is for an existing BTF generation issue.

Testing Notes:
 - Regression tested on x86_64-linux-gnu
 - Tested btf.exp, ctf.exp, bpf.exp for --target=bpf-unknown-none

Thanks,
Indu Bhagat (2):
  ctf: fix PR debug/112878
  btf: do not skip members of data type with type id BTF_VOID_TYPEID

 gcc/btfout.cc   |  5 -
 gcc/dwarf2ctf.cc| 15 ++-
 .../gcc.dg/debug/btf/btf-bitfields-4.c  |  6 +++---
 gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c   |  9 +
 .../gcc.dg/debug/ctf/ctf-bitfields-5.c  | 17 +
 5 files changed, 35 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-5.c

-- 
2.43.0



[PATCH 2/2] btf: do not skip members of data type with type id BTF_VOID_TYPEID

2024-04-10 Thread Indu Bhagat
Testing the previous fix in gen_ctf_sou_type () reveals an issue in BTF
generation, however: BTF emission was currently decrementing the vlen
(indicating the number of members) to skip members of type CTF_K_UNKNOWN
altogether, but still emitting the BTF for the corresponding member (in
output_asm_btf_sou_fields ()).

One can see malformed BTF by executing the newly added CTF testcase
(gcc.dg/debug/ctf/ctf-bitfields-5.c) with -gbtf instead or even existing
btf-struct-2.c without this patch.

To fix the issue, it makes sense to rather _not_ skip members of data
type of type id BTF_VOID_TYPEID.

gcc/ChangeLog:
* btfout.cc (btf_asm_type): Do not skip emitting members of
unknown type.

gcc/testsuite/ChangeLog:
* btf-bitfields-4.c: Update the vlen check.
* btf-struct-2.c: Check that member named 'f' with void data
type is emitted.
---
 gcc/btfout.cc| 5 -
 gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c | 6 +++---
 gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c| 9 +
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 4a8ec4d1ff0..ab491f0297f 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -820,11 +820,6 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
  /* Set kflag if this member is a representable bitfield.  */
  if (btf_dmd_representable_bitfield_p (ctfc, dmd))
btf_kflag = 1;
-
- /* Struct members that refer to unsupported types or bitfield formats
-shall be skipped. These are marked during preprocessing.  */
- else if (!btf_emit_id_p (dmd->dmd_type))
-   btf_vlen -= 1;
}
 }
 
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c
index c00c8b3d87f..d4a6ef6a1eb 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-4.c
@@ -6,14 +6,14 @@
In this test, we construct a structure such that the bitfield will have
an offset so large as to be unrepresentable in BTF. We expect that the
resulting BTF will describe the rest of the structure, ignoring the
-   non-representable bitfield.  */
+   non-representable bitfield by simply using void data type for the same.  */
 
 /* { dg-do compile } */
 /* { dg-options "-O0 -gbtf -dA" } */
 /* { dg-require-effective-target size32plus } */
 
-/* Struct with 3 members and no bitfield (kind_flag not set).  */
-/* { dg-final { scan-assembler-times "\[\t \]0x403\[\t 
\]+\[^\n\]*btt_info" 1 } } */
+/* Struct with 4 members and no bitfield (kind_flag not set).  */
+/* { dg-final { scan-assembler-times "\[\t \]0x404\[\t 
\]+\[^\n\]*btt_info" 1 } } */
 
 struct bigly
 {
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c
index e9ff06883db..fa7231be75c 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-struct-2.c
@@ -2,14 +2,15 @@
unsupported type.
 
BTF does not support vector types (among other things). When
-   generating BTF for a struct (or union) type, members which refer to
-   unsupported types should be skipped.  */
+   generating BTF for a struct (or union) type.  Members which refer to
+   unsupported types should not be skipped, however.  */
 
 /* { dg-do compile } */
 /* { dg-options "-O0 -gbtf -dA" } */
 
-/* Expect a struct with only 2 members - 'f' should not be present.  */
-/* { dg-final { scan-assembler-times "\[\t \]0x402\[\t 
\]+\[^\n\]*btt_info" 1 } } */
+/* Expect a struct with 3 members - 'f' is present but is of data type void.  
*/
+/* { dg-final { scan-assembler-times "\[\t \]0x403\[\t 
\]+\[^\n\]*btt_info" 1 } } */
+/* { dg-final { scan-assembler-times " MEMBER 'f' 
idx=1\[\\r\\n\]+\[^\\r\\n\]*0\[\t \]+\[^\n\]*btm_type: void" 1 } } */
 
 struct with_float
 {
-- 
2.43.0



Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Martin Uecker
Am Mittwoch, dem 10.04.2024 um 17:35 + schrieb Joseph Myers:
> On Fri, 29 Mar 2024, Qing Zhao wrote:
> 
> > +  /* Issue error when there is a counted_by attribute with a different
> > + field as the argument for the same flexible array member field.  */
> 
> There's another case of this to consider, though I'm not sure where best 
> to check for it (Martin might have suggestions) - of course this case will 
> need testcases as well.
> 
> Suppose, as allowed in C23, a structure is defined twice in the same 
> scope, but the two definitions of the structure use inconsistent 
> counted_by attributes.  I'd say that, when the declarations are in the 
> same scope (thus required to be consistent), it should be an error for the 
> two definitions of what is meant to be the same structure to use 
> incompatible counted_by attributes (even though the member declarations 
> are otherwise the same).

I think the right place could be comp_types_attributes in
attributes.cc.  It may be sufficient to set the
affects_type_identify flag.

This should then give a redefinition error as it should do for
"packed".

> 
> In C23 structures defined with the same tag in different scopes are 
> compatible given requirements including compatible types for corresponding 
> elements.  It would seem most appropriate to me for such structures with 
> incompatible counted_by attributes to be considered *not* compatible types 
> (but it would be valid to define structures with the same tag, different 
> scopes, and elements the same except for counted_by - just not to use them 
> in any way requiring them to be compatible).

Another option might be to warn about the case when those types
are then used together in a way where they are required to
be compatible.  Then comp_types_attributes would have to return 2.


Martin

> 
> > +The @code{counted_by} attribute may be attached to the C99 flexible array
> > +member of a structure.  It indicates that the number of the elements of the
> > +array is given by the field "@var{count}" in the same structure as the
> 
> As noted previously, the "" quotes should be removed there (or replaced by 
> ``'' quotes).
> 



Re: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-04-10 Thread Richard Sandiford
Evgeny Karpov  writes:
> From: Zac Walker 
> Date: Fri, 1 Mar 2024 09:56:59 +0100
> Subject: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for
>  MS ABI
>
> Define the MS ABI for aarch64-w64-mingw32.
> Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
> STATIC_CHAIN_REGNUM for AArch64 MS ABI.
> The X18 register is reserved on Windows for the TEB.
>
> gcc/ChangeLog:
>
>   * config.gcc: Define TARGET_AARCH64_MS_ABI when
>   AArch64 MS ABI is used.
>   * config/aarch64/aarch64-abi-ms.h: New file. Adjust
>   FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
>   STATIC_CHAIN_REGNUM for AArch64 MS ABI.
> ---
>  gcc/config.gcc  |  1 +
>  gcc/config/aarch64/aarch64-abi-ms.h | 64 +
>  2 files changed, 65 insertions(+)
>  create mode 100644 gcc/config/aarch64/aarch64-abi-ms.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 2756377e50b..b762393b64c 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1265,6 +1265,7 @@ aarch64*-*-linux*)
>   TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
>   ;;
>  aarch64-*-mingw*)
> + tm_file="${tm_file} aarch64/aarch64-abi-ms.h"
>   tmake_file="${tmake_file} aarch64/t-aarch64"
>   case ${enable_threads} in
> "" | yes | win32)
> diff --git a/gcc/config/aarch64/aarch64-abi-ms.h 
> b/gcc/config/aarch64/aarch64-abi-ms.h
> new file mode 100644
> index 000..90b0dcc5edf
> --- /dev/null
> +++ b/gcc/config/aarch64/aarch64-abi-ms.h
> @@ -0,0 +1,64 @@
> +/* Machine description for AArch64 MS ABI.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +#ifndef GCC_AARCH64_ABI_MS_H
> +#define GCC_AARCH64_ABI_MS_H
> +
> +/* X18 reserved for the TEB on Windows.  */
> +
> +#undef FIXED_REGISTERS
> +#define FIXED_REGISTERS  \
> +  {  \
> +0, 0, 0, 0,   0, 0, 0, 0,/* R0 - R7.  */ \
> +0, 0, 0, 0,   0, 0, 0, 0,/* R8 - R15.  */\
> +0, 0, 1, 0,   0, 0, 0, 0,/* R16 - R23.  */   \
> +0, 0, 0, 0,   0, 1, 0, 1,/* R24 - R30, SP.  */   \
> +0, 0, 0, 0,   0, 0, 0, 0,/* V0 - V7.  */ \
> +0, 0, 0, 0,   0, 0, 0, 0,   /* V8 - V15.  */ \
> +0, 0, 0, 0,   0, 0, 0, 0,   /* V16 - V23.  */\
> +0, 0, 0, 0,   0, 0, 0, 0,   /* V24 - V31.  */\
> +1, 1, 1, 1,  /* SFP, AP, CC, VG.  */ \
> +0, 0, 0, 0,   0, 0, 0, 0,/* P0 - P7.  */ \
> +0, 0, 0, 0,   0, 0, 0, 0,   /* P8 - P15.  */ \
> +1, 1,/* FFR and FFRT.  */\
> +1, 1, 1, 1, 1, 1, 1, 1   /* Fake registers.  */  \
> +  }
> +
> +#undef CALL_REALLY_USED_REGISTERS
> +#define CALL_REALLY_USED_REGISTERS   \
> +  {  \
> +1, 1, 1, 1,   1, 1, 1, 1,/* R0 - R7.  */ \
> +1, 1, 1, 1,   1, 1, 1, 1,/* R8 - R15.  */\
> +1, 1, 0, 0,   0, 0, 0, 0,   /* R16 - R23.  */\
> +0, 0, 0, 0,   0, 1, 1, 1,/* R24 - R30, SP.  */   \
> +1, 1, 1, 1,   1, 1, 1, 1,/* V0 - V7.  */ \
> +0, 0, 0, 0,   0, 0, 0, 0,/* V8 - V15.  */\
> +1, 1, 1, 1,   1, 1, 1, 1,   /* V16 - V23.  */\
> +1, 1, 1, 1,   1, 1, 1, 1,   /* V24 - V31.  */\
> +1, 1, 1, 0,  /* SFP, AP, CC, VG.  */ \
> +1, 1, 1, 1,   1, 1, 1, 1,/* P0 - P7.  */ \
> +1, 1, 1, 1,   1, 1, 1, 1,/* P8 - P15.  */\
> +1, 1,/* FFR and FFRT.  */\
> +0, 0, 0, 0, 0, 0, 0, 0   /* Fake registers.  */  \
> +  }
> +
> +#undef  STATIC_CHAIN_REGNUM
> +#define STATIC_CHAIN_REGNUM R17_REGNUM
> +
> +#endif /* GCC_AARCH64_ABI_MS_H.  */

Gah, I think there was a miscommunication, sorry.  The way I'd interpreted
Richard's comment:

> +/* X18 reserved for the TEB on Windows.  */
> +#ifdef TARGET_ARM64_MS_ABI
> +# define FIXED_X18 1
> +# define CALL_USED_X18 0
> +#else
> +# define FIXED_X18 0
> +# define CALL_USED_X18 1
> +#endif
>
> I'm not overly keen on ifdefs like this (and the one below), it can
> get quite confusing if we have to support more than a couple of ABIs.
> Perhaps we could create a couple of new headers, 

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-10 Thread Ajit Agarwal
Hello Alex:

On 10/04/24 7:52 pm, Alex Coplan wrote:
> Hi Ajit,
> 
> On 10/04/2024 15:31, Ajit Agarwal wrote:
>> Hello Alex:
>>
>> On 10/04/24 1:42 pm, Alex Coplan wrote:
>>> Hi Ajit,
>>>
>>> On 09/04/2024 20:59, Ajit Agarwal wrote:
 Hello Alex:

 On 09/04/24 8:39 pm, Alex Coplan wrote:
> On 09/04/2024 20:01, Ajit Agarwal wrote:
>> Hello Alex:
>>
>> On 09/04/24 7:29 pm, Alex Coplan wrote:
>>> On 09/04/2024 17:30, Ajit Agarwal wrote:


 On 05/04/24 10:03 pm, Alex Coplan wrote:
> On 05/04/2024 13:53, Ajit Agarwal wrote:
>> Hello Alex/Richard:
>>
>> All review comments are incorporated.
> 
>> @@ -2890,8 +3018,8 @@ ldp_bb_info::merge_pairs (insn_list_t 
>> _list,
>>  // of accesses.  If we find two sets of adjacent accesses, call
>>  // merge_pairs.
>>  void
>> -ldp_bb_info::transform_for_base (int encoded_lfs,
>> - access_group )
>> +pair_fusion_bb_info::transform_for_base (int encoded_lfs,
>> + access_group )
>>  {
>>const auto lfs = decode_lfs (encoded_lfs);
>>const unsigned access_size = lfs.size;
>> @@ -2909,7 +3037,7 @@ ldp_bb_info::transform_for_base (int 
>> encoded_lfs,
>> access.cand_insns,
>> lfs.load_p,
>> access_size);
>> -  skip_next = access.cand_insns.empty ();
>> +  skip_next = bb_state->cand_insns_empty_p (access.cand_insns);
>
> As above, why is this needed?

 For rs6000 we want to return always true. as load store pair
 that are to be merged with 8/16 16/32 32/64 is occuring for rs6000.
 And we want load store pair to 8/16 32/64. Thats why we want
 to generate always true for rs6000 to skip pairs as above.
>>>
>>> Hmm, sorry, I'm not sure I follow.  Are you saying that for rs6000 you 
>>> have
>>> load/store pair instructions where the two arms of the access are 
>>> storing
>>> operands of different sizes?  Or something else?
>>>
>>> As it stands the logic is to skip the next iteration only if we
>>> exhausted all the candidate insns for the current access.  In the case
>>> that we didn't exhaust all such candidates, then the idea is that when
>>> access becomes prev_access, we can attempt to use those candidates as
>>> the "left-hand side" of a pair in the next iteration since we failed to
>>> use them as the "right-hand side" of a pair in the current iteration.
>>> I don't see why you wouldn't want that behaviour.  Please can you
>>> explain?
>>>
>>
>> In merge_pair we get the 2 load candiates one load from 0 offset and
>> other load is from 16th offset. Then in next iteration we get load
>> from 16th offset and other load from 32 offset. In next iteration
>> we get load from 32 offset and other load from 48 offset.
>>
>> For example:
>>
>> Currently we get the load candiates as follows.
>>
>> pairs:
>>
>> load from 0th offset.
>> load from 16th offset.
>>
>> next pairs:
>>
>> load from 16th offset.
>> load from 32th offset.
>>
>> next pairs:
>>
>> load from 32th offset
>> load from 48th offset.
>>
>> Instead in rs6000 we should get:
>>
>> pairs:
>>
>> load from 0th offset
>> load from 16th offset.
>>
>> next pairs:
>>
>> load from 32th offset
>> load from 48th offset.
>
> Hmm, so then I guess my question is: why wouldn't you consider merging
> the pair with offsets (16,32) for rs6000?  Is it because you have a
> stricter alignment requirement on the base pair offsets (i.e. they have
> to be a multiple of 32 when the operand size is 16)?  So the pair
> offsets have to be a multiple of the entire pair size rather than a
> single operand size> 

 We get load pair at a certain point with (0,16) and other program
 point we get load pair (32, 48).

 In current implementation it takes offsets loads as (0, 16),
 (16, 32), (32, 48).

 But In rs6000 we want  the load pair to be merged at different points
 as (0,16) and (32, 48). for (0,16) we want to replace load lxvp with
 0 offset and other load (32, 48) with lxvp with 32 offset.

 In current case it will merge with lxvp with 0 offset and lxvp with
 16 offset, then lxvp with 32 offset and lxvp with 48 offset which
 is incorrect in our case as the (16-32) case 16 offset will not
 load from even register and break for rs6000.
>>>
>>> Sorry, I think I'm still missing something here.  Why does the address 
>>> offset
>>> affect the parity of the tranfser register?  ISTM they needn't be related at
>>> all (and indeed we can't even 

Re: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-04-10 Thread Richard Sandiford
Sorry for the slow reply.

Evgeny Karpov  writes:
> From: Zac Walker 
> Date: Fri, 1 Mar 2024 01:45:13 +0100
> Subject: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements
>  the MS ABI
>
> Two ABIs for aarch64 have been defined for different platforms.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-opts.h (enum aarch64_calling_abi):
>   Define two ABIs.
> ---
>  gcc/config/aarch64/aarch64-opts.h | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64-opts.h 
> b/gcc/config/aarch64/aarch64-opts.h
> index a05c0d3ded1..52c9e4596d6 100644
> --- a/gcc/config/aarch64/aarch64-opts.h
> +++ b/gcc/config/aarch64/aarch64-opts.h
> @@ -131,4 +131,11 @@ enum aarch64_early_ra_scope {
>AARCH64_EARLY_RA_NONE
>  };
>  
> +/* Available call ABIs.  */
> +enum aarch64_calling_abi
> +{
> +  AARCH64_CALLING_ABI_EABI,
> +  AARCH64_CALLING_ABI_MS
> +};
> +
>  #endif

Thanks for removing the MS_ABI uses.  However, I'm still a bit uneasy
about this.  We already have a way of categorising ABIs (arm_pcs)
and it's not clear how this new enum would interact with it.  We also
have infrastructure for recording the arm_pcs for each call, including
in RTL insns.  Would we need to do the same for this classification?

It seems like the enum is currently unused.  At least, I was able
to build successfully with:

diff --git a/gcc/config/aarch64/aarch64-opts.h 
b/gcc/config/aarch64/aarch64-opts.h
index 52c9e4596d6..a05c0d3ded1 100644
--- a/gcc/config/aarch64/aarch64-opts.h
+++ b/gcc/config/aarch64/aarch64-opts.h
@@ -131,11 +131,4 @@ enum aarch64_early_ra_scope {
   AARCH64_EARLY_RA_NONE
 };
 
-/* Available call ABIs.  */
-enum aarch64_calling_abi
-{
-  AARCH64_CALLING_ABI_EABI,
-  AARCH64_CALLING_ABI_MS
-};
-
 #endif
diff --git a/gcc/config/aarch64/cygming.h b/gcc/config/aarch64/cygming.h
index 2f239c42a89..902539763bd 100644
--- a/gcc/config/aarch64/cygming.h
+++ b/gcc/config/aarch64/cygming.h
@@ -43,9 +43,6 @@ still needed for compilation.  */
 #define GENERAL_REGNO_P(N) 0
 #define SEH_MAX_FRAME_SIZE 0
 
-#undef DEFAULT_ABI
-#define DEFAULT_ABI AARCH64_CALLING_ABI_MS
-
 #undef TARGET_PECOFF
 #define TARGET_PECOFF 1
 
diff --git a/gcc/config/mingw/mingw32.h b/gcc/config/mingw/mingw32.h
index 040c3e1e521..08f1b5f0696 100644
--- a/gcc/config/mingw/mingw32.h
+++ b/gcc/config/mingw/mingw32.h
@@ -19,9 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 #undef DEFAULT_ABI
-#if defined (TARGET_AARCH64_MS_ABI)
-# define DEFAULT_ABI AARCH64_CALLING_ABI_MS
-#else
+#if !defined (TARGET_AARCH64_MS_ABI)
 # define DEFAULT_ABI MS_ABI
 #endif
 
Would you be happy with that for now?  We can then revisit this
later when the information is needed.

Thanks,
Richard


Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Qing Zhao
Thanks for the comments.

> On Apr 10, 2024, at 13:35, Joseph Myers  wrote:
> 
> On Fri, 29 Mar 2024, Qing Zhao wrote:
> 
>> +  /* Issue error when there is a counted_by attribute with a different
>> + field as the argument for the same flexible array member field.  */
> 
> There's another case of this to consider, though I'm not sure where best 
> to check for it (Martin might have suggestions) - of course this case will 
> need testcases as well.

Looks like this additional case relates to the new C23 feature, where is the
documentation on this new feature, I need to study a little bit on this, thanks.

> 
> Suppose, as allowed in C23, a structure is defined twice in the same 
> scope,

A stupid question first, the same scope means the same file? (Or same function)

Is there a testing case for this feature in current GCC source tree I can take 
a look? (and
Then I can use it to construct the new testing case for the counted-by 
attribute).

> but the two definitions of the structure use inconsistent 
> counted_by attributes.

Where in the current C FE to handle the same structure is defined twice in the 
same scope? Which routine
In the C FE?

>  I'd say that, when the declarations are in the 
> same scope (thus required to be consistent), it should be an error for the 
> two definitions of what is meant to be the same structure to use 
> incompatible counted_by attributes (even though the member declarations 
> are otherwise the same).

Agreed. Wil add such checking. 

> 
> In C23 structures defined with the same tag in different scopes are 
> compatible given requirements including compatible types for corresponding 
> elements.
Again, which routine in the C FE handle such case? I’d like to take a look at 
the current
Handling and how to update it for the counted-by attribute. 


>  It would seem most appropriate to me for such structures with 
> incompatible counted_by attributes to be considered *not* compatible types

Is there a utility routine for checking “compatible type”? 


> (but it would be valid to define structures with the same tag, different 
> scopes, and elements the same except for counted_by - just not to use them 
> in any way requiring them to be compatible).

Updating that routine (checking compatible type) with the new “counted-by” 
attribute
Might be enough for this purpose, I guess. 
> 
>> +The @code{counted_by} attribute may be attached to the C99 flexible array
>> +member of a structure.  It indicates that the number of the elements of the
>> +array is given by the field "@var{count}" in the same structure as the
> 
> As noted previously, the "" quotes should be removed there (or replaced by 
> ``'' quotes).

Okay, will update this.

thanks.

Qing
> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 



Re: Combine patch ping

2024-04-10 Thread Segher Boessenkool
On Sun, Apr 07, 2024 at 08:31:38AM +0200, Uros Bizjak wrote:
> If there are no further comments, I plan to commit the referred patch
> to the mainline on Wednesday. The latest version can be considered an
> obvious patch that solves certain oversight in the original
> implementation.

This is never okay.  You cannot commit a patch without approval, *ever*.

That patch is also obvious -- obviously *wrong*, that is.  There are
big assumptions everywhere in the compiler how a CC reg can be used.
This violates that, as explained elsewhere.


Segher


Re: [PATCH v2] target: missing -Whardened with -fcf-protection=none [PR114606]

2024-04-10 Thread Jakub Jelinek
On Fri, Apr 05, 2024 at 02:37:08PM -0400, Marek Polacek wrote:
> > This function is passed explicit opts and opts_set arguments, so it
> > shouldn't be using flag_something macros nor OPTION_SET_P, as the former
> > use global_options.x_flag_something rather than opts->x_flag_something
> > and the latter uses global_options_set.x_flag_something.
> 
> Ah right, so the other uses of OPTION_SET_P in ix86_option_override_internal
> are also wrong?

Most likely yes.

> > So, I think you want to use if (!opts_set->x_flag_cf_protection)
> > instead.
> 
> Fixed below, thanks.
> 
> New tests passed on x86_64-pc-linux-gnu, ok for trunk?

Ok, thanks.
> 
> -- >8 --
> -Whardened warns when -fhardened couldn't enable a hardening option
> because that option was disabled on the command line, e.g.:
> 
> $ ./cc1plus -quiet g.C -fhardened -O2 -fstack-protector
> cc1plus: warning: '-fstack-protector-strong' is not enabled by '-fhardened' 
> because it was specified on the command line [-Whardened]
> 
> but it doesn't work as expected with -fcf-protection=none:
> 
> $ ./cc1plus -quiet g.C -fhardened -O2 -fcf-protection=none
> 
> because we're checking == CF_NONE which doesn't distinguish between nothing
> and -fcf-protection=none.  I should have used opts_set, like below.
> 
>   PR target/114606
> 
> gcc/ChangeLog:
> 
>   * config/i386/i386-options.cc (ix86_option_override_internal): Use
>   opts_set rather than checking == CF_NONE.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/fhardened-1.c: New test.
>   * gcc.target/i386/fhardened-2.c: New test.

Jakub



Re: [PATCH v8 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-04-10 Thread Joseph Myers
On Fri, 29 Mar 2024, Qing Zhao wrote:

> +  /* Issue error when there is a counted_by attribute with a different
> + field as the argument for the same flexible array member field.  */

There's another case of this to consider, though I'm not sure where best 
to check for it (Martin might have suggestions) - of course this case will 
need testcases as well.

Suppose, as allowed in C23, a structure is defined twice in the same 
scope, but the two definitions of the structure use inconsistent 
counted_by attributes.  I'd say that, when the declarations are in the 
same scope (thus required to be consistent), it should be an error for the 
two definitions of what is meant to be the same structure to use 
incompatible counted_by attributes (even though the member declarations 
are otherwise the same).

In C23 structures defined with the same tag in different scopes are 
compatible given requirements including compatible types for corresponding 
elements.  It would seem most appropriate to me for such structures with 
incompatible counted_by attributes to be considered *not* compatible types 
(but it would be valid to define structures with the same tag, different 
scopes, and elements the same except for counted_by - just not to use them 
in any way requiring them to be compatible).

> +The @code{counted_by} attribute may be attached to the C99 flexible array
> +member of a structure.  It indicates that the number of the elements of the
> +array is given by the field "@var{count}" in the same structure as the

As noted previously, the "" quotes should be removed there (or replaced by 
``'' quotes).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH] c++/114409 - ANNOTATE_EXPR and templates

2024-04-10 Thread Jason Merrill

On 4/10/24 13:10, Richard Biener wrote:

On Wed, 10 Apr 2024, Jakub Jelinek wrote:


On Wed, Apr 10, 2024 at 06:43:02PM +0200, Richard Biener wrote:

The following fixes a mismatch in COMPOUND_EXPR handling in
tsubst_expr vs tsubst_stmt where the latter allows a stmt in
operand zero but the former doesn't.  This makes a difference
for the case at hand because when the COMPOUND_EXPR is wrapped
inside an ANNOTATE_EXPR it gets handled by tsubst_expr and when
not, tsubst_stmt successfully handles it and the contained
DECL_EXPR in operand zero.

The following makes handling of COMPOUND_EXPR in tsubst_expr
consistent with that of tsubst_stmt for the operand that doesn't
specify the result and thus the reason we choose either or the
other for substing.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

PR c++/114409
gcc/cp/
* pt.cc (tsubst_expr): Recurse to COMPOUND_EXPR operand
zero using tsubst_stmt, when that returns NULL return
the subst operand one, mimicing what tsubst_stmt does.

gcc/testsuite/
* g++.dg/pr114409.C: New testcase.


I've posted https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114409#c16
for this already and Jason agreed to that version, so I just have to test it
tonight:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649165.html


Ah, I saw the bugzilla patches and wanted this version to be sent
because I think the COMPOUND_EXPR inconsistency is odd.  So Jason,
please still have a look, not necessarily because of the bug
which can be fixed in multiple ways but because of that COMPOUND_EXPR
handling oddity (there are already some cases in tsubst_expr that
explicitly recurse with tsubst_stmt).


The difference between tsubst_stmt and tsubst_expr handling of 
COMPOUND_EXPR seems consistent with the general difference between the 
two functions, so I think this change isn't needed.  The two existing 
uses of tsubst_stmt in tsubst_expr are statement-expressions (for the 
substatement) and transactions (strangely, non-statement transactions 
are handled in tsubst_stmt).


Jason



Re: [PATCH] c++/114409 - ANNOTATE_EXPR and templates

2024-04-10 Thread Jakub Jelinek
On Wed, Apr 10, 2024 at 07:10:52PM +0200, Richard Biener wrote:
> Ah, I saw the bugzilla patches and wanted this version to be sent
> because I think the COMPOUND_EXPR inconsistency is odd.  So Jason,
> please still have a look, not necessarily because of the bug
> which can be fixed in multiple ways but because of that COMPOUND_EXPR
> handling oddity (there are already some cases in tsubst_expr that
> explicitly recurse with tsubst_stmt).

I think if COMPOUND_EXPR appears in a context where only expressions but not
statements are allowed (say one of the operands of PLUS_EXPR/MINUS_EXPR/...
and hundreds of other places), then the operands of that COMPOUND_EXPR
shouldn't be statements either, so we should be using tsubst_expr rather
than tsubst_stmt on it for the recursion on the first operand and it should
never return NULL.  For statements, it can return NULL when the statement
is acutally emitted with add_stmt and so nothing more needs to be kept.
tsubst_stmt ends with
default:
  gcc_assert (!STATEMENT_CODE_P (TREE_CODE (t)));
   
  RETURN (tsubst_expr (t, args, complain, in_decl));
so if something isn't handled by tsubst_stmt, it will handle it using
tsubst_expr.  But COMPOUND_EXPR is I think intentionally handled by both.
({ ... }) is handled separately in the STMT_EXPR tsubst_expr case, where
it calls tsubst_stmt after preparing stuff.

Jakub



Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Andrew Carlotti
On Wed, Apr 10, 2024 at 05:42:05PM +0100, Richard Sandiford wrote:
> Andrew Carlotti  writes:
> > On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
> >> Andrew Carlotti  writes:
> >> > The first three patches are trivial changes to the feature list to 
> >> > reflect
> >> > recent changes in the ACLE.  Patch 4 removes most of the FMV 
> >> > multiversioning
> >> > features that don't work at the moment, and should be entirely 
> >> > uncontroversial.
> >> >
> >> > Patch 5 handles the remaining cases, where there's an inconsistency in 
> >> > how
> >> > features are named in the current FMV specification compared to the 
> >> > existing
> >> > command line options.  It might be better to instead preserve the 
> >> > "memtag2",
> >> > "ssbs2" and "ls64_accdata" names for now; I'd be happy to commit either
> >> > version.
> >> 
> >> Yeah, I suppose patch 5 leaves things in a somewhat awkward state,
> >> since e.g.:
> >> 
> >> -AARCH64_OPT_FMV_EXTENSION("memtag", MEMTAG, (), (), (), "")
> >> +AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "")
> >>  
> >> -AARCH64_FMV_FEATURE("memtag2", MEMTAG2, (MEMTAG))
> >> +AARCH64_FMV_FEATURE("memtag", MEMTAG2, (MEMTAG))
> >> 
> >> seems to drop "memtag2" and FEAT_MEMTAG, but keep "memtag" and
> >> FEAT_MEMTAG2.  Is that right?
> >
> > That's deliberate. The FEAT_MEMTAG bit in __aarch64_cpu_features is defined 
> > to
> > match the definition of FEAT_MTE in the architecture, and likewise for
> > FEAT_MEMTAG2/FEAT_MTE2.  However, in Binutils the "+memtag" extension 
> > enables
> > both FEAT_MTE and FEAT_MTE2 instructions (although none of the FEAT_MTE2
> > instructions can be generated from GCC without inline assembly).  The FMV
> > specification in the ACLE currently uses names "memtag" and "memtag2" that
> > match the architecture names, but arguably don't match the command line
> > extension names.  I'm advocating for that to change to match the extension
> > names in command line options.
> 
> Hmm, ok.  I agree it makes sense for the user-visible FMV namnes to match
> the command line.  But shouldn't __aarch64_cpu_features either (a) use exactly
> the same names as the architecture or (b) use exactly the same names as the
> command-line (mangled where necessary)?  It seems that we're instead
> using a third convention that doesn't exactly match the other two.

I agree that the name isn't one I would choose now, but I don't think it 
matters much that it's inconsistent.

> That is, I can see the rationale for "memtag" => FEAT_MTE2 and
> "memtag" => FEAT_MEMTAG.  It just seems odd to have "memtag" => FEAT_MEMTAG2
> (where MEMTAG2 is an alias of MTE2).
> 
> How much leeway do we have to change the __aarch64_cpu_features names?
> Is it supposed to be a public API (as opposed to ABI)?

I think we're designing it to be capable of being a public API, but we haven't
yet made it one.  That's partly why I've kept the enum value names the same as
in LLVM so far.

> > The LS64 example is definitely an inconsistency, since GCC uses "+ls64" to
> > enable intrinsics for all of the FEAT_LS64/FEAT_LS64_V/FEAT_LS64_ACCDATA
> > intrinsics.
> 
> Ok, thanks.  If we go for option (a) above then I agree that the ls64
> change is correct.  If we go for option (b) then I suppose it should
> stay as LS64.
> 
> > There were similar issues with "sha1", "pmull" and "sve2-pmull128", but in
> > these cases their presence architecturally is implied by the presence of the
> > features checked for "sha2", "aes" and "sve2-aes" so it's fine to just 
> > delete
> > the ones without command line flags.
> >
> >> Apart from that and the comment on patch 2, the series looks good to me.
> >> 
> >> While rechecking aarch64-option-extensions.def against the ACLE list:
> >> it seems that the .def doesn't treat mops as an FMV feature.  Is that
> >> deliberate?
> >
> > "mops" was added to the ACLE list later, and libgcc doesn't yet support
> > detecting it.  I didn't think it was sensible to add new FMV feature 
> > support at
> > this stage.
> 
> Ah, ok, makes sense.
> 
> Richard


[PATCH][wwwdocs] gcc-14/changes.html: Update _BitInt to include AArch64 (little-endian)

2024-04-10 Thread Andre Vieira (lists)

Hi,

Patch to add AArch64 to the list of supported _BitInt(N) in 
gcc-14/changes.html.


OK?diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 
a7ba957110183f906938d935bfa17aaed2ba20c8..55ab8c14c6d0b54e05a5f266f25c8ef1a4f959bf
 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -216,7 +216,7 @@ a work-in-progress.
   Bit-precise integer types (_BitInt (N)
   and unsigned _BitInt (N)): integer types with
   a specified number of bits.  These are only supported on
-  IA-32/x86-64 at present.
+  IA-32/x86-64 and AArch64 (little-endian) at present.
   Structure, union and enumeration types may be defined more
   than once in the same scope with the same contents and the same
   tag; if such types are defined with the same contents and the


Re: [PATCH] c++/114409 - ANNOTATE_EXPR and templates

2024-04-10 Thread Richard Biener
On Wed, 10 Apr 2024, Jakub Jelinek wrote:

> On Wed, Apr 10, 2024 at 06:43:02PM +0200, Richard Biener wrote:
> > The following fixes a mismatch in COMPOUND_EXPR handling in
> > tsubst_expr vs tsubst_stmt where the latter allows a stmt in
> > operand zero but the former doesn't.  This makes a difference
> > for the case at hand because when the COMPOUND_EXPR is wrapped
> > inside an ANNOTATE_EXPR it gets handled by tsubst_expr and when
> > not, tsubst_stmt successfully handles it and the contained
> > DECL_EXPR in operand zero.
> > 
> > The following makes handling of COMPOUND_EXPR in tsubst_expr
> > consistent with that of tsubst_stmt for the operand that doesn't
> > specify the result and thus the reason we choose either or the
> > other for substing.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> > 
> > Thanks,
> > Richard.
> > 
> > PR c++/114409
> > gcc/cp/
> > * pt.cc (tsubst_expr): Recurse to COMPOUND_EXPR operand
> > zero using tsubst_stmt, when that returns NULL return
> > the subst operand one, mimicing what tsubst_stmt does.
> > 
> > gcc/testsuite/
> > * g++.dg/pr114409.C: New testcase.
> 
> I've posted https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114409#c16
> for this already and Jason agreed to that version, so I just have to test it
> tonight:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649165.html

Ah, I saw the bugzilla patches and wanted this version to be sent
because I think the COMPOUND_EXPR inconsistency is odd.  So Jason,
please still have a look, not necessarily because of the bug
which can be fixed in multiple ways but because of that COMPOUND_EXPR
handling oddity (there are already some cases in tsubst_expr that
explicitly recurse with tsubst_stmt).

Richard.

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] c++/114409 - ANNOTATE_EXPR and templates

2024-04-10 Thread Jakub Jelinek
On Wed, Apr 10, 2024 at 06:43:02PM +0200, Richard Biener wrote:
> The following fixes a mismatch in COMPOUND_EXPR handling in
> tsubst_expr vs tsubst_stmt where the latter allows a stmt in
> operand zero but the former doesn't.  This makes a difference
> for the case at hand because when the COMPOUND_EXPR is wrapped
> inside an ANNOTATE_EXPR it gets handled by tsubst_expr and when
> not, tsubst_stmt successfully handles it and the contained
> DECL_EXPR in operand zero.
> 
> The following makes handling of COMPOUND_EXPR in tsubst_expr
> consistent with that of tsubst_stmt for the operand that doesn't
> specify the result and thus the reason we choose either or the
> other for substing.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> 
> Thanks,
> Richard.
> 
>   PR c++/114409
> gcc/cp/
>   * pt.cc (tsubst_expr): Recurse to COMPOUND_EXPR operand
>   zero using tsubst_stmt, when that returns NULL return
>   the subst operand one, mimicing what tsubst_stmt does.
> 
> gcc/testsuite/
>   * g++.dg/pr114409.C: New testcase.

I've posted https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114409#c16
for this already and Jason agreed to that version, so I just have to test it
tonight:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649165.html

Jakub



[PATCH] c++/114409 - ANNOTATE_EXPR and templates

2024-04-10 Thread Richard Biener
The following fixes a mismatch in COMPOUND_EXPR handling in
tsubst_expr vs tsubst_stmt where the latter allows a stmt in
operand zero but the former doesn't.  This makes a difference
for the case at hand because when the COMPOUND_EXPR is wrapped
inside an ANNOTATE_EXPR it gets handled by tsubst_expr and when
not, tsubst_stmt successfully handles it and the contained
DECL_EXPR in operand zero.

The following makes handling of COMPOUND_EXPR in tsubst_expr
consistent with that of tsubst_stmt for the operand that doesn't
specify the result and thus the reason we choose either or the
other for substing.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

PR c++/114409
gcc/cp/
* pt.cc (tsubst_expr): Recurse to COMPOUND_EXPR operand
zero using tsubst_stmt, when that returns NULL return
the subst operand one, mimicing what tsubst_stmt does.

gcc/testsuite/
* g++.dg/pr114409.C: New testcase.
---
 gcc/cp/pt.cc| 5 -
 gcc/testsuite/g++.dg/pr114409.C | 8 
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/pr114409.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index bf4b89d8413..dae423a751f 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -20635,8 +20635,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
 
 case COMPOUND_EXPR:
   {
-   tree op0 = tsubst_expr (TREE_OPERAND (t, 0), args,
+   tree op0 = tsubst_stmt (TREE_OPERAND (t, 0), args,
complain & ~tf_decltype, in_decl);
+   if (op0 == NULL_TREE)
+ /* If the first operand was a statement, we're done with it.  */
+ RETURN (RECUR (TREE_OPERAND (t, 1)));
RETURN (build_x_compound_expr (EXPR_LOCATION (t),
   op0,
   RECUR (TREE_OPERAND (t, 1)),
diff --git a/gcc/testsuite/g++.dg/pr114409.C b/gcc/testsuite/g++.dg/pr114409.C
new file mode 100644
index 000..6343fe8d9f3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr114409.C
@@ -0,0 +1,8 @@
+// { dg-do compile }
+
+template  int t() {
+#pragma GCC unroll 4
+while (int ThisEntry = 0) { } // { dg-bogus "ignoring loop annotation" "" 
{ xfail *-*-* } }
+return 0;
+}
+int tt = t<1>();
-- 
2.35.3


Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Richard Sandiford
Andrew Carlotti  writes:
> On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
>> Andrew Carlotti  writes:
>> > The first three patches are trivial changes to the feature list to reflect
>> > recent changes in the ACLE.  Patch 4 removes most of the FMV 
>> > multiversioning
>> > features that don't work at the moment, and should be entirely 
>> > uncontroversial.
>> >
>> > Patch 5 handles the remaining cases, where there's an inconsistency in how
>> > features are named in the current FMV specification compared to the 
>> > existing
>> > command line options.  It might be better to instead preserve the 
>> > "memtag2",
>> > "ssbs2" and "ls64_accdata" names for now; I'd be happy to commit either
>> > version.
>> 
>> Yeah, I suppose patch 5 leaves things in a somewhat awkward state,
>> since e.g.:
>> 
>> -AARCH64_OPT_FMV_EXTENSION("memtag", MEMTAG, (), (), (), "")
>> +AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "")
>>  
>> -AARCH64_FMV_FEATURE("memtag2", MEMTAG2, (MEMTAG))
>> +AARCH64_FMV_FEATURE("memtag", MEMTAG2, (MEMTAG))
>> 
>> seems to drop "memtag2" and FEAT_MEMTAG, but keep "memtag" and
>> FEAT_MEMTAG2.  Is that right?
>
> That's deliberate. The FEAT_MEMTAG bit in __aarch64_cpu_features is defined to
> match the definition of FEAT_MTE in the architecture, and likewise for
> FEAT_MEMTAG2/FEAT_MTE2.  However, in Binutils the "+memtag" extension enables
> both FEAT_MTE and FEAT_MTE2 instructions (although none of the FEAT_MTE2
> instructions can be generated from GCC without inline assembly).  The FMV
> specification in the ACLE currently uses names "memtag" and "memtag2" that
> match the architecture names, but arguably don't match the command line
> extension names.  I'm advocating for that to change to match the extension
> names in command line options.

Hmm, ok.  I agree it makes sense for the user-visible FMV namnes to match
the command line.  But shouldn't __aarch64_cpu_features either (a) use exactly
the same names as the architecture or (b) use exactly the same names as the
command-line (mangled where necessary)?  It seems that we're instead
using a third convention that doesn't exactly match the other two.

That is, I can see the rationale for "memtag" => FEAT_MTE2 and
"memtag" => FEAT_MEMTAG.  It just seems odd to have "memtag" => FEAT_MEMTAG2
(where MEMTAG2 is an alias of MTE2).

How much leeway do we have to change the __aarch64_cpu_features names?
Is it supposed to be a public API (as opposed to ABI)?

> The LS64 example is definitely an inconsistency, since GCC uses "+ls64" to
> enable intrinsics for all of the FEAT_LS64/FEAT_LS64_V/FEAT_LS64_ACCDATA
> intrinsics.

Ok, thanks.  If we go for option (a) above then I agree that the ls64
change is correct.  If we go for option (b) then I suppose it should
stay as LS64.

> There were similar issues with "sha1", "pmull" and "sve2-pmull128", but in
> these cases their presence architecturally is implied by the presence of the
> features checked for "sha2", "aes" and "sve2-aes" so it's fine to just delete
> the ones without command line flags.
>
>> Apart from that and the comment on patch 2, the series looks good to me.
>> 
>> While rechecking aarch64-option-extensions.def against the ACLE list:
>> it seems that the .def doesn't treat mops as an FMV feature.  Is that
>> deliberate?
>
> "mops" was added to the ACLE list later, and libgcc doesn't yet support
> detecting it.  I didn't think it was sensible to add new FMV feature support 
> at
> this stage.

Ah, ok, makes sense.

Richard


Re: [PATCH] c++: Fix ANNOTATE_EXPR instantiation [PR114409]

2024-04-10 Thread Jason Merrill

On 4/10/24 09:06, Jakub Jelinek wrote:

Hi!

The following testcase ICEs starting with the r14-4229 PR111529
change which moved ANNOTATE_EXPR handling from tsubst_expr to
tsubst_copy_and_build.
ANNOTATE_EXPR is only allowed in the IL to wrap a loop condition,
and the loop condition of while/for loops can be a COMPOUND_EXPR
with DECL_EXPR in the first operand and the corresponding VAR_DECL
in the second, as created by finish_cond
   else if (!empty_expr_stmt_p (cond))
expr = build2 (COMPOUND_EXPR, TREE_TYPE (expr), cond, expr);
Since then Patrick reworked the instantiation, so that we have now
tsubst_stmt and tsubst_expr and ANNOTATE_EXPR ended up in the latter,
while only tsubst_stmt can handle DECL_EXPR.

Now, the reason why the while/for loops with variable declaration
in the condition works in templates without the pragmas (i.e. without
ANNOTATE_EXPR) is that both the FOR_STMT and WHILE_STMT handling uses
RECUR aka tsubst_stmt in handling of the *_COND operand:
 case FOR_STMT:
   stmt = begin_for_stmt (NULL_TREE, NULL_TREE);
   RECUR (FOR_INIT_STMT (t));
   finish_init_stmt (stmt);
   tmp = RECUR (FOR_COND (t));
   finish_for_cond (tmp, stmt, false, 0, false);
and
 case WHILE_STMT:
   stmt = begin_while_stmt ();
   tmp = RECUR (WHILE_COND (t));
   finish_while_stmt_cond (tmp, stmt, false, 0, false);
Therefore, it will handle DECL_EXPR embedded in COMPOUND_EXPR of the
{WHILE,FOR}_COND just fine.
But if that COMPOUND_EXPR with DECL_EXPR is wrapped with one or more
ANNOTATE_EXPRs, because ANNOTATE_EXPR is now done solely in tsubst_expr
and uses RECUR there (i.e. tsubst_expr), it will ICE on DECL_EXPR in there.

Here are 2 possible fixes for this.
The first one keeps ANNOTATE_EXPR handling in tsubst_expr but uses
tsubst_stmt for the first operand.
The second one moves ANNOTATE_EXPR handling to tsubst_stmt (and uses
tsubst_expr for the second/third operand (it could just RECUR too if you
prefer that)).
Yet another possibility could be to duplicate the ANNOTATE_EXPR handling
from tsubst_expr to tsubst_stmt, where both would just RECUR on its
operands, so if one arrives to ANNOTATE_EXPR from tsubst_stmt, it will
tsubst_stmt recursively, if from tsubst_expr (when?) then it would handle
it using tsubst_expr.

So far just lightly tested (but g++.dg/ext/unroll-4.C and the new test
both pass with both versions of the patch), what do you prefer?  I'd like
to avoid testing too many variants...


Let's go with the second.


2024-04-10  Jakub Jelinek  

PR c++/114409
* pt.cc (tsubst_expr) : Use tsubst_stmt rather
than tsubst_expr aka RECUR on op1.

* g++.dg/ext/pr114409-2.C: New test.

--- gcc/cp/pt.cc.jj 2024-04-09 09:29:04.721521726 +0200
+++ gcc/cp/pt.cc2024-04-10 14:38:43.591554947 +0200
@@ -21774,7 +21774,10 @@ tsubst_expr (tree t, tree args, tsubst_f
  
  case ANNOTATE_EXPR:

{
-   op1 = RECUR (TREE_OPERAND (t, 0));
+   /* ANNOTATE_EXPR should only appear in WHILE_COND, DO_COND or
+  FOR_COND expressions, which are tsubsted using tsubst_stmt
+  rather than tsubst_expr and can contain DECL_EXPRs.  */
+   op1 = tsubst_stmt (TREE_OPERAND (t, 0), args, complain, in_decl);
tree op2 = RECUR (TREE_OPERAND (t, 1));
tree op3 = RECUR (TREE_OPERAND (t, 2));
if (TREE_CODE (op2) == INTEGER_CST
--- gcc/testsuite/g++.dg/ext/pr114409-2.C.jj2024-04-10 14:35:19.693300552 
+0200
+++ gcc/testsuite/g++.dg/ext/pr114409-2.C   2024-04-10 14:35:13.513383766 
+0200
@@ -0,0 +1,36 @@
+// PR c++/114409
+// { dg-do compile }
+// { dg-options "-O2" }
+
+template 
+T
+foo (T)
+{
+  static T t;
+  return 42 - ++t;
+}
+
+template 
+void
+bar (T x)
+{
+  #pragma GCC novector
+  while (T y = foo (x))
+++y;
+}
+
+template 
+void
+baz (T x)
+{
+  #pragma GCC novector
+  for (; T y = foo (x); )
+++y;
+}
+
+void
+qux ()
+{
+  bar (0);
+  baz (0);
+}

Jakub




Re: [PATCH] c++/modules: optimize tree flag streaming

2024-04-10 Thread Jason Merrill

On 4/10/24 11:26, Patrick Palka wrote:

On Wed, 10 Apr 2024, Patrick Palka wrote:



On Tue, 9 Apr 2024, Jason Merrill wrote:


On 2/16/24 10:06, Patrick Palka wrote:

On Thu, 15 Feb 2024, Patrick Palka wrote:


One would expect consecutive calls to bytes_in/out::b for streaming
adjacent bits, as we do for tree flag streaming, to at least be
optimized by the compiler into individual bit operations using
statically known bit positions (and ideally merged into larger sized
reads/writes).


Did you have any thoughts about how feasible it would be to use
data-streamer.h?  I didn't see a response to richi's mail.


IIUC the workhorses of data-streamer.h are

   for streaming out: bitpack_create / bp_pack_value / streamer_write_bitpack
   for streaming in:  streamer_read_bitpack / bp_unpack_value

which use a locally constructed bitpack_d struct for state management,
much like what this patch proposes, which is a sign that this is a good
approach I suppose.

The bit twiddling code is unsurprisingly pretty similar except
data-streamer.h can stream more than one bit at a time whereas
bits_in/out::b from this patch can only handle one bit at a time
(which is by far the common case).  Another difference is that the
data-streamer.h buffer is a HOST_WIDE_INT while the modules bit buffer
is uint32_t (this patch doesn't change that).

Unfortunately it seems data-streamer.h is currently hardcoded for
LTO streaming since bitpack_d::stream must be an lto_input_block and it
uses streamer_write_uhwi_stream and streamer_read_uhwi under the hood.
So we can't use it for modules streaming currently without abstracting
away this hardcoding AFAICT.




Unfortunately this doesn't happen because the compiler has trouble
tracking the values of this->bit_pos and this->bit_val across such
calls, likely because the compiler doesn't know 'this' and so it's
treated as global memory.  This means for each consecutive bit stream
operation, bit_pos and bit_val are loaded from memory, checked if
buffering is needed, and finally the bit is extracted from bit_val
according to the (unknown) bit_pos, even though relative to the previous
operation (if we didn't need to buffer) bit_val is unchanged and bit_pos
is just 1 larger.  This ends up being quite slow, with tree_node_bools
taking 10% of time when streaming in parts of the std module.

This patch optimizes this by making tracking of bit_pos and bit_val
easier for the compiler.  Rather than bit_pos and bit_val being members
of the (effectively global) bytes_in/out objects, this patch factors out
the bit streaming code/state into separate classes bits_in/out that get
constructed locally as needed for bit streaming.  Since these objects
are now clearly local, the compiler can more easily track their values.


Please add this rationale to the bits_in comment.


Will do.




And since bit streaming is intended to be batched it's natural for these
new classes to be RAII-enabled such that the bit stream is flushed upon
destruction.

In order to make the most of this improved tracking of bit position,
this patch changes parts where we conditionally stream a tree flag
to unconditionally stream (the flag or a dummy value).  That way
the number of bits streamed and the respective bit positions are as
statically known as reasonably possible.  In lang_decl_bools and
lang_type_bools we flush the current bit buffer at the start so that
subsequent bit positions are statically known.  And in core bools, we
can add explicit early exits utilizing invariants that the compiler
can't figure out itself (e.g. a tree code can't have both TS_TYPE_COMMON
and TS_DECL_COMMON, and if a tree code doesn't have TS_DECL_COMMON then
it doesn't have TS_DECL_WITH_VIS).  Finally if we're streaming fewer
than 4 bits, it's more space efficient to stream them as individual
bytes rather than as packed bits (due to the 32-bit buffer).


Oops, this last sentence is wrong.  Although the size of the bit buffer
is 32 bits, upon flushing we rewind unused bytes within the buffer,
which means streaming 2-8 bits ends up using only one byte not all four.
So v2 below undoes this pessimization.


This patch also moves the definitions of the relevant streaming classes
into anonymous namespaces so that the compiler can make more informed
decisions about inlining their member functions.


I'm curious why you decided to put namespace { } around each class rather than
a larger part of the file?  Not asking for a change, just curious.


I don't feel strongly about i, but to me using a separate namespace { }
for each class is consistent with how we use 'static' instead of
namespace { } to give (consecutively defined) free functions internal
linkage, i.e. instead of

 namespace {
   void f() { }
   void g() { }
 }

we do

static void f() { }
static void g() { }

Using a separate namespace { } for each class is the closest thing to
'static' for types.  And it makes it obvious whether a class is TU-local
or not.



I'm also 

[PATCH] tree-optimization/114672 - WIDEN_MULT_PLUS_EXPR type mismatch

2024-04-10 Thread Richard Biener
The following makes sure to restrict WIDEN_MULT*_EXPR to a mode
precision final compute type as the mode is used to find the optab
and type checking chokes when seeing bit-precisions later which
would likely also not properly expanded to RTL.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/114672
* tree-ssa-math-opts.cc (convert_plusminus_to_widen): Only
allow mode-precision results.

* gcc.dg/torture/pr114672.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr114672.c | 14 ++
 gcc/tree-ssa-math-opts.cc   |  5 +++--
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr114672.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr114672.c 
b/gcc/testsuite/gcc.dg/torture/pr114672.c
new file mode 100644
index 000..b69511fe8db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114672.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+struct {
+  __INT64_TYPE__ m : 60;
+} s;
+
+short a;
+short b;
+
+void
+foo ()
+{
+  s.m += a * b;
+}
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index a8d25c2de48..705f4a4695a 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -2918,8 +2918,9 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, 
gimple *stmt,
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
-  if (TREE_CODE (type) != INTEGER_TYPE
-  && TREE_CODE (type) != FIXED_POINT_TYPE)
+  if ((TREE_CODE (type) != INTEGER_TYPE
+   && TREE_CODE (type) != FIXED_POINT_TYPE)
+  || !type_has_mode_precision_p (type))
 return false;
 
   if (code == MINUS_EXPR)
-- 
2.35.3


Re: [PATCH] wwwdocs: gcc-14: Add RISC-V changes

2024-04-10 Thread Palmer Dabbelt

On Wed, 10 Apr 2024 00:58:00 PDT (-0700), kito.ch...@sifive.com wrote:

---
 htdocs/gcc-14/changes.html | 155 -
 1 file changed, 154 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 2d8968cf..6cbb2e8f 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -739,7 +739,160 @@ __asm (".global __flmap_lock"  "\n\t"

 

-
+RISC-V
+
+  The SLP and loop vectorizer is now enabled for RISC-V when the vector


I think "are now enabled"?


+  extension is enabled, thanks to Ju-Zhe Zhong from
+  RiVAI,
+  Pan Li from Intel, and Robin Dapp
+  from Ventana Micro for
+  contributing most of the implementation!
+  The -mrvv-max-lmul= option has been introduced for
+  performance tuning of the loop vectorizer. The default value is
+  -mrvv-max-lmul=m1, which limits the maximum LMUL to 1.
+  The -mrvv-max-lmul=dynamic setting can dynamically select
+  the maximum LMUL value based on register pressure.
+  Atomic code generation has been improved and is now in conformance with
+  the latest psABI specification, thanks to Patrick O'Neill from
+  Rivos.
+  Support for the vector intrinsics as specified in
+  
+  version 1.0 of the RISC-V vector intrinsic specification.
+  Support for the experimental vector crypto intrinsics as specified in
+  
+  RISC-V vector intrinsic specification, thanks to Feng Wang et al.
+  from https://eswincomputing.com/;>ESWIN Computing
+  Support for the T-head vector intrinsics.
+  Support for the scalar bitmanip and scalar crypto  intrinsics, thanks to
+  Liao Shihua from https://plctlab.org/;>PLCT.
+  Support for the large code model via option -mcmodel=large,
+  thanks to Kuan-Lin Chen from
+  https://www.andestech.com/;>Andes Technology.
+  Support for the standard vector calling convention variant, thanks to
+  Lehua Ding from RiVAI.
+  Supports the target attribute, which allows users to compile
+  a function with specific extensions.
+  -march= option no longer requires the architecture string
+  to be in canonical order, with only a few constraints remaining: the
+  architecture string must start with rv[32|64][i|g|e], and
+  must use an underscore as the separator after a multi-letter extension.
+  
+  -march=help option has been introduced to dump all
+  supported extensions.
+  Added experimental support for the -mrvv-vector-bits=zvl
+  option and the riscv_rvv_vector_bits attribute, which
+  specify a fixed length for scalable vector types. This option is
+  optimized for specific vector core implementations; however, the code
+  generated with this option is NOT portable,


IIUC the code is just optimized for a specific vector length, not any 
specific core.  It's portable to other cores, just not portable to cores 
with different vector lengths.


So I think we can soften the language a bit there, as it's not like 
we're emitting vendor-specific code on this one.



+  thanks to Pan Li from https://www.intel.com/;>Intel.
+  
+  Support for TLS descriptors has been introduced, which can be enabled by
+  the -mtls-dialect=desc option. The default behavior can be
+  configured with --with-tls=[trad|desc].
+  Support for the TLS descriptors, this can be enabled by
+  -mtls-dialect=desc and the default behavior can be configure
+  by --with-tls=[trad|desc], thanks to Tatsuyuki Ishi from
+  https://bluewhale.systems/;>Blue Whale Systems


Maybe should call out that this will require the next glibc release to 
function correctly?



+  
+  Support for the following standard extensions has been added:
+
+  Vector crypto extensions:
+   
+ Zvbb
+ Zvkb
+ Zvbc
+ Zvkg
+ Zvkned
+ Zvkhna
+ Zvkhnb
+ Zvksed
+ Zvksh
+ Zvkn
+ Zvknc
+ Zvkng
+ Zvks
+ Zvksc
+ Zvksg
+ Zvkt
+   
+  
+  Code size reduction extensions:
+   
+ Zca
+ Zcb
+ Zce
+ Zcf
+ Zcd
+ Zcmp
+ Zcmt
+   
+  
+  Zicond
+  Zfa
+  Ztso
+  Zvfbfmin
+  Zvfhmin
+  Zvfh
+  Za64rs
+  Za128rs
+  Ziccif
+  Ziccrse
+  Ziccamoa
+  Zicclsm
+  Zic64b
+  Smaia
+  Smepmp
+  Smstateen
+  Ssaia
+  Sscofpmf
+  Ssstateen
+  Sstc
+  Svinval
+  Svnapot
+  Svpbmt
+
+  
+  Support for the following vendor extensions has been added:
+
+  T-Head:
+   
+ XTheadVector
+   
+  
+  CORE-V:
+   
+ XCVmac
+ XCValu
+ XCVelw
+ XCVsimd
+ XCVbi
+   
+  
+  Ventana Micro:
+   
+ XVentanaCondops
+   
+  
+
+  
+  The following new CPUs are supported through the -mcpu
+  option (GCC identifiers in 

Re: [PATCH] c++: templated substitution into lambda-expr [PR114393]

2024-04-10 Thread Jason Merrill

On 3/27/24 10:01, Patrick Palka wrote:

On Mon, 25 Mar 2024, Patrick Palka wrote:

On Mon, 25 Mar 2024, Patrick Palka wrote:


The below testcases use a lambda-expr as a template argument and they
all trip over the below added tsubst_lambda_expr sanity check ultimately
because current_template_parms is empty, which causes push_template_decl
to return error_mark_node from the call to begin_lambda_type.  Were it
not for the sanity check this silent error_mark_node result leads to
nonsensical errors down the line, or silent breakage.

In the first testcase, we hit this assert during instantiation of the
dependent alias template-id c1_t<_Data> from instantiate_template, which
clears current_template_parms via push_to_top_level.  Similar story for
the second testcase.  For the third testcase we hit the assert during
partial instantiation of the member template from instantiate_class_template
which similarly calls push_to_top_level.

These testcases illustrate that templated substitution into a lambda-expr
is not always possible, in particular when we lost the relevant template
context.  I experimented with recovering the template context by making
tsubst_lambda_expr fall back to using scope_chain->prev->template_parms if
current_template_parms is empty which worked but seemed like a hack.  I
also experimented with preserving the template context by keeping
current_template_parms set during instantiate_template for a dependent
specialization which also worked but it's at odds with the fact that we
cache dependent specializations (and so they should be independent of
the template context).


I suspect the problem comes from this bit in type_unification_real:

  /* First instatiate in template context, in case we still 
 depend on undeduced template parameters.  */

  ++processing_template_decl;
  substed = tsubst_template_arg (arg, full_targs, complain,
 NULL_TREE);
  --processing_template_decl;
  if (substed != error_mark_node
  && !uses_template_parms (substed))
/* We replaced all the tparms, substitute again out of  
   template context.  */

substed = NULL_TREE;


and perhaps we should switch to searching the argument for undeduced 
template parameters rather than doing a trial substitution.


But the pattern of setting processing_template_decl, substituting, and 
clearing it again is very widespread, so we still want to handle lambdas 
in that context.



+  if (processing_template_decl && !in_template_context)
+{
+  /* Defer templated substitution into a lambda-expr when arguments
+are dependent or when we lost the necessary template context,
+which may happen for a lambda-expr used as a template argument.  */


And this comment is stale (an earlier version of the patch also deferred
for dependent arguments even when current_template_parms is non-empty,
which I backed out to make the fix as narrow as possible).


FWIW I also experimented with unconditionally deferring templated
substitution into a lambda-expr (i.e. iff processing_template_decl)
which passed bootstrap+regtest, and turns out to also fix the
(non-regression) PR114167.  I didn't analyze the underlying issue
very closely though, there might very well be a better way to fix
that particular non-regression PR.

One downside of unconditionally deferring is that it'd mean less
ahead-of-time checking of uninvoked deeply-nested generic lambdas,
e.g.:

   int main() {
 [](auto x) {
   [](auto) {
 [](auto) { decltype(x)::fail; }; // not diagnosed anymore
   };
 }(0);
   }


Hmm, unconditionally deferring would probably also help to resolve 
issues with local classes in generic lambdas.  It might be worth going 
that way rather than continue to grapple with partial substitution problems.



diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8cf0d5b7a8d..c25bdd283f1 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -19571,6 +19572,18 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
tree oldfn = lambda_function (t);
in_decl = oldfn;
  
+  args = add_extra_args (LAMBDA_EXPR_EXTRA_ARGS (t), args, complain, in_decl);

+  if (processing_template_decl && !in_template_context)
+{
+  /* Defer templated substitution into a lambda-expr when we lost the
+necessary template context, which may happen for a lambda-expr
+used as a template argument.  */


as a *default* template argument.  OK with that tweak.

Jason



Re: [PATCH] c++/modules: optimize tree flag streaming

2024-04-10 Thread Patrick Palka
On Wed, 10 Apr 2024, Patrick Palka wrote:

> 
> On Tue, 9 Apr 2024, Jason Merrill wrote:
> 
> > On 2/16/24 10:06, Patrick Palka wrote:
> > > On Thu, 15 Feb 2024, Patrick Palka wrote:
> > > 
> > > > One would expect consecutive calls to bytes_in/out::b for streaming
> > > > adjacent bits, as we do for tree flag streaming, to at least be
> > > > optimized by the compiler into individual bit operations using
> > > > statically known bit positions (and ideally merged into larger sized
> > > > reads/writes).
> > 
> > Did you have any thoughts about how feasible it would be to use
> > data-streamer.h?  I didn't see a response to richi's mail.
> 
> IIUC the workhorses of data-streamer.h are
> 
>   for streaming out: bitpack_create / bp_pack_value / streamer_write_bitpack
>   for streaming in:  streamer_read_bitpack / bp_unpack_value
> 
> which use a locally constructed bitpack_d struct for state management,
> much like what this patch proposes, which is a sign that this is a good
> approach I suppose.
> 
> The bit twiddling code is unsurprisingly pretty similar except
> data-streamer.h can stream more than one bit at a time whereas
> bits_in/out::b from this patch can only handle one bit at a time
> (which is by far the common case).  Another difference is that the
> data-streamer.h buffer is a HOST_WIDE_INT while the modules bit buffer
> is uint32_t (this patch doesn't change that).
> 
> Unfortunately it seems data-streamer.h is currently hardcoded for
> LTO streaming since bitpack_d::stream must be an lto_input_block and it
> uses streamer_write_uhwi_stream and streamer_read_uhwi under the hood.
> So we can't use it for modules streaming currently without abstracting
> away this hardcoding AFAICT.
> 
> > 
> > > > Unfortunately this doesn't happen because the compiler has trouble
> > > > tracking the values of this->bit_pos and this->bit_val across such
> > > > calls, likely because the compiler doesn't know 'this' and so it's
> > > > treated as global memory.  This means for each consecutive bit stream
> > > > operation, bit_pos and bit_val are loaded from memory, checked if
> > > > buffering is needed, and finally the bit is extracted from bit_val
> > > > according to the (unknown) bit_pos, even though relative to the previous
> > > > operation (if we didn't need to buffer) bit_val is unchanged and bit_pos
> > > > is just 1 larger.  This ends up being quite slow, with tree_node_bools
> > > > taking 10% of time when streaming in parts of the std module.
> > > > 
> > > > This patch optimizes this by making tracking of bit_pos and bit_val
> > > > easier for the compiler.  Rather than bit_pos and bit_val being members
> > > > of the (effectively global) bytes_in/out objects, this patch factors out
> > > > the bit streaming code/state into separate classes bits_in/out that get
> > > > constructed locally as needed for bit streaming.  Since these objects
> > > > are now clearly local, the compiler can more easily track their values.
> > 
> > Please add this rationale to the bits_in comment.
> 
> Will do.
> 
> > 
> > > > And since bit streaming is intended to be batched it's natural for these
> > > > new classes to be RAII-enabled such that the bit stream is flushed upon
> > > > destruction.
> > > > 
> > > > In order to make the most of this improved tracking of bit position,
> > > > this patch changes parts where we conditionally stream a tree flag
> > > > to unconditionally stream (the flag or a dummy value).  That way
> > > > the number of bits streamed and the respective bit positions are as
> > > > statically known as reasonably possible.  In lang_decl_bools and
> > > > lang_type_bools we flush the current bit buffer at the start so that
> > > > subsequent bit positions are statically known.  And in core bools, we
> > > > can add explicit early exits utilizing invariants that the compiler
> > > > can't figure out itself (e.g. a tree code can't have both TS_TYPE_COMMON
> > > > and TS_DECL_COMMON, and if a tree code doesn't have TS_DECL_COMMON then
> > > > it doesn't have TS_DECL_WITH_VIS).  Finally if we're streaming fewer
> > > > than 4 bits, it's more space efficient to stream them as individual
> > > > bytes rather than as packed bits (due to the 32-bit buffer).
> > > 
> > > Oops, this last sentence is wrong.  Although the size of the bit buffer
> > > is 32 bits, upon flushing we rewind unused bytes within the buffer,
> > > which means streaming 2-8 bits ends up using only one byte not all four.
> > > So v2 below undoes this pessimization.
> > > 
> > > > This patch also moves the definitions of the relevant streaming classes
> > > > into anonymous namespaces so that the compiler can make more informed
> > > > decisions about inlining their member functions.
> > 
> > I'm curious why you decided to put namespace { } around each class rather 
> > than
> > a larger part of the file?  Not asking for a change, just curious.
> 
> I don't feel strongly about i, but to me using a separate namespace { }
> for each 

[committed] libstdc++: Adjust expected locale-dependent date formats in tests

2024-04-10 Thread Jonathan Wakely
Tested x86_64-linux and x86_64-freebsd14. Pushed to trunk.

-- >8 --

The std/time/year_month_day/io.cc test assumes that %x in the fr_FR
locale is %d/%m/%Y but on FreeBSD it is %d.%m.%Y instead. Make the test
PASS for either format.

Similarly, 27_io/manipulators/extended/get_time/char/2.cc expects that
%a in the de_DE locale is "Di" but on FreeBSD it's "Di." with a trailing
period. Adjust the input string to be "1971 Di." instead of "Di 1971"
and that way if %a doesn't expect the trailing '.' it simply won't
extract it from the stream.

This fixes:
FAIL: std/time/year_month_day/io.cc  -std=gnu++20 execution test
FAIL: 27_io/manipulators/extended/get_time/char/2.cc  -std=gnu++17 execution 
test

libstdc++-v3/ChangeLog:

* testsuite/27_io/manipulators/extended/get_time/char/2.cc:
Adjust input string so that it matches %a with or without a
trailing period.
* testsuite/std/time/year_month_day/io.cc: Adjust expected
format for %x in the fr_FR locale.
---
 .../27_io/manipulators/extended/get_time/char/2.cc  | 6 +++---
 libstdc++-v3/testsuite/std/time/year_month_day/io.cc| 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/27_io/manipulators/extended/get_time/char/2.cc 
b/libstdc++-v3/testsuite/27_io/manipulators/extended/get_time/char/2.cc
index 6104349d254..b582967fddc 100644
--- a/libstdc++-v3/testsuite/27_io/manipulators/extended/get_time/char/2.cc
+++ b/libstdc++-v3/testsuite/27_io/manipulators/extended/get_time/char/2.cc
@@ -35,9 +35,9 @@ void test01()
   VERIFY( loc_de != loc_c );
   istringstream iss;
   iss.imbue(loc_de);
-  iss.str("Di 1971");
-  tm time1;
-  iss >> get_time(, "%a %Y");
+  iss.str("1971 Di."); // %a is "Di" on some targets, "Di." on others.
+  tm time1{};
+  iss >> get_time(, "%Y %a");
   VERIFY(time1.tm_wday == 2);
   VERIFY(time1.tm_year == 71);
 }
diff --git a/libstdc++-v3/testsuite/std/time/year_month_day/io.cc 
b/libstdc++-v3/testsuite/std/time/year_month_day/io.cc
index cb82ef3b612..632b7a0fc2d 100644
--- a/libstdc++-v3/testsuite/std/time/year_month_day/io.cc
+++ b/libstdc++-v3/testsuite/std/time/year_month_day/io.cc
@@ -84,7 +84,7 @@ test_format()
   s = std::format(loc_fr, "{:%x}", 2022y/December/19);
   VERIFY( s == "12/19/22" );
   s = std::format(loc_fr, "{:L%x}", 2022y/December/19);
-  VERIFY( s == "19/12/2022" );
+  VERIFY( s == "19/12/2022" || s == "19.12.2022" ); // depends on locale defs
   s = std::format(loc_fr, "{}", 2022y/December/19);
   VERIFY( s == "2022-12-19" );
   s = std::format(loc_fr, "{:L%F}", 2022y/December/19);
-- 
2.44.0



Re: [PATCH] c++/modules: optimize tree flag streaming

2024-04-10 Thread Patrick Palka


On Tue, 9 Apr 2024, Jason Merrill wrote:

> On 2/16/24 10:06, Patrick Palka wrote:
> > On Thu, 15 Feb 2024, Patrick Palka wrote:
> > 
> > > One would expect consecutive calls to bytes_in/out::b for streaming
> > > adjacent bits, as we do for tree flag streaming, to at least be
> > > optimized by the compiler into individual bit operations using
> > > statically known bit positions (and ideally merged into larger sized
> > > reads/writes).
> 
> Did you have any thoughts about how feasible it would be to use
> data-streamer.h?  I didn't see a response to richi's mail.

IIUC the workhorses of data-streamer.h are

  for streaming out: bitpack_create / bp_pack_value / streamer_write_bitpack
  for streaming in:  streamer_read_bitpack / bp_unpack_value

which use a locally constructed bitpack_d struct for state management,
much like what this patch proposes, which is a sign that this is a good
approach I suppose.

The bit twiddling code is unsurprisingly pretty similar except
data-streamer.h can stream more than one bit at a time whereas
bits_in/out::b from this patch can only handle one bit at a time
(which is by far the common case).  Another difference is that the
data-streamer.h buffer is a HOST_WIDE_INT while the modules bit buffer
is uint32_t (this patch doesn't change that).

Unfortunately it seems data-streamer.h is currently hardcoded for
LTO streaming since bitpack_d::stream must be an lto_input_block and it
uses streamer_write_uhwi_stream and streamer_read_uhwi under the hood.
So we can't use it for modules streaming currently without abstracting
away this hardcoding AFAICT.

> 
> > > Unfortunately this doesn't happen because the compiler has trouble
> > > tracking the values of this->bit_pos and this->bit_val across such
> > > calls, likely because the compiler doesn't know 'this' and so it's
> > > treated as global memory.  This means for each consecutive bit stream
> > > operation, bit_pos and bit_val are loaded from memory, checked if
> > > buffering is needed, and finally the bit is extracted from bit_val
> > > according to the (unknown) bit_pos, even though relative to the previous
> > > operation (if we didn't need to buffer) bit_val is unchanged and bit_pos
> > > is just 1 larger.  This ends up being quite slow, with tree_node_bools
> > > taking 10% of time when streaming in parts of the std module.
> > > 
> > > This patch optimizes this by making tracking of bit_pos and bit_val
> > > easier for the compiler.  Rather than bit_pos and bit_val being members
> > > of the (effectively global) bytes_in/out objects, this patch factors out
> > > the bit streaming code/state into separate classes bits_in/out that get
> > > constructed locally as needed for bit streaming.  Since these objects
> > > are now clearly local, the compiler can more easily track their values.
> 
> Please add this rationale to the bits_in comment.

Will do.

> 
> > > And since bit streaming is intended to be batched it's natural for these
> > > new classes to be RAII-enabled such that the bit stream is flushed upon
> > > destruction.
> > > 
> > > In order to make the most of this improved tracking of bit position,
> > > this patch changes parts where we conditionally stream a tree flag
> > > to unconditionally stream (the flag or a dummy value).  That way
> > > the number of bits streamed and the respective bit positions are as
> > > statically known as reasonably possible.  In lang_decl_bools and
> > > lang_type_bools we flush the current bit buffer at the start so that
> > > subsequent bit positions are statically known.  And in core bools, we
> > > can add explicit early exits utilizing invariants that the compiler
> > > can't figure out itself (e.g. a tree code can't have both TS_TYPE_COMMON
> > > and TS_DECL_COMMON, and if a tree code doesn't have TS_DECL_COMMON then
> > > it doesn't have TS_DECL_WITH_VIS).  Finally if we're streaming fewer
> > > than 4 bits, it's more space efficient to stream them as individual
> > > bytes rather than as packed bits (due to the 32-bit buffer).
> > 
> > Oops, this last sentence is wrong.  Although the size of the bit buffer
> > is 32 bits, upon flushing we rewind unused bytes within the buffer,
> > which means streaming 2-8 bits ends up using only one byte not all four.
> > So v2 below undoes this pessimization.
> > 
> > > This patch also moves the definitions of the relevant streaming classes
> > > into anonymous namespaces so that the compiler can make more informed
> > > decisions about inlining their member functions.
> 
> I'm curious why you decided to put namespace { } around each class rather than
> a larger part of the file?  Not asking for a change, just curious.

I don't feel strongly about i, but to me using a separate namespace { }
for each class is consistent with how we use 'static' instead of
namespace { } to give (consecutively defined) free functions internal
linkage, i.e. instead of

namespace {
  void f() { }
  void g() { }
}

we do

   static void 

[committed] libstdc++: Handle EMLINK and EFTYPE in std::filesystem::remove_all

2024-04-10 Thread Jonathan Wakely
Tested x86_64-linux and x86_64-freebsd14. Pushed to trunk.

-- >8 --

Although POSIX requires ELOOP, FreeBSD documents that openat with
O_NOFOLLOW returns EMLINK if the last component of a filename is a
symbolic link.  Check for EMLINK as well as ELOOP, so that the TOCTTOU
mitigation in remove_all works correctly.

See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=214633 or the
FreeBSD man page for reference.

According to its man page, DragonFlyBSD also uses EMLINK for this error,
and NetBSD uses its own EFTYPE. OpenBSD follows POSIX and uses EMLINK.

This fixes these failures on FreeBSD:
FAIL: 27_io/filesystem/operations/remove_all.cc  -std=gnu++17 execution test
FAIL: experimental/filesystem/operations/remove_all.cc  -std=gnu++17 execution 
test

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (remove_all) [__FreeBSD__ || __DragonFly__]:
Check for EMLINK as well as ELOOP.
[__NetBSD__]: Check for EFTYPE as well as ELOOP.
---
 libstdc++-v3/src/c++17/fs_ops.cc | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 61df19753ef..07bc2a0fa88 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1312,7 +1312,13 @@ fs::remove_all(const path& p)
 // Our work here is done.
 return 0;
   case ENOTDIR:
-  case ELOOP:
+  case ELOOP:  // POSIX says openat with O_NOFOLLOW sets ELOOP for a symlink.
+#if defined __FreeBSD__ || defined __DragonFly__
+  case EMLINK: // Used instead of ELOOP
+#endif
+#if defined __NetBSD__ && defined EFTYPE
+  case EFTYPE: // Used instead of ELOOP
+#endif
 // Not a directory, will remove below.
 break;
 #endif
@@ -1352,7 +1358,13 @@ fs::remove_all(const path& p, error_code& ec)
 ec.clear();
 return 0;
   case ENOTDIR:
-  case ELOOP:
+  case ELOOP:  // POSIX says openat with O_NOFOLLOW sets ELOOP for a symlink.
+#if defined __FreeBSD__ || defined __DragonFly__
+  case EMLINK: // Used instead of ELOOP
+#endif
+#if defined __NetBSD__ && defined EFTYPE
+  case EFTYPE: // Used instead of ELOOP
+#endif
 // Not a directory, will remove below.
 break;
 #endif
-- 
2.44.0



[wwwdocs, committed] Fix link to "Feature Test Macros" in "Porting to GCC 14" page

2024-04-10 Thread Martin Jambor
Hi,

Michal Jireš found out that the link to Feature Test Macros on the
Porting to GCC 14 page was broken, it misses a "/latest/" directory in
the middle of the path.

I'll commit the following as obvious.

Thanks,

Martin

---
 htdocs/gcc-14/porting_to.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/porting_to.html b/htdocs/gcc-14/porting_to.html
index 35274691..c825a68e 100644
--- a/htdocs/gcc-14/porting_to.html
+++ b/htdocs/gcc-14/porting_to.html
@@ -133,7 +133,7 @@ On GNU systems, headers described in standards (such as the 
C
 standard, or POSIX) may require the definition of certain
 macros at the start of the compilation before all required
 function declarations are made available.
-See https://sourceware.org/glibc/manual/html_node/Feature-Test-Macros.html;>Feature
 Test Macros
+See https://sourceware.org/glibc/manual/latest/html_node/Feature-Test-Macros.html;>Feature
 Test Macros
 in the GNU C Library manual for details.
 
 
-- 
2.44.0



Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-10 Thread Alex Coplan
Hi Ajit,

On 10/04/2024 15:31, Ajit Agarwal wrote:
> Hello Alex:
> 
> On 10/04/24 1:42 pm, Alex Coplan wrote:
> > Hi Ajit,
> > 
> > On 09/04/2024 20:59, Ajit Agarwal wrote:
> >> Hello Alex:
> >>
> >> On 09/04/24 8:39 pm, Alex Coplan wrote:
> >>> On 09/04/2024 20:01, Ajit Agarwal wrote:
>  Hello Alex:
> 
>  On 09/04/24 7:29 pm, Alex Coplan wrote:
> > On 09/04/2024 17:30, Ajit Agarwal wrote:
> >>
> >>
> >> On 05/04/24 10:03 pm, Alex Coplan wrote:
> >>> On 05/04/2024 13:53, Ajit Agarwal wrote:
>  Hello Alex/Richard:
> 
>  All review comments are incorporated.

>  @@ -2890,8 +3018,8 @@ ldp_bb_info::merge_pairs (insn_list_t 
>  _list,
>   // of accesses.  If we find two sets of adjacent accesses, call
>   // merge_pairs.
>   void
>  -ldp_bb_info::transform_for_base (int encoded_lfs,
>  - access_group )
>  +pair_fusion_bb_info::transform_for_base (int encoded_lfs,
>  + access_group )
>   {
> const auto lfs = decode_lfs (encoded_lfs);
> const unsigned access_size = lfs.size;
>  @@ -2909,7 +3037,7 @@ ldp_bb_info::transform_for_base (int 
>  encoded_lfs,
>  access.cand_insns,
>  lfs.load_p,
>  access_size);
>  -  skip_next = access.cand_insns.empty ();
>  +  skip_next = bb_state->cand_insns_empty_p (access.cand_insns);
> >>>
> >>> As above, why is this needed?
> >>
> >> For rs6000 we want to return always true. as load store pair
> >> that are to be merged with 8/16 16/32 32/64 is occuring for rs6000.
> >> And we want load store pair to 8/16 32/64. Thats why we want
> >> to generate always true for rs6000 to skip pairs as above.
> >
> > Hmm, sorry, I'm not sure I follow.  Are you saying that for rs6000 you 
> > have
> > load/store pair instructions where the two arms of the access are 
> > storing
> > operands of different sizes?  Or something else?
> >
> > As it stands the logic is to skip the next iteration only if we
> > exhausted all the candidate insns for the current access.  In the case
> > that we didn't exhaust all such candidates, then the idea is that when
> > access becomes prev_access, we can attempt to use those candidates as
> > the "left-hand side" of a pair in the next iteration since we failed to
> > use them as the "right-hand side" of a pair in the current iteration.
> > I don't see why you wouldn't want that behaviour.  Please can you
> > explain?
> >
> 
>  In merge_pair we get the 2 load candiates one load from 0 offset and
>  other load is from 16th offset. Then in next iteration we get load
>  from 16th offset and other load from 32 offset. In next iteration
>  we get load from 32 offset and other load from 48 offset.
> 
>  For example:
> 
>  Currently we get the load candiates as follows.
> 
>  pairs:
> 
>  load from 0th offset.
>  load from 16th offset.
> 
>  next pairs:
> 
>  load from 16th offset.
>  load from 32th offset.
> 
>  next pairs:
> 
>  load from 32th offset
>  load from 48th offset.
> 
>  Instead in rs6000 we should get:
> 
>  pairs:
> 
>  load from 0th offset
>  load from 16th offset.
> 
>  next pairs:
> 
>  load from 32th offset
>  load from 48th offset.
> >>>
> >>> Hmm, so then I guess my question is: why wouldn't you consider merging
> >>> the pair with offsets (16,32) for rs6000?  Is it because you have a
> >>> stricter alignment requirement on the base pair offsets (i.e. they have
> >>> to be a multiple of 32 when the operand size is 16)?  So the pair
> >>> offsets have to be a multiple of the entire pair size rather than a
> >>> single operand size> 
> >>
> >> We get load pair at a certain point with (0,16) and other program
> >> point we get load pair (32, 48).
> >>
> >> In current implementation it takes offsets loads as (0, 16),
> >> (16, 32), (32, 48).
> >>
> >> But In rs6000 we want  the load pair to be merged at different points
> >> as (0,16) and (32, 48). for (0,16) we want to replace load lxvp with
> >> 0 offset and other load (32, 48) with lxvp with 32 offset.
> >>
> >> In current case it will merge with lxvp with 0 offset and lxvp with
> >> 16 offset, then lxvp with 32 offset and lxvp with 48 offset which
> >> is incorrect in our case as the (16-32) case 16 offset will not
> >> load from even register and break for rs6000.
> > 
> > Sorry, I think I'm still missing something here.  Why does the address 
> > offset
> > affect the parity of the tranfser register?  ISTM they needn't be related at
> > all (and indeed we can't even know the parity of the transfer register 
> > before
> > RA, 

Re: [PATCH] Regenerate opt.urls

2024-04-10 Thread Palmer Dabbelt

On Wed, 10 Apr 2024 00:57:59 PDT (-0700), sch...@suse.de wrote:

On Apr 09 2024, Palmer Dabbelt wrote:


I didn't actually regenerate this as I can't figure out how,


make regenerate-opt-urls


Ya, that's what the CI says too.  I think I might just have a broken 
build tree, something is mixed up and it picked up a host binutils.  
Looks like there's already a patch over here 
, 
so we should be good.


Re: [PATCH] Regenerate opt.urls

2024-04-10 Thread Palmer Dabbelt

On Tue, 09 Apr 2024 07:57:24 PDT (-0700), ishitatsuy...@gmail.com wrote:

Fixes: 97069657c4e ("RISC-V: Implement TLS Descriptors.")

gcc/ChangeLog:
* config/riscv/riscv.opt.urls: Regenerated.
---
 gcc/config/riscv/riscv.opt.urls | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/riscv/riscv.opt.urls b/gcc/config/riscv/riscv.opt.urls
index da31820e234..351f7f0dda2 100644
--- a/gcc/config/riscv/riscv.opt.urls
+++ b/gcc/config/riscv/riscv.opt.urls
@@ -89,3 +89,5 @@ UrlSuffix(gcc/RISC-V-Options.html#index-minline-strncmp)
 minline-strlen
 UrlSuffix(gcc/RISC-V-Options.html#index-minline-strlen)

+; skipping UrlSuffix for 'mtls-dialect=' due to finding no URLs
+


Thanks.  I had another one over here 
, 
but let's go with yours -- I think the actual contents are the same, but 
I didn't actually run the regenerate script.  So


Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 


Re: [PATCH] [testsuite] Fix pretty printers regexps for GDB output

2024-04-10 Thread Christophe Lyon
ping?

On Tue, 6 Feb 2024 at 10:26, Christophe Lyon  wrote:
>
> ping?
>
> On Thu, 25 Jan 2024 at 16:54, Christophe Lyon
>  wrote:
> >
> > On Wed, 24 Jan 2024 at 12:02, Jonathan Wakely  wrote:
> > >
> > > On Wed, 24 Jan 2024 at 10:48, Christophe Lyon wrote:
> > > >
> > > > GDB emits end of lines as \r\n, we currently match the reverse \n\r,
> > >
> > > We currently match [\n\r]+ which should match any of \n, \r, \n\r or \r\n
> > >
> >
> > Hmm, right, sorry I had this patch in my tree for quite some time, but
> > wrote the description just now, so I read a bit too quickly.
> >
> > >
> > > > possibly leading to mismatches under racy conditions.
> > >
> > > What do we incorrectly match? Is the problem that a \r\n sequence
> > > might be incompletely printed, due to buffering, and so the regex only
> > > sees (and matches) the \r which then leaves an unwanted \n in the
> > > stream, which then interferes with the next match? I don't understand
> > > why that problem wouldn't just result in a failed match with your new
> > > regex though.
> > >
> > Exactly: READ1 forces read() to return 1 byte at a time, so we leave
> > an unwanted \r in front of a string that should otherwise match the
> > "got" case.
> >
> > >
> > > >
> > > > I noticed this while running the GCC testsuite using the equivalent of
> > > > GDB's READ1 feature [1] which helps detecting bufferization issues.
> > > >
> > > > Adjusting the first regexp to match the right order implied fixing the
> > > > second one, to skip the empty lines.
> > >
> > > At the very least, this part of the description is misleading. The
> > > existing regex matches "the right order" already. The change is to
> > > match *exactly* \r\n instead of any mix of CR and LF characters.
> > > That's not about matching "the right order", it's being more precise
> > > in what we match.
> > >
> > > But I'm still confused about what the failure scenario is and how the
> > > change fixes it.
> > >
> >
> > I followed what the GDB testsuite does (it matches \r\n at the end of
> > many regexps), but in fact it seems it's not needed:
> > it works if I update the "got" regexp like this (ie. accept any number
> > of leading \r or \n):
> > -   -re {^(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} {
> > +   -re {^[\n\r]*(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} {
> > and leave the "skipping" regexp as it is currently.
> >
> > Is the new attached version OK?
> >
> > Thanks,
> >
> > Christophe
> >
> > > >
> > > > Tested on aarch64-linux-gnu.
> > > >
> > > > [1] 
> > > > https//github.com/bminor/binutils-gdb/blob/master/gdb/testsuite/README#L269
> > > >
> > > > 2024-01-24  Christophe Lyon  
> > > >
> > > > libstdc++-v3/
> > > > * testsuite/lib/gdb-test.exp (gdb-test): Fix regexps.
> > > > ---
> > > >  libstdc++-v3/testsuite/lib/gdb-test.exp | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/libstdc++-v3/testsuite/lib/gdb-test.exp 
> > > > b/libstdc++-v3/testsuite/lib/gdb-test.exp
> > > > index 31206f2fc32..0de8d9ee153 100644
> > > > --- a/libstdc++-v3/testsuite/lib/gdb-test.exp
> > > > +++ b/libstdc++-v3/testsuite/lib/gdb-test.exp
> > > > @@ -194,7 +194,7 @@ proc gdb-test { marker {selector {}} {load_xmethods 
> > > > 0} } {
> > > >
> > > >  set test_counter 0
> > > >  remote_expect target [timeout_value] {
> > > > -   -re {^(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} {
> > > > +   -re {^(type|\$([0-9]+)) = ([^\n\r]*)\r\n} {
> > > > send_log "got: $expect_out(buffer)"
> > > >
> > > > incr test_counter
> > > > @@ -250,7 +250,7 @@ proc gdb-test { marker {selector {}} {load_xmethods 
> > > > 0} } {
> > > > return
> > > > }
> > > >
> > > > -   -re {^[^$][^\n\r]*[\n\r]+} {
> > > > +   -re {^[\r\n]*[^$][^\n\r]*\r\n} {
> > > > send_log "skipping: $expect_out(buffer)"
> > > > exp_continue
> > > > }
> > > > --
> > > > 2.34.1
> > > >
> > >


Re: [PING^5] Re: [PATCH] analyzer: deal with -fshort-enums

2024-04-10 Thread Torbjorn SVENSSON

Ping!

Kind regards,
Torbjörn

On 2024-03-25 15:59, Yvan ROUX - foss wrote:

Ping!

Rgds,
Yvan

From: Torbjorn SVENSSON - foss
Sent: Friday, March 15, 2024 11:32 AM
To: David Malcolm; Alexandre Oliva
Cc: gcc-patches@gcc.gnu.org; Yvan ROUX - foss
Subject: [PING^3] Re: [PATCH] analyzer: deal with -fshort-enums

Ping!

Kind regards,
Torbjörn

On 2024-03-08 10:14, Torbjorn SVENSSON wrote:

Ping!

Kind regards,
Torbjörn

On 2024-02-22 09:51, Torbjorn SVENSSON wrote:

Ping!

Kind regards,
Torbjörn

On 2024-02-07 17:21, Torbjorn SVENSSON wrote:

Hi,

Is it okay to backport 3cbab07b08d2f3a3ed34b6ec12e67727c59d285c to
releases/gcc-13?

Without this backport, I see these failures on arm-none-eabi:

FAIL: gcc.dg/analyzer/switch-enum-1.c  (test for bogus messages, line
26)
FAIL: gcc.dg/analyzer/switch-enum-1.c  (test for bogus messages, line
44)
FAIL: gcc.dg/analyzer/switch-enum-2.c  (test for bogus messages, line
34)
FAIL: gcc.dg/analyzer/switch-enum-2.c  (test for bogus messages, line
52)
FAIL: gcc.dg/analyzer/torture/switch-enum-pr105273-doom-p_floor.c -O0
   (test for bogus messages, line 82)
FAIL: gcc.dg/analyzer/torture/switch-enum-pr105273-doom-p_maputl.c
-O0(test for bogus messages, line 83)

Kind regards,
Torbjörn


On 2023-12-06 23:22, David Malcolm wrote:

On Wed, 2023-12-06 at 02:31 -0300, Alexandre Oliva wrote:

On Nov 22, 2023, Alexandre Oliva  wrote:


Ah, nice, that's a great idea, I wish I'd thought of that!  Will
do.


Sorry it took me so long, here it is.  I added two tests, so that,
regardless of the defaults, we get both circumstances tested, without
repetition.

Regstrapped on x86_64-linux-gnu.  Also tested on arm-eabi.  Ok to
install?


Thanks for the updated patch.

Looks good to me.

Dave




analyzer: deal with -fshort-enums

On platforms that enable -fshort-enums by default, various switch-
enum
analyzer tests fail, because apply_constraints_for_gswitch doesn't
expect the integral promotion type cast.  I've arranged for the code
to cope with those casts.


for  gcc/analyzer/ChangeLog

  * region-model.cc (has_nondefault_case_for_value_p): Take
  enumerate type as a parameter.
  (region_model::apply_constraints_for_gswitch): Cope with
  integral promotion type casts.

for  gcc/testsuite/ChangeLog

  * gcc.dg/analyzer/switch-short-enum-1.c: New.
  * gcc.dg/analyzer/switch-no-short-enum-1.c: New.
---
   gcc/analyzer/region-model.cc   |   27 +++-
   .../gcc.dg/analyzer/switch-no-short-enum-1.c   |  141

   .../gcc.dg/analyzer/switch-short-enum-1.c  |  140

   3 files changed, 304 insertions(+), 4 deletions(-)
   create mode 100644 gcc/testsuite/gcc.dg/analyzer/switch-no-short-
enum-1.c
   create mode 100644 gcc/testsuite/gcc.dg/analyzer/switch-short-enum-
1.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
model.cc
index 2157ad2578b85..6a7a8bc9f4884 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5387,10 +5387,10 @@ has_nondefault_case_for_value_p (const
gswitch *switch_stmt, tree int_cst)
  has nondefault cases handling all values in the enum.  */
   static bool
-has_nondefault_cases_for_all_enum_values_p (const gswitch
*switch_stmt)
+has_nondefault_cases_for_all_enum_values_p (const gswitch
*switch_stmt,
+   tree type)
   {
 gcc_assert (switch_stmt);
-  tree type = TREE_TYPE (gimple_switch_index (switch_stmt));
 gcc_assert (TREE_CODE (type) == ENUMERAL_TYPE);
 for (tree enum_val_iter = TYPE_VALUES (type);
@@ -5426,6 +5426,23 @@ apply_constraints_for_gswitch (const
switch_cfg_superedge ,
   {
 tree index  = gimple_switch_index (switch_stmt);
 const svalue *index_sval = get_rvalue (index, ctxt);
+  bool check_index_type = true;
+
+  /* With -fshort-enum, there may be a type cast.  */
+  if (ctxt && index_sval->get_kind () == SK_UNARYOP
+  && TREE_CODE (index_sval->get_type ()) == INTEGER_TYPE)
+{
+  const unaryop_svalue *unaryop = as_a 
(index_sval);
+  if (unaryop->get_op () == NOP_EXPR
+ && is_a  (unaryop->get_arg ()))
+   if (const initial_svalue *initvalop = (as_a 
+  (unaryop->get_arg
(
+ if (TREE_CODE (initvalop->get_type ()) == ENUMERAL_TYPE)
+   {
+ index_sval = initvalop;
+ check_index_type = false;
+   }
+}
 /* If we're switching based on an enum type, assume that the user
is only
working with values from the enum.  Hence if this is an
@@ -5437,12 +5454,14 @@ apply_constraints_for_gswitch (const
switch_cfg_superedge ,
 ctxt
 /* Must be an enum value.  */
 && index_sval->get_type ()
-  && TREE_CODE (TREE_TYPE (index)) == ENUMERAL_TYPE
+  && (!check_index_type
+ || TREE_CODE (TREE_TYPE (index)) == ENUMERAL_TYPE)
 && TREE_CODE 

[PATCH] c++: Fix ANNOTATE_EXPR instantiation [PR114409]

2024-04-10 Thread Jakub Jelinek
Hi!

The following testcase ICEs starting with the r14-4229 PR111529
change which moved ANNOTATE_EXPR handling from tsubst_expr to
tsubst_copy_and_build.
ANNOTATE_EXPR is only allowed in the IL to wrap a loop condition,
and the loop condition of while/for loops can be a COMPOUND_EXPR
with DECL_EXPR in the first operand and the corresponding VAR_DECL
in the second, as created by finish_cond
  else if (!empty_expr_stmt_p (cond))
expr = build2 (COMPOUND_EXPR, TREE_TYPE (expr), cond, expr);
Since then Patrick reworked the instantiation, so that we have now
tsubst_stmt and tsubst_expr and ANNOTATE_EXPR ended up in the latter,
while only tsubst_stmt can handle DECL_EXPR.

Now, the reason why the while/for loops with variable declaration
in the condition works in templates without the pragmas (i.e. without
ANNOTATE_EXPR) is that both the FOR_STMT and WHILE_STMT handling uses
RECUR aka tsubst_stmt in handling of the *_COND operand:
case FOR_STMT:
  stmt = begin_for_stmt (NULL_TREE, NULL_TREE);
  RECUR (FOR_INIT_STMT (t));
  finish_init_stmt (stmt);
  tmp = RECUR (FOR_COND (t));
  finish_for_cond (tmp, stmt, false, 0, false);
and
case WHILE_STMT:
  stmt = begin_while_stmt ();
  tmp = RECUR (WHILE_COND (t));
  finish_while_stmt_cond (tmp, stmt, false, 0, false);
Therefore, it will handle DECL_EXPR embedded in COMPOUND_EXPR of the
{WHILE,FOR}_COND just fine.
But if that COMPOUND_EXPR with DECL_EXPR is wrapped with one or more
ANNOTATE_EXPRs, because ANNOTATE_EXPR is now done solely in tsubst_expr
and uses RECUR there (i.e. tsubst_expr), it will ICE on DECL_EXPR in there.

Here are 2 possible fixes for this.
The first one keeps ANNOTATE_EXPR handling in tsubst_expr but uses
tsubst_stmt for the first operand.
The second one moves ANNOTATE_EXPR handling to tsubst_stmt (and uses
tsubst_expr for the second/third operand (it could just RECUR too if you
prefer that)).
Yet another possibility could be to duplicate the ANNOTATE_EXPR handling
from tsubst_expr to tsubst_stmt, where both would just RECUR on its
operands, so if one arrives to ANNOTATE_EXPR from tsubst_stmt, it will
tsubst_stmt recursively, if from tsubst_expr (when?) then it would handle
it using tsubst_expr.

So far just lightly tested (but g++.dg/ext/unroll-4.C and the new test
both pass with both versions of the patch), what do you prefer?  I'd like
to avoid testing too many variants...

2024-04-10  Jakub Jelinek  

PR c++/114409
* pt.cc (tsubst_expr) : Use tsubst_stmt rather
than tsubst_expr aka RECUR on op1.

* g++.dg/ext/pr114409-2.C: New test.

--- gcc/cp/pt.cc.jj 2024-04-09 09:29:04.721521726 +0200
+++ gcc/cp/pt.cc2024-04-10 14:38:43.591554947 +0200
@@ -21774,7 +21774,10 @@ tsubst_expr (tree t, tree args, tsubst_f
 
 case ANNOTATE_EXPR:
   {
-   op1 = RECUR (TREE_OPERAND (t, 0));
+   /* ANNOTATE_EXPR should only appear in WHILE_COND, DO_COND or
+  FOR_COND expressions, which are tsubsted using tsubst_stmt
+  rather than tsubst_expr and can contain DECL_EXPRs.  */
+   op1 = tsubst_stmt (TREE_OPERAND (t, 0), args, complain, in_decl);
tree op2 = RECUR (TREE_OPERAND (t, 1));
tree op3 = RECUR (TREE_OPERAND (t, 2));
if (TREE_CODE (op2) == INTEGER_CST
--- gcc/testsuite/g++.dg/ext/pr114409-2.C.jj2024-04-10 14:35:19.693300552 
+0200
+++ gcc/testsuite/g++.dg/ext/pr114409-2.C   2024-04-10 14:35:13.513383766 
+0200
@@ -0,0 +1,36 @@
+// PR c++/114409
+// { dg-do compile }
+// { dg-options "-O2" }
+
+template 
+T
+foo (T)
+{
+  static T t;
+  return 42 - ++t;
+}
+
+template 
+void
+bar (T x)
+{
+  #pragma GCC novector
+  while (T y = foo (x))
+++y;
+}
+
+template 
+void
+baz (T x)
+{
+  #pragma GCC novector
+  for (; T y = foo (x); )
+++y;
+}
+
+void
+qux ()
+{
+  bar (0);
+  baz (0);
+}

Jakub
2024-04-10  Jakub Jelinek  

PR c++/114409
* pt.cc (tsubst_expr) : Move to ...
(tsubst_stmt) : ... here.  Use tsubst_expr
instead of RECUR for the last 2 arguments.

* g++.dg/ext/pr114409-2.C: New test.

--- gcc/cp/pt.cc.jj 2024-04-09 09:29:04.721521726 +0200
+++ gcc/cp/pt.cc2024-04-10 14:45:25.527142692 +0200
@@ -19433,6 +19433,23 @@ tsubst_stmt (tree t, tree args, tsubst_f
 case PREDICT_EXPR:
   RETURN (add_stmt (copy_node (t)));
 
+case ANNOTATE_EXPR:
+  {
+   /* Although ANNOTATE_EXPR is an expression, it can only appear in
+  WHILE_COND, DO_COND or FOR_COND expressions, which are tsubsted
+  using tsubst_stmt rather than tsubst_expr and can contain
+  DECL_EXPRs.  */
+   tree op1 = RECUR (TREE_OPERAND (t, 0));
+   tree op2 = tsubst_expr (TREE_OPERAND (t, 1), args, complain, in_decl);
+   tree op3 = tsubst_expr (TREE_OPERAND (t, 2), args, complain, in_decl);
+   if (TREE_CODE (op2) == INTEGER_CST
+   && wi::to_widest (op2) == (int) annot_expr_unroll_kind)
+ op3 = 

Re: [PATCH 2/5] aarch64: Don't use FEAT_MAX as array length

2024-04-10 Thread Andrew Carlotti
On Tue, Apr 09, 2024 at 04:33:10PM +0100, Richard Sandiford wrote:
> Andrew Carlotti  writes:
> > There was an assumption in some places that the aarch64_fmv_feature_data
> > array contained FEAT_MAX elements.  While this assumption held up till
> > now, it is safer and more flexible to use the array size directly.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64.cc (compare_feature_masks):
> > Use ARRAY_SIZE to determine iteration bounds.
> > (aarch64_mangle_decl_assembler_name): Ditto.
> >
> >
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index 
> > 1ea84c8bd7386e399f6ffa3a5e36408cf8831fc6..5de842fcc212c78beba1fa99639e79562d718579
> >  100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -19899,7 +19899,8 @@ compare_feature_masks (aarch64_fmv_feature_mask 
> > mask1,
> >auto diff_mask = mask1 ^ mask2;
> >if (diff_mask == 0ULL)
> >  return 0;
> > -  for (int i = FEAT_MAX - 1; i > 0; i--)
> > +  static const int num_features = ARRAY_SIZE (aarch64_fmv_feature_data);
> 
> There doesn't seem any need for this to be static (or const).  Same for
> the second hunk.

Agreed - I'll fix that, and the other instance I added in a previous patch.

I originally copied this pattern from my driver-aarch64.c:252, which was added
by Kyrill back in 2015.

> > +  for (int i = num_features - 1; i > 0; i--)
> 
> Pre-existing, but is > 0 rather than >= 0 deliberate?  Shouldn't we look
> at index 0 as well?

That was probably left over from when "default" was handled as part of the
list.  I think a different instance of this mistake was mentioned in a previous
review.  I'll fix this mistake and add a test.

> LGTM otherwise.
> 
> Thanks,
> Richard
> 
> >  {
> >auto bit_mask = aarch64_fmv_feature_data[i].feature_mask;
> >if (diff_mask & bit_mask)
> > @@ -19982,7 +19983,8 @@ aarch64_mangle_decl_assembler_name (tree decl, tree 
> > id)
> >  
> >name += "._";
> >  
> > -  for (int i = 0; i < FEAT_MAX; i++)
> > +  static const int num_features = ARRAY_SIZE 
> > (aarch64_fmv_feature_data);
> > +  for (int i = 0; i < num_features; i++)
> > {
> >   if (feature_mask & aarch64_fmv_feature_data[i].feature_mask)
> > {


Re: [PATCH 0/5] aarch64: FMV feature list fixes

2024-04-10 Thread Andrew Carlotti
On Tue, Apr 09, 2024 at 04:43:16PM +0100, Richard Sandiford wrote:
> Andrew Carlotti  writes:
> > The first three patches are trivial changes to the feature list to reflect
> > recent changes in the ACLE.  Patch 4 removes most of the FMV multiversioning
> > features that don't work at the moment, and should be entirely 
> > uncontroversial.
> >
> > Patch 5 handles the remaining cases, where there's an inconsistency in how
> > features are named in the current FMV specification compared to the existing
> > command line options.  It might be better to instead preserve the "memtag2",
> > "ssbs2" and "ls64_accdata" names for now; I'd be happy to commit either
> > version.
> 
> Yeah, I suppose patch 5 leaves things in a somewhat awkward state,
> since e.g.:
> 
> -AARCH64_OPT_FMV_EXTENSION("memtag", MEMTAG, (), (), (), "")
> +AARCH64_OPT_EXTENSION("memtag", MEMTAG, (), (), (), "")
>  
> -AARCH64_FMV_FEATURE("memtag2", MEMTAG2, (MEMTAG))
> +AARCH64_FMV_FEATURE("memtag", MEMTAG2, (MEMTAG))
> 
> seems to drop "memtag2" and FEAT_MEMTAG, but keep "memtag" and
> FEAT_MEMTAG2.  Is that right?

That's deliberate. The FEAT_MEMTAG bit in __aarch64_cpu_features is defined to
match the definition of FEAT_MTE in the architecture, and likewise for
FEAT_MEMTAG2/FEAT_MTE2.  However, in Binutils the "+memtag" extension enables
both FEAT_MTE and FEAT_MTE2 instructions (although none of the FEAT_MTE2
instructions can be generated from GCC without inline assembly).  The FMV
specification in the ACLE currently uses names "memtag" and "memtag2" that
match the architecture names, but arguably don't match the command line
extension names.  I'm advocating for that to change to match the extension
names in command line options.

The LS64 example is definitely an inconsistency, since GCC uses "+ls64" to
enable intrinsics for all of the FEAT_LS64/FEAT_LS64_V/FEAT_LS64_ACCDATA
intrinsics.

There were similar issues with "sha1", "pmull" and "sve2-pmull128", but in
these cases their presence architecturally is implied by the presence of the
features checked for "sha2", "aes" and "sve2-aes" so it's fine to just delete
the ones without command line flags.

> Apart from that and the comment on patch 2, the series looks good to me.
> 
> While rechecking aarch64-option-extensions.def against the ACLE list:
> it seems that the .def doesn't treat mops as an FMV feature.  Is that
> deliberate?

"mops" was added to the ACLE list later, and libgcc doesn't yet support
detecting it.  I didn't think it was sensible to add new FMV feature support at
this stage.

> Thanks,
> Richard


C++ Patch ping^2

2024-04-10 Thread Jakub Jelinek
Hi!

On Wed, Apr 03, 2024 at 11:48:20AM +0200, Jakub Jelinek wrote:
> I'd like to ping the following patches:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647445.html
> PR111284 P2
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648215.html
> PR114409 (part of a P1)
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/648381.html
> PR114426 P1

Thanks

Jakub



Re: [PATCH 0/2] mmap: Avoid the sanitizer configure check failure

2024-04-10 Thread H.J. Lu
On Tue, Apr 9, 2024 at 10:39 PM Alan Modra  wrote:
>
> On Tue, Apr 09, 2024 at 07:24:33AM -0700, H.J. Lu wrote:
> > Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid
> > the sanitizer configure check failure.
>
> OK for binutils.  (I just fixed my local copy of autoconf so I
> wouldn't run into this again.)  The proper fix of course is to update
> autotools to something more recent.
>

This is what I am checking in with:

dnl
dnl Avoid the sanitizer run-time memory leak failure in the mmap configure
dnl test.  This should be removed when autoconf with commit:
dnl
dnl commit 09b6e78d1592ce10fdc975025d699ee41444aa3f
dnl Author: Paul Eggert 
dnl Date:   Fri Feb 5 21:06:20 2016 -0800
dnl Fix memory leak in AC_FUNC_MMAP
dnl
dnl * lib/autoconf/functions.m4 (AC_FUNC_MMAP): Fix memory leak
dnl in test case, found by configuring with gcc -fsanitize=address.
dnl
dnl is in use.
dnl

Thanks.

-- 
H.J.
From b0c2d5417fc216eeaacf7f2cd34109f438fa9aa7 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 9 Apr 2024 06:39:21 -0700
Subject: [PATCH] mmap: Avoid the sanitizer configure check failure

When -fsanitize=address,undefined is used to build, the mmap configure
check failed with

=
==231796==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
#0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x5750c7f6d72b in main /home/alan/build/gas-san/all/bfd/conftest.c:239

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
#0 0x7cdd3d0defdf in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x5750c7f6d2e1 in main /home/alan/build/gas-san/all/bfd/conftest.c:190

SUMMARY: AddressSanitizer: 8192 byte(s) leaked in 2 allocation(s).

Define GCC_AC_FUNC_MMAP with export ASAN_OPTIONS=detect_leaks=0 to avoid
the sanitizer configure check failure.

config/

	* mmap.m4 (GCC_AC_FUNC_MMAP): New.
	* no-executables.m4 (AC_FUNC_MMAP): Renamed to GCC_AC_FUNC_MMAP.
	Change AC_FUNC_MMAP to GCC_AC_FUNC_MMAP.

libiberty/

	* Makefile.in (aclocal_deps): Add $(srcdir)/../config/mmap.m4.
	* acinclude.m4: Change AC_FUNC_MMAP to GCC_AC_FUNC_MMAP.
	* aclocal.m4: Regenerated.
	* configure: Likewise.

zlib/

	* acinclude.m4: Include ../config/mmap.m4.
	* Makefile.in: Regenerated.
	* configure: Likewise.
---
 config/mmap.m4   | 22 ++
 config/no-executables.m4 |  4 ++--
 libiberty/Makefile.in|  1 +
 libiberty/acinclude.m4   |  2 +-
 libiberty/aclocal.m4 |  1 +
 libiberty/configure  |  5 +
 zlib/Makefile.in |  2 +-
 zlib/acinclude.m4|  1 +
 zlib/configure   |  7 ---
 9 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/config/mmap.m4 b/config/mmap.m4
index fba0d9d3657..326b97b91f4 100644
--- a/config/mmap.m4
+++ b/config/mmap.m4
@@ -95,3 +95,25 @@ if test $gcc_cv_func_mmap_anon = yes; then
 	[Define if mmap with MAP_ANON(YMOUS) works.])
 fi
 ])
+
+dnl
+dnl Avoid the sanitizer run-time memory leak failure in the mmap configure
+dnl test.  This should be removed when autoconf with commit:
+dnl
+dnl commit 09b6e78d1592ce10fdc975025d699ee41444aa3f
+dnl Author: Paul Eggert 
+dnl Date:   Fri Feb 5 21:06:20 2016 -0800
+dnl Fix memory leak in AC_FUNC_MMAP
+dnl
+dnl * lib/autoconf/functions.m4 (AC_FUNC_MMAP): Fix memory leak
+dnl in test case, found by configuring with gcc -fsanitize=address.
+dnl
+dnl is in use.
+dnl
+AC_DEFUN([GCC_AC_FUNC_MMAP],
+  save_ASAN_OPTIONS="$ASAN_OPTIONS"
+  ASAN_OPTIONS=detect_leaks=0
+  export ASAN_OPTIONS
+  m4_defn([AC_FUNC_MMAP])
+  ASAN_OPTIONS="$save_ASAN_OPTIONS"
+)
diff --git a/config/no-executables.m4 b/config/no-executables.m4
index 6842f84fba3..e8e2537bde5 100644
--- a/config/no-executables.m4
+++ b/config/no-executables.m4
@@ -49,14 +49,14 @@ m4_defn([AC_LINK_IFELSE]))
 
 dnl This is a shame.  We have to provide a default for some link tests,
 dnl similar to the default for run tests.
-m4_define([AC_FUNC_MMAP],
+m4_define([GCC_AC_FUNC_MMAP],
 if test x$gcc_no_link = xyes; then
   if test "x${ac_cv_func_mmap_fixed_mapped+set}" != xset; then
 ac_cv_func_mmap_fixed_mapped=no
   fi
 fi
 if test "x${ac_cv_func_mmap_fixed_mapped}" != xno; then
-  m4_defn([AC_FUNC_MMAP])
+  m4_defn([GCC_AC_FUNC_MMAP])
 fi)
 
 m4_divert_pop()dnl
diff --git a/libiberty/Makefile.in b/libiberty/Makefile.in
index 85c4b6b6ef8..b77a41c781c 100644
--- a/libiberty/Makefile.in
+++ b/libiberty/Makefile.in
@@ -508,6 +508,7 @@ aclocal_deps = \
 	$(srcdir)/../config/cet.m4 \
 	$(srcdir)/../config/enable.m4 \
 	$(srcdir)/../config/gcc-plugin.m4 \
+	$(srcdir)/../config/mmap.m4 \
 	$(srcdir)/../config/no-executables.m4 \
 	$(srcdir)/../config/override.m4 \
 	$(srcdir)/../config/picflag.m4 \
diff --git a/libiberty/acinclude.m4 b/libiberty/acinclude.m4
index 9974dcd4ec5..d08e31bc0b5 100644
--- a/libiberty/acinclude.m4
+++ 

[PATCH 5/4] libstdc++: Rewrite std::variant comparisons without macros

2024-04-10 Thread Jonathan Wakely
I think this is considerably nicer than the macro version, but it can
definitely wait for stage 1.

-- >8 --

libstdc++-v3/ChangeLog:

* include/std/variant (__detail::__variant::__compare): New
function template.
(operator==, operator!=, operator<, operator>, operator<=)
(operator>=): Replace macro definition with handwritten function
calling __detail::__variant::__compare.
(operator<=>): Call __detail::__variant::__compare.
---
 libstdc++-v3/include/std/variant | 167 +--
 1 file changed, 114 insertions(+), 53 deletions(-)

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 87a119df8b5..c2277e8831a 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -48,6 +48,7 @@
 #include 
 #include  // in_place_index_t
 #if __cplusplus >= 202002L
+# include 
 # include 
 #endif
 
@@ -1237,47 +1238,119 @@ namespace __variant
 
   struct monostate { };
 
-#if __cpp_lib_concepts
-# define _VARIANT_RELATION_FUNCTION_CONSTRAINTS(TYPES, OP) \
-  requires ((requires (const TYPES& __t) { \
-   { __t OP __t } -> __detail::__boolean_testable; }) && ...)
-#else
-# define _VARIANT_RELATION_FUNCTION_CONSTRAINTS(TYPES, OP)
-#endif
+namespace __detail::__variant
+{
+  template
+constexpr _Ret
+__compare(_Ret __ret, const _Vp& __lhs, const _Vp& __rhs, _Op __op)
+{
+  __variant::__raw_idx_visit(
+   [&__ret, &__lhs, __op] (auto&& __rhs_mem, auto __rhs_index) mutable
+   {
+ if constexpr (__rhs_index != variant_npos)
+   {
+ if (__lhs.index() == __rhs_index.value)
+   {
+ auto& __this_mem = std::get<__rhs_index>(__lhs);
+ __ret = __op(__this_mem, __rhs_mem);
+ return;
+   }
+   }
+ __ret = __op(__lhs.index() + 1, __rhs_index + 1);
+   }, __rhs);
+  return __ret;
+}
+} // namespace __detail::__variant
 
-#define _VARIANT_RELATION_FUNCTION_TEMPLATE(__OP) \
-  template \
-_VARIANT_RELATION_FUNCTION_CONSTRAINTS(_Types, __OP) \
-constexpr bool \
-operator __OP [[nodiscard]] (const variant<_Types...>& __lhs, \
-const variant<_Types...>& __rhs) \
-{ \
-  bool __ret = true; \
-  __detail::__variant::__raw_idx_visit( \
-[&__ret, &__lhs] (auto&& __rhs_mem, auto __rhs_index) mutable \
-{ \
- if constexpr (__rhs_index != variant_npos) \
-   { \
- if (__lhs.index() == __rhs_index) \
-   { \
- auto& __this_mem = std::get<__rhs_index>(__lhs);  \
-  __ret = __this_mem __OP __rhs_mem; \
- return; \
-} \
-} \
- __ret = (__lhs.index() + 1) __OP (__rhs_index + 1); \
-   }, __rhs); \
-  return __ret; \
+  template
+#if __cpp_lib_concepts
+requires ((requires (const _Types& __t) {
+  { __t == __t } -> convertible_to; }) && ...)
+#endif
+constexpr bool
+operator== [[nodiscard]] (const variant<_Types...>& __lhs,
+ const variant<_Types...>& __rhs)
+{
+  return __detail::__variant::__compare(true, __lhs, __rhs,
+   [](auto&& __l, auto&& __r) {
+ return __l == __r;
+   });
 }
 
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(<)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(<=)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(==)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(!=)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(>=)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(>)
+  template
+#if __cpp_lib_concepts
+requires ((requires (const _Types& __t) {
+  { __t != __t } -> convertible_to; }) && ...)
+#endif
+constexpr bool
+operator!= [[nodiscard]] (const variant<_Types...>& __lhs,
+ const variant<_Types...>& __rhs)
+{
+  return __detail::__variant::__compare(true, __lhs, __rhs,
+   [](auto&& __l, auto&& __r) {
+ return __l != __r;
+   });
+}
 
-#undef _VARIANT_RELATION_FUNCTION_TEMPLATE
+  template
+#if __cpp_lib_concepts
+requires ((requires (const _Types& __t) {
+  { __t < __t } -> convertible_to; }) && ...)
+#endif
+constexpr bool
+operator< [[nodiscard]] (const variant<_Types...>& __lhs,
+const variant<_Types...>& __rhs)
+{
+  return __detail::__variant::__compare(true, __lhs, __rhs,
+   [](auto&& __l, auto&& __r) {
+ return __l < __r;
+   });
+}
+
+  template
+#if __cpp_lib_concepts
+requires ((requires (const _Types& __t) {
+  { __t <= __t } -> convertible_to; 

Re: [PATCHv3 2/2] aarch64: Add support for _BitInt

2024-04-10 Thread Richard Sandiford
"Andre Vieira (lists)"  writes:
> Added the target check, also had to change some of the assembly checking 
> due to changes upstream, the assembly is still valid, but we do extend 
> where not necessary, I do believe that's a general issue though.
>
> The _BitInt(N > 64) codegen for non-powers of 2 did get worse, we see 
> similar codegen with _int128 bitfields on aarch64.
> I suspect we need to improve the way we 'extend' TImode in the aarch64 
> backend to be able to operate only on the affected DImode parts of it 
> when relevant. Though I also think we may need to change how _BitInt is 
> currently expanded in such situations, right now it does the extension 
> as two shifts. Anyway I did not have too much time to look deeper into this.
>
> Bootstrapped on aarch64-unknown-linux-gnu.
>
> OK for trunk?

OK, thanks.  In truth I've not gone through the tests very thorougly
this time around, and just gone by the internal diff between this
version and the previous one.  But we can adjust them as necessary
based on any reports that come in.

Richard

>
> On 28/03/2024 15:21, Richard Sandiford wrote:
>> Jakub Jelinek  writes:
>>> On Thu, Mar 28, 2024 at 03:00:46PM +, Richard Sandiford wrote:
>   * gcc.target/aarch64/bitint-alignments.c: New test.
>   * gcc.target/aarch64/bitint-args.c: New test.
>   * gcc.target/aarch64/bitint-sizes.c: New test.
>   * gcc.target/aarch64/bitfield-bitint-abi.h: New header.
>   * gcc.target/aarch64/bitfield-bitint-abi-align16.c: New test.
>   * gcc.target/aarch64/bitfield-bitint-abi-align8.c: New test.

 Since we don't support big-endian yet, I assume the tests should be
 conditional on aarch64_little_endian.
>>>
>>> Perhaps better on bitint effective target, then they'll become available
>>> automatically as soon as big endian aarch64 _BitInt support is turned on.
>> 
>> Ah, yeah, good point.
>> 
>> Richard
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 81400cc666472ffeff40df14e98ae00ebc774d31..c0af4ef151a8c46f78c0c3a43c2ab1318a3f610a
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -6583,6 +6583,7 @@ aarch64_return_in_memory_1 (const_tree type)
>int count;
>  
>if (!AGGREGATE_TYPE_P (type)
> +  && TREE_CODE (type) != BITINT_TYPE
>&& TREE_CODE (type) != COMPLEX_TYPE
>&& TREE_CODE (type) != VECTOR_TYPE)
>  /* Simple scalar types always returned in registers.  */
> @@ -21996,6 +21997,11 @@ aarch64_composite_type_p (const_tree type,
>if (type && (AGGREGATE_TYPE_P (type) || TREE_CODE (type) == COMPLEX_TYPE))
>  return true;
>  
> +  if (type
> +  && TREE_CODE (type) == BITINT_TYPE
> +  && int_size_in_bytes (type) > 16)
> +return true;
> +
>if (mode == BLKmode
>|| GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
>|| GET_MODE_CLASS (mode) == MODE_COMPLEX_INT)
> @@ -28477,6 +28483,42 @@ aarch64_excess_precision (enum excess_precision_type 
> type)
>return FLT_EVAL_METHOD_UNPREDICTABLE;
>  }
>  
> +/* Implement TARGET_C_BITINT_TYPE_INFO.
> +   Return true if _BitInt(N) is supported and fill its details into *INFO.  
> */
> +bool
> +aarch64_bitint_type_info (int n, struct bitint_info *info)
> +{
> +  if (TARGET_BIG_END)
> +return false;
> +
> +  if (n <= 8)
> +info->limb_mode = QImode;
> +  else if (n <= 16)
> +info->limb_mode = HImode;
> +  else if (n <= 32)
> +info->limb_mode = SImode;
> +  else if (n <= 64)
> +info->limb_mode = DImode;
> +  else if (n <= 128)
> +info->limb_mode = TImode;
> +  else
> +/* The AAPCS for AArch64 defines _BitInt(N > 128) as an array with
> +   type {signed,unsigned} __int128[M] where M*128 >= N.  However, to be
> +   able to use libgcc's implementation to support large _BitInt's we need
> +   to use a LIMB_MODE that is no larger than 'long long'.  This is why we
> +   use DImode for our internal LIMB_MODE and we define the ABI_LIMB_MODE 
> to
> +   be TImode to ensure we are ABI compliant.  */
> +info->limb_mode = DImode;
> +
> +  if (n > 128)
> +info->abi_limb_mode = TImode;
> +  else
> +info->abi_limb_mode = info->limb_mode;
> +  info->big_endian = TARGET_BIG_END;
> +  info->extended = false;
> +  return true;
> +}
> +
>  /* Implement TARGET_SCHED_CAN_SPECULATE_INSN.  Return true if INSN can be
> scheduled for speculative execution.  Reject the long-running division
> and square-root instructions.  */
> @@ -30601,6 +30643,9 @@ aarch64_run_selftests (void)
>  #undef TARGET_C_EXCESS_PRECISION
>  #define TARGET_C_EXCESS_PRECISION aarch64_excess_precision
>  
> +#undef TARGET_C_BITINT_TYPE_INFO
> +#define TARGET_C_BITINT_TYPE_INFO aarch64_bitint_type_info
> +
>  #undef  TARGET_EXPAND_BUILTIN
>  #define TARGET_EXPAND_BUILTIN aarch64_expand_builtin
>  
> diff --git a/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c 
> b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c
> 

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-04-10 Thread Richard Sandiford
"Andre Vieira (lists)"  writes:
> @@ -6907,6 +6938,11 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
> function_arg_info )
> && (!alignment || abi_break_gcc_9 < alignment)
> && (!abi_break_gcc_13 || alignment < abi_break_gcc_13));
>  
> +  /* _BitInt(N) was only added in GCC 14.  */
> +  bool warn_pcs_change_le_gcc14
> += warn_pcs_change && !bitint_or_aggr_of_bitint_p (type);
> +
> +

Excess blank line.

OK with that removed, thanks (no need to retest).

Richard


[PATCH v1] aarch64: Preparatory Patch to place target independent and dependent changed code in one file

2024-04-10 Thread Ajit Agarwal
Hello Alex/Richard:

All comments are addressed in this version-1 of the patch.

Common infrastructure of load store pair fusion is divded into target
independent and target dependent changed code.

Target independent code is the Generic code with pure virtual function
to interface betwwen target independent and dependent code.

Target dependent code is the implementation of pure virtual function for
aarch64 target and the call to target independent code.

Thanks & Regards
Ajit


aarch64: Place target independent and dependent changed code in one file

Common infrastructure of load store pair fusion is divided into target
independent and target dependent changed code.

Target independent code is the Generic code with pure virtual function
to interface betwwen target independent and dependent code.

Target dependent code is the implementation of pure virtual function for
aarch64 target and the call to target independent code.

2024-04-10  Ajit Kumar Agarwal  

gcc/ChangeLog:

* config/aarch64/aarch64-ldp-fusion.cc: Place target
independent and dependent changed code.
---
 gcc/config/aarch64/aarch64-ldp-fusion.cc | 497 +++
 1 file changed, 337 insertions(+), 160 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 365dcf48b22..03e8572ebfd 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -138,6 +138,198 @@ struct alt_base
   poly_int64 offset;
 };
 
+// Virtual base class for load/store walkers used in alias analysis.
+struct alias_walker
+{
+  virtual bool conflict_p (int ) const = 0;
+  virtual insn_info *insn () const = 0;
+  virtual bool valid () const = 0;
+  virtual void advance () = 0;
+};
+
+// Forward declaration to be used inside the aarch64_pair_fusion class.
+bool ldp_operand_mode_ok_p (machine_mode mode);
+rtx aarch64_destructure_load_pair (rtx regs[2], rtx pattern);
+rtx aarch64_destructure_store_pair (rtx regs[2], rtx pattern);
+rtx aarch64_gen_writeback_pair (rtx wb_effect, rtx pair_mem, rtx regs[2],
+   bool load_p);
+enum class writeback{
+  WRITEBACK_PAIR_P,
+  WRITEBACK
+};
+
+struct pair_fusion {
+
+  pair_fusion ()
+  {
+calculate_dominance_info (CDI_DOMINATORS);
+df_analyze ();
+crtl->ssa = new rtl_ssa::function_info (cfun);
+  };
+  // Return true if GPR is FP or SIMD accesses, passed
+  // with GPR reg_op rtx, machine mode and load_p.
+  virtual bool fpsimd_op_p (rtx, machine_mode, bool)
+  {
+return false;
+  }
+  // Return true if pair operand mode is ok. Passed with
+  // machine mode.
+  virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0;
+  // Return true if reg operand is ok, passed with load_p,
+  // reg_op rtx and machine mode.
+  virtual bool pair_reg_operand_ok_p (bool load_p, rtx reg_op,
+ machine_mode mem_mode) = 0;
+  // Return alias check limit.
+  virtual int pair_mem_alias_check_limit () = 0;
+  // Return true if there is writeback opportunities. Passed
+  // with enum writeback.
+  virtual bool handle_writeback_opportunities (enum writeback wback) = 0 ;
+  // Return true if mem ok ldp stp policy model passed with
+  // rtx mem, load_p and machine mode.
+  virtual bool pair_mem_ok_with_policy (rtx first_mem, bool load_p,
+   machine_mode mode) = 0;
+  // Gen load store mem pair. Return load store rtx passed
+  // with arguments load store pattern, writeback rtx and
+  // load_p.
+  virtual rtx gen_mem_pair (rtx *pats, rtx writeback,
+   bool load_p) = 0;
+  // Return true if memory writeback can be promoted, passed
+  // with insn, rtx pattern and load_p. load_p is set by this
+  // hook.
+  virtual bool pair_mem_promote_writeback_p (insn_info *, rtx, bool &)
+  {
+ return false;
+  }
+  // Return true if we track loads.
+  virtual bool track_loads_p ()
+  {
+return true;
+  }
+  // Return true if we track stores.
+  virtual bool track_stores_p ()
+  {
+return true;
+  }
+  // Return true if offset is out of range.
+  virtual bool pair_mem_out_of_range_p (HOST_WIDE_INT off) = 0;
+  // Return destructure pair. Passed with rtx reg, insn pattern
+  // and load_p.
+  virtual rtx gen_destructure_pair (rtx regs[2], rtx rti, bool load_p) = 0;
+  // Return writeback pair. Passed with rtx writeback effect, mem rtx
+  // regs rtx and load_p.
+  virtual rtx gen_writeback_pair (rtx wb_effect, rtx mem,
+ rtx regs[2], bool load_p) = 0;
+  // Return true if offset is aligned and multiple of 32.
+  // Passed with offset and access_size to check multiple of 32.
+  virtual bool pair_offset_alignment_ok_p (poly_int64 offset,
+  unsigned access_size) = 0;
+  void ldp_fusion_bb (bb_info *bb);
+  insn_info * find_trailing_add (insn_info *insns[2],
+const insn_range_info _range,
+  

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-10 Thread Ajit Agarwal
Hello Alex:

On 10/04/24 1:42 pm, Alex Coplan wrote:
> Hi Ajit,
> 
> On 09/04/2024 20:59, Ajit Agarwal wrote:
>> Hello Alex:
>>
>> On 09/04/24 8:39 pm, Alex Coplan wrote:
>>> On 09/04/2024 20:01, Ajit Agarwal wrote:
 Hello Alex:

 On 09/04/24 7:29 pm, Alex Coplan wrote:
> On 09/04/2024 17:30, Ajit Agarwal wrote:
>>
>>
>> On 05/04/24 10:03 pm, Alex Coplan wrote:
>>> On 05/04/2024 13:53, Ajit Agarwal wrote:
 Hello Alex/Richard:

 All review comments are incorporated.
>>>
>>> Thanks, I was kind-of expecting you to also send the renaming patch as a
>>> preparatory patch as we discussed.
>>>
>>> Sorry for another meta comment, but: I think the reason that the Linaro
>>> CI isn't running tests on your patches is actually because you're
>>> sending 1/3 of a series but not sending the rest of the series.
>>>
>>> So please can you either send this as an individual preparatory patch
>>> (not marked as a series) or if you're going to send a series (e.g. with
>>> a preparatory rename patch as 1/2 and this as 2/2) then send the entire
>>> series when you make updates.  That way the CI should test your patches,
>>> which would be helpful.
>>>
>>
>> Addressed.
>>  

 Common infrastructure of load store pair fusion is divided into target
 independent and target dependent changed code.

 Target independent code is the Generic code with pure virtual function
 to interface betwwen target independent and dependent code.

 Target dependent code is the implementation of pure virtual function 
 for
 aarch64 target and the call to target independent code.

 Thanks & Regards
 Ajit


 aarch64: Place target independent and dependent changed code in one 
 file

 Common infrastructure of load store pair fusion is divided into target
 independent and target dependent changed code.

 Target independent code is the Generic code with pure virtual function
 to interface betwwen target independent and dependent code.

 Target dependent code is the implementation of pure virtual function 
 for
 aarch64 target and the call to target independent code.

 2024-04-06  Ajit Kumar Agarwal  

 gcc/ChangeLog:

* config/aarch64/aarch64-ldp-fusion.cc: Place target
independent and dependent changed code.
>>>
>>> You're going to need a proper ChangeLog eventually, but I guess there's
>>> no need for that right now.
>>>
 ---
  gcc/config/aarch64/aarch64-ldp-fusion.cc | 371 +++
  1 file changed, 249 insertions(+), 122 deletions(-)

 diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
 b/gcc/config/aarch64/aarch64-ldp-fusion.cc
 index 22ed95eb743..cb21b514ef7 100644
 --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
 +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
 @@ -138,8 +138,122 @@ struct alt_base
poly_int64 offset;
  };
  
 +// Virtual base class for load/store walkers used in alias analysis.
 +struct alias_walker
 +{
 +  virtual bool conflict_p (int ) const = 0;
 +  virtual insn_info *insn () const = 0;
 +  virtual bool valid () const  = 0;
>>>
>>> Heh, looking at this made me realise there is a whitespace bug here in
>>> the existing code (double space after const).  Sorry about that!  I'll
>>> push an obvious fix for that.
>>>
 +  virtual void advance () = 0;
 +};
 +
 +struct pair_fusion {
 +
 +  pair_fusion () {};
>>>
>>> This ctor looks pointless at the moment.  Perhaps instead we could put
>>> the contents of ldp_fusion_init in here and then delete that function?
>>>
>>
>> Addressed.
>>
 +  virtual bool fpsimd_op_p (rtx reg_op, machine_mode mem_mode,
 + bool load_p) = 0;
>>>
>>> Please can we have comments above each of these virtual functions
>>> describing any parameters, what the purpose of the hook is, and the
>>> interpretation of the return value?  This will serve as the
>>> documentation for other targets that want to make use of the pass.
>>>
>>> It might make sense to have a default-false implementation for
>>> fpsimd_op_p, especially if you don't want to make use of this bit for
>>> rs6000.
>>>
>>
>> Addressed.
>>  
 +
 +  virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0;
 +  virtual bool pair_trailing_writeback_p () = 0;
>>>
>>> Sorry for the run-around, but: I think this and
>>> 

Re: [PATCHv3 2/2] aarch64: Add support for _BitInt

2024-04-10 Thread Andre Vieira (lists)
Added the target check, also had to change some of the assembly checking 
due to changes upstream, the assembly is still valid, but we do extend 
where not necessary, I do believe that's a general issue though.


The _BitInt(N > 64) codegen for non-powers of 2 did get worse, we see 
similar codegen with _int128 bitfields on aarch64.
I suspect we need to improve the way we 'extend' TImode in the aarch64 
backend to be able to operate only on the affected DImode parts of it 
when relevant. Though I also think we may need to change how _BitInt is 
currently expanded in such situations, right now it does the extension 
as two shifts. Anyway I did not have too much time to look deeper into this.


Bootstrapped on aarch64-unknown-linux-gnu.

OK for trunk?

On 28/03/2024 15:21, Richard Sandiford wrote:

Jakub Jelinek  writes:

On Thu, Mar 28, 2024 at 03:00:46PM +, Richard Sandiford wrote:

* gcc.target/aarch64/bitint-alignments.c: New test.
* gcc.target/aarch64/bitint-args.c: New test.
* gcc.target/aarch64/bitint-sizes.c: New test.
* gcc.target/aarch64/bitfield-bitint-abi.h: New header.
* gcc.target/aarch64/bitfield-bitint-abi-align16.c: New test.
* gcc.target/aarch64/bitfield-bitint-abi-align8.c: New test.


Since we don't support big-endian yet, I assume the tests should be
conditional on aarch64_little_endian.


Perhaps better on bitint effective target, then they'll become available
automatically as soon as big endian aarch64 _BitInt support is turned on.


Ah, yeah, good point.

Richarddiff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
81400cc666472ffeff40df14e98ae00ebc774d31..c0af4ef151a8c46f78c0c3a43c2ab1318a3f610a
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -6583,6 +6583,7 @@ aarch64_return_in_memory_1 (const_tree type)
   int count;
 
   if (!AGGREGATE_TYPE_P (type)
+  && TREE_CODE (type) != BITINT_TYPE
   && TREE_CODE (type) != COMPLEX_TYPE
   && TREE_CODE (type) != VECTOR_TYPE)
 /* Simple scalar types always returned in registers.  */
@@ -21996,6 +21997,11 @@ aarch64_composite_type_p (const_tree type,
   if (type && (AGGREGATE_TYPE_P (type) || TREE_CODE (type) == COMPLEX_TYPE))
 return true;
 
+  if (type
+  && TREE_CODE (type) == BITINT_TYPE
+  && int_size_in_bytes (type) > 16)
+return true;
+
   if (mode == BLKmode
   || GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
   || GET_MODE_CLASS (mode) == MODE_COMPLEX_INT)
@@ -28477,6 +28483,42 @@ aarch64_excess_precision (enum excess_precision_type 
type)
   return FLT_EVAL_METHOD_UNPREDICTABLE;
 }
 
+/* Implement TARGET_C_BITINT_TYPE_INFO.
+   Return true if _BitInt(N) is supported and fill its details into *INFO.  */
+bool
+aarch64_bitint_type_info (int n, struct bitint_info *info)
+{
+  if (TARGET_BIG_END)
+return false;
+
+  if (n <= 8)
+info->limb_mode = QImode;
+  else if (n <= 16)
+info->limb_mode = HImode;
+  else if (n <= 32)
+info->limb_mode = SImode;
+  else if (n <= 64)
+info->limb_mode = DImode;
+  else if (n <= 128)
+info->limb_mode = TImode;
+  else
+/* The AAPCS for AArch64 defines _BitInt(N > 128) as an array with
+   type {signed,unsigned} __int128[M] where M*128 >= N.  However, to be
+   able to use libgcc's implementation to support large _BitInt's we need
+   to use a LIMB_MODE that is no larger than 'long long'.  This is why we
+   use DImode for our internal LIMB_MODE and we define the ABI_LIMB_MODE to
+   be TImode to ensure we are ABI compliant.  */
+info->limb_mode = DImode;
+
+  if (n > 128)
+info->abi_limb_mode = TImode;
+  else
+info->abi_limb_mode = info->limb_mode;
+  info->big_endian = TARGET_BIG_END;
+  info->extended = false;
+  return true;
+}
+
 /* Implement TARGET_SCHED_CAN_SPECULATE_INSN.  Return true if INSN can be
scheduled for speculative execution.  Reject the long-running division
and square-root instructions.  */
@@ -30601,6 +30643,9 @@ aarch64_run_selftests (void)
 #undef TARGET_C_EXCESS_PRECISION
 #define TARGET_C_EXCESS_PRECISION aarch64_excess_precision
 
+#undef TARGET_C_BITINT_TYPE_INFO
+#define TARGET_C_BITINT_TYPE_INFO aarch64_bitint_type_info
+
 #undef  TARGET_EXPAND_BUILTIN
 #define TARGET_EXPAND_BUILTIN aarch64_expand_builtin
 
diff --git a/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c 
b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c
new file mode 100644
index 
..3f292a45f955d35b802a0bd789cd39d5fa7b5860
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c
@@ -0,0 +1,384 @@
+/* { dg-do compile { target bitint } } */
+/* { dg-additional-options "-std=c23 -O2 -fno-stack-protector -save-temps 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#define ALIGN 16
+#include "bitfield-bitint-abi.h"
+
+// f1-f16 are all the same
+

Re: [PATCHv2 1/2] aarch64: Do not give ABI change diagnostics for _BitInt(N)

2024-04-10 Thread Andre Vieira (lists)

Hey,

Added the warn_pcs_change_le_gcc14 variable and changed the uses of 
warn_pcs_change to use this new variable.
Also fixed an issue with the loop through TREE_FIELDS to avoid an ICE 
during bootstrap.


OK for trunk?

Bootstrapped and regression tested on aarch64-unknown-linux-gnu.

Kind regards,
Andre

On 28/03/2024 12:54, Richard Sandiford wrote:

"Andre Vieira (lists)"  writes:

This patch makes sure we do not give ABI change diagnostics for the ABI
breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that
type did not exist before this GCC version.

ChangeLog:

* config/aarch64/aarch64.cc (bitint_or_aggr_of_bitint_p): New function.
(aarch64_layout_arg): Don't emit diagnostics for types involving
_BitInt(N).

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
1ea84c8bd7386e399f6ffa3a5e36408cf8831fc6..b68cf3e7cb9a6fa89b4e5826a39ffa11f64ca20a
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -6744,6 +6744,33 @@ aarch64_function_arg_alignment (machine_mode mode, 
const_tree type,
return alignment;
  }
  
+/* Return true if TYPE describes a _BitInt(N) or an angreggate that uses the

+   _BitInt(N) type.  These include ARRAY_TYPE's with an element that is a
+   _BitInt(N) or an aggregate that uses it, and a RECORD_TYPE or a UNION_TYPE
+   with a field member that is a _BitInt(N) or an aggregate that uses it.
+   Return false otherwise.  */
+
+static bool
+bitint_or_aggr_of_bitint_p (tree type)
+{
+  if (!type)
+return false;
+
+  if (TREE_CODE (type) == BITINT_TYPE)
+return true;
+
+  /* If ARRAY_TYPE, check it's element type.  */
+  if (TREE_CODE (type) == ARRAY_TYPE)
+return bitint_or_aggr_of_bitint_p (TREE_TYPE (type));
+
+  /* If RECORD_TYPE or UNION_TYPE, check the fields' types.  */
+  if (RECORD_OR_UNION_TYPE_P (type))
+for (tree field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field))
+  if (bitint_or_aggr_of_bitint_p (TREE_TYPE (field)))
+   return true;
+  return false;
+}
+
  /* Layout a function argument according to the AAPCS64 rules.  The rule
 numbers refer to the rule numbers in the AAPCS64.  ORIG_MODE is the
 mode that was originally given to us by the target hook, whereas the
@@ -6767,12 +6794,6 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
if (pcum->aapcs_arg_processed)
  return;
  
-  bool warn_pcs_change

-= (warn_psabi
-   && !pcum->silent_p
-   && (currently_expanding_function_start
-  || currently_expanding_gimple_stmt));
-
/* HFAs and HVAs can have an alignment greater than 16 bytes.  For example:
  
 typedef struct foo {

@@ -6907,6 +6928,18 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
  && (!alignment || abi_break_gcc_9 < alignment)
  && (!abi_break_gcc_13 || alignment < abi_break_gcc_13));
  
+

+  bool warn_pcs_change
+= (warn_psabi
+   && !pcum->silent_p
+   && (currently_expanding_function_start
+  || currently_expanding_gimple_stmt)
+  /* warn_pcs_change is currently used to gate diagnostics in case of
+abi_break_gcc_{9,13,14}.  These however, do not apply to _BitInt(N)
+types as they were only introduced in GCC 14.  */
+   && (!type || !bitint_or_aggr_of_bitint_p (type)));


How about making this a new variable such as:

   /* _BitInt(N) was only added in GCC 14.  */
   bool warn_pcs_change_le_gcc14
 = (warn_psabi && !bitint_or_aggr_of_bitint_p (type);

(and keeping warn_pcs_change where it is).  In principle, warn_pcs_change
is meaningful for any future ABI breaks, and we might forget that it
excludes bitints.  The name is just a suggestion.

OK with that change, thanks.

Richard


+
+
/* allocate_ncrn may be false-positive, but allocate_nvrn is quite reliable.
   The following code thus handles passing by SIMD/FP registers first.  */
  
@@ -21266,19 +21299,25 @@ aarch64_gimplify_va_arg_expr (tree valist, tree type, gimple_seq *pre_p,

rsize = ROUND_UP (size, UNITS_PER_WORD);
nregs = rsize / UNITS_PER_WORD;
  
-  if (align <= 8 && abi_break_gcc_13 && warn_psabi)

+  if (align <= 8
+ && abi_break_gcc_13
+ && warn_psabi
+ && !bitint_or_aggr_of_bitint_p (type))
inform (input_location, "parameter passing for argument of type "
"%qT changed in GCC 13.1", type);
  
if (warn_psabi

  && abi_break_gcc_14
- && (abi_break_gcc_14 > 8 * BITS_PER_UNIT) != (align > 8))
+ && (abi_break_gcc_14 > 8 * BITS_PER_UNIT) != (align > 8)
+ && !bitint_or_aggr_of_bitint_p (type))
inform (input_location, "parameter passing for argument of type "
"%qT changed in GCC 14.1", type);
  
if (align > 8)

{
- if (abi_break_gcc_9 && warn_psabi)
+ if (abi_break_gcc_9
+ && warn_psabi
+ && 

[PATCH 3/4] libstdc++: Constrain equality ops for std::pair, std::tuple, std::variant

2024-04-10 Thread Jonathan Wakely
Tested x86_64-linux.

Since this only affects C++20 and later (except for adding [[nodiscard]]
to relational ops) it seems OK for trunk now.

-- >8 --

Implement the changes from P2944R3 which add constraints to the
comparison operators of std::pair, std::tuple, and std::variant.

The paper also changes std::optional, but we already constrain its
comparisons using SFINAE on the return type. However, we need some
additional constraints on the [optional.comp.with.t] operators that
compare an optional with a value. The paper doesn't say to do that, but
I think it's needed because otherwise when the comparison for two
optional objects fails its constraints, the two overloads that are
supposed to be for comparing to a non-optional become the best overload
candidates, but are ambiguous (and we don't even get as far as checking
the constraints for satisfaction).

The paper does not change std::expected, but probably should have done.
I'll submit an LWG issue about that and implement it separately.

Also add [[nodiscard]] to all these comparison operators.

libstdc++-v3/ChangeLog:

* include/bits/stl_pair.h (operator==): Add constraint.
* include/bits/version.def (constrained_equality): Define.
* include/bits/version.h: Regenerate.
* include/std/optional: Define feature test macro.
(__optional_rep_op_t): Use is_convertible_v instead of
is_convertible.
* include/std/tuple: Define feature test macro.
(operator==, __tuple_cmp, operator<=>): Reimplement C++20
comparisons using lambdas. Add constraints.
* include/std/utility: Define feature test macro.
* include/std/variant: Define feature test macro.
(_VARIANT_RELATION_FUNCTION_TEMPLATE): Add constraints.
(variant): Remove unnecessary friend declarations for comparison
operators.
* testsuite/20_util/optional/relops/constrained.cc: New test.
* testsuite/20_util/pair/comparison_operators/constrained.cc:
New test.
* testsuite/20_util/tuple/comparison_operators/constrained.cc:
New test.
* testsuite/20_util/variant/relops/constrained.cc: New test.
* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
Disable for C++20 and later.
* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
Remove dg-error line for target c++20.
---
 libstdc++-v3/include/bits/stl_pair.h  |  16 +-
 libstdc++-v3/include/bits/version.def |   9 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/optional |  48 +++-
 libstdc++-v3/include/std/tuple| 102 ---
 libstdc++-v3/include/std/utility  |   1 +
 libstdc++-v3/include/std/variant  |  28 +-
 .../20_util/optional/relops/constrained.cc| 258 ++
 .../pair/comparison_operators/constrained.cc  |  48 
 .../tuple/comparison_operators/constrained.cc |  50 
 .../tuple/comparison_operators/overloaded.cc  |   6 +-
 .../tuple/comparison_operators/overloaded2.cc |   1 -
 .../20_util/variant/relops/constrained.cc | 175 
 13 files changed, 677 insertions(+), 75 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/20_util/optional/relops/constrained.cc
 create mode 100644 
libstdc++-v3/testsuite/20_util/pair/comparison_operators/constrained.cc
 create mode 100644 
libstdc++-v3/testsuite/20_util/tuple/comparison_operators/constrained.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/variant/relops/constrained.cc

diff --git a/libstdc++-v3/include/bits/stl_pair.h 
b/libstdc++-v3/include/bits/stl_pair.h
index 45317417c9c..0c1e5719a1a 100644
--- a/libstdc++-v3/include/bits/stl_pair.h
+++ b/libstdc++-v3/include/bits/stl_pair.h
@@ -1000,14 +1000,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template pair(_T1, _T2) -> pair<_T1, _T2>;
 #endif
 
-#if __cpp_lib_three_way_comparison && __cpp_lib_concepts
+#if __cpp_lib_three_way_comparison
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 3865. Sorting a range of pairs
 
   /// Two pairs are equal iff their members are equal.
   template
-inline _GLIBCXX_CONSTEXPR bool
+[[nodiscard]]
+constexpr bool
 operator==(const pair<_T1, _T2>& __x, const pair<_U1, _U2>& __y)
+requires requires {
+  { __x.first == __y.first } -> __detail::__boolean_testable;
+  { __x.second == __y.second } -> __detail::__boolean_testable;
+}
 { return __x.first == __y.first && __x.second == __y.second; }
 
   /** Defines a lexicographical order for pairs.
@@ -1018,6 +1023,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* less than `Q.second`.
   */
   template
+[[nodiscard]]
 constexpr common_comparison_category_t<__detail::__synth3way_t<_T1, _U1>,
   __detail::__synth3way_t<_T2, _U2>>
 operator<=>(const pair<_T1, _T2>& __x, const pair<_U1, _U2>& __y)
@@ -1029,6 +1035,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #else
   /// Two pairs of 

[PATCH 4/4] libstdc++: Simplify std::variant comparison operators

2024-04-10 Thread Jonathan Wakely
Tested x86_64-linux.

This is just a minor clean-up and could wait for stage 1.

-- >8 --

libstdc++-v3/ChangeLog:

* include/std/variant (_VARIANT_RELATION_FUNCTION_TEMPLATE):
Simplify.
---
 libstdc++-v3/include/std/variant | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 5ba6d9d42e3..2be0f0c1db7 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -1245,7 +1245,7 @@ namespace __variant
 # define _VARIANT_RELATION_FUNCTION_CONSTRAINTS(TYPES, OP)
 #endif
 
-#define _VARIANT_RELATION_FUNCTION_TEMPLATE(__OP, __NAME) \
+#define _VARIANT_RELATION_FUNCTION_TEMPLATE(__OP) \
   template \
 _VARIANT_RELATION_FUNCTION_CONSTRAINTS(_Types, __OP) \
 constexpr bool \
@@ -1262,22 +1262,20 @@ namespace __variant
{ \
  auto& __this_mem = std::get<__rhs_index>(__lhs);  \
   __ret = __this_mem __OP __rhs_mem; \
+ return; \
 } \
- else \
-   __ret = (__lhs.index() + 1) __OP (__rhs_index + 1); \
 } \
-  else \
-__ret = (__lhs.index() + 1) __OP (__rhs_index + 1); \
+ __ret = (__lhs.index() + 1) __OP (__rhs_index + 1); \
}, __rhs); \
   return __ret; \
 }
 
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(<, less)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(<=, less_equal)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(==, equal)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(!=, not_equal)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(>=, greater_equal)
-  _VARIANT_RELATION_FUNCTION_TEMPLATE(>, greater)
+  _VARIANT_RELATION_FUNCTION_TEMPLATE(<)
+  _VARIANT_RELATION_FUNCTION_TEMPLATE(<=)
+  _VARIANT_RELATION_FUNCTION_TEMPLATE(==)
+  _VARIANT_RELATION_FUNCTION_TEMPLATE(!=)
+  _VARIANT_RELATION_FUNCTION_TEMPLATE(>=)
+  _VARIANT_RELATION_FUNCTION_TEMPLATE(>)
 
 #undef _VARIANT_RELATION_FUNCTION_TEMPLATE
 
-- 
2.44.0



[PATCH 2/4] libstdc++: Add std::reference_wrapper comparison operators for C++26

2024-04-10 Thread Jonathan Wakely
Tested x86_64-linux.

Since this only affects C++26 it seems OK for trunk now.

-- >8 --

This C++26 change was just approved in Tokyo, in P2944R3. It adds
operator== and operator<=> overloads to std::reference_wrapper.

The operator<=> overloads in the paper cause compilation errors for any
type without <=> so they're implemented here with deduced return types
and constrained by a requires clause.

libstdc++-v3/ChangeLog:

* include/bits/refwrap.h (reference_wrapper): Add comparison
operators as proposed by P2944R3.
* include/bits/version.def (reference_wrapper): Define.
* include/bits/version.h: Regenerate.
* include/std/functional: Enable feature test macro.
* testsuite/20_util/reference_wrapper/compare.cc: New test.
---
 libstdc++-v3/include/bits/refwrap.h   | 45 +
 libstdc++-v3/include/bits/version.def |  8 ++
 libstdc++-v3/include/bits/version.h   | 10 ++
 libstdc++-v3/include/std/functional   |  1 +
 .../20_util/reference_wrapper/compare.cc  | 95 +++
 5 files changed, 159 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/20_util/reference_wrapper/compare.cc

diff --git a/libstdc++-v3/include/bits/refwrap.h 
b/libstdc++-v3/include/bits/refwrap.h
index 2d4338b718f..fd1cc2b63e6 100644
--- a/libstdc++-v3/include/bits/refwrap.h
+++ b/libstdc++-v3/include/bits/refwrap.h
@@ -38,6 +38,10 @@
 #include 
 #include  // for unary_function and binary_function
 
+#if __glibcxx_reference_wrapper >= 202403L // >= C++26
+# include 
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -358,6 +362,47 @@ _GLIBCXX_MEM_FN_TRAITS(&& noexcept, false_type, true_type)
 #endif
  return std::__invoke(get(), std::forward<_Args>(__args)...);
}
+
+#if __glibcxx_reference_wrapper >= 202403L // >= C++26
+  // [refwrap.comparisons], comparisons
+  [[nodiscard]]
+  friend constexpr bool
+  operator==(reference_wrapper __x, reference_wrapper __y)
+  requires requires { { __x.get() == __y.get() } -> convertible_to; }
+  { return __x.get() == __y.get(); }
+
+  [[nodiscard]]
+  friend constexpr bool
+  operator==(reference_wrapper __x, const _Tp& __y)
+  requires requires { { __x.get() == __y } -> convertible_to; }
+  { return __x.get() == __y; }
+
+  [[nodiscard]]
+  friend constexpr bool
+  operator==(reference_wrapper __x, reference_wrapper __y)
+  requires (!is_const_v<_Tp>)
+   && requires { { __x.get() == __y.get() } -> convertible_to; }
+  { return __x.get() == __y.get(); }
+
+  [[nodiscard]]
+  friend constexpr auto
+  operator<=>(reference_wrapper __x, reference_wrapper<_Tp> __y)
+  requires requires { __detail::__synth3way(__x.get(), __y.get()); }
+  { return __detail::__synth3way(__x.get(), __y.get()); }
+
+  [[nodiscard]]
+  friend constexpr auto
+  operator<=>(reference_wrapper __x, const _Tp& __y)
+  requires requires { __detail::__synth3way(__x.get(), __y); }
+  { return __detail::__synth3way(__x.get(), __y); }
+
+  [[nodiscard]]
+  friend constexpr auto
+  operator<=>(reference_wrapper __x, reference_wrapper __y)
+  requires (!is_const_v<_Tp>)
+   && requires { __detail::__synth3way(__x.get(), __y.get()); }
+  { return __detail::__synth3way(__x.get(), __y.get()); }
+#endif
 };
 
 #if __cpp_deduction_guides
diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index 5ad44941bff..5c0477fb61e 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -1760,6 +1760,14 @@ ftms = {
   };
 };
 
+ftms = {
+  name = reference_wrapper;
+  values = {
+v = 202403;
+cxxmin = 26;
+  };
+};
+
 ftms = {
   name = saturation_arithmetic;
   values = {
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index 460a3e0116a..65e708c73fb 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -1963,6 +1963,16 @@
 #endif /* !defined(__cpp_lib_ratio) && defined(__glibcxx_want_ratio) */
 #undef __glibcxx_want_ratio
 
+#if !defined(__cpp_lib_reference_wrapper)
+# if (__cplusplus >  202302L)
+#  define __glibcxx_reference_wrapper 202403L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_reference_wrapper)
+#   define __cpp_lib_reference_wrapper 202403L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_reference_wrapper) && 
defined(__glibcxx_want_reference_wrapper) */
+#undef __glibcxx_want_reference_wrapper
+
 #if !defined(__cpp_lib_saturation_arithmetic)
 # if (__cplusplus >  202302L)
 #  define __glibcxx_saturation_arithmetic 202311L
diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 766558b3ce0..99364286a72 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -83,6 +83,7 @@
 #define 

[PATCH 1/4] libstdc++: Heterogeneous std::pair comparisons [PR113386]

2024-04-10 Thread Jonathan Wakely
Tested x86_64-linux.

Since this only affects C++20 and later it seems OK for trunk now.

-- >8 --

I'm only treating this as a DR for C++20 for now, because it's less work
and only requires changes to operator== and operator<=>. To do this for
older standards would require changes to the six relational operators
used pre-C++20.

libstdc++-v3/ChangeLog:

PR libstdc++/113386
* include/bits/stl_pair.h (operator==, operator<=>): Support
heterogeneous comparisons, as per LWG 3865.
* testsuite/20_util/pair/comparison_operators/lwg3865.cc: New
test.
---
 libstdc++-v3/include/bits/stl_pair.h  | 32 ++-
 .../pair/comparison_operators/lwg3865.cc  | 15 +
 2 files changed, 39 insertions(+), 8 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/20_util/pair/comparison_operators/lwg3865.cc

diff --git a/libstdc++-v3/include/bits/stl_pair.h 
b/libstdc++-v3/include/bits/stl_pair.h
index 4f5c8389fa6..45317417c9c 100644
--- a/libstdc++-v3/include/bits/stl_pair.h
+++ b/libstdc++-v3/include/bits/stl_pair.h
@@ -1000,23 +1000,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template pair(_T1, _T2) -> pair<_T1, _T2>;
 #endif
 
-  /// Two pairs of the same type are equal iff their members are equal.
-  template
+#if __cpp_lib_three_way_comparison && __cpp_lib_concepts
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 3865. Sorting a range of pairs
+
+  /// Two pairs are equal iff their members are equal.
+  template
 inline _GLIBCXX_CONSTEXPR bool
-operator==(const pair<_T1, _T2>& __x, const pair<_T1, _T2>& __y)
+operator==(const pair<_T1, _T2>& __x, const pair<_U1, _U2>& __y)
 { return __x.first == __y.first && __x.second == __y.second; }
 
-#if __cpp_lib_three_way_comparison && __cpp_lib_concepts
-  template
-constexpr common_comparison_category_t<__detail::__synth3way_t<_T1>,
-  __detail::__synth3way_t<_T2>>
-operator<=>(const pair<_T1, _T2>& __x, const pair<_T1, _T2>& __y)
+  /** Defines a lexicographical order for pairs.
+   *
+   * For two pairs of comparable types, `P` is ordered before `Q` if
+   * `P.first` is less than `Q.first`, or if `P.first` and `Q.first`
+   * are equivalent (neither is less than the other) and `P.second` is
+   * less than `Q.second`.
+  */
+  template
+constexpr common_comparison_category_t<__detail::__synth3way_t<_T1, _U1>,
+  __detail::__synth3way_t<_T2, _U2>>
+operator<=>(const pair<_T1, _T2>& __x, const pair<_U1, _U2>& __y)
 {
   if (auto __c = __detail::__synth3way(__x.first, __y.first); __c != 0)
return __c;
   return __detail::__synth3way(__x.second, __y.second);
 }
 #else
+  /// Two pairs of the same type are equal iff their members are equal.
+  template
+inline _GLIBCXX_CONSTEXPR bool
+operator==(const pair<_T1, _T2>& __x, const pair<_T1, _T2>& __y)
+{ return __x.first == __y.first && __x.second == __y.second; }
+
   /** Defines a lexicographical order for pairs.
*
* For two pairs of the same type, `P` is ordered before `Q` if
diff --git 
a/libstdc++-v3/testsuite/20_util/pair/comparison_operators/lwg3865.cc 
b/libstdc++-v3/testsuite/20_util/pair/comparison_operators/lwg3865.cc
new file mode 100644
index 000..2bbd54af192
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/pair/comparison_operators/lwg3865.cc
@@ -0,0 +1,15 @@
+// { dg-do run { target c++20 } }
+
+// LWG 3865. Sorting a range of pairs
+
+#include 
+#include 
+
+int main()
+{
+  std::pair p(1, 2);
+  std::pair p2(p.first, p.second);
+  VERIFY( p == p2 );
+  VERIFY( p <= p2 );
+  VERIFY( p >= p2 );
+}
-- 
2.44.0



[Patch, fortran] PR113363 - ICE on ASSOCIATE and unlimited polymorphic function

2024-04-10 Thread Paul Richard Thomas
Hi All,

This patch corrects incorrect results from assignment of unlimited
polymorphic function results both in assignment statements and allocation
with source.

The first chunk in trans-array.cc ensures that the array dtype is set to
the source dtype. The second chunk ensures that the lhs _len field does not
default to zero and so is specific to dynamic types of character.

The addition to trans-stmt.cc transforms the source expression, aka expr3,
from a derived type of type "STAR" into a proper unlimited polymorphic
expression ready for assignment to the newly allocated entity.

OK for mainline?

Paul

Fortran: Fix wrong code in unlimited polymorphic assignment [PR113363]

2024-04-10  Paul Thomas  

gcc/fortran
PR fortran/113363
* trans-array.cc (gfc_array_init_size): Use the expr3 dtype so
that the correct element size is used.
(gfc_alloc_allocatable_for_assignment): Set the _len field for
unlimited polymorphic assignments.
* trans-stmt.cc (gfc_trans_allocate): Build a correct rhs for
the assignment of an unlimited polymorphic 'source'.

gcc/testsuite/
PR fortran/113363
* gfortran.dg/pr113363.f90: New test.
diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 30b84762346..2f9a32dda15 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -5957,6 +5957,11 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
   tmp = gfc_conv_descriptor_dtype (descriptor);
   gfc_add_modify (pblock, tmp, gfc_get_dtype_rank_type (rank, type));
 }
+  else if (expr3_desc && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (expr3_desc)))
+{
+  tmp = gfc_conv_descriptor_dtype (descriptor);
+  gfc_add_modify (pblock, tmp, gfc_conv_descriptor_dtype (expr3_desc));
+}
   else
 {
   tmp = gfc_conv_descriptor_dtype (descriptor);
@@ -11324,6 +11329,9 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
 	gfc_add_modify (, tmp,
 			fold_convert (TREE_TYPE (tmp),
 	  TYPE_SIZE_UNIT (type)));
+	  else if (UNLIMITED_POLY (expr2))
+	gfc_add_modify (, tmp,
+			gfc_class_len_get (TREE_OPERAND (desc, 0)));
 	  else
 	gfc_add_modify (, tmp,
 			build_int_cst (TREE_TYPE (tmp), 0));
diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc
index 7997c167bae..c6953033cf4 100644
--- a/gcc/fortran/trans-stmt.cc
+++ b/gcc/fortran/trans-stmt.cc
@@ -7187,6 +7187,45 @@ gfc_trans_allocate (gfc_code * code, gfc_omp_namelist *omp_allocate)
 	  gfc_expr *rhs = e3rhs ? e3rhs : gfc_copy_expr (code->expr3);
 	  flag_realloc_lhs = 0;
 
+	  /* The handling of code->expr3 above produces a derived type of
+	 type "STAR", whose size defaults to size(void*). In order to
+	 have the right type information for the assignment, we must
+	 reconstruct an unlimited polymorphic rhs.  */
+	  if (UNLIMITED_POLY (code->expr3)
+	  && e3rhs && e3rhs->ts.type == BT_DERIVED
+	  && !strcmp (e3rhs->ts.u.derived->name, "STAR"))
+	{
+	  gfc_ref *ref;
+	  gcc_assert (TREE_CODE (expr3_vptr) == COMPONENT_REF);
+	  tmp = gfc_create_var (gfc_typenode_for_spec (>expr3->ts),
+"e3");
+	  gfc_add_modify (, tmp,
+			  gfc_get_class_from_expr (expr3_vptr));
+	  rhs->symtree->n.sym->backend_decl = tmp;
+	  rhs->ts = code->expr3->ts;
+	  rhs->symtree->n.sym->ts = rhs->ts;
+	  for (ref = init_expr->ref; ref; ref = ref->next)
+		{
+		  /* Copy over the lhs _data component ref followed by the
+		 full array reference for source expressions with rank.
+		 Otherwise, just copy the _data component ref.  */
+		  if (code->expr3->rank
+		  && ref && ref->next && !ref->next->next)
+		{
+		  rhs->ref = gfc_copy_ref (ref);
+		  rhs->ref->next = gfc_copy_ref (ref->next);
+		  break;
+		}
+		  else if ((init_expr->rank && !code->expr3->rank
+			&& ref && ref->next && !ref->next->next)
+			   || (ref && !ref->next))
+		{
+		  rhs->ref = gfc_copy_ref (ref);
+		  break;
+		}
+		}
+	}
+
 	  /* Set the symbol to be artificial so that the result is not finalized.  */
 	  init_expr->symtree->n.sym->attr.artificial = 1;
 	  tmp = gfc_trans_assignment (init_expr, rhs, true, false, true,
diff --git a/gcc/testsuite/gfortran.dg/pr113363.f90 b/gcc/testsuite/gfortran.dg/pr113363.f90
new file mode 100644
index 000..7701539fdff
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr113363.f90
@@ -0,0 +1,86 @@
+! { dg-do run }
+! Test the fix for comment 1 in PR113363, which failed as in comments below.
+! Contributed by Harald Anlauf  
+program p
+  implicit none
+  class(*), allocatable :: x(:), y
+  character(*), parameter :: arr(2) = ["hello ","bye   "], &
+ sca = "Have a nice day"
+
+! Bug was detected in polymorphic array function results
+  allocate(x, source = foo ())
+  call check1 (x, arr)  ! Wrong output "6 hello e"
+  deallocate (x)
+  x = foo ()
+  call check1 (x, arr)  ! Wrong output "0  "
+  associate 

Re: [PATCH V4 1/3] aarch64: Place target independent and dependent changed code in one file

2024-04-10 Thread Alex Coplan
Hi Ajit,

On 09/04/2024 20:59, Ajit Agarwal wrote:
> Hello Alex:
> 
> On 09/04/24 8:39 pm, Alex Coplan wrote:
> > On 09/04/2024 20:01, Ajit Agarwal wrote:
> >> Hello Alex:
> >>
> >> On 09/04/24 7:29 pm, Alex Coplan wrote:
> >>> On 09/04/2024 17:30, Ajit Agarwal wrote:
> 
> 
>  On 05/04/24 10:03 pm, Alex Coplan wrote:
> > On 05/04/2024 13:53, Ajit Agarwal wrote:
> >> Hello Alex/Richard:
> >>
> >> All review comments are incorporated.
> >
> > Thanks, I was kind-of expecting you to also send the renaming patch as a
> > preparatory patch as we discussed.
> >
> > Sorry for another meta comment, but: I think the reason that the Linaro
> > CI isn't running tests on your patches is actually because you're
> > sending 1/3 of a series but not sending the rest of the series.
> >
> > So please can you either send this as an individual preparatory patch
> > (not marked as a series) or if you're going to send a series (e.g. with
> > a preparatory rename patch as 1/2 and this as 2/2) then send the entire
> > series when you make updates.  That way the CI should test your patches,
> > which would be helpful.
> >
> 
>  Addressed.
>   
> >>
> >> Common infrastructure of load store pair fusion is divided into target
> >> independent and target dependent changed code.
> >>
> >> Target independent code is the Generic code with pure virtual function
> >> to interface betwwen target independent and dependent code.
> >>
> >> Target dependent code is the implementation of pure virtual function 
> >> for
> >> aarch64 target and the call to target independent code.
> >>
> >> Thanks & Regards
> >> Ajit
> >>
> >>
> >> aarch64: Place target independent and dependent changed code in one 
> >> file
> >>
> >> Common infrastructure of load store pair fusion is divided into target
> >> independent and target dependent changed code.
> >>
> >> Target independent code is the Generic code with pure virtual function
> >> to interface betwwen target independent and dependent code.
> >>
> >> Target dependent code is the implementation of pure virtual function 
> >> for
> >> aarch64 target and the call to target independent code.
> >>
> >> 2024-04-06  Ajit Kumar Agarwal  
> >>
> >> gcc/ChangeLog:
> >>
> >>* config/aarch64/aarch64-ldp-fusion.cc: Place target
> >>independent and dependent changed code.
> >
> > You're going to need a proper ChangeLog eventually, but I guess there's
> > no need for that right now.
> >
> >> ---
> >>  gcc/config/aarch64/aarch64-ldp-fusion.cc | 371 +++
> >>  1 file changed, 249 insertions(+), 122 deletions(-)
> >>
> >> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
> >> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> >> index 22ed95eb743..cb21b514ef7 100644
> >> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> >> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> >> @@ -138,8 +138,122 @@ struct alt_base
> >>poly_int64 offset;
> >>  };
> >>  
> >> +// Virtual base class for load/store walkers used in alias analysis.
> >> +struct alias_walker
> >> +{
> >> +  virtual bool conflict_p (int ) const = 0;
> >> +  virtual insn_info *insn () const = 0;
> >> +  virtual bool valid () const  = 0;
> >
> > Heh, looking at this made me realise there is a whitespace bug here in
> > the existing code (double space after const).  Sorry about that!  I'll
> > push an obvious fix for that.
> >
> >> +  virtual void advance () = 0;
> >> +};
> >> +
> >> +struct pair_fusion {
> >> +
> >> +  pair_fusion () {};
> >
> > This ctor looks pointless at the moment.  Perhaps instead we could put
> > the contents of ldp_fusion_init in here and then delete that function?
> >
> 
>  Addressed.
> 
> >> +  virtual bool fpsimd_op_p (rtx reg_op, machine_mode mem_mode,
> >> + bool load_p) = 0;
> >
> > Please can we have comments above each of these virtual functions
> > describing any parameters, what the purpose of the hook is, and the
> > interpretation of the return value?  This will serve as the
> > documentation for other targets that want to make use of the pass.
> >
> > It might make sense to have a default-false implementation for
> > fpsimd_op_p, especially if you don't want to make use of this bit for
> > rs6000.
> >
> 
>  Addressed.
>   
> >> +
> >> +  virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0;
> >> +  virtual bool pair_trailing_writeback_p () = 0;
> >
> > Sorry for the run-around, but: I think this and
> > handle_writeback_opportunities () should be the same function, either
> > returning 

Re: [PATCH] testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]

2024-04-10 Thread Kewen.Lin
on 2024/4/10 15:11, Richard Biener wrote:
> On Wed, Apr 10, 2024 at 8:24 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> pr113359-2_*.c define a struct having unsigned long type
>> members ay and az which have 4 bytes size at -m32, while
>> the related constants CL1 and CL2 used for equality check
>> are always 8 bytes, it makes compiler consider the below
>>
>>   69   if (a.ay != CL1)
>>   70 __builtin_abort ();
>>
>> always to abort and optimize away the following call to
>> getb, which leads to the expected wpa dumping on
>> "Semantic equality" missing.
>>
>> This patch is to modify the types with unsigned long long
>> accordingly.  Tested well on powerpc64-linux-gnu.
>>
>> Is it ok for trunk?
> 
> OK

Thanks!  Pushed as r14-9886.

BR,
Kewen

> 
>> BR,
>> Kewen
>> -
>> PR testsuite/114662
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.dg/lto/pr113359-2_0.c: Use unsigned long long instead of
>> unsigned long.
>> * gcc.dg/lto/pr113359-2_1.c: Likewise.
>> ---
>>  gcc/testsuite/gcc.dg/lto/pr113359-2_0.c | 8 
>>  gcc/testsuite/gcc.dg/lto/pr113359-2_1.c | 8 
>>  2 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c 
>> b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
>> index 8b2d5bdfab2..8495667599d 100644
>> --- a/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
>> +++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_0.c
>> @@ -8,15 +8,15 @@
>>  struct SA
>>  {
>>unsigned int ax;
>> -  unsigned long ay;
>> -  unsigned long az;
>> +  unsigned long long ay;
>> +  unsigned long long az;
>>  };
>>
>>  struct SB
>>  {
>>unsigned int bx;
>> -  unsigned long by;
>> -  unsigned long bz;
>> +  unsigned long long by;
>> +  unsigned long long bz;
>>  };
>>
>>  struct ZA
>> diff --git a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c 
>> b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
>> index 61bc0547981..8320f347efe 100644
>> --- a/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
>> +++ b/gcc/testsuite/gcc.dg/lto/pr113359-2_1.c
>> @@ -5,15 +5,15 @@
>>  struct SA
>>  {
>>unsigned int ax;
>> -  unsigned long ay;
>> -  unsigned long az;
>> +  unsigned long long ay;
>> +  unsigned long long az;
>>  };
>>
>>  struct SB
>>  {
>>unsigned int bx;
>> -  unsigned long by;
>> -  unsigned long bz;
>> +  unsigned long long by;
>> +  unsigned long long bz;
>>  };
>>
>>  struct ZA
>> --
>> 2.43.0



[PATCH] wwwdocs: gcc-14: Add RISC-V changes

2024-04-10 Thread Kito Cheng
---
 htdocs/gcc-14/changes.html | 155 -
 1 file changed, 154 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 2d8968cf..6cbb2e8f 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -739,7 +739,160 @@ __asm (".global __flmap_lock"  "\n\t"
 
 
 
-
+RISC-V
+
+  The SLP and loop vectorizer is now enabled for RISC-V when the vector
+  extension is enabled, thanks to Ju-Zhe Zhong from
+  RiVAI,
+  Pan Li from Intel, and Robin Dapp
+  from Ventana Micro for
+  contributing most of the implementation!
+  The -mrvv-max-lmul= option has been introduced for
+  performance tuning of the loop vectorizer. The default value is
+  -mrvv-max-lmul=m1, which limits the maximum LMUL to 1.
+  The -mrvv-max-lmul=dynamic setting can dynamically select
+  the maximum LMUL value based on register pressure.
+  Atomic code generation has been improved and is now in conformance with
+  the latest psABI specification, thanks to Patrick O'Neill from
+  Rivos.
+  Support for the vector intrinsics as specified in
+  
+  version 1.0 of the RISC-V vector intrinsic specification.
+  Support for the experimental vector crypto intrinsics as specified in
+  
+  RISC-V vector intrinsic specification, thanks to Feng Wang et al.
+  from https://eswincomputing.com/;>ESWIN Computing
+  Support for the T-head vector intrinsics.
+  Support for the scalar bitmanip and scalar crypto  intrinsics, thanks to
+  Liao Shihua from https://plctlab.org/;>PLCT.
+  Support for the large code model via option -mcmodel=large,
+  thanks to Kuan-Lin Chen from
+  https://www.andestech.com/;>Andes Technology.
+  Support for the standard vector calling convention variant, thanks to
+  Lehua Ding from RiVAI.
+  Supports the target attribute, which allows users to compile
+  a function with specific extensions.
+  -march= option no longer requires the architecture string
+  to be in canonical order, with only a few constraints remaining: the
+  architecture string must start with rv[32|64][i|g|e], and
+  must use an underscore as the separator after a multi-letter extension.
+  
+  -march=help option has been introduced to dump all
+  supported extensions.
+  Added experimental support for the -mrvv-vector-bits=zvl
+  option and the riscv_rvv_vector_bits attribute, which
+  specify a fixed length for scalable vector types. This option is
+  optimized for specific vector core implementations; however, the code
+  generated with this option is NOT portable,
+  thanks to Pan Li from https://www.intel.com/;>Intel.
+  
+  Support for TLS descriptors has been introduced, which can be enabled by
+  the -mtls-dialect=desc option. The default behavior can be
+  configured with --with-tls=[trad|desc].
+  Support for the TLS descriptors, this can be enabled by
+  -mtls-dialect=desc and the default behavior can be configure
+  by --with-tls=[trad|desc], thanks to Tatsuyuki Ishi from
+  https://bluewhale.systems/;>Blue Whale Systems
+  
+  Support for the following standard extensions has been added:
+
+  Vector crypto extensions:
+   
+ Zvbb
+ Zvkb
+ Zvbc
+ Zvkg
+ Zvkned
+ Zvkhna
+ Zvkhnb
+ Zvksed
+ Zvksh
+ Zvkn
+ Zvknc
+ Zvkng
+ Zvks
+ Zvksc
+ Zvksg
+ Zvkt
+   
+  
+  Code size reduction extensions:
+   
+ Zca
+ Zcb
+ Zce
+ Zcf
+ Zcd
+ Zcmp
+ Zcmt
+   
+  
+  Zicond
+  Zfa
+  Ztso
+  Zvfbfmin
+  Zvfhmin
+  Zvfh
+  Za64rs
+  Za128rs
+  Ziccif
+  Ziccrse
+  Ziccamoa
+  Zicclsm
+  Zic64b
+  Smaia
+  Smepmp
+  Smstateen
+  Ssaia
+  Sscofpmf
+  Ssstateen
+  Sstc
+  Svinval
+  Svnapot
+  Svpbmt
+
+  
+  Support for the following vendor extensions has been added:
+
+  T-Head:
+   
+ XTheadVector
+   
+  
+  CORE-V:
+   
+ XCVmac
+ XCValu
+ XCVelw
+ XCVsimd
+ XCVbi
+   
+  
+  Ventana Micro:
+   
+ XVentanaCondops
+   
+  
+
+  
+  The following new CPUs are supported through the -mcpu
+  option (GCC identifiers in parentheses).
+
+  SiFive's X280 (sifive-x280).
+  SiFive's P450 (sifive-p450).
+  SiFive's P670 (sifive-p670).
+
+  
+  The following new CPUs are supported through the -mtune
+  option (GCC identifiers in parentheses).
+
+  Generic out-of-order core (generic-ooo).
+  SiFive's P400 series (sifive-p400-series).
+  SiFive's P600 series (sifive-p600-series).
+  XiangShan's Nanhu microarchitecture 
(xiangshan-nanhu).
+
+  
+
 
 
 
-- 
2.34.1



Re: [PATCH] Regenerate opt.urls

2024-04-10 Thread Andreas Schwab
On Apr 09 2024, Palmer Dabbelt wrote:

> I didn't actually regenerate this as I can't figure out how,

make regenerate-opt-urls

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Regeneration of 'gcc/config/riscv/riscv.opt.urls' (was: [PATCH v2 2/3] aarch64: Add support for aarch64-gnu (GNU/Hurd on AArch64))

2024-04-10 Thread Thomas Schwinge
Hi!

On 2024-04-09T09:24:29-0700, Palmer Dabbelt  wrote:
> On Tue, 09 Apr 2024 01:04:34 PDT (-0700), buga...@gmail.com wrote:
>> On Tue, Apr 9, 2024 at 10:27 AM Thomas Schwinge  
>> wrote:
>>> Thanks, pushed to trunk branch:
>>>
>>>   - commit 532c57f8c3a15b109a46d3e2b14d60a5c40979d5 "Move GNU/Hurd 
>>> startfile spec from config/i386/gnu.h to config/gnu.h"
>>>   - commit 9670a2326333caa8482377c00beb65723b7b4b26 "aarch64: Add support 
>>> for aarch64-gnu (GNU/Hurd on AArch64)"
>>>   - commit 46c91665f4bceba19aed56f5bd6e934c548b84ff "libgcc: Add basic 
>>> support for aarch64-gnu (GNU/Hurd on AArch64)"
>>
>> \o/ Thanks a lot!
>>
>> This will unblock merging the aarch64-gnu glibc port upstream.

\o/


>> I assume the buildbot failure that I just got an email about is
>> unrelated; it's failing on some RISC-V thing.
>
> Sorry if I missed something here, do you have a pointer?


and several more such messages, requesting:

--- a/gcc/config/riscv/riscv.opt.urls
+++ b/gcc/config/riscv/riscv.opt.urls
@@ -89,3 +89,5 @@ UrlSuffix(gcc/RISC-V-Options.html#index-minline-strncmp)
 minline-strlen
 UrlSuffix(gcc/RISC-V-Options.html#index-minline-strlen)
 
+; skipping UrlSuffix for 'mtls-dialect=' due to finding no URLs
+

To be fixed by

"Regenerate opt.urls".


Grüße
 Thomas


  1   2   >