[PATCH, RFC] combine: Don't truncate const operand of AND if it's no benefits

2024-03-10 Thread HAO CHEN GUI
Hi,
  This patch tries to fix the problem when a canonical form doesn't benefit
on a specific target. The const operand of AND is and with the nonzero
bits of another operand in combine pass. It's a canonical form, but it's no
benefits for the target which has rotate and mask insns. As the mask is
truncated, it can't match the insn conditions which it originally matches.
For example, the following insn condition checks the sum of two AND masks.
When one of the mask is truncated, the condition breaks.

(define_insn "*rotlsi3_insert_5"
  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
(ior:SI (and:SI (match_operand:SI 1 "gpc_reg_operand" "0,r")
(match_operand:SI 2 "const_int_operand" "n,n"))
(and:SI (match_operand:SI 3 "gpc_reg_operand" "r,0")
(match_operand:SI 4 "const_int_operand" "n,n"]
  "rs6000_is_valid_mask (operands[2], NULL, NULL, SImode)
   && UINTVAL (operands[2]) != 0 && UINTVAL (operands[4]) != 0
   && UINTVAL (operands[2]) + UINTVAL (operands[4]) + 1 == 0"
...

  This patch tries to fix the problem by comparing the rtx cost. If another
operand (varop) is not changed and rtx cost with new mask is not less than
the original one, the mask is restored to original one.

  I'm not sure if comparison of rtx cost here is proper. The outer code is
unknown and I suppose it as "SET". Also the rtx cost might not be accurate.
>From my understanding, the canonical forms should always benefit as it can't
be undo in combine pass. Do we have a perfect solution for this kind of
issues? Looking forward for your advice.

  Another similar issues for canonical forms. Whether the widen mode for
lshiftrt is always good?
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624852.html

Thanks
Gui Haochen

ChangeLog
Combine: Don't truncate const operand of AND if it's no benefits

In combine pass, the canonical form is to turn off all bits in the constant
that are know to already be zero for AND.

  /* Turn off all bits in the constant that are known to already be zero.
 Thus, if the AND isn't needed at all, we will have CONSTOP == NONZERO_BITS
 which is tested below.  */

  constop &= nonzero;

But it doesn't benefit when the target has rotate and mask insert insns.
The AND mask is truncated and lost its information.  Thus it can't match
the insn conditions.  For example, the following insn condition checks
the sum of two AND masks.

(define_insn "*rotlsi3_insert_5"
  [(set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
(ior:SI (and:SI (match_operand:SI 1 "gpc_reg_operand" "0,r")
(match_operand:SI 2 "const_int_operand" "n,n"))
(and:SI (match_operand:SI 3 "gpc_reg_operand" "r,0")
(match_operand:SI 4 "const_int_operand" "n,n"]
  "rs6000_is_valid_mask (operands[2], NULL, NULL, SImode)
   && UINTVAL (operands[2]) != 0 && UINTVAL (operands[4]) != 0
   && UINTVAL (operands[2]) + UINTVAL (operands[4]) + 1 == 0"
...

This patch restores the const operand of AND if the another operand is
not optimized and the truncated const operand doesn't save the rtx cost.

gcc/
* combine.cc (simplify_and_const_int_1): Restore the const operand
of AND if varop is not optimized and the rtx cost of the new const
operand is not reduced.

gcc/testsuite/
* gcc.target/powerpc/rlwimi-0.c: Reduced total number of insns and
adjust the number of rotate and mask insns.
* gcc.target/powerpc/rlwimi-1.c: Likewise.
* gcc.target/powerpc/rlwimi-2.c: Likewise.

patch.diff
diff --git a/gcc/combine.cc b/gcc/combine.cc
index a4479f8d836..16ff09ea854 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -10161,8 +10161,23 @@ simplify_and_const_int_1 (scalar_int_mode mode, rtx 
varop,
   if (constop == nonzero)
 return varop;

-  if (varop == orig_varop && constop == orig_constop)
-return NULL_RTX;
+  if (varop == orig_varop)
+{
+  if (constop == orig_constop)
+   return NULL_RTX;
+  else
+   {
+ rtx tmp = simplify_gen_binary (AND, mode, varop,
+gen_int_mode (constop, mode));
+ rtx orig = simplify_gen_binary (AND, mode, varop,
+ gen_int_mode (orig_constop, mode));
+ if (set_src_cost (tmp, mode, optimize_this_for_speed_p)
+ < set_src_cost (orig, mode, optimize_this_for_speed_p))
+   return tmp;
+ else
+   return NULL_RTX;
+   }
+}

   /* Otherwise, return an AND.  */
   return simplify_gen_binary (AND, mode, varop, gen_int_mode (constop, mode));
diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-0.c 
b/gcc/testsuite/gcc.target/powerpc/rlwimi-0.c
index 961be199901..d9dd4419f1d 100644
--- a/gcc/testsuite/gcc.target/powerpc/rlwimi-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-0.c
@@ -2,15 +2,15 @@
 /* { dg-options "-O2" } */

 /* { dg-final { scan-assembler-times 

Re: [RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-10 Thread Andrew Pinski
On Sun, Mar 10, 2024 at 2:09 PM Jeff Law  wrote:
>
>
>
> On 3/10/24 3:05 PM, Andrew Pinski wrote:
> > On Sun, Mar 10, 2024 at 2:04 PM Jeff Law  wrote:
> >>
> >> Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
> >> positive triggered by loop unrolling.
> >>
> >> As I speculated a couple years ago, we could eliminate the comparisons
> >> against bogus pointers.  Consider:
> >>
> >>> [local count: 30530247]:
> >>>if (last_12 !=   [(void *)"aa" + 3B])
> >>>  goto ; [54.59%]
> >>>else
> >>>  goto ; [45.41%]
> >>
> >>
> >> That's a valid comparison as ISO allows us to generate, but not
> >> dereference, a pointer one element past the end of the object.
> >>
> >> But +4B is a bogus pointer.  So given an EQ comparison against that
> >> pointer we could always return false and for NE always return true.
> >>
> >> VRP and DOM seem to be the most natural choices for this kind of
> >> optimization on the surface.  However DOM is actually not viable because
> >> the out-of-bounds pointer warning pass is run at the end of VRP.  So
> >> we've got to take care of this prior to the end of VRP.
> >>
> >>
> >>
> >> I haven't done a bootstrap or regression test with this.  But if it
> >> looks reasonable I can certainly push on it further. I have confirmed it
> >> does eliminate the tests and shuts up the bogus warning.
> >>
> >> The downside is this would also shut up valid warnings if user code did
> >> this kind of test.
> >>
> >> Comments/Suggestions?
> >
> > ENOPATCH
> Yea, realized it as I pushed the send button.  Then t-bird crashed,
> repeatedly.
>
> Attached this time..


One minor comment on it

The comment:
> return true for EQ and false for NE.

Seems to be the opposite for what the code does:
> return code == EQ_EXPR ? boolean_false_node : boolean_true_node;

Thanks,
Andrew


>
> jeff
>


Re: [PATCH] fix PowerPC < 7 w/ Altivec not to default to power7

2024-03-10 Thread Kewen.Lin
Hi,

on 2024/3/8 19:33, Rene Rebe wrote:
> This might not be the best timing -short before a major release-,
> however, Sam just commented on the bug I filled years ago [1], so here
> we go:
> 
> Glibc uses .machine to determine assembler optimizations to use.
> However, since reworking the rs6000 .machine output selection in
> commit e154242724b084380e3221df7c08fcdbd8460674 22 May 2019, G5 as
> well as Cell, and even power4 w/ -maltivec currently resulted in
> power7. Mask _ALTIVEC away as the .machine selection already did for
> GFX and GPOPT.

Thanks for fixing, this fix looks reasonable to me, OPTION_MASK_ALTIVEC
is a part of POWERPC_7400_MASK so any specified cpu type which has this
POWERPC_7400_MASK by default and isn't handled early in function
rs6000_machine_from_flags can suffer from this issue.

> 
> powerpc64-t2-linux-gnu-gcc  test.c -S -o - -mcpu=G5
>   .file   "test.c"
>   .machine power7
>   .abiversion 2
>   .section".text"
>   .ident  "GCC: (GNU) 10.2.0"
>   .section.note.GNU-stack,"",@progbits
> 

Nit: Could you also add one test case for this?

btw, -mdejagnu-cpu=G5 can force the cpu type in dg-options.

> We ship this in T2/Linux [2] since 2020 and it is tested on G5, Cell
> and Power8.
> 
> Signed-of-by: René Rebe 
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97367
> [2] https://t2sde.org
> 
> --- gcc-11.1.0-RC-20210423/gcc/config/rs6000/rs6000.cc.vanilla
> 2021-04-25 22:57:16.964223106 +0200
> +++ gcc-11.1.0-RC-20210423/gcc/config/rs6000/rs6000.cc2021-04-25 
> 22:57:27.193223841 +0200
> @@ -5765,7 +5765,7 @@
>HOST_WIDE_INT flags = rs6000_isa_flags;
>  
>/* Disable the flags that should never influence the .machine selection.  
> */
> -  flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | 
> OPTION_MASK_ISEL);
> +  flags &= ~(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_PPC_GPOPT | 
> OPTION_MASK_ALTIVEC | OPTION_MASK_ISEL);

Nit: This line is too long and needs re-format.

BR,
Kewen

>  
>if ((flags & (ISA_3_1_MASKS_SERVER & ~ISA_3_0_MASKS_SERVER)) != 0)
>  return "power10";
> 



[COMMITTED] Fold: Fix up merge_truthop_with_opposite_arm for NaNs [PR95351]

2024-03-10 Thread Andrew Pinski
The problem here is that merge_truthop_with_opposite_arm would
use the type of the result of the comparison rather than the operands
of the comparison to figure out if we are honoring NaNs.
This fixes that oversight and now we get the correct results in this
case.

Committed as obvious after a bootstrap/test on x86_64-linux-gnu.

PR middle-end/95351

gcc/ChangeLog:

* fold-const.cc (merge_truthop_with_opposite_arm): Use
the type of the operands of the comparison and not the type
of the comparison.

gcc/testsuite/ChangeLog:

* gcc.dg/float_opposite_arm-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/fold-const.cc   |  3 ++-
 gcc/testsuite/gcc.dg/float_opposite_arm-1.c | 17 +
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/float_opposite_arm-1.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 43105d20be3..299c22bf391 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -6420,7 +6420,6 @@ static tree
 merge_truthop_with_opposite_arm (location_t loc, tree op, tree cmpop,
 bool rhs_only)
 {
-  tree type = TREE_TYPE (cmpop);
   enum tree_code code = TREE_CODE (cmpop);
   enum tree_code truthop_code = TREE_CODE (op);
   tree lhs = TREE_OPERAND (op, 0);
@@ -6436,6 +6435,8 @@ merge_truthop_with_opposite_arm (location_t loc, tree op, 
tree cmpop,
   if (TREE_CODE_CLASS (code) != tcc_comparison)
 return NULL_TREE;
 
+  tree type = TREE_TYPE (TREE_OPERAND (cmpop, 0));
+
   if (rhs_code == truthop_code)
 {
   tree newrhs = merge_truthop_with_opposite_arm (loc, rhs, cmpop, 
rhs_only);
diff --git a/gcc/testsuite/gcc.dg/float_opposite_arm-1.c 
b/gcc/testsuite/gcc.dg/float_opposite_arm-1.c
new file mode 100644
index 000..d2dbff35066
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/float_opposite_arm-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-original -fdump-tree-optimized" } */
+/* { dg-add-options ieee } */
+/* PR middle-end/95351 */
+
+int Foo(double possiblyNAN, double b, double c)
+{
+return (possiblyNAN <= 2.0) || ((possiblyNAN  > 2.0) && (b > c));
+}
+
+/* Make sure we don't remove either >/<=  */
+
+/* { dg-final { scan-tree-dump "possiblyNAN > 2.0e.0" "original" } } */
+/* { dg-final { scan-tree-dump "possiblyNAN_\[0-9\]+.D. > 2.0e.0" "optimized" 
} } */
+
+/* { dg-final { scan-tree-dump "possiblyNAN <= 2.0e.0" "original" } } */
+/* { dg-final { scan-tree-dump "possiblyNAN_\[0-9\]+.D. <= 2.0e.0" "optimized" 
} } */
-- 
2.43.0



RE: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-10 Thread Li, Pan2
> You might want to investigate why you get mask and not Len for a particular 
> stmt.  mixing will cause variable length vectorization to fail.

Yes, the new added gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c cannot 
vectorize, will try to investigate why.

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, March 11, 2024 1:05 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, 
Yanzhang ; rdapp@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and 
store are enabled



> Am 10.03.2024 um 11:02 schrieb Li, Pan2 :
> 
> Committed, thanks Richard.

You might want to investigate why you get mask and not Len for a particular 
stmt.  mixing will cause variable length vectorization to fail.

> Pan
> 
> -Original Message-
> From: Richard Biener 
> Sent: Sunday, March 10, 2024 2:53 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> Wang, Yanzhang ; rdapp@gmail.com; 
> jeffreya...@gmail.com
> Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len 
> and store are enabled
> 
> 
> 
>> Am 10.03.2024 um 04:14 schrieb pan2...@intel.com:
>> 
>> From: Pan Li 
>> 
>> This patch would like to fix one ICE in vectorizable_store when both the
>> loop_masks and loop_lens are enabled.  The ICE looks like below when build
>> with "-march=rv64gcv -O3".
>> 
>> during GIMPLE pass: vect
>> test.c: In function ‘d’:
>> test.c:6:6: internal compiler error: in vectorizable_store, at
>> tree-vect-stmts.cc:8691
>>   6 | void d() {
>> |  ^
>> 0x37a6f2f vectorizable_store
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
>> 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
>> _slp_tree*, _slp_instance*, vec*)
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
>> 0x1db5dca vect_analyze_loop_operations
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
>> 0x1db885b vect_analyze_loop_2
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
>> 0x1dba029 vect_analyze_loop_1
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
>> 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
>> 0x1e389d1 try_vectorize_loop_1
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
>> 0x1e38f3d try_vectorize_loop
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
>> 0x1e39230 execute
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298
>> 
>> There are two ways to reach vectorizer LD/ST, one is the analysis and
>> the other is transform.  We cannot have both the lens and the masks
>> enabled during transform but it is valid during analysis.  Given the
>> transform doesn't required cost_vec,  we can only enable the assert
>> based on cost_vec is NULL or not.
>> 
>> Below testsuites are passed for this patch:
>> * The x86 bootstrap tests.
>> * The x86 fully regression tests.
>> * The aarch64 fully regression tests.
>> * The riscv fully regressison tests.
> 
> Ok
> 
> Thanks,
> Richard
> 
>> gcc/ChangeLog:
>> 
>>   * tree-vect-stmts.cc (vectorizable_store): Enable the assert
>>   during transform process.
>>   (vectorizable_load): Ditto.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>   * gcc.target/riscv/rvv/base/pr114195-1.c: New test.
>> 
>> Signed-off-by: Pan Li 
>> ---
>> .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++
>> gcc/tree-vect-stmts.cc | 18 ++
>> 2 files changed, 29 insertions(+), 4 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> 
>> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
>> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> new file mode 100644
>> index 000..a67b847112b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> @@ -0,0 +1,15 @@
>> +/* Test that we do not have ice when compile */
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
>> +
>> +long a, b;
>> +extern short c[];
>> +
>> +void d() {
>> +  for (int e = 0; e < 35; e = 2) {
>> +a = ({ a < 0 ? a : 0; });
>> +b = ({ b < 0 ? b : 0; });
>> +
>> +c[e] = 0;
>> +  }
>> +}
>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> index 14a3ffb5f02..e8617439a48 100644
>> --- a/gcc/tree-vect-stmts.cc
>> +++ b/gcc/tree-vect-stmts.cc
>> @@ -8697,8 +8697,13 @@ vectorizable_store (vec_info *vinfo,
>>   ? _VINFO_LENS (loop_vinfo)
>>   : NULL);
>> 
>> -  /* Shouldn't go with length-based approach if fully masked.  */
>> -  gcc_assert (!loop_lens || !loop_masks);
>> +  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
>> + are some difference here.  We cannot enable both the lens and masks
>> + during transform but it is allowed during analysis.
>> + Shouldn't go with 

Re: [PATCH wwwdocs] gcc-14: Some very common historic Autoconf probes that no longer work

2024-03-10 Thread Gerald Pfeifer
On Sat, 17 Feb 2024, Florian Weimer wrote:
> +
> +Older Autoconf versions (for example, Autoconf 2.13) generate core
> +probes that are incompatible with C99.  These include the basic
> +compiler functionality check:
:
:

Yes, thank you!

Gerald

PS: Feel free to copy me on wwwdocs patches.


[PATCH, v2] Fortran: use name of array component in runtime error message [PR30802]

2024-03-10 Thread Harald Anlauf

Dear all,

after playing for some time with NAG and Intel, and an off-list
discussion with Jerry, I am getting more and more convinced that
simpler runtime error messages (also simpler to parse by a human)
are superior to awkward solutions.  This is also what Intel does:
use only the name of the array (component) in the message whose
indices are out of bounds.

(NAG's solution appears also inconsistent for nested derived types.)

So no x%z, or x%_data, etc. in runtime error messages any more.

Please give it a spin...

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald


On 1/30/24 11:46, Mikael Morin wrote:

Le 30/01/2024 à 11:38, Mikael Morin a écrit :


Another (easier) way to clarify the data reference would be rephrasing 
the message so that the array part is separate from the scalar part, 
like so (there are too many 'of', but I lack inspiration):

Index '0' of dimension 1 of component 'zz' of element from 'x1%vv'
below lower bound of 1


This has the same number of 'of' but sounds better maybe:
Out of bounds accessing component 'zz' of element from 'x1%yy': index 
'0' of dimension 1 below lower bound of 1


From cdf3b197beed0ce1649661b2132643b54cbade8d Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sun, 10 Mar 2024 22:14:30 +0100
Subject: [PATCH] Fortran: use name of array component in runtime error message
 [PR30802]

gcc/fortran/ChangeLog:

	PR fortran/30802
	* trans-array.cc (trans_array_bound_check): Find name of component
	to use in runtime error message.
	(array_bound_check_elemental): Likewise.
	(gfc_conv_array_ref): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/30802
	* gfortran.dg/bounds_check_17.f90: Adjust dg-pattern.
	* gfortran.dg/bounds_check_fail_6.f90: Likewise.
	* gfortran.dg/pr92050.f90: Likewise.
	* gfortran.dg/bounds_check_fail_8.f90: New test.
---
 gcc/fortran/trans-array.cc| 60 +--
 gcc/testsuite/gfortran.dg/bounds_check_17.f90 |  2 +-
 .../gfortran.dg/bounds_check_fail_6.f90   |  7 ++-
 .../gfortran.dg/bounds_check_fail_8.f90   | 48 +++
 gcc/testsuite/gfortran.dg/pr92050.f90 |  2 +-
 5 files changed, 83 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/bounds_check_fail_8.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 3673fa40720..9c62b070c50 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -3497,6 +3497,8 @@ trans_array_bound_check (gfc_se * se, gfc_ss *ss, tree index, int n,
   tree descriptor;
   char *msg;
   const char * name = NULL;
+  gfc_expr *expr;
+  gfc_ref *ref;
 
   if (!(gfc_option.rtcheck & GFC_RTCHECK_BOUNDS))
 return index;
@@ -3509,6 +3511,24 @@ trans_array_bound_check (gfc_se * se, gfc_ss *ss, tree index, int n,
   name = ss->info->expr->symtree->n.sym->name;
   gcc_assert (name != NULL);
 
+  /* When we have a component ref, get name of the array section.
+ Note that there can only be one part ref.  */
+  expr = ss->info->expr;
+  if (expr->ref && !compname)
+{
+  for (ref = expr->ref; ref; ref = ref->next)
+	{
+	  /* Remember component name.  */
+	  if (ref->type == REF_COMPONENT)
+	{
+	  name = ref->u.c.component->name;
+	  continue;
+	}
+	  if (ref->type == REF_ARRAY && ref->u.ar.type == AR_SECTION)
+	break;
+	}
+}
+
   if (VAR_P (descriptor))
 name = IDENTIFIER_POINTER (DECL_NAME (descriptor));
 
@@ -3574,29 +3594,20 @@ array_bound_check_elemental (gfc_se * se, gfc_ss * ss, gfc_expr * expr)
   gfc_array_ref *ar;
   gfc_ref *ref;
   gfc_symbol *sym;
-  char *var_name = NULL;
-  size_t len;
+  const char *var_name = NULL;
   int dim;
 
   if (expr->expr_type == EXPR_VARIABLE)
 {
   sym = expr->symtree->n.sym;
-  len = strlen (sym->name) + 1;
-
-  for (ref = expr->ref; ref; ref = ref->next)
-	if (ref->type == REF_COMPONENT)
-	  len += 2 + strlen (ref->u.c.component->name);
-
-  var_name = XALLOCAVEC (char, len);
-  strcpy (var_name, sym->name);
+  var_name = sym->name;
 
   for (ref = expr->ref; ref; ref = ref->next)
 	{
-	  /* Append component name.  */
+	  /* Get component name.  */
 	  if (ref->type == REF_COMPONENT)
 	{
-	  strcat (var_name, "%%");
-	  strcat (var_name, ref->u.c.component->name);
+	  var_name = ref->u.c.component->name;
 	  continue;
 	}
 
@@ -4001,7 +4012,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
   gfc_se indexse;
   gfc_se tmpse;
   gfc_symbol * sym = expr->symtree->n.sym;
-  char *var_name = NULL;
+  const char *var_name = NULL;
 
   if (ar->dimen == 0)
 {
@@ -4035,30 +4046,17 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
 
   if (gfc_option.rtcheck & GFC_RTCHECK_BOUNDS)
 {
-  size_t len;
   gfc_ref *ref;
 
-  len = strlen (sym->name) + 1;
-  for (ref = expr->ref; ref; ref = ref->next)
-	{
-	  if (ref->type == REF_ARRAY && >u.ar == ar)
-	break;
-	  if (ref->type == 

Re: [RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-10 Thread Jeff Law



On 3/10/24 3:05 PM, Andrew Pinski wrote:

On Sun, Mar 10, 2024 at 2:04 PM Jeff Law  wrote:


Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
positive triggered by loop unrolling.

As I speculated a couple years ago, we could eliminate the comparisons
against bogus pointers.  Consider:


[local count: 30530247]:
   if (last_12 !=   [(void *)"aa" + 3B])
 goto ; [54.59%]
   else
 goto ; [45.41%]



That's a valid comparison as ISO allows us to generate, but not
dereference, a pointer one element past the end of the object.

But +4B is a bogus pointer.  So given an EQ comparison against that
pointer we could always return false and for NE always return true.

VRP and DOM seem to be the most natural choices for this kind of
optimization on the surface.  However DOM is actually not viable because
the out-of-bounds pointer warning pass is run at the end of VRP.  So
we've got to take care of this prior to the end of VRP.



I haven't done a bootstrap or regression test with this.  But if it
looks reasonable I can certainly push on it further. I have confirmed it
does eliminate the tests and shuts up the bogus warning.

The downside is this would also shut up valid warnings if user code did
this kind of test.

Comments/Suggestions?


ENOPATCH
Yea, realized it as I pushed the send button.  Then t-bird crashed, 
repeatedly.


Attached this time..

jeff

diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index 3ccb77d28be..cc753e79a8a 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -325,6 +325,34 @@ simplify_using_ranges::fold_cond_with_ops (enum tree_code 
code,
   if (res == range_false (type))
return boolean_false_node;
 }
+
+  /* If we're comparing pointers and one of the pointers is
+ not a valid pointer (say   [(void *)"aa" + 4B)
+ return true for EQ and false for NE.  */
+  if ((code == EQ_EXPR || code == NE_EXPR)
+  && POINTER_TYPE_P (type)
+  && TREE_CODE (op1) == ADDR_EXPR
+  && TREE_CODE (TREE_OPERAND (op1, 0)) == MEM_REF)
+{
+  tree mem_ref = TREE_OPERAND (op1, 0);
+  if (TREE_CODE (TREE_OPERAND (mem_ref, 0)) == ADDR_EXPR)
+   {
+ tree addr_expr = TREE_OPERAND (mem_ref, 0);
+ if (TREE_CODE (TREE_OPERAND (addr_expr, 0)) == STRING_CST)
+   {
+ tree string = TREE_OPERAND (addr_expr, 0);
+
+ if (TREE_CODE (TREE_OPERAND (mem_ref, 1)) == INTEGER_CST)
+   {
+ HOST_WIDE_INT len = TREE_STRING_LENGTH (string);
+ HOST_WIDE_INT offset = tree_to_uhwi (TREE_OPERAND (mem_ref, 
1));
+ if (offset > len)
+   return code == EQ_EXPR ? boolean_false_node : 
boolean_true_node;
+   }
+   }
+   }
+}
+
   return NULL;
 }
 


Re: [RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-10 Thread Andrew Pinski
On Sun, Mar 10, 2024 at 2:04 PM Jeff Law  wrote:
>
> Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
> positive triggered by loop unrolling.
>
> As I speculated a couple years ago, we could eliminate the comparisons
> against bogus pointers.  Consider:
>
> >[local count: 30530247]:
> >   if (last_12 !=   [(void *)"aa" + 3B])
> > goto ; [54.59%]
> >   else
> > goto ; [45.41%]
>
>
> That's a valid comparison as ISO allows us to generate, but not
> dereference, a pointer one element past the end of the object.
>
> But +4B is a bogus pointer.  So given an EQ comparison against that
> pointer we could always return false and for NE always return true.
>
> VRP and DOM seem to be the most natural choices for this kind of
> optimization on the surface.  However DOM is actually not viable because
> the out-of-bounds pointer warning pass is run at the end of VRP.  So
> we've got to take care of this prior to the end of VRP.
>
>
>
> I haven't done a bootstrap or regression test with this.  But if it
> looks reasonable I can certainly push on it further. I have confirmed it
> does eliminate the tests and shuts up the bogus warning.
>
> The downside is this would also shut up valid warnings if user code did
> this kind of test.
>
> Comments/Suggestions?

ENOPATCH

>
> Jeff


[RFC] [PR tree-optimization/92539] Optimize away tests against invalid pointers

2024-03-10 Thread Jeff Law
Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false 
positive triggered by loop unrolling.


As I speculated a couple years ago, we could eliminate the comparisons 
against bogus pointers.  Consider:



   [local count: 30530247]:
  if (last_12 !=   [(void *)"aa" + 3B])
goto ; [54.59%]
  else
goto ; [45.41%]



That's a valid comparison as ISO allows us to generate, but not 
dereference, a pointer one element past the end of the object.


But +4B is a bogus pointer.  So given an EQ comparison against that 
pointer we could always return false and for NE always return true.


VRP and DOM seem to be the most natural choices for this kind of 
optimization on the surface.  However DOM is actually not viable because 
the out-of-bounds pointer warning pass is run at the end of VRP.  So 
we've got to take care of this prior to the end of VRP.




I haven't done a bootstrap or regression test with this.  But if it 
looks reasonable I can certainly push on it further. I have confirmed it 
does eliminate the tests and shuts up the bogus warning.


The downside is this would also shut up valid warnings if user code did 
this kind of test.


Comments/Suggestions?

Jeff


[committed] d: Fix -fpreview=in ICEs with forward referenced parameter [PR112285]

2024-03-10 Thread Iain Buclaw
Hi,

This patch removes the early lowering of D AST types as GCC trees in
Target::preferPassByRef, fixing both PR12285 and PR12290.

The way that the target hook preferPassByRef is implemented, it relied
on the GCC "back-end" tree type to determine whether or not to use `ref'
ABI for D `in' parameters; e.g: prefer by value if it is expected that
the target will pass the type around in registers.

Building the GCC tree type depends on the AST type being complete - all
semantic processing is finished - but as this hook is called from the
front-end, this will not be the case for forward referenced or
self-referencing types.

The consensus in upstream is that `in' parameters should always be
implicitly `ref', but as the front-end does not yet support all types
being rvalue references, limit this just static arrays and structs.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed
to mainline and backported to releases/gcc-13 and releases/gcc-12.

Regards,
Iain.

---
PR d/112285
PR d/112290

gcc/d/ChangeLog:

* d-target.cc (Target::preferPassByRef): Return true for all static
array and struct types.

gcc/testsuite/ChangeLog:

* gdc.dg/pr112285.d: New test.
* gdc.dg/pr112290.d: New test.
---
 gcc/d/d-target.cc   | 25 +
 gcc/testsuite/gdc.dg/pr112285.d | 13 +
 gcc/testsuite/gdc.dg/pr112290.d | 15 +++
 3 files changed, 33 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr112285.d
 create mode 100644 gcc/testsuite/gdc.dg/pr112290.d

diff --git a/gcc/d/d-target.cc b/gcc/d/d-target.cc
index b9d124422b7..9b69b0bc873 100644
--- a/gcc/d/d-target.cc
+++ b/gcc/d/d-target.cc
@@ -575,31 +575,16 @@ Target::supportsLinkerDirective (void) const
 }
 
 /* Decides whether an `in' parameter of the specified POD type PARAM_TYPE is to
-   be passed by reference or by valie.  This is used only when compiling with
+   be passed by reference or by value.  This is used only when compiling with
`-fpreview=in' enabled.  */
 
 bool
 Target::preferPassByRef (Type *param_type)
 {
-  if (param_type->size () == SIZE_INVALID)
+  /* See note in Target::isReturnOnStack.  */
+  Type *tb = param_type->toBasetype ();
+  if (tb->size () == SIZE_INVALID)
 return false;
 
-  tree type = build_ctype (param_type);
-
-  /* Prefer a `ref' if the type is an aggregate, and its size is greater than
- its alignment.  */
-  if (AGGREGATE_TYPE_P (type)
-  && (!valid_constant_size_p (TYPE_SIZE_UNIT (type))
- || compare_tree_int (TYPE_SIZE_UNIT (type), TYPE_ALIGN (type)) > 0))
-return true;
-
-  /* If the back-end is always going to pass this by invisible reference.  */
-  if (pass_by_reference (NULL, function_arg_info (type, true)))
-return true;
-
-  /* If returning the parameter means the caller will do RVO.  */
-  if (targetm.calls.return_in_memory (type, NULL_TREE))
-return true;
-
-  return false;
+  return (tb->ty == TY::Tstruct || tb->ty == TY::Tsarray);
 }
diff --git a/gcc/testsuite/gdc.dg/pr112285.d b/gcc/testsuite/gdc.dg/pr112285.d
new file mode 100644
index 000..5ca100a74a9
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr112285.d
@@ -0,0 +1,13 @@
+// { dg-do compile }
+// { dg-additional-options "-fpreview=in" }
+struct S112285
+{
+}
+
+class C112285
+{
+S112285 s;
+void f112285(in C112285)
+{
+}
+}
diff --git a/gcc/testsuite/gdc.dg/pr112290.d b/gcc/testsuite/gdc.dg/pr112290.d
new file mode 100644
index 000..7456fc21be1
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr112290.d
@@ -0,0 +1,15 @@
+// { dg-do compile }
+// { dg-additional-options "-fpreview=in" }
+struct S112290a
+{
+S112290b* p;
+bool opEquals(in S112290a)
+{
+return p == p;
+}
+}
+
+struct S112290b
+{
+string s;
+}
-- 
2.40.1



[committed] [PR tree-optimization/110199] Simplify MIN/MAX more often

2024-03-10 Thread Jeff Law

So as I mentioned in the BZ, the case of

t = MIN_EXPR (A, B)

where we know something about the relationship between A and B can be 
trivially handled by some existing code in DOM.  That existing code 
would simplify when A == B.  But by testing GE and LE instead of EQ we 
can cover more cases with minimal effort.  When applicable the MIN/MAX 
turns into a simple copy.


I made one other change.  We have other binary operations that we 
simplify when we know something about the relationship between the 
operands.  That code was not canonicalizing the order of operands when 
building the expression to lookup in the hash tables to discover that 
relationship.  Since those paths are only testing for equality, we can 
trivially reverse them and not have to worry about changing codes or 
anything like that.  So extremely safe and avoids having to come back 
and fix that code to match the MIN_EXPR/MAX_EXPR case later.


Bootstrapped on x86 and also tested on the crosses.  I briefly thought 
there was an sh regression, but that was actually the recent fwprop 
changes twiddling code generation for one test.


Pushed to the trunk.

Jeffcommit 8fe27ed193d60f6cd8b34761858a720c95eabbdb
Author: jlaw 
Date:   Sun Mar 10 11:58:00 2024 -0600

[committed] [PR tree-optimization/110199] Simplify MIN/MAX more often

So as I mentioned in the BZ, the case of

t = MIN_EXPR (A, B)

where we know something about the relationship between A and B can be 
trivially
handled by some existing code in DOM.  That existing code would simplify 
when A
== B.  But by testing GE and LE instead of EQ we can cover more cases with
minimal effort.  When applicable the MIN/MAX turns into a simple copy.

I made one other change.  We have other binary operations that we simplify 
when
we know something about the relationship between the operands.  That code 
was
not canonicalizing the order of operands when building the expression to 
lookup
in the hash tables to discover that relationship.  Since those paths are 
only
testing for equality, we can trivially reverse them and not have to worry 
about
changing codes or anything like that.  So extremely safe and avoids having 
to
come back and fix that code to match the MIN_EXPR/MAX_EXPR case later.

Bootstrapped on x86 and also tested on the crosses.  I briefly thought there
was an sh regression, but that was actually the recent fwprop changes 
twiddling
code generation for one test.

PR tree-optimization/110199
gcc/
* tree-ssa-scopedtables.cc
(avail_exprs_stack::simplify_binary_operation): Generalize handling
of MIN_EXPR/MAX_EXPR to allow additional simplifications.  
Canonicalize
comparison operands for other cases.

gcc/testsuite

* gcc.dg/tree-ssa/minmax-27.c: New test.
* gcc.dg/tree-ssa/minmax-28.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c
new file mode 100644
index 000..4b94203b0d0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c
@@ -0,0 +1,118 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom2" } */
+
+
+int min1(int a, int b)
+{
+if (a <= b)
+return a < b ? a : b;
+return 0;
+}
+
+int min2(int a, int b)
+{
+if (a <= b)
+return a > b ? b : a;
+return 0;
+}
+
+int min3(int a, int b)
+{
+if (a < b)
+return a < b ? a : b;
+return 0;
+}
+
+int min4(int a, int b)
+{
+if (a < b)
+return a > b ? b : a;
+return 0;
+}
+
+int min5(int a, int b)
+{
+if (a <= b)
+return a <= b ? a : b;
+return 0;
+}
+
+int min6(int a, int b)
+{
+if (a <= b)
+return a >= b ? b : a;
+return 0;
+}
+
+int min7(int a, int b)
+{
+if (a < b)
+return a <= b ? a : b;
+return 0;
+}
+
+int min8(int a, int b)
+{
+if (b > a)
+return a >= b ? b : a;
+return 0;
+}
+
+int min9(int a, int b)
+{
+if (b >= a)
+return a < b ? a : b;
+return 0;
+}
+
+int min10(int a, int b)
+{
+if (b >= a)
+return a > b ? b : a;
+return 0;
+}
+
+int min11(int a, int b)
+{
+if (b > a)
+return a < b ? a : b;
+return 0;
+}
+
+int min12(int a, int b)
+{
+if (b > a)
+return a > b ? b : a;
+return 0;
+}
+
+int min13(int a, int b)
+{
+if (b >= a)
+return a <= b ? a : b;
+return 0;
+}
+
+int min14(int a, int b)
+{
+if (b >= a)
+return a >= b ? b : a;
+return 0;
+}
+
+int min15(int a, int b)
+{
+if (b > a)
+return a <= b ? a : b;
+return 0;
+}
+
+int min16(int a, int b)
+{
+if (b > a)
+return a >= b ? b : a;
+return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "MIN_EXPR" "dom2" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-28.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-28.c
new file mode 100644
index 

Re: [wwwdocs] Add Ada's GCC 14 changelog entry

2024-03-10 Thread Fernando Oleo Blanco
Hi all,

I have a new revision of the patch. Alexandre pointed out a few issues 
with the hardening options and I agreed with the comments. I took a look 
at when the boolean hardening and stack scrubbing options became 
available within Ada. Hardbools were already available in GCC 13.1, 
stack scrubbing was already present in GCC 12.1. Which means that adding 
this changes to the changelog would be incorrect. The additional 
compiler hardening options/flags within GCC are not unique to Ada and 
they are already documented in the general compiler section and they are 
available for the C family of languages as well as Ada. Therefore, it 
made sense not to explicitly have them in the Ada section.

Nonetheless, there have been some (smaller) hardening improvements to 
Ada, so I just wrote a generic note and pointers to the documentation. I 
know this is not the pretties thing to do, but I did something similar 
in the GCC 12 changelog so...

On 2/26/24 20:36, Fernando Oleo Blanco wrote:
> Hi Mark,
> 
> On 2/26/24 10:17, Marc Poulhiès wrote:
>>
>> Fernando,
>>
>> Thank you for this work! I have a few comments, see below.
>>
>> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
>> index 85ccc54d..e6c96c9f 100644
>> --- a/htdocs/gcc-14/changes.html
>> +++ b/htdocs/gcc-14/changes.html
>> @@ -171,7 +171,49 @@ a work-in-progress.
>>
>>New Languages and Language specific improvements
>>
>> -
>> +Ada

[... omitted for brevity ...]
> 
> I have applied your recommendations. The documentation links are still
> not up... Nonetheless, I created the URL in such a way that they should
> work once the final documentation is given a release number (which I
> guessed to be 14.1.0). If you think this can be improved just say so.
> Nonetheless, feel free to modify my patch if you see it fit.

In this newly revised patch I have not modified the URLs to point to the 
future GCC 14 documentation. I saw that the links in the changelog all 
had the unversioned "master" links, so I just followed the same convention.

> 
> Best regards,
> Fer

I squashed the different commits I had submitted and created a 
completely new patch. Hopefully this is acceptable and leads to a 
cleaner, less noisy commit history/patch. It is attached to the email. I 
think the patch should be in an acceptable state to be committed, but 
feel free to give back any feedback!

Best regards,
FerFrom 9ad2e979e921938c466de3a7868346e8b2426e49 Mon Sep 17 00:00:00 2001
From: Fernando Oleo Blanco 
Date: Sun, 10 Mar 2024 17:47:46 +0100
Subject: [PATCH] Add Ada changes for v14

---
 htdocs/gcc-14/changes.html | 43 +-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 85ccc54d..0886241a 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -171,7 +171,48 @@ a work-in-progress.
 
 New Languages and Language specific improvements
 
-
+Ada
+
+
+  Several new implementation defined aspects and contracts have been
+  added:
+
+  Exceptional_Cases may be specified for procedures and
+  functions with side effects; it can be used to list exceptions that might
+  be propagated by the subprogram with side effects in the context of its
+  precondition, and associate them with a specific postcondition. For more
+  information, refer to SPARK 2014 Reference Manual, section 6.1.9.
+  User_Aspect takes an argument that is the name of an
+  aspect defined by a User_Aspect_Definition configuration pragma.
+  Local_Restrictions is used to specify that a particular
+  subprogram does not violate one or more local restrictions, nor can it
+  call a subprogram that is not subject to the same requirements.
+  Side_Effects is equivalent to pragma
+  Side_Effecs.
+  Always_Terminates is a boolean equivalent to pragma
+  Always_Terminates
+  Ghost_Predicate introduces a subtype predicate that can
+  reference Ghost entities.
+
+For more information and usage guidelines, see
+the https://gcc.gnu.org/onlinedocs/gnat_rm/Implementation-Defined-Pragmas.html;>GNAT
+Reference Manual.
+  
+  The new attributes and contracts have been applied to the relevant parts
+of the Ada library and more code has been proven to be correct.
+  Initial support for the
+  https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/;>CHERI
+  architecture.
+  Support for the LoongArch architecture.
+  Support for vxWorks 7 Cert RTP has been removed.
+  Additional hardening improvements. For more information reltated to hardening options, refer to
+the https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fharden-compares;>GCC
+Instrumentation Options and
+the https://gcc.gnu.org/onlinedocs/gnat_rm/Security-Hardening-Features.html;>GNAT
+Reference Manual, Security and Hardening Features.
+  
+  Further clean up and improvements to the GNAT code.
+
 
 

[PATCH] testsuite: Define _POSIX_C_SOURCE for test

2024-03-10 Thread Torbjörn SVENSSON
Ok for trunk?

--

As the tests assume that strndup() is visible (only part of
POSIX.1-2008) define the guard to ensure that it's visible.  Currently,
glibc appears to always have this defined in C++, newlib does not.

Without this patch, fails like this can be seen:

Testing analyzer/strndup-1.c,  -std=c++98
.../strndup-1.c: In function 'void test_1(const char*)':
.../strndup-1.c:11:13: error: 'strndup' was not declared in this scope; did you 
mean 'strncmp'?
.../strndup-1.c: In function 'void test_2(const char*)':
.../strndup-1.c:16:13: error: 'strndup' was not declared in this scope; did you 
mean 'strncmp'?
.../strndup-1.c: In function 'void test_3(const char*)':
.../strndup-1.c:21:13: error: 'strndup' was not declared in this scope; did you 
mean 'strncmp'?

Patch has been verified on Linux.

gcc/testsuite/ChangeLog:

* c-c++-common/analyzer/strndup-1.c: Define _POSIX_C_SOURCE.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/c-c++-common/analyzer/strndup-1.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/c-c++-common/analyzer/strndup-1.c 
b/gcc/testsuite/c-c++-common/analyzer/strndup-1.c
index 85ccae85d83..577ece0cfba 100644
--- a/gcc/testsuite/c-c++-common/analyzer/strndup-1.c
+++ b/gcc/testsuite/c-c++-common/analyzer/strndup-1.c
@@ -1,4 +1,5 @@
 /* { dg-skip-if "no strndup in libc" { *-*-darwin[789]* *-*-darwin10* 
hppa*-*-hpux* *-*-mingw* *-*-vxworks* } } */
+/* { dg-additional-options "-D_POSIX_C_SOURCE=200809L" } */
 
 #include 
 #include 
-- 
2.25.1



Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-10 Thread Richard Biener



> Am 10.03.2024 um 11:02 schrieb Li, Pan2 :
> 
> Committed, thanks Richard.

You might want to investigate why you get mask and not Len for a particular 
stmt.  mixing will cause variable length vectorization to fail.

> Pan
> 
> -Original Message-
> From: Richard Biener 
> Sent: Sunday, March 10, 2024 2:53 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> Wang, Yanzhang ; rdapp@gmail.com; 
> jeffreya...@gmail.com
> Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len 
> and store are enabled
> 
> 
> 
>> Am 10.03.2024 um 04:14 schrieb pan2...@intel.com:
>> 
>> From: Pan Li 
>> 
>> This patch would like to fix one ICE in vectorizable_store when both the
>> loop_masks and loop_lens are enabled.  The ICE looks like below when build
>> with "-march=rv64gcv -O3".
>> 
>> during GIMPLE pass: vect
>> test.c: In function ‘d’:
>> test.c:6:6: internal compiler error: in vectorizable_store, at
>> tree-vect-stmts.cc:8691
>>   6 | void d() {
>> |  ^
>> 0x37a6f2f vectorizable_store
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
>> 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
>> _slp_tree*, _slp_instance*, vec*)
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
>> 0x1db5dca vect_analyze_loop_operations
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
>> 0x1db885b vect_analyze_loop_2
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
>> 0x1dba029 vect_analyze_loop_1
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
>> 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
>> 0x1e389d1 try_vectorize_loop_1
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
>> 0x1e38f3d try_vectorize_loop
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
>> 0x1e39230 execute
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298
>> 
>> There are two ways to reach vectorizer LD/ST, one is the analysis and
>> the other is transform.  We cannot have both the lens and the masks
>> enabled during transform but it is valid during analysis.  Given the
>> transform doesn't required cost_vec,  we can only enable the assert
>> based on cost_vec is NULL or not.
>> 
>> Below testsuites are passed for this patch:
>> * The x86 bootstrap tests.
>> * The x86 fully regression tests.
>> * The aarch64 fully regression tests.
>> * The riscv fully regressison tests.
> 
> Ok
> 
> Thanks,
> Richard
> 
>> gcc/ChangeLog:
>> 
>>   * tree-vect-stmts.cc (vectorizable_store): Enable the assert
>>   during transform process.
>>   (vectorizable_load): Ditto.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>   * gcc.target/riscv/rvv/base/pr114195-1.c: New test.
>> 
>> Signed-off-by: Pan Li 
>> ---
>> .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++
>> gcc/tree-vect-stmts.cc | 18 ++
>> 2 files changed, 29 insertions(+), 4 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> 
>> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
>> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> new file mode 100644
>> index 000..a67b847112b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> @@ -0,0 +1,15 @@
>> +/* Test that we do not have ice when compile */
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
>> +
>> +long a, b;
>> +extern short c[];
>> +
>> +void d() {
>> +  for (int e = 0; e < 35; e = 2) {
>> +a = ({ a < 0 ? a : 0; });
>> +b = ({ b < 0 ? b : 0; });
>> +
>> +c[e] = 0;
>> +  }
>> +}
>> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
>> index 14a3ffb5f02..e8617439a48 100644
>> --- a/gcc/tree-vect-stmts.cc
>> +++ b/gcc/tree-vect-stmts.cc
>> @@ -8697,8 +8697,13 @@ vectorizable_store (vec_info *vinfo,
>>   ? _VINFO_LENS (loop_vinfo)
>>   : NULL);
>> 
>> -  /* Shouldn't go with length-based approach if fully masked.  */
>> -  gcc_assert (!loop_lens || !loop_masks);
>> +  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
>> + are some difference here.  We cannot enable both the lens and masks
>> + during transform but it is allowed during analysis.
>> + Shouldn't go with length-based approach if fully masked.  */
>> +  if (cost_vec == NULL)
>> +/* The cost_vec is NULL during transfrom.  */
>> +gcc_assert ((!loop_lens || !loop_masks));
>> 
>>  /* Targets with store-lane instructions must not require explicit
>> realignment.  vect_supportable_dr_alignment always returns either
>> @@ -10577,8 +10582,13 @@ vectorizable_load (vec_info *vinfo,
>>   ? _VINFO_LENS (loop_vinfo)
>>   : NULL);
>> 
>> -  /* Shouldn't go with length-based approach if fully masked.  */
>> -  gcc_assert (!loop_lens || !loop_masks);
>> +  /* The vect_transform_stmt and 

New Swedish PO file for 'gcc' (version 14.1-b20240218)

2024-03-10 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-14.1-b20240218.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PATCH] c++/modules: Support target-specific nodes with streaming [PR111224]

2024-03-10 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu and
aarch64-unknown-linux-gnu, OK for trunk?

It's worth noting that the AArch64 machines I had available to test with
didn't have a new enough glibc to reproduce the ICEs in the PR, but this
patch will be necessary (albeit possibly not sufficient) to fix it.

-- >8 --

Some targets make use of POLY_INT_CSTs and other custom builtin types,
which currently violate some assumptions when streaming. This patch adds
support for them, specifically AArch64 SVE types like __fp16.

This patch doesn't provide "full" support of AArch64 SVE, however, since
for that we would need to support 'target' nodes (tracked in PR108080).

PR c++/111224

gcc/cp/ChangeLog:

* module.cc (enum tree_tag): Add new tag for builtin types.
(trees_out::start): POLY_INT_CSTs can be emitted.
(trees_in::start): Likewise.
(trees_out::core_vals): Stream POLY_INT_CSTs.
(trees_in::core_vals): Likewise.
(trees_out::type_node): Handle target-specific builtin types,
and vectors with NUM_POLY_INT_COEFFS > 1.
(trees_in::tree_node): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr111224_a.C: New test.
* g++.dg/modules/pr111224_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc  | 70 +++
 gcc/testsuite/g++.dg/modules/pr111224_a.C | 17 ++
 gcc/testsuite/g++.dg/modules/pr111224_b.C | 13 +
 3 files changed, 90 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr111224_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr111224_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 99055523d91..0b5e2e67053 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -2718,6 +2718,7 @@ enum tree_tag {
   tt_typedef_type, /* A (possibly implicit) typedefed type.  */
   tt_derived_type, /* A type derived from another type.  */
   tt_variant_type, /* A variant of another type.  */
+  tt_builtin_type,  /* A custom builtin type.  */
 
   tt_tinfo_var,/* Typeinfo object. */
   tt_tinfo_typedef,/* Typeinfo typedef.  */
@@ -2732,7 +2733,7 @@ enum tree_tag {
   tt_binfo,/* A BINFO.  */
   tt_vtable,   /* A vtable.  */
   tt_thunk,/* A thunk.  */
-  tt_clone_ref,
+  tt_clone_ref, /* A cloned function.  */
 
   tt_entity,   /* A extra-cluster entity.  */
 
@@ -5173,7 +5174,6 @@ trees_out::start (tree t, bool code_streamed)
   break;
 
 case FIXED_CST:
-case POLY_INT_CST:
   gcc_unreachable (); /* Not supported in C++.  */
   break;
 
@@ -5259,7 +5259,6 @@ trees_in::start (unsigned code)
 
 case FIXED_CST:
 case IDENTIFIER_NODE:
-case POLY_INT_CST:
 case SSA_NAME:
 case TARGET_MEM_REF:
 case TRANSLATION_UNIT_DECL:
@@ -6106,7 +6105,10 @@ trees_out::core_vals (tree t)
   break;
 
 case POLY_INT_CST:
-  gcc_unreachable (); /* Not supported in C++.  */
+  if (streaming_p ())
+   for (unsigned ix = 0; ix != NUM_POLY_INT_COEFFS; ix++)
+ WT (POLY_INT_CST_COEFF (t, ix));
+  break;
 
 case REAL_CST:
   if (streaming_p ())
@@ -6615,8 +6617,9 @@ trees_in::core_vals (tree t)
   break;
 
 case POLY_INT_CST:
-  /* Not suported in C++.  */
-  return false;
+  for (unsigned ix = 0; ix != NUM_POLY_INT_COEFFS; ix++)
+   RT (POLY_INT_CST_COEFF (t, ix));
+  break;
 
 case REAL_CST:
   if (const void *bytes = buf (sizeof (real_value)))
@@ -8930,6 +8933,32 @@ trees_out::type_node (tree type)
   return;
 }
 
+  if (tree name = TYPE_NAME (type))
+if (TREE_CODE (name) == TYPE_DECL && DECL_ARTIFICIAL (name))
+  {
+   /* Potentially a custom machine- or OS-specific builtin type.  */
+   bool found = false;
+   unsigned ix = 0;
+   for (tree t = registered_builtin_types; t; t = TREE_CHAIN (t), ix++)
+ if (TREE_VALUE (t) == type)
+   {
+ found = true;
+ break;
+   }
+   if (found)
+ {
+   int type_tag = insert (type);
+   if (streaming_p ())
+ {
+   i (tt_builtin_type);
+   u (ix);
+   dump (dumper::TREE)
+ && dump ("Wrote:%d builtin type %N", type_tag, name);
+ }
+   return;
+ }
+  }
+
   if (streaming_p ())
 {
   u (tt_derived_type);
@@ -9068,8 +9097,8 @@ trees_out::type_node (tree type)
   if (streaming_p ())
{
  poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (type);
- /* to_constant asserts that only coeff[0] is of interest.  */
- wu (static_cast (nunits.to_constant ()));
+ for (unsigned ix = 0; ix != NUM_POLY_INT_COEFFS; ix++)
+   wu (nunits.coeffs[ix]);
}
   break;
 }
@@ -9630,9 +9659,11 @@ trees_in::tree_node (bool is_use)
 
  case VECTOR_TYPE:
{
- 

Re: Frontend access to target features (was Re: [PATCH] libgccjit: Add ability to get CPU features)

2024-03-10 Thread Iain Buclaw
Excerpts from David Malcolm's message of März 5, 2024 4:09 pm:
> On Thu, 2023-11-09 at 19:33 -0500, Antoni Boucher wrote:
>> Hi.
>> See answers below.
>> 
>> On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote:
>> > On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote:
>> > > Hi.
>> > > This patch adds support for getting the CPU features in libgccjit
>> > > (bug
>> > > 112466)
>> > > 
>> > > There's a TODO in the test:
>> > > I'm not sure how to test that gcc_jit_target_info_arch returns
>> > > the
>> > > correct value since it is dependant on the CPU.
>> > > Any idea on how to improve this?
>> > > 
>> > > Also, I created a CStringHash to be able to have a
>> > > std::unordered_set. Is there any built-in way of
>> > > doing
>> > > this?
>> > 
>> > Thanks for the patch.
>> > 
>> > Some high-level questions:
>> > 
>> > Is this specifically about detecting capabilities of the host that
>> > libgccjit is currently running on? or how the target was configured
>> > when libgccjit was built?
>> 
>> I'm less sure about this part. I'll need to do more tests.
>> 
>> > 
>> > One of the benefits of libgccjit is that, in theory, we support all
>> > of
>> > the targets that GCC already supports.  Does this patch change
>> > that,
>> > or
>> > is this more about giving client code the ability to determine
>> > capabilities of the specific host being compiled for?
>> 
>> This should not change that. If it does, this is a bug.
>> 
>> > 
>> > I'm nervous about having per-target jit code.  Presumably there's a
>> > reason that we can't reuse existing target logic here - can you
>> > please
>> > describe what the problem is.  I see that the ChangeLog has:
>> > 
>> > > * config/i386/i386-jit.cc: New file.
>> > 
>> > where i386-jit.cc has almost 200 lines of nontrivial code.  Where
>> > did
>> > this come from?  Did you base it on existing code in our source
>> > tree,
>> > making modifications to fit the new internal API, or did you write
>> > it
>> > from scratch?  In either case, how onerous would this be for other
>> > targets?
>> 
>> This was mostly copied from the same code done for the Rust and D
>> frontends.
>> See this commit and the following:
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b1c06fd9723453dd2b2ec306684cb806dc2b4fbb
>> The equivalent to i386-jit.cc is there:
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=22e3557e2d52f129f2bbfdc98688b945dba28dc9
> 
> [CCing Iain and Arthur re those patches; for reference, the patch being
> discussed is attached to :
> https://gcc.gnu.org/pipermail/jit/2024q1/001792.html ]
> 
> One of my concerns about this patch is that we seem to be gaining code
> that's per-(frontend x config) which seems to be copied and pasted with
> a search and replace, which could lead to an M*N explosion.
> 

That's certainly the case with the configure/make rules. Itself I think
is copied originally from the {cpu_type}-protos.h machinery.

It might be worth pointing out that the c-family of front-ends don't
have separate headers because their per-target macros are defined in
{cpu_type}.h directly - for better or worse.

> Is there any real difference between the per-config code for the
> different frontends, or should there be a general "enumerate all
> features of the target" hook that's independent of the frontend? (but
> perhaps calls into it).
> 

As far as I understand, the configure parts should all be identical
between tm_p, tm_d, tm_rust, ..., so would benefit from being templated
to aid any other front-ends adding in their own per target hooks.

> Am I right in thinking that (rustc with default LLVM backend) has some
> set of feature strings that both (rustc with rustc_codegen_gcc) and
> gccrs are trying to emulate?  If so, is it presumably a goal that
> libgccjit gives identical results to gccrs?  If so, would it be crazy
> for libgccjit to consume e.g. config/i386/i386-rust.cc ?

I don't know whether libgccjit can just pull in directly the
implementation of the rust target hooks here.  The per-frontend target
hooks usually also make use of code specific to that front-end -
TARGET_CPU_CPP_BUILTINS and others can't be used by a non-c-family
front-end without adding a plethora of stubs, for example.

Whether or not libgccjit wants to give identical information as as rust
I think is a decision for you as the maintainer of its API.

Iain.


RE: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-10 Thread Li, Pan2
Committed, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Sunday, March 10, 2024 2:53 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, 
Yanzhang ; rdapp@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and 
store are enabled



> Am 10.03.2024 um 04:14 schrieb pan2...@intel.com:
> 
> From: Pan Li 
> 
> This patch would like to fix one ICE in vectorizable_store when both the
> loop_masks and loop_lens are enabled.  The ICE looks like below when build
> with "-march=rv64gcv -O3".
> 
> during GIMPLE pass: vect
> test.c: In function ‘d’:
> test.c:6:6: internal compiler error: in vectorizable_store, at
> tree-vect-stmts.cc:8691
>6 | void d() {
>  |  ^
> 0x37a6f2f vectorizable_store
>.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
> 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
> _slp_tree*, _slp_instance*, vec*)
>.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
> 0x1db5dca vect_analyze_loop_operations
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
> 0x1db885b vect_analyze_loop_2
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
> 0x1dba029 vect_analyze_loop_1
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
> 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
> 0x1e389d1 try_vectorize_loop_1
>.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
> 0x1e38f3d try_vectorize_loop
>.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
> 0x1e39230 execute
>.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298
> 
> There are two ways to reach vectorizer LD/ST, one is the analysis and
> the other is transform.  We cannot have both the lens and the masks
> enabled during transform but it is valid during analysis.  Given the
> transform doesn't required cost_vec,  we can only enable the assert
> based on cost_vec is NULL or not.
> 
> Below testsuites are passed for this patch:
> * The x86 bootstrap tests.
> * The x86 fully regression tests.
> * The aarch64 fully regression tests.
> * The riscv fully regressison tests.

Ok

Thanks,
Richard 

> gcc/ChangeLog:
> 
>* tree-vect-stmts.cc (vectorizable_store): Enable the assert
>during transform process.
>(vectorizable_load): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.target/riscv/rvv/base/pr114195-1.c: New test.
> 
> Signed-off-by: Pan Li 
> ---
> .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++
> gcc/tree-vect-stmts.cc | 18 ++
> 2 files changed, 29 insertions(+), 4 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> 
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> new file mode 100644
> index 000..a67b847112b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> @@ -0,0 +1,15 @@
> +/* Test that we do not have ice when compile */
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
> +
> +long a, b;
> +extern short c[];
> +
> +void d() {
> +  for (int e = 0; e < 35; e = 2) {
> +a = ({ a < 0 ? a : 0; });
> +b = ({ b < 0 ? b : 0; });
> +
> +c[e] = 0;
> +  }
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 14a3ffb5f02..e8617439a48 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -8697,8 +8697,13 @@ vectorizable_store (vec_info *vinfo,
>? _VINFO_LENS (loop_vinfo)
>: NULL);
> 
> -  /* Shouldn't go with length-based approach if fully masked.  */
> -  gcc_assert (!loop_lens || !loop_masks);
> +  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
> + are some difference here.  We cannot enable both the lens and masks
> + during transform but it is allowed during analysis.
> + Shouldn't go with length-based approach if fully masked.  */
> +  if (cost_vec == NULL)
> +/* The cost_vec is NULL during transfrom.  */
> +gcc_assert ((!loop_lens || !loop_masks));
> 
>   /* Targets with store-lane instructions must not require explicit
>  realignment.  vect_supportable_dr_alignment always returns either
> @@ -10577,8 +10582,13 @@ vectorizable_load (vec_info *vinfo,
>? _VINFO_LENS (loop_vinfo)
>: NULL);
> 
> -  /* Shouldn't go with length-based approach if fully masked.  */
> -  gcc_assert (!loop_lens || !loop_masks);
> +  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
> + are some difference here.  We cannot enable both the lens and masks
> + during transform but it is allowed during analysis.
> + Shouldn't go with length-based approach if fully masked.  */
> +  if (cost_vec == NULL)
> +/* The cost_vec is NULL during transfrom.  */
> +