Re: [PATCH v2] c++: explicit inst of template method not generated [PR110323]

2024-03-20 Thread Jason Merrill

On 3/19/24 15:30, Marek Polacek wrote:

On Mon, Mar 18, 2024 at 09:10:27PM -0400, Jason Merrill wrote:

On 3/15/24 13:48, Marek Polacek wrote:

On Thu, Mar 14, 2024 at 03:39:04PM -0400, Jason Merrill wrote:

On 3/8/24 12:02, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Consider

 constexpr int VAL = 1;
 struct foo {
 template 
 void bar(typename std::conditional::type arg) { }
 };
 template void foo::bar<1>(int arg);

where we since r11-291 fail to emit the code for the explicit
instantiation.  That's because cp_walk_subtrees/TYPENAME_TYPE now
walks TYPE_CONTEXT ('conditional' here) as well, and in a template
finds the B==VAL template argument.  VAL is constexpr, which implies const,
which in the global scope implies static.  constrain_visibility_for_template
then makes "struct conditional<(B == VAL), int, float>" non-TREE_PUBLIC.
Then symtab_node::needed_p checks TREE_PUBLIC, sees it's 0, and we don't
emit any code.

I thought the fix would be some ODR-esque check to not consider
constexpr variables/fns that are used just for their value.  But
it turned out to be tricky.  For instance, we can't skip
determine_visibility in a template; we can't even skip it for value-dep
expressions.  For example, no-linkage-expr1.C has

 using P = struct {}*;
 template 
 void f(int(*)[((P)0, N)]) {}

where ((P)0, N) is value-dep, but N is not relevant here: we have to
ferret out the anonymous type.  When instantiating, it's already gone.


Hmm, how is that different from the B == VAL case?  In both cases we're
naming an internal entity that gets folded away.

I guess the difference is that B == VAL falls under the special allowance in
https://eel.is/c++draft/basic.def.odr#14.5.1 because it's a constant used as
a prvalue, and therefore is not odr-used under
https://eel.is/c++draft/basic.def.odr#5.2

So I would limit this change to decl_constant_var_p.  Really we should also
be checking that the lvalue-rvalue conversion is applied, but that's more
complicated.


Thanks.  My previous version had it, but it didn't handle

static constexpr int getval () { return 1; }

template 
void baz(typename conditional::type arg) { }

I'd say that "getval()" is one of "manifestly constant-evaluated expressions 
that
are not value-dependent", so it should be treated the same as B == VAL.


But it doesn't satisfy the 14.5 rule that corresponding names need to refer
to the same entity; since getval names a function, it doesn't get the
special exemption from that rule that VAL gets.

So this should not be treated the same as B == VAL.
  
Thanks for the explanation.



I don't know if this is important to handle.  Do you want me to poke further or
should we just go with decl_constant_var_p and leave it at that for now?


Just decl_constant_var_p.


Here it is:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Consider

   constexpr int VAL = 1;
   struct foo {
   template 
   void bar(typename std::conditional::type arg) { }
   };
   template void foo::bar<1>(int arg);

where we since r11-291 fail to emit the code for the explicit
instantiation.  That's because cp_walk_subtrees/TYPENAME_TYPE now
walks TYPE_CONTEXT ('conditional' here) as well, and in a template
finds the B==VAL template argument.  VAL is constexpr, which implies const,
which in the global scope implies static.  constrain_visibility_for_template
then makes "struct conditional<(B == VAL), int, float>" non-TREE_PUBLIC.
Then symtab_node::needed_p checks TREE_PUBLIC, sees it's 0, and we don't
emit any code.

I thought the fix would be some ODR-esque check to not consider
constexpr variables/fns that are used just for their value.  But
it turned out to be tricky.  For instance, we can't skip
determine_visibility in a template; we can't even skip it for value-dep
expressions.  For example, no-linkage-expr1.C has

   using P = struct {}*;
   template 
   void f(int(*)[((P)0, N)]) {}

where ((P)0, N) is value-dep, but N is not relevant here: we have to
ferret out the anonymous type.  When instantiating, it's already gone.

This patch uses decl_constant_var_p.  This is to implement (an
approximation) [basic.def.odr]#14.5.1 and [basic.def.odr]#5.2.

PR c++/110323

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r) : Do nothing for
decl_constant_var_p VAR_DECLs.

gcc/testsuite/ChangeLog:

* g++.dg/template/explicit-instantiation6.C: New test.
* g++.dg/template/explicit-instantiation7.C: New test.
---
  gcc/cp/decl2.cc   |  6 ++-
  .../g++.dg/template/explicit-instantiation6.C | 43 +++
  .../g++.dg/template/explicit-instantiation7.C | 22 ++
  3 files changed, 70 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/template/explicit-instantiation6.C
  create mode 100644 gcc/testsuite/g++.dg/template/explicit-instantiation7.C

diff --git 

[PATCH] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-03-20 Thread Jie Mei
This patch adds MIN/MAX.fmt instructios from MIPS R6
with corresponding tests.

gcc/ChangeLog:

* config/mips/i6400.md (i6400_fpu_minmax): New
define_insn_reservation.
* config/mips/mips.h (ISA_HAS_FMIN_FMAX): Define new macro.
* config/mips/mips.md (type): Add fminmax.
(smin3): Generates MIN.fmt instructions.
(smax3): Generates MAX.fmt instructions.
* config/mips/p6600.md (p6600_fpu_fabs): Include fminmax
type.

gcc/testsuite/ChangeLog:

* gcc.target/mips/mips-minmax.c: new tests for MIPS R6.
---
 gcc/config/mips/i6400.md|  6 
 gcc/config/mips/mips.h  |  2 ++
 gcc/config/mips/mips.md | 26 --
 gcc/config/mips/p6600.md|  2 +-
 gcc/testsuite/gcc.target/mips/mips-minmax.c | 40 +
 5 files changed, 73 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/mips-minmax.c

diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md
index 9f216fe0210..d6f691ee217 100644
--- a/gcc/config/mips/i6400.md
+++ b/gcc/config/mips/i6400.md
@@ -219,6 +219,12 @@
(eq_attr "type" "fabs,fneg,fmove"))
   "i6400_fpu_short, i6400_fpu_apu")
 
+;; min, max
+(define_insn_reservation "i6400_fpu_minmax" 2
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "fminmax"))
+  "i6400_fpu_short+i6400_fpu_logic")
+
 ;; fadd, fsub, fcvt
 (define_insn_reservation "i6400_fpu_fadd" 4
   (and (eq_attr "cpu" "i6400")
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 7145d23c650..5ce984ac99b 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1259,6 +1259,8 @@ struct mips_cpu_info {
 #define ISA_HAS_9BIT_DISPLACEMENT  (mips_isa_rev >= 6  \
 || ISA_HAS_MIPS16E2)
 
+#define ISA_HAS_FMIN_FMAX  (mips_isa_rev >= 6)
+
 /* ISA has data indexed prefetch instructions.  This controls use of
'prefx', along with TARGET_HARD_FLOAT and TARGET_DOUBLE_FLOAT.
(prefx is a cop1x instruction, so can only be used if FP is
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index b0fb5850a9e..d309ebb6848 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -370,6 +370,7 @@
 ;; frsqrt   floating point reciprocal square root
 ;; frsqrt1  floating point reciprocal square root step1
 ;; frsqrt2  floating point reciprocal square root step2
+;; fminmax  floating point min/max
 ;; dspmac   DSP MAC instructions not saturating the accumulator
 ;; dspmacsatDSP MAC instructions that saturate the accumulator
 ;; accext   DSP accumulator extract instructions
@@ -387,8 +388,8 @@
prefetch,prefetchx,condmove,mtc,mfc,mthi,mtlo,mfhi,mflo,const,arith,logical,
shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
-   frsqrt,frsqrt1,frsqrt2,dspmac,dspmacsat,accext,accmod,dspalu,dspalusat,
-   multi,atomic,syncloop,nop,ghost,multimem,
+   frsqrt,frsqrt1,frsqrt2,fminmax,dspmac,dspmacsat,accext,accmod,dspalu,
+   dspalusat,multi,atomic,syncloop,nop,ghost,multimem,
simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd,
simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp,
simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill,
@@ -7971,6 +7972,27 @@
   [(set_attr "move_type" "load")
(set_attr "insn_count" "2")])
 
+;;
+;;  Float point MIN/MAX
+;;
+
+(define_insn "smin3"
+  [(set (match_operand:SCALARF 0 "register_operand" "=f")
+   (smin:SCALARF (match_operand:SCALARF 1 "register_operand" "f")
+  (match_operand:SCALARF 2 "register_operand" "f")))]
+  "ISA_HAS_FMIN_FMAX"
+  "min.\t%0,%1,%2"
+  [(set_attr "type" "fminmax")
+   (set_attr "mode" "")])
+
+(define_insn "smax3"
+  [(set (match_operand:SCALARF 0 "register_operand" "=f")
+   (smax:SCALARF (match_operand:SCALARF 1 "register_operand" "f")
+  (match_operand:SCALARF 2 "register_operand" "f")))]
+  "ISA_HAS_FMIN_FMAX"
+  "max.\t%0,%1,%2"
+  [(set_attr "type" "fminmax")
+  (set_attr "mode" "")])
 
 ;; 2 HI loads are joined.
 (define_peephole2
diff --git a/gcc/config/mips/p6600.md b/gcc/config/mips/p6600.md
index a9e3262cc18..c502f0eb5c6 100644
--- a/gcc/config/mips/p6600.md
+++ b/gcc/config/mips/p6600.md
@@ -170,7 +170,7 @@
 ;; fabs, fneg, fcmp
 (define_insn_reservation "p6600_fpu_fabs" 2
   (and (eq_attr "cpu" "p6600")
-   (ior (eq_attr "type" "fabs,fneg,fcmp,fmove")
+   (ior (eq_attr "type" "fabs,fneg,fcmp,fmove,fminmax")
   (and (eq_attr "type" "condmove")
(eq_attr "mode" "SF,DF"
   "p6600_fpu_short, p6600_fpu_apu")
diff --git a/gcc/testsuite/gcc.target/mips/mips-minmax.c 
b/gcc/testsuite/gcc.target/mips/mips-minmax.c
new file mode 100644
index 000..2d234ac4b1d
--- /dev/null
+++ 

Re: [PATCH] c++: direct-init of an array of class type [PR59465]

2024-03-20 Thread Jason Merrill

On 3/1/24 19:58, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?  I don't
claim that this has to go to 14 though.

-- >8 --
...from another array in a mem-initializer should not be accepted.

We already reject

   struct string {} a[1];
   string x[1](a);

but

   struct pair {
 string s[1];
 pair() : s(a) {}
   };

is wrongly accepted.

It started to be accepted with r0-110915-ga034826198b771:

which was supposed to be a cleanup, not a deliberate change to start
accepting the code.  The build_vec_init_expr code was added in r165976:
.

It appears that we do the magic copy array when we have a defaulted
constructor and we generate code for its mem-initializer which
initializes an array.  I also see that we go that path for compound
literals.  So when initializing an array member, we can limit building
up a VEC_INIT_EXPR to those special cases.

PR c++/59465

gcc/cp/ChangeLog:

* init.cc (can_init_array_with_p): New.
(perform_member_init): Check it.

gcc/testsuite/ChangeLog:

* g++.dg/init/array62.C: New test.
* g++.dg/init/array63.C: New test.
* g++.dg/init/array64.C: New test.
---
  gcc/cp/init.cc  | 27 ++-
  gcc/testsuite/g++.dg/init/array62.C | 19 +++
  gcc/testsuite/g++.dg/init/array63.C | 13 +
  gcc/testsuite/g++.dg/init/array64.C | 22 ++
  4 files changed, 80 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/init/array62.C
  create mode 100644 gcc/testsuite/g++.dg/init/array63.C
  create mode 100644 gcc/testsuite/g++.dg/init/array64.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index d2586fad86b..fb8c0e521fb 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -934,6 +934,31 @@ find_uninit_fields (tree *t, hash_set 
*uninitialized, tree member)
  }
  }
  
+/* Return true if it's OK to initialize an array from INIT.  Mere mortals

+   can't copy arrays, but the compiler can do so with a VEC_INIT_EXPR in
+   certain cases.  */
+
+static bool
+can_init_array_with_p (tree init)
+{
+  if (!init)
+return true;
+
+  /* We're called from synthesize_method, and we're processing the
+ mem-initializers of a constructor.  */
+  if (DECL_DEFAULTED_FN (current_function_decl))
+return true;
+  /* As an extension, we allow copying from a compound literal.  */
+  else if (TREE_CODE (init) == TARGET_EXPR)
+{
+  init = TARGET_EXPR_INITIAL (init);
+  if (TREE_CODE (init) == CONSTRUCTOR)
+   return CONSTRUCTOR_C99_COMPOUND_LITERAL (init);
+}
+
+  return false;
+}
+
  /* Initialize MEMBER, a FIELD_DECL, with INIT, a TREE_LIST of
 arguments.  If TREE_LIST is void_type_node, an empty initializer
 list was given; if NULL_TREE no initializer was given.  UNINITIALIZED
@@ -1085,7 +1110,7 @@ perform_member_init (tree member, tree init, hash_set 
)
else if (type_build_ctor_call (type)
   || (init && CLASS_TYPE_P (strip_array_types (type
  {
-  if (TREE_CODE (type) == ARRAY_TYPE)
+  if (TREE_CODE (type) == ARRAY_TYPE && can_init_array_with_p (init))
{
  if (init == NULL_TREE
  || same_type_ignoring_top_level_qualifiers_p (type,


It seems like these last two existing lines also fall under "init is 
suitable to initialize type", so let's fold them into the new function.


Jason



Re: [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak

2024-03-20 Thread Vineet Gupta



On 3/18/24 21:41, Jeff Law wrote:
>> The first patch is the main change which improves SPEC cactu by 10%.
> Just to confirm.  Yup, 10% reduction in icounts and about a 3.5% 
> improvement in cycles on our target.  Which is great!

Nice.

> This also makes me wonder if cactu is the benchmark that was sensitive 
> to flushing the pending queue in the scheduler.  Jivan's data would tend 
> to indicate that is the case as several routines seem to flush the 
> pending queue often.  In particular:
>
> ML_BSSN_RHS_Body
> ML_BSSN_Advect_Body
> ML_BSSN_constraints_Body
>
> All have a high number of dynamic instructions as well as lots of 
> flushes of the pending queue.
>
> Vineet, you might want to look and see if cranking up the 
> max-pending-list-length parameter helps drive down spilling.   I think 
> it's default value is 32 insns.  I've seen it cranked up to 128 and 256 
> insns without significant ill effects on compile time.
>
> My recollection (it's been like 3 years) of the key loop was that it had 
> a few hundred instructions and we'd flush the pending list about 50 
> cycles into the loop as there just wasn't enough issue bandwidth to the 
> FP units to dispatch all the FP instructions as their inputs became 
> ready.  So you'd be looking for flushes in a big loop.

Great insight.

Fired off a cactu run with 128, will keep you posted.

Thx,
-Vineet


Pushed: [PATCH] LoongArch: Fix a typo [PR 114407]

2024-03-20 Thread Xi Ruoyao
gcc/ChangeLog:

PR target/114407
* config/loongarch/loongarch-opts.cc (loongarch_config_target):
Fix typo in diagnostic message, enabing -> enabling.
---

Pushed r14-9582 as obvious.

 gcc/config/loongarch/loongarch-opts.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/loongarch-opts.cc 
b/gcc/config/loongarch/loongarch-opts.cc
index 7eeac43ed2f..627f9148adf 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -362,7 +362,7 @@ config_target_isa:
  gcc_assert (constrained.simd);
 
  inform (UNKNOWN_LOCATION,
- "enabing %qs promotes %<%s%s%> to %<%s%s%>",
+ "enabling %qs promotes %<%s%s%> to %<%s%s%>",
  loongarch_isa_ext_strings[t.isa.simd],
  OPTSTR_ISA_EXT_FPU, loongarch_isa_ext_strings[t.isa.fpu],
  OPTSTR_ISA_EXT_FPU, loongarch_isa_ext_strings[ISA_EXT_FPU64]);
-- 
2.44.0



Re: [PATCH] cpp: new built-in __EXP_COUNTER__

2024-03-20 Thread rep . dot . nop
On 19 March 2024 18:27:13 CET, Kaz Kylheku  wrote:
>On 2024-03-18 00:30, Jonathan Wakely wrote:
>> I don't have an opinion on the implementation, or the proposal itself,
>> except that the implementation seems susprisingly simple, which is
>> nice.
>
>Hi Jonathan,
>
>Here is an updated patch.
>
>It rebased cleanly over more than newer 16000 commits, suggesting
>that the area in the cpp code is "still waters", which is good.
>
>I made the documentation change not to recommend using #if, but
>#ifdef.
>
>I got rid of the ChangeLog changes, and also tried to pay more
>attention to the log message format, where the ChangeLog pieces
>are specified.
>
>In the first test case, I had to adjust the expected warning text
>for two lines.
>

Please forgive the bike shedding, but __EXP_COUNTER__ would lead me into 
thinking about exponents or thereabouts.
__MACRO_EXPANSION_COUNTER__ is more what your patch is about, IMHO? Maybe you 
could come up with a more descriptive name, please?

And, while I can see what could possibly be done with that, I'm not really 
convinced that it would be a wise idea to (unilaterally) support that idea. 
Don't you think that this would encourage producing more spaghetti code?

Just curious about real world motivating examples I guess.
cheers


[pushed] analyzer: fix -Wanalyzer-deref-before-check false positive seen in loop header macro [PR109251]

2024-03-20 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9586-g9093f275e0a343.

gcc/analyzer/ChangeLog:
PR analyzer/109251
* sm-malloc.cc (deref_before_check::emit): Reject cases where the
check is in a loop header within a macro expansion.
(deref_before_check::loop_header_p): New.

gcc/testsuite/ChangeLog:
PR analyzer/109251
* c-c++-common/analyzer/deref-before-check-pr109251-1.c: New test.
* c-c++-common/analyzer/deref-before-check-pr109251-2.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/sm-malloc.cc | 30 ++
 .../analyzer/deref-before-check-pr109251-1.c  | 60 +++
 .../analyzer/deref-before-check-pr109251-2.c  | 37 
 3 files changed, 127 insertions(+)
 create mode 100644 
gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-1.c
 create mode 100644 
gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-2.c

diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index a518816b2b8..4e11d6dfc63 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -1563,6 +1563,21 @@ public:
 if (linemap_location_from_macro_definition_p (line_table, check_loc))
   return false;
 
+/* Reject warning if the check is in a loop header within a
+   macro expansion.  This rejects cases like:
+   |  deref of x;
+   |  [...snip...]
+   |  FOR_EACH(x) {
+   |[...snip...]
+   |  }
+   where the FOR_EACH macro tests for non-nullness of x, since
+   the user is hoping to encapsulate the details of iteration
+   in the macro, and the extra check on the first iteration
+   would just be noise if we reported it.  */
+if (loop_header_p (m_check_enode->get_point ())
+   && linemap_location_from_macro_expansion_p (line_table, check_loc))
+  return false;
+
 /* Reject if m_deref_expr is sufficiently different from m_arg
for cases where the dereference is spelled differently from
the check, which is probably two different ways to get the
@@ -1618,6 +1633,21 @@ public:
   }
 
 private:
+  static bool loop_header_p (const program_point )
+  {
+const supernode *snode = point.get_supernode ();
+if (!snode)
+  return false;
+for (auto _edge : snode->m_preds)
+  {
+   if (const cfg_superedge *cfg_in_edge
+ = in_edge->dyn_cast_cfg_superedge ())
+ if (cfg_in_edge->back_edge_p ())
+   return true;
+  }
+return false;
+  }
+
   static bool sufficiently_similar_p (tree expr_a, tree expr_b)
   {
 pretty_printer *pp_a = global_dc->printer->clone ();
diff --git 
a/gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-1.c 
b/gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-1.c
new file mode 100644
index 000..769cffae6d7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-1.c
@@ -0,0 +1,60 @@
+/* Reduced from linux-5.10.162's kernel/sched/fair.c,
+   with !CONFIG_FAIR_GROUP_SCHED.  */
+
+#define NULL ((void*)0)
+
+struct load_weight
+{
+  unsigned long weight;
+  /* [...snip...] */
+};
+
+struct sched_entity
+{
+  struct load_weight load;
+  /* [...snip...] */
+  unsigned int on_rq;
+  /* [...snip...] */
+};
+
+struct cfs_rq
+{
+  /* [...snip...] */
+  unsigned int nr_running;
+  /* [...snip...] */
+};
+
+extern int
+__calc_delta(int delta_exec, unsigned long weight /* [...snip...] */);
+
+/* !CONFIG_FAIR_GROUP_SCHED */
+#define for_each_sched_entity(se) \
+  for (; se; se = (struct sched_entity *)NULL)
+
+extern struct cfs_rq*
+cfs_rq_of(struct sched_entity* se);
+
+extern int
+__sched_period(unsigned long nr_running);
+
+int
+sched_slice(struct cfs_rq* cfs_rq, struct sched_entity* se)
+{
+  unsigned int nr_running = cfs_rq->nr_running;
+  int slice;
+
+  /* [...snip...] */
+
+  slice = __sched_period(nr_running + !se->on_rq);
+
+  for_each_sched_entity(se) {
+/* [...snip...] */
+cfs_rq = cfs_rq_of(se);
+/* [...snip...] */
+slice = __calc_delta(slice, se->load.weight);
+  }
+
+  /* [...snip...] */
+
+  return slice;
+}
diff --git 
a/gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-2.c 
b/gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-2.c
new file mode 100644
index 000..8e85a47d315
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/analyzer/deref-before-check-pr109251-2.c
@@ -0,0 +1,37 @@
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int test_loop_1 (struct node *n)
+{
+  int total = 0;
+  if (n->val = 42)
+return -1;
+  for (struct node *iter = n; iter; iter=iter->next)
+total += iter->val;
+  return total;
+}
+
+int test_loop_2 (struct node *n)
+{
+  int total = 0;
+  if (n->val = 42)
+return -1;
+  for (; n; n=n->next)
+total += n->val;
+  return total;
+}
+
+#define FOR_EACH_NODE(ITER) for (; 

Re: [PATCH] tree-optimization/113727 - bogus SRA with BIT_FIELD_REF

2024-03-20 Thread Martin Jambor
Hello,

On Tue, Mar 19 2024, Richard Biener wrote:
> When SRA analyzes BIT_FIELD_REFs it handles writes and not byte
> aligned reads differently from byte aligned reads.  Instead of
> trying to create replacements for the loaded portion the former
> cases try to replace the base object while keeping the wrapping
> BIT_FIELD_REFs.  This breaks when we have both kinds operating
> on the same base object if there's no appearant overlap conflict
> as the conflict that then nevertheless exists isn't handled with.
> The fix is to enforce what I think is part of the design handling
> the former case - that only the full base object gets replaced
> and no further sub-objects are created within as otherwise
> keeping the wrapping BIT_FIELD_REF cannot work.  The patch
> enforces this within analyze_access_subtree.
>
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>
> OK?

I agree this is the best thing to do.

Thanks,

Martin

>
> Thanks,
> Richard.
>
>   PR tree-optimization/113727
>   * tree-sra.cc (analyze_access_subtree): Do not allow
>   replacements in subtrees when grp_partial_lhs.
>
>   * gcc.dg/torture/pr113727.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/torture/pr113727.c | 26 +
>  gcc/tree-sra.cc |  3 ++-
>  2 files changed, 28 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr113727.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr113727.c 
> b/gcc/testsuite/gcc.dg/torture/pr113727.c
> new file mode 100644
> index 000..f92ddad5c8e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr113727.c
> @@ -0,0 +1,26 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target int32plus } */
> +
> +struct f {
> +  unsigned au : 5;
> +  unsigned f3 : 21;
> +} g_994;
> +
> +int main()
> +{
> +  struct f aq1 = {};
> +{
> +  struct f aq = {9, 5};
> +  struct f as = aq;
> +  for (int y = 0 ; y <= 4; y += 1)
> + if (as.au)
> +   {
> + struct f aa[5] = {{2, 154}, {2, 154}, {2, 154}, {2, 154}, {2, 154}};
> + as = aa[0];
> +   }
> +  aq1 = as;
> +}
> +  if (aq1.f3 != 0x9a)
> +__builtin_abort();
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
> index f8e71ec48b9..dbfae5e7fdd 100644
> --- a/gcc/tree-sra.cc
> +++ b/gcc/tree-sra.cc
> @@ -2735,7 +2735,8 @@ analyze_access_subtree (struct access *root, struct 
> access *parent,
>  {
>hole |= covered_to < child->offset;
>sth_created |= analyze_access_subtree (child, root,
> -  allow_replacements && !scalar,
> +  allow_replacements && !scalar
> +  && !root->grp_partial_lhs,
>totally);
>  
>root->grp_unscalarized_data |= child->grp_unscalarized_data;
> -- 
> 2.35.3


Re: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-03-20 Thread rep . dot . nop
On 19 March 2024 14:40:57 CET, Christophe Lyon  
wrote:
>On Mon, 18 Mar 2024 at 22:35, Evgeny Karpov  
>wrote:
>>
>> Monday, March 18, 2024 2:27 PM
>> Christophe Lyon wrote:
>>
>> > > +/* Disable SEH and declare the required SEH-related macros that are
>> > > +still needed for compilation.  */ #undef TARGET_SEH #define
>> > > +TARGET_SEH 0
>> > > +
>> > > +#define SSE_REGNO_P(N) 0
>> > > +#define GENERAL_REGNO_P(N) 0
>> > I think you forgot to add a comment to explain the above two lines.
>> > (it was requested during v1 review)
>> >
>> > Thanks,
>> >
>> > Christophe
>>
>> Hi Christophe,
>>
>> Thank you for the review!
>> The comment regarding SEH and SEH-related macros has been added two lines 
>> above.
>> It may not be obvious, but these macros are needed to emit SEH data in 
>> mingw/winnt.cc.
>> This group is separated by an empty line; however, it still relates to 
>> SEH-related macros.
>>
>Thanks for the clarification, I thought that comment only applied to
>the two lines about TARGET_SEH.

So, for avoidance of doubt, please drop the vertical space before SSE_REGNO_P 
to be gentle to the casual/inattentive reader?

Or add /* SEH-related */ after the vertical space, to make it clear?

thanks


Re: [PATCH V2] Document -fexcess-precision=16.

2024-03-20 Thread Joseph Myers
On Wed, 20 Mar 2024, liuhongt wrote:

> gcc/ChangeLog:
> 
>   * doc/invoke.texi: Document -fexcess-precision=16.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH 3/3] bpf: Corrected index computation when present with unnamed struct fields

2024-03-20 Thread Cupertino Miranda


David Faust writes:

> On 3/13/24 07:24, Cupertino Miranda wrote:
>> Any unnamed structure field if not a member of the BTF_KIND_STRUCT.
> typo: if -> is
>
> I'd suggest to clarify that "any unnamed structure field" is really
> any unnamed non-struct-or-union field, since anonymous inner structs
> and unions certainly are present in BTF (and you handle them here).
>
>> For that reason, CO-RE access strings indexes should take that in
>> consideration. This patch adds a condition to the incrementer that
>> computes the index for the field access.
>
> Otherwise, OK.
> Thanks.

Corrected and Pushed. Thanks
>
>>
>> gcc/ChangeLog:
>>  * config/bpf/core-builtins.cc (bpf_core_get_index): Check if
>>  field contains a DECL_NAME.
>>
>> gcc/testsuite/ChangeLog:
>>  * gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Add
>>  testcase for unnamed fields.
>> ---
>>  gcc/config/bpf/core-builtins.cc|  6 +-
>>  .../gcc.target/bpf/core-builtin-fieldinfo-offset-1.c   | 10 --
>>  2 files changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/config/bpf/core-builtins.cc 
>> b/gcc/config/bpf/core-builtins.cc
>> index 70b14e48e6e5..8333ad81d0e0 100644
>> --- a/gcc/config/bpf/core-builtins.cc
>> +++ b/gcc/config/bpf/core-builtins.cc
>> @@ -553,7 +553,11 @@ bpf_core_get_index (const tree node, bool *valid)
>>  {
>>if (l == node)
>>  return i;
>> -  i++;
>> +  /* Skip unnamed padding, not represented by BTF.  */
>> +  if (DECL_NAME(l) != NULL_TREE
>> +  || TREE_CODE (TREE_TYPE (l)) == UNION_TYPE
>> +  || TREE_CODE (TREE_TYPE (l)) == RECORD_TYPE)
>> +i++;
>>  }
>>  }
>>else if (code == ARRAY_REF || code == ARRAY_RANGE_REF || code == MEM_REF)
>> diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-fieldinfo-offset-1.c 
>> b/gcc/testsuite/gcc.target/bpf/core-builtin-fieldinfo-offset-1.c
>> index 27654205287d..8b1d8b012a2a 100644
>> --- a/gcc/testsuite/gcc.target/bpf/core-builtin-fieldinfo-offset-1.c
>> +++ b/gcc/testsuite/gcc.target/bpf/core-builtin-fieldinfo-offset-1.c
>> @@ -14,6 +14,9 @@ struct T {
>>struct S s[2];
>>char c;
>>char d;
>> +  int a: 1;
>> +  int:31;
>> +  int f;
>>  };
>>
>>  enum {
>> @@ -38,7 +41,9 @@ unsigned int foo (struct T *t)
>>unsigned e1 = __builtin_preserve_field_info (bar()->d, FIELD_BYTE_OFFSET);
>>unsigned e2 = __builtin_preserve_field_info (bar()->s[1].a4, 
>> FIELD_BYTE_OFFSET);
>>
>> -  return s0a1 + s0a4 + s0x + s1a1 + s1a4 + s1x + c + d + e1 + e2;
>> +  unsigned f1 = __builtin_preserve_field_info (t->f, FIELD_BYTE_OFFSET);
>> +
>> +  return s0a1 + s0a4 + s0x + s1a1 + s1a4 + s1x + c + d + e1 + e2 + f1;
>>  }
>>
>>  /* { dg-final { scan-assembler-times "\[\t \]mov\[\t \]%r\[0-9\],4" 2 } } */
>> @@ -65,5 +70,6 @@ unsigned int foo (struct T *t)
>>  /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:1:1:4\"\\)" 1 } 
>> } */
>>  /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:2\"\\)" 1 } } */
>>  /* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:3\"\\)" 2 } } */
>> +/* { dg-final { scan-assembler-times "bpfcr_astr_off \\(\"0:5\"\\)" 1 } } */
>>
>> -/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bpfcr_kind" 10 } } */
>> +/* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bpfcr_kind" 11 } } */


Re: [PATCH 2/3] bpf: Fix access string default for CO-RE type based relocations

2024-03-20 Thread Cupertino Miranda


David Faust writes:

> On 3/13/24 07:24, Cupertino Miranda wrote:
>> Although part of all CO-RE relocation data, type based relocations do
>> not require an access string.
>> Initial implementation defined it as an empty string.
>> On the other hand, libbpf when parsing the CO-RE relocations verifies
>> that those strings would contain "0", otherwise reports an error.
>> This patch makes GCC compliant with libbpf expectations.
>
> OK, thanks.
>
Pushed! Thanks
>>
>> gcc/Changelog:
>>  * config/bpf/btfext-out.cc (cpf_core_reloc_add): Correct for new code.
>>  Add assert to validate the string is set.
>>  * config/bpf/core-builtins.cc (cr_final): Make string struct
>>  field as const.
>>  (process_enum_value): Correct for field type change.
>>  (process_type): Set access string to "0".
>>
>> gcc/testsuite/ChangeLog:
>>  * gcc.target/bpf/core-builtin-type-based.c: Correct.
>>  * gcc.target/bpf/core-builtin-type-id.c: Correct.
>> ---
>>  gcc/config/bpf/btfext-out.cc   |  5 +++--
>>  gcc/config/bpf/core-builtins.cc| 10 ++
>>  gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c |  1 +
>>  gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c|  1 +
>>  4 files changed, 11 insertions(+), 6 deletions(-)
>>
>> diff --git a/gcc/config/bpf/btfext-out.cc b/gcc/config/bpf/btfext-out.cc
>> index 57c0dc323812..ff1fd0739f1e 100644
>> --- a/gcc/config/bpf/btfext-out.cc
>> +++ b/gcc/config/bpf/btfext-out.cc
>> @@ -299,8 +299,9 @@ bpf_core_reloc_add (const tree type, const char * 
>> section_name,
>>
>>/* Buffer the access string in the auxiliary strtab.  */
>>bpfcr->bpfcr_astr_off = 0;
>> -  if (accessor != NULL)
>> -bpfcr->bpfcr_astr_off = btf_ext_add_string (accessor);
>> +  gcc_assert (accessor != NULL);
>> +  bpfcr->bpfcr_astr_off = btf_ext_add_string (accessor);
>> +
>>bpfcr->bpfcr_type = get_btf_id (ctf_lookup_tree_type (ctfc, type));
>>bpfcr->bpfcr_insn_label = label;
>>bpfcr->bpfcr_kind = kind;
>> diff --git a/gcc/config/bpf/core-builtins.cc 
>> b/gcc/config/bpf/core-builtins.cc
>> index 4256fea15e49..70b14e48e6e5 100644
>> --- a/gcc/config/bpf/core-builtins.cc
>> +++ b/gcc/config/bpf/core-builtins.cc
>> @@ -205,7 +205,7 @@ struct cr_local
>>  /* Core Relocation Final data */
>>  struct cr_final
>>  {
>> -  char *str;
>> +  const char *str;
>>tree type;
>>enum btf_core_reloc_kind kind;
>>  };
>> @@ -868,8 +868,10 @@ process_enum_value (struct cr_builtins *data)
>>  {
>>if (TREE_VALUE (l) == expr)
>>  {
>> -  ret.str = (char *) ggc_alloc_atomic ((index / 10) + 1);
>> -  sprintf (ret.str, "%d", index);
>> +  char *tmp = (char *) ggc_alloc_atomic ((index / 10) + 1);
>> +  sprintf (tmp, "%d", index);
>> +  ret.str = (const char *) tmp;
>> +
>>break;
>>  }
>>index++;
>> @@ -987,7 +989,7 @@ process_type (struct cr_builtins *data)
>>|| data->kind == BPF_RELO_TYPE_MATCHES);
>>
>>struct cr_final ret;
>> -  ret.str = NULL;
>> +  ret.str = ggc_strdup ("0");
>>ret.type = data->type;
>>ret.kind = data->kind;
>>
>> diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c 
>> b/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
>> index 74a8d5a14d9d..9d818133c084 100644
>> --- a/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
>> +++ b/gcc/testsuite/gcc.target/bpf/core-builtin-type-based.c
>> @@ -56,3 +56,4 @@ int foo(void *data)
>>  /* { dg-final { scan-assembler-times "0x8\[\t \]+\[^\n\]*bpfcr_kind" 13 } } 
>> BPF_TYPE_EXISTS */
>>  /* { dg-final { scan-assembler-times "0x9\[\t \]+\[^\n\]*bpfcr_kind" 11 } } 
>> BPF_TYPE_SIZE */
>>  /* { dg-final { scan-assembler-times "0xc\[\t \]+\[^\n\]*bpfcr_kind" 13 } } 
>> BPF_TYPE_MATCHES */
>> +/* { dg-final { scan-assembler-times "bpfcr_astr_off \[(\"\]+0\[(\"\]+" 37 
>> } } */
>> diff --git a/gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c 
>> b/gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c
>> index 4b23288eac08..9576b91bc940 100644
>> --- a/gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c
>> +++ b/gcc/testsuite/gcc.target/bpf/core-builtin-type-id.c
>> @@ -38,3 +38,4 @@ int foo(void *data)
>>  /* { dg-final { scan-assembler-times "0\[\t \]+\[^\n\]*bpfcr_type" 0  { 
>> xfail *-*-* } } } */
>>  /* { dg-final { scan-assembler-times "0x6\[\t \]+\[^\n\]*bpfcr_kind" 13 } } 
>> BPF_TYPE_ID_LOCAL */
>>  /* { dg-final { scan-assembler-times "0x7\[\t \]+\[^\n\]*bpfcr_kind" 7 } } 
>> BPF_TYPE_ID_TARGET */
>> +/* { dg-final { scan-assembler-times "bpfcr_astr_off \[(\"\]+0\[(\"\]+" 20 
>> } } */


Re: [PATCH 1/3] bpf: Fix CO-RE field expression builtins

2024-03-20 Thread Cupertino Miranda


> This patch corrects bugs within the CO-RE builtin field expression
> related builtins.
> The following bugs were identified and corrected based on the expected
> results of bpf-next selftests testsuite.
> It addresses the following problems:
>  - Expressions with pointer dereferencing now point to the BTF structure
>type, instead of the structure pointer type.
>  - Pointer addition to structure root is now identified and constructed
>in CO-RE relocations as if it is an array access. For example,
>   "&(s+2)->b" generates "2:1" as an access string where "2" is
>   refering to the access for "s+2".
>
> gcc/ChangeLog:
>   * config/bpf/core-builtins.cc (core_field_info): Add
>   support for POINTER_PLUS_EXPR in the root of the field expression.
>   (bpf_core_get_index): Likewise.
>   (pack_field_expr): Make the BTF type to point to the structure
>   related node, instead of its pointer type.
>   (make_core_safe_access_index): Correct to new code.
>
> gcc/testsuite/ChangeLog:
>   * gcc.target/bpf/core-attr-5.c: Correct.
>   * gcc.target/bpf/core-attr-6.c: Likewise.
>   * gcc.target/bpf/core-attr-struct-as-array.c: Add test case for
>   pointer arithmetics as array access use case.
> ---
>  gcc/config/bpf/core-builtins.cc   | 54 +++
>  gcc/testsuite/gcc.target/bpf/core-attr-5.c|  4 +-
>  gcc/testsuite/gcc.target/bpf/core-attr-6.c|  4 +-
>  .../bpf/core-attr-struct-as-array.c   | 35 
>  4 files changed, 82 insertions(+), 15 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/bpf/core-attr-struct-as-array.c
>
> diff --git a/gcc/config/bpf/core-builtins.cc 
> b/gcc/config/bpf/core-builtins.cc
> index 8d8c54c1fb3d..4256fea15e49 100644
> --- a/gcc/config/bpf/core-builtins.cc
> +++ b/gcc/config/bpf/core-builtins.cc
> @@ -388,8 +388,8 @@ core_field_info (tree src, enum btf_core_reloc_kind 
> kind)
>
>src = root_for_core_field_info (src);
>
> -  get_inner_reference (src, , , _off, , 
> ,
> -, );
> +  tree root = get_inner_reference (src, , , _off, 
> ,
> +, , );
>
>/* Note: Use DECL_BIT_FIELD_TYPE rather than DECL_BIT_FIELD here, 
> because it
>   remembers whether the field in question was originally declared as a
> @@ -414,6 +414,23 @@ core_field_info (tree src, enum btf_core_reloc_kind 
> kind)
>  {
>  case BPF_RELO_FIELD_BYTE_OFFSET:
>{
> + result = 0;
> + if (var_off == NULL_TREE
> + && TREE_CODE (root) == INDIRECT_REF
> + && TREE_CODE (TREE_OPERAND (root, 0)) == POINTER_PLUS_EXPR)
> +   {
> + tree node = TREE_OPERAND (root, 0);
> + tree offset = TREE_OPERAND (node, 1);
> + tree type = TREE_TYPE (TREE_OPERAND (node, 0));
> + type = TREE_TYPE (type);
> +
> + gcc_assert (TREE_CODE (offset) == INTEGER_CST && tree_fits_shwi_p 
> (offset)
> + && COMPLETE_TYPE_P (type) && tree_fits_shwi_p (TYPE_SIZE 
> (type)));

 What if an expression with a non-constant offset (something like s+foo)
 is passed to the builtin?  Wouldn't it be better to error there instead
 of ICEing?

>>> In that case, var_off == NULL_TREE, and it did not reach the assert.
>>> In any case, please notice that this code was copied from some different
>>> code in the same file which in that case would actually produce the
>>> error earlier.  The assert is there as a safe guard just in case the
>>> other function stops detecting this case.
>>>
>>> In core-builtins.cc:572
>>>
>>> else if (code == POINTER_PLUS_EXPR)
>>>   {
>>> tree offset = TREE_OPERAND (node, 1);
>>> tree type = TREE_TYPE (TREE_OPERAND (node, 0));
>>> type = TREE_TYPE (type);
>>>
>>> if (TREE_CODE (offset) == INTEGER_CST && tree_fits_shwi_p (offset)
>>> && COMPLETE_TYPE_P (type) && tree_fits_shwi_p (TYPE_SIZE 
>>> (type)))
>>>   {
>>> HOST_WIDE_INT offset_i = tree_to_shwi (offset);
>>> HOST_WIDE_INT type_size_i = tree_to_shwi (TYPE_SIZE_UNIT 
>>> (type));
>>> if ((offset_i % type_size_i) == 0)
>>>   return offset_i / type_size_i;
>>>   }
>>>   }
>>>
>>> if (valid != NULL)
>>>   *valid = false;
>>> return -1;
>>>   }
>>>
>>> Because the code, although similar, is actually having different
>>> purposes, I decided not to abstract this in an independent function. My
>>> perception was that it would be more confusing.
>>>
>>> Without wanting to paste too much code, please notice that the function
>>> with the assert is only called if the above function, does not return
>>> with error (i.e. valid != false).
>>
>> Ok understood.
>> Please submit upstream.
>> Thanks!
>
> Heh 

[PATCH, v3] Fortran: improve array component description in runtime error message [PR30802]

2024-03-20 Thread Harald Anlauf

Hi Mikael, all,

here's now the third version of the patch that implements the following
scheme:

On 3/15/24 20:29, Mikael Morin wrote:

Le 15/03/2024 à 18:26, Harald Anlauf a écrit :

OK, that sounds interesting.  To clarify the options:

- for ordinary array x it would stay 'x'

- when z is a DT scalar, and z%x is the array in question, use 'z%x'
   (here z...%x would look strange to me)


Yes, the ellipsis would look strange to me as well.


- when z is a DT array, and x some component further down, 'z...%x'


This case also applies when z is a DT scalar and x is more than one
level deep.


I would rather not make the error message text vary too much to avoid
to run into issues with translation.  Would it be fine with you to have

... dimension 1 of array 'z...%x' above array bound ...

only?


OK, let's drop "component".


Anything else?


No, I think you covered everything.


I've created a new helper function that centralizes the generation of
the abbreviated name of the array (component) and use it to simplify
related code in multiple places.  If we change our mind how a bounds
violation error message should look like, it will be easier to adjust
in the future.

Is this OK for 14-mainline?

Thanks,
Harald


From 30d7cef086d440262b206bc39bcbcac89491b792 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 20 Mar 2024 20:59:24 +0100
Subject: [PATCH] Fortran: improve array component description in runtime error
 message [PR30802]

Runtime error messages for array bounds violation shall use the following
scheme for a coherent, abridged description of arrays or array components
of derived types:
(1) If x is an ordinary array variable, use "x"
(2) if z is a DT scalar and x an array component at level 1, use "z%x"
(3) if z is a DT scalar and x an array component at level > 1, or
if z is a DT array and x an array (at any level), use "z...%x"
Use a new helper function abridged_ref_name for construction of that name.

gcc/fortran/ChangeLog:

	PR fortran/30802
	* trans-array.cc (abridged_ref_name): New helper function.
	(trans_array_bound_check): Use it.
	(array_bound_check_elemental): Likewise.
	(gfc_conv_array_ref): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/30802
	* gfortran.dg/bounds_check_17.f90: Adjust pattern.
	* gfortran.dg/bounds_check_fail_8.f90: New test.
---
 gcc/fortran/trans-array.cc| 132 +++---
 gcc/testsuite/gfortran.dg/bounds_check_17.f90 |   2 +-
 .../gfortran.dg/bounds_check_fail_8.f90   |  56 
 3 files changed, 142 insertions(+), 48 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/bounds_check_fail_8.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 0a453828bad..30b84762346 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -3485,6 +3485,78 @@ gfc_conv_array_ubound (tree descriptor, int dim)
 }
 
 
+/* Generate abridged name of a part-ref for use in bounds-check message.
+   Cases:
+   (1) for an ordinary array variable x return "x"
+   (2) for z a DT scalar and array component x (at level 1) return "z%%x"
+   (3) for z a DT scalar and array component x (at level > 1) or
+   for z a DT array and array x (at any number of levels): "z...%%x"
+ */
+
+static char *
+abridged_ref_name (gfc_expr * expr, gfc_array_ref * ar)
+{
+  gfc_ref *ref;
+  gfc_symbol *sym;
+  char *ref_name = NULL;
+  const char *comp_name = NULL;
+  int len_sym, last_len = 0, level = 0;
+  bool sym_is_array;
+
+  gcc_assert (expr->expr_type == EXPR_VARIABLE && expr->ref != NULL);
+
+  sym = expr->symtree->n.sym;
+  sym_is_array = (sym->ts.type != BT_CLASS
+		  ? sym->as != NULL
+		  : IS_CLASS_ARRAY (sym));
+  len_sym = strlen (sym->name);
+
+  /* Scan ref chain to get name of the array component (when ar != NULL) or
+ array section, determine depth and remember its component name.  */
+  for (ref = expr->ref; ref; ref = ref->next)
+{
+  if (ref->type == REF_COMPONENT
+	  && strcmp (ref->u.c.component->name, "_data") != 0)
+	{
+	  level++;
+	  comp_name = ref->u.c.component->name;
+	  continue;
+	}
+
+  if (ref->type != REF_ARRAY)
+	continue;
+
+  if (ar)
+	{
+	  if (>u.ar == ar)
+	break;
+	}
+  else if (ref->u.ar.type == AR_SECTION)
+	break;
+}
+
+  if (level > 0)
+last_len = strlen (comp_name);
+
+  /* Provide a buffer sufficiently large to hold "x...%%z".  */
+  ref_name = XNEWVEC (char, len_sym + last_len + 6);
+  strcpy (ref_name, sym->name);
+
+  if (level == 1 && !sym_is_array)
+{
+  strcat (ref_name, "%%");
+  strcat (ref_name, comp_name);
+}
+  else if (level > 0)
+{
+  strcat (ref_name, "...%%");
+  strcat (ref_name, comp_name);
+}
+
+  return ref_name;
+}
+
+
 /* Generate code to perform an array index bound check.  */
 
 static tree
@@ -3496,7 +3568,9 @@ trans_array_bound_check (gfc_se * se, gfc_ss *ss, tree index, int n,
   tree tmp_lo, tmp_up;
   tree descriptor;
   char *msg;
+  char *ref_name = NULL;
   const char 

Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-03-20 Thread NightStrike
On Wed, Mar 20, 2024, 07:50 Radek Barton  wrote:

> Hello, everyone.
>
> I've re-run the `x86_64-w64-mingw32` target tests with all the languages
> enabled, except of Ada and JIT, which are harder to bootstrap, and Go,
> which is not supported by MinGW. The summarized results are:
>
> 536176 of expected passes
> 13859 of unexpected failures
> 188 of unexpected successes
> 4537 of expected failures
> 8271 of unresolved testcases
> 20958 of unsupported tests
> 97.40% of tests completes in expected way
>

You can generate an integrated summary using dejagnu report card:

https://www.gnu.org/software/dejagnu/manual/Invoking-dejagnu-report-card.html


> and, again, zero regressions were detected.
>
> The detailed results can be reviewed at
> https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-build/actions/runs/8349019387
>
> Best regards,
>
> Radek Bartoň
>
> 
> From: Radek Barton 
> Sent: Monday, March 18, 2024 10:05 PM
> To: Evgeny Karpov; gcc-patches@gcc.gnu.org
> Cc: richard.sandif...@arm.com; Richard Earnshaw (lists); Andrew Pinski
> (QUIC); Maxim Kuvyrkov
> Subject: Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target
>
> Hello, everyone.
>
> Currently, we are able to provide results of regression testing for
> `x86_64-w64-mingw32` target with `--enable-languages=c,lto,c++,fortran`
> running in WSL only.
>
> The summarized results, both for the branch with patch set applied and its
> corresponding base branch, show:
>
> 517501 expected passes
> 4537 of expected failures
> 10828 unexpected failures
> 180 of unexpected successes
> 5934 of unresolved testcases
> 19113 of unsupported tests
>
> which means that 98% of the tests ends in an expected way and we haven't
> detected a single regression between the branches.
>
> The detailed results can be downloaded and reviewed at
> https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-build/actions/runs/8327889403
>
> Best regards,
>
> Radek Bartoň
>


Re: [PATCH gcc] Hurd x86_64: add unwind support for signal trampoline code

2024-03-20 Thread Thomas Schwinge
Hi!

Please note that emails to , or
 don't reach me anymore, and, at least for
the time being, likewise for  --
 is the new thing; see
.
(Or use , ,
, as before.)


On 2024-03-01T02:33:10+0100, Samuel Thibault  wrote:
> Flavio Cruz, le mer. 28 févr. 2024 22:59:09 -0500, a ecrit:
>> Tested with some simple toy examples where an exception is thrown in the
>> signal handler.
>> 
>> libgcc/ChangeLog:
>>  * config/i386/gnu-unwind.h: Support unwinding x86_64 signal frames.
>> 
>> Signed-off-by: Flavio Cruz 
>
> Reviewed-by: Samuel Thibault 

Thanks, pushed as commit b7c4ae5ace82b81dafffbc50e8026adfa3cc76e7.


Grüße
 Thomas


Re: RISC-V: Use convert instructions instead of calling library functions

2024-03-20 Thread Palmer Dabbelt

On Wed, 20 Mar 2024 11:54:34 PDT (-0700), Jeff Law wrote:



On 3/19/24 10:23 AM, Palmer Dabbelt wrote:

On Mon, 18 Mar 2024 20:50:14 PDT (-0700), jeffreya...@gmail.com wrote:



On 3/18/24 3:09 AM, Jivan Hakobyan wrote:

As RV has round instructions it is reasonable to use them instead of
calling the library functions.

With my patch for the following C code:
double foo(double a) {
     return ceil(a);
}

GCC generates the following ASM code (before it was tail call)
foo:
         fabs.d  fa4,fa0
         lui     a5,%hi(.LC0)
         fld     fa3,%lo(.LC0)(a5)
         flt.d   a5,fa4,fa3
         beq     a5,zero,.L3
         fcvt.l.d a5,fa0,rup


I'm not sure exactly what context this is in, but my reading of
"according to the current rounding mode" means we'd usually use the
dynamic rounding mode.

As Andrew W. noted, we're dealing with ceil and thus rup is the correct
rounding mode to use here.





My only worry here is that when we were doing the other patterns we
decided not to do rint.  I can't remember exactly why, but from reading
the docs we probably just skipped it due to the inexact handling and Zfa
having an instruction that just does this.  FP stuff is always a bit of
a time sink, so at that point it probably just fell off the priority list.

rint is supposed to raise FE_INEXACT, so it's actually a good match for
RISC-V fcvt semantics as they appropriately raise FE_INEXACT.

nearby* do not arise FE_INEXACT and thus would rely on the new Zfa
instructions where we have ones that do not raise FE_INEXACT or they
need to be conditional on flag_fp_int_builtin_inexact.  One could
reasonably argue that when flag_fp_int_builtin_inexact is enabled that a
call to nearby* ought to be converted into a call to rint*.




I'm not really an FP guy, so I usually just poke around what the other
ports generate and try to figure out what's going on.  arm64 has the Zfa
instruction and x86 FP is complicated, so I'm not sure exactly who else
to look at for this sort of stuff.  From just looking at the code,
though, I think there's two issues -- I'm not really an FP person,
though, so take this with a grain of salt:

Right.  And the condition under which we use the new sequence for
ceil/round actually borrows from x86.  Essentially we only use the new
sequence when we've been told we don't care about FE_INEXACT or fp
exceptions in general.



IIUC what we've got here doesn't actually set the inexact flag for the
bounds check failure case, as we're just loading up an exact constant
and doing the bounds check.  We're also not clamping to INT_MAX-type
values, but not sure if we're supposed to.  I think we could fix both of
those by adjusting the expansion to something like

The state of FE_INEXACT is a don't care here due the condition on the
expansion code.




          fabs.d  fa4,fa0
          lui     a5,%hi(.LC0)
          fld     fa3,%lo(.LC0)(a5)
          flt.d   a5,fa4,fa3
          bne     a5,zero,.L3
  mv  fa0, fa3
     .L3:
          fcvt.l.d a5,fa0,rup
          fcvt.d.l        fa4,a5
          fsgnj.d fa0,fa4,fa0
          ret

and then adjusting the constant to be an epsilon larger than INT_MAX so
it'll still trigger the clamping but also inexact.

I think Jivan's sequence is more correct.  It's not just INT_MAX here
that's concerning, there's a whole class of values that cause problems.


Ya, sorry, I thought I'd replied to Andrew's email somewhere -- I'd just 
managed to confuse myself about how the FP stuff works, I also think 
Jivan's code is correct now.



There's also a pair of changes to the ISA in 2020 that added the
conversion inexact handling requirement, it was a grey area before.  I
don't remember exactly what happened there, but I did remember it
happening.  I don't think anyone cares all that much about the
performance of systems that target the older ISAs, so maybe we just
restrict the non-libcall expansion to ISAs that contain the new wording?

I think all this got sufficiently cleaned up.  The spec is explicit
about when FE_INEXACT gets raised on the fcvt instructions.  I referred
to it repeatedly when analyzing Jivan's work.


We still have support for stuf like -misa-spec=2.2 (and some 2019 
releases with clunky version numbers).  Those all predate the 
convert/inexact wording that got added.


Though if FE_INEXACT is a don't care here, then I think it doesn't 
matter if the wording got changed.  In that case I think this is fine, 
so 


Reviewed-by: Palmer Dabbelt 


We can hash through the final items in a few weeks once the trunk
re-opens for development.


I think those were all the isuses on my end ;)




Jeff


Re: [PATCH gcc 1/3] Move GNU/Hurd startfile spec from config/i386/gnu.h to config/gnu.h

2024-03-20 Thread Thomas Schwinge
Hi!

On 2024-01-03T09:49:06+, Richard Sandiford  
wrote:
> The series looks good to me FWIW, but Thomas should have the final say.

Richard, thanks for your review.

Sergey, great work on aarch64 GNU/Hurd!  (... where these GCC bits
clearly were the less complicated part...)  ;-)

Please re-submit with ChangeLog updates added to the Git commit logs; see
 ->
, and/or 'git log'
for guidance.  You may use
'contrib/gcc-changelog/git_check_commit.py --print-changelog' to verify.


Grüße
 Thomas


Re: [PATCH v7 0/5] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-03-20 Thread Kees Cook
On Wed, Mar 20, 2024 at 01:15:13PM +, Qing Zhao wrote:
> This is the 7th version of the patch.

This happily builds the Linux kernel with all its hundreds of counted_by
annotations. My behavioral regression tests all pass too:

# PASSED: 19 / 19 tests passed.
# Totals: pass:17 fail:0 xfail:2 xpass:0 skip:0 error:0

Thanks!

-Kees

-- 
Kees Cook


Re: [gcc-15 2/3] RISC-V: avoid LUI based const mat: keep stack offsets aligned

2024-03-20 Thread Jeff Law




On 3/19/24 2:05 PM, Vineet Gupta wrote:


Just to be clear, the changes don't misalign the stack pointer at all.
They merely have the potential to create *another* pointer into the
stack which may or may not be aligned.  Which is totally normal, it's no
different than taking the address of a char on the stack.


Right I never saw any sp,sp,2047 getting generated - not even in the
first version of patch which lacked any filtering of stack regs via
riscv_reg_frame_related () and obviously didn't have the stack variant
of splitter. I don't know if that is just being lucky and not enough
testing exposure (I only spot checked buildroot libc, vmlinux) or
something somewhere enforces that.

However given that misaligned pointer off of stack is a non-issue, I
think we can do the following:

1. keep just one splitter with 2047 based predicates and constraint (and
not 2032) for both stack-related and general regs.
2. gate the splitter on only operands[0] being not stack related
(currently it checks for either [0] or [1]) - this allows the prominent
case where SP is simply a src, and avoids when any potential shenanigans
to SP itself.

Works for me.

Jeff



Re: RISC-V: Use convert instructions instead of calling library functions

2024-03-20 Thread Jeff Law




On 3/19/24 10:23 AM, Palmer Dabbelt wrote:

On Mon, 18 Mar 2024 20:50:14 PDT (-0700), jeffreya...@gmail.com wrote:



On 3/18/24 3:09 AM, Jivan Hakobyan wrote:

As RV has round instructions it is reasonable to use them instead of
calling the library functions.

With my patch for the following C code:
double foo(double a) {
     return ceil(a);
}

GCC generates the following ASM code (before it was tail call)
foo:
         fabs.d  fa4,fa0
         lui     a5,%hi(.LC0)
         fld     fa3,%lo(.LC0)(a5)
         flt.d   a5,fa4,fa3
         beq     a5,zero,.L3
         fcvt.l.d a5,fa0,rup


I'm not sure exactly what context this is in, but my reading of 
"according to the current rounding mode" means we'd usually use the 
dynamic rounding mode.
As Andrew W. noted, we're dealing with ceil and thus rup is the correct 
rounding mode to use here.






My only worry here is that when we were doing the other patterns we 
decided not to do rint.  I can't remember exactly why, but from reading 
the docs we probably just skipped it due to the inexact handling and Zfa 
having an instruction that just does this.  FP stuff is always a bit of 
a time sink, so at that point it probably just fell off the priority list.
rint is supposed to raise FE_INEXACT, so it's actually a good match for 
RISC-V fcvt semantics as they appropriately raise FE_INEXACT.


nearby* do not arise FE_INEXACT and thus would rely on the new Zfa 
instructions where we have ones that do not raise FE_INEXACT or they 
need to be conditional on flag_fp_int_builtin_inexact.  One could 
reasonably argue that when flag_fp_int_builtin_inexact is enabled that a 
call to nearby* ought to be converted into a call to rint*.





I'm not really an FP guy, so I usually just poke around what the other 
ports generate and try to figure out what's going on.  arm64 has the Zfa 
instruction and x86 FP is complicated, so I'm not sure exactly who else 
to look at for this sort of stuff.  From just looking at the code, 
though, I think there's two issues -- I'm not really an FP person, 
though, so take this with a grain of salt:
Right.  And the condition under which we use the new sequence for 
ceil/round actually borrows from x86.  Essentially we only use the new 
sequence when we've been told we don't care about FE_INEXACT or fp 
exceptions in general.




IIUC what we've got here doesn't actually set the inexact flag for the 
bounds check failure case, as we're just loading up an exact constant 
and doing the bounds check.  We're also not clamping to INT_MAX-type 
values, but not sure if we're supposed to.  I think we could fix both of 
those by adjusting the expansion to something like
The state of FE_INEXACT is a don't care here due the condition on the 
expansion code.





          fabs.d  fa4,fa0
          lui     a5,%hi(.LC0)
          fld     fa3,%lo(.LC0)(a5)
          flt.d   a5,fa4,fa3
          bne     a5,zero,.L3
  mv  fa0, fa3
     .L3:
          fcvt.l.d a5,fa0,rup
          fcvt.d.l        fa4,a5
          fsgnj.d fa0,fa4,fa0
          ret

and then adjusting the constant to be an epsilon larger than INT_MAX so 
it'll still trigger the clamping but also inexact.
I think Jivan's sequence is more correct.  It's not just INT_MAX here 
that's concerning, there's a whole class of values that cause problems.




There's also a pair of changes to the ISA in 2020 that added the 
conversion inexact handling requirement, it was a grey area before.  I 
don't remember exactly what happened there, but I did remember it 
happening.  I don't think anyone cares all that much about the 
performance of systems that target the older ISAs, so maybe we just 
restrict the non-libcall expansion to ISAs that contain the new wording?
I think all this got sufficiently cleaned up.  The spec is explicit 
about when FE_INEXACT gets raised on the fcvt instructions.  I referred 
to it repeatedly when analyzing Jivan's work.


We can hash through the final items in a few weeks once the trunk 
re-opens for development.


Jeff


Re: [committed] m32r: Fix up m32r_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 11:29 AM, Jakub Jelinek wrote:

On Wed, Mar 20, 2024 at 10:58:05AM -0600, Jeff Law wrote:

On 3/20/24 10:06 AM, Jakub Jelinek wrote:

Like for x86-64, alpha or rs6000, m32r seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.


That fixed c23-stdarg-{6,8,9} execution tests.


Thanks for these checks.  All I've done with the remaining targets
today (csky-linux epiphany-elf fr30-elf frv-linux ft32-elf m32r-elf mcore-elf
mmix-knuth-mmixware nds32-linux nios2-linux sh-linux visium-elf) was
build crosses (without binutils, so stopped shortly after building cc1),
make a copy of cc1, try expected tweaks in config/*/*.cc around the
TYPE_NO_NAMED_ARGS_STDARG_P uses unless it was clearly unnecessary,
make cc1 again and compare if the c23-stdarg-9.c assembly is any different,
if yes, try to eyeball it if the changes are desired, if the assembly is
unmodified, throw away the changes.

With the changes by others (loongarch, mips, riscv) I hope we are good on
the c23-stdarg-*.c tests on all arches.
Forgot to note, arc-elf started passing c23-stdarg-9.c yesterday. 
Started an off-cycle run of the alpha, so figure this time tomorrow 
we'll have confirmation it's working.


Jeff


Re: _LIBCXX_DEBUG value initialized singular iterators assert failures in std algorithms [PR104316]

2024-03-20 Thread François Dumont

As proposed below I also updated gcc-13 branch.

    libstdc++: [_GLIBCXX_DEBUG] Define __cpp_lib_null_iterators

    _GLIBCXX_DEBUG has now fully N3344 compliant iterator checks, we 
can define

    __cpp_lib_null_iterators macros like the normal mode.

    libstdc++-v3/ChangeLog:

    * include/std/iterator (__cpp_lib_null_iterators): Define 
regardless of

    _GLIBCXX_DEBUG.
    * include/std/version (__cpp_lib_null_iterators): Likewise.

François


On 20/03/2024 10:02, Jonathan Wakely wrote:

On Wed, 20 Mar 2024 at 05:59, François Dumont wrote:

Thanks to you doc:

  libstdc++: [_GLIBCXX_DEBUG] Define __[glibcxx,cpp_lib]_null_iterators

  _GLIBCXX_DEBUG has now fully N3344 compliant iterator checks, we
can define
  __glibcxx_null_iterators and __cpp_lib_null_iterators macros like
the normal
  mode.

  libstdc++-v3/ChangeLog:

  * version.def (null_iterators): Remove extra_cond.
  * version.h: Regenerate.

Ok to commit ?

Please don't bother talking about __glibcxx_null_iterators in the
commit message, that's an implementation detail that always mirrors
the standard-defined __cpp_lib_null_iterators one. The first line of
the commit will be much easier to read without that.

OK with that change, thanks.


I already noticed that GCC 13 has no version.h file so no backport question.

It has no version.h but it still has the macros:

include/std/iterator:# define __cpp_lib_null_iterators 201304L
include/std/version:# define __cpp_lib_null_iterators 201304L

Those definitions can be made to not depend on _GLIBCXX_DEBUG.
diff --git a/libstdc++-v3/include/std/iterator 
b/libstdc++-v3/include/std/iterator
index 695e18e2c47..a0a8eac570b 100644
--- a/libstdc++-v3/include/std/iterator
+++ b/libstdc++-v3/include/std/iterator
@@ -67,7 +67,7 @@
 #endif
 #include 
 
-#if __cplusplus >= 201402L && ! defined _GLIBCXX_DEBUG // PR libstdc++/70303
+#if __cplusplus >= 201402L
 # define __cpp_lib_null_iterators 201304L
 #endif
 
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index bd1bee0190d..ee515c4e66c 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -81,9 +81,7 @@
 #define __cpp_lib_integral_constant_callable 201304L
 #define __cpp_lib_is_final 201402L
 #define __cpp_lib_make_reverse_iterator 201402L
-#ifndef _GLIBCXX_DEBUG // PR libstdc++/70303
-# define __cpp_lib_null_iterators 201304L
-#endif
+#define __cpp_lib_null_iterators 201304L
 #define __cpp_lib_robust_nonmodifying_seq_ops 201304L
 #define __cpp_lib_transformation_trait_aliases 201304L
 #define __cpp_lib_transparent_operators 201510L


Re: [committed] csky: Fix up csky_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 10:04 AM, Jakub Jelinek wrote:

Hi!

Like for x86-64, alpha or rs6000, csky seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/csky/csky.cc (csky_setup_incoming_varargs): Only skip
csky_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.
I'm not set up to do execution tests with glibc+qemu.  So for csky & 
nios2 we just have the the build successes.


jeff



Re: [committed] m32r: Fix up m32r_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
On Wed, Mar 20, 2024 at 10:58:05AM -0600, Jeff Law wrote:
> On 3/20/24 10:06 AM, Jakub Jelinek wrote:
> > Like for x86-64, alpha or rs6000, m32r seems to be affected too.
> > 
> > Just visually checked differences in c23-stdarg-9.c assembly in a cross
> > without/with the patch, committed to trunk.
> > 
> That fixed c23-stdarg-{6,8,9} execution tests.

Thanks for these checks.  All I've done with the remaining targets
today (csky-linux epiphany-elf fr30-elf frv-linux ft32-elf m32r-elf mcore-elf
mmix-knuth-mmixware nds32-linux nios2-linux sh-linux visium-elf) was
build crosses (without binutils, so stopped shortly after building cc1),
make a copy of cc1, try expected tweaks in config/*/*.cc around the
TYPE_NO_NAMED_ARGS_STDARG_P uses unless it was clearly unnecessary,
make cc1 again and compare if the c23-stdarg-9.c assembly is any different,
if yes, try to eyeball it if the changes are desired, if the assembly is
unmodified, throw away the changes.

With the changes by others (loongarch, mips, riscv) I hope we are good on
the c23-stdarg-*.c tests on all arches.

Jakub



Re: [committed] visium: Fix up visium_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 10:08 AM, Jakub Jelinek wrote:

Hi!

Like for x86-64, alpha or rs6000, visium seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/visium/visium.cc (visium_setup_incoming_varargs): Only skip
TARGET_FUNCTION_ARG_ADVANCE for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

No gdbsim, so no execution results.  Builds fine though.

jeff



Re: [committed] nds32: Fix up nds32_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 10:06 AM, Jakub Jelinek wrote:

Hi!

Like for x86-64, alpha or rs6000, nds32 seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/nds32/nds32.cc (nds32_setup_incoming_varargs): Only skip
function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.
No gdbsim for the nds32 processor.  So no execution tests.  Built fine 
though.


jeff



Re: [committed] ft32: Fix up ft32_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 10:05 AM, Jakub Jelinek wrote:

Hi!

Like for x86-64, alpha or rs6000, ft32 seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/ft32/ft32.cc (ft32_setup_incoming_varargs): Only skip
function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

Built fine.  No gdbsim on this platform, so no execution test data.

jeff



Re: [committed] epiphany: Fix up epiphany_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 10:05 AM, Jakub Jelinek wrote:

Hi!

Like for x86-64, alpha or rs6000, epiphany seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/epiphany/epiphany.cc (epiphany_setup_incoming_varargs): Only
skip function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.
Built fine.  Platform has been too flakey to run tests with any 
consistency.  So no real test results.


jeff



Re: [committed] m32r: Fix up m32r_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jeff Law




On 3/20/24 10:06 AM, Jakub Jelinek wrote:

Hi!

Like for x86-64, alpha or rs6000, m32r seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.


That fixed c23-stdarg-{6,8,9} execution tests.

jeff



[committed] visium: Fix up visium_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, visium seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/visium/visium.cc (visium_setup_incoming_varargs): Only skip
TARGET_FUNCTION_ARG_ADVANCE for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/visium/visium.cc.jj  2024-01-03 11:51:47.166514192 +0100
+++ gcc/config/visium/visium.cc 2024-03-20 16:14:51.613422079 +0100
@@ -1482,7 +1482,8 @@ visium_setup_incoming_varargs (cumulativ
   /* The caller has advanced ARGS_SO_FAR up to, but not beyond, the last named
  argument.  Advance a local copy of ARGS_SO_FAR past the last "real" named
  argument, to find out how many registers are left over.  */
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 TARGET_FUNCTION_ARG_ADVANCE (local_args_so_far, arg);
 
   /* Find how many registers we need to save.  */

Jakub



[committed] nios2: Fix up nios2_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, nios2 seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/nios2/nios2.cc (nios2_setup_incoming_varargs): Only skip
nios2_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/nios2/nios2.cc.jj2024-01-03 12:01:15.414592108 +0100
+++ gcc/config/nios2/nios2.cc   2024-03-20 16:14:07.051020226 +0100
@@ -3525,7 +3525,8 @@ nios2_setup_incoming_varargs (cumulative
 
   cfun->machine->uses_anonymous_args = 1;
   local_cum = *cum;
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 nios2_function_arg_advance (local_cum_v, arg);
 
   regs_to_push = NUM_ARG_REGS - local_cum.regs_used;

Jakub



[committed] nds32: Fix up nds32_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, nds32 seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/nds32/nds32.cc (nds32_setup_incoming_varargs): Only skip
function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/nds32/nds32.cc.jj2024-01-03 12:01:22.862487765 +0100
+++ gcc/config/nds32/nds32.cc   2024-03-20 16:13:47.460283181 +0100
@@ -2374,7 +2374,8 @@ nds32_setup_incoming_varargs (cumulative
  for varargs.  */
   total_args_regs
 = NDS32_MAX_GPR_REGS_FOR_ARGS + NDS32_GPR_ARG_FIRST_REGNUM;
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 num_of_used_regs
   = NDS32_AVAILABLE_REGNUM_FOR_GPR_ARG (cum->gpr_offset, arg.mode, 
arg.type)
 + NDS32_NEED_N_REGS_FOR_ARG (arg.mode, arg.type);

Jakub



[committed] m32r: Fix up m32r_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, m32r seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/m32r/m32r.cc (m32r_setup_incoming_varargs): Only skip
function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/m32r/m32r.cc.jj  2024-01-03 11:51:50.204472028 +0100
+++ gcc/config/m32r/m32r.cc 2024-03-20 16:12:15.841512952 +0100
@@ -1287,7 +1287,8 @@ m32r_setup_incoming_varargs (cumulative_
   if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
 gcc_assert (arg.mode != BLKmode);
 
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 first_anon_arg = (ROUND_ADVANCE_CUM (*get_cumulative_args (cum),
 arg.mode, arg.type)
  + ROUND_ADVANCE_ARG (arg.mode, arg.type));

Jakub



[committed] ft32: Fix up ft32_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, ft32 seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/ft32/ft32.cc (ft32_setup_incoming_varargs): Only skip
function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/ft32/ft32.cc.jj  2024-01-03 12:01:22.988486000 +0100
+++ gcc/config/ft32/ft32.cc 2024-03-20 16:11:41.804969813 +0100
@@ -635,9 +635,10 @@ ft32_setup_incoming_varargs (cumulative_
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
   int named_size = 0;
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
-named_size =
-  GET_MODE_SIZE (SImode) * (*cum - FT32_R0) + GET_MODE_SIZE (arg.mode);
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
+named_size
+  = GET_MODE_SIZE (SImode) * (*cum - FT32_R0) + GET_MODE_SIZE (arg.mode);
 
   if (named_size < 24)
 *pretend_size = 24 - named_size;

Jakub



[committed] epiphany: Fix up epiphany_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, epiphany seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/epiphany/epiphany.cc (epiphany_setup_incoming_varargs): Only
skip function arg advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/epiphany/epiphany.cc.jj  2024-01-03 11:51:47.401510931 +0100
+++ gcc/config/epiphany/epiphany.cc 2024-03-20 16:10:08.152226880 +0100
@@ -728,7 +728,8 @@ epiphany_setup_incoming_varargs (cumulat
 gcc_assert (arg.mode != BLKmode);
 
   next_cum = *get_cumulative_args (cum);
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 next_cum = (ROUND_ADVANCE_CUM (next_cum, arg.mode, arg.type)
+ ROUND_ADVANCE_ARG (arg.mode, arg.type));
   first_anon_arg = next_cum;

Jakub



[committed] csky: Fix up csky_setup_incoming_varargs [PR114175]

2024-03-20 Thread Jakub Jelinek
Hi!

Like for x86-64, alpha or rs6000, csky seems to be affected too.

Just visually checked differences in c23-stdarg-9.c assembly in a cross
without/with the patch, committed to trunk.

2024-03-20  Jakub Jelinek  

PR target/114175
* config/csky/csky.cc (csky_setup_incoming_varargs): Only skip
csky_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P functions
if arg.type is NULL.

--- gcc/config/csky/csky.cc.jj  2024-01-03 11:51:47.818505143 +0100
+++ gcc/config/csky/csky.cc 2024-03-20 16:09:23.765822660 +0100
@@ -2090,7 +2090,8 @@ csky_setup_incoming_varargs (cumulative_
 
   cfun->machine->uses_anonymous_args = 1;
   local_cum = *pcum;
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 csky_function_arg_advance (local_cum_v, arg);
   regs_to_push = CSKY_NPARM_REGS - local_cum.reg;
   if (regs_to_push)

Jakub



Re: [PATCH 1/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Yury Khrustalev
Hi Richard,

> On 20 Mar 2024, at 1:55 PM, Richard Earnshaw (lists) 
>  wrote:
> 
> On 20/03/2024 11:21, Yury Khrustalev wrote:
>> This patch updates `aarch64-sys-regs.def', bringing it into sync with
>> the Binutils source.
>> 
>> gcc/ChangeLog:
>> 
>> * config/aarch64/aarch64-sys-regs.def: Copy from Binutils.
> 
> Thanks, I've pushed this.  It's trivial enough and there's value of keeping 
> it in sync with binutils.

Thanks!

> 
> One comment though, there should be one hard tab before "* config/..."; you 
> seem to have some other random characters there that looked like white space.

Thanks for the feedback, I'll make sure to avoid this in the future patches.

> 
> R.
> 
>> ---
>> gcc/config/aarch64/aarch64-sys-regs.def | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
>> b/gcc/config/aarch64/aarch64-sys-regs.def
>> index 6a948171d6e..8b65673a5d6 100644
>> --- a/gcc/config/aarch64/aarch64-sys-regs.def
>> +++ b/gcc/config/aarch64/aarch64-sys-regs.def
>> @@ -521,6 +521,7 @@
>>   SYSREG ("id_aa64isar0_el1", CPENC (3,0,0,6,0), F_REG_READ, 
>> AARCH64_NO_FEATURES)
>>   SYSREG ("id_aa64isar1_el1", CPENC (3,0,0,6,1), F_REG_READ, 
>> AARCH64_NO_FEATURES)
>>   SYSREG ("id_aa64isar2_el1", CPENC (3,0,0,6,2), F_REG_READ, 
>> AARCH64_NO_FEATURES)
>> +  SYSREG ("id_aa64isar3_el1", CPENC (3,0,0,6,3), F_REG_READ, 
>> AARCH64_NO_FEATURES)
>>   SYSREG ("id_aa64mmfr0_el1", CPENC (3,0,0,7,0), F_REG_READ, 
>> AARCH64_NO_FEATURES)
>>   SYSREG ("id_aa64mmfr1_el1", CPENC (3,0,0,7,1), F_REG_READ, 
>> AARCH64_NO_FEATURES)
>>   SYSREG ("id_aa64mmfr2_el1", CPENC (3,0,0,7,2), F_REG_READ, 
>> AARCH64_NO_FEATURES)
> 

Kind regards,
Yury




Re: [PATCH 1/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Richard Earnshaw (lists)
On 20/03/2024 11:21, Yury Khrustalev wrote:
> This patch updates `aarch64-sys-regs.def', bringing it into sync with
> the Binutils source.
> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-sys-regs.def: Copy from Binutils.

Thanks, I've pushed this.  It's trivial enough and there's value of keeping it 
in sync with binutils.

One comment though, there should be one hard tab before "* config/..."; you 
seem to have some other random characters there that looked like white space.

R.

> ---
>  gcc/config/aarch64/aarch64-sys-regs.def | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
> b/gcc/config/aarch64/aarch64-sys-regs.def
> index 6a948171d6e..8b65673a5d6 100644
> --- a/gcc/config/aarch64/aarch64-sys-regs.def
> +++ b/gcc/config/aarch64/aarch64-sys-regs.def
> @@ -521,6 +521,7 @@
>SYSREG ("id_aa64isar0_el1",CPENC (3,0,0,6,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar1_el1",CPENC (3,0,0,6,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64isar2_el1",CPENC (3,0,0,6,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)
> +  SYSREG ("id_aa64isar3_el1",CPENC (3,0,0,6,3),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr0_el1",CPENC (3,0,0,7,0),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr1_el1",CPENC (3,0,0,7,1),  F_REG_READ, 
> AARCH64_NO_FEATURES)
>SYSREG ("id_aa64mmfr2_el1",CPENC (3,0,0,7,2),  F_REG_READ, 
> AARCH64_NO_FEATURES)



[PATCH v7 4/5] Use the .ACCESS_WITH_SIZE in bound sanitizer.

2024-03-20 Thread Qing Zhao
gcc/c-family/ChangeLog:

* c-ubsan.cc (get_bound_from_access_with_size): New function.
(ubsan_instrument_bounds): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-3.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds-4.c: New test.
* gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
---
 gcc/c-family/c-ubsan.cc   | 42 +
 .../ubsan/flex-array-counted-by-bounds-2.c| 45 ++
 .../ubsan/flex-array-counted-by-bounds-3.c| 34 ++
 .../ubsan/flex-array-counted-by-bounds-4.c| 34 ++
 .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
 5 files changed, 201 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-3.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-4.c
 create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 940982819ddf..7cd3c6aa5b88 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -376,6 +376,40 @@ ubsan_instrument_return (location_t loc)
   return build_call_expr_loc (loc, t, 1, build_fold_addr_expr_loc (loc, data));
 }
 
+/* Get the tree that represented the number of counted_by, i.e, the maximum
+   number of the elements of the object that the call to .ACCESS_WITH_SIZE
+   points to, this number will be the bound of the corresponding array.  */
+static tree
+get_bound_from_access_with_size (tree call)
+{
+  if (!is_access_with_size_p (call))
+return NULL_TREE;
+
+  tree ref_to_size = CALL_EXPR_ARG (call, 1);
+  unsigned int class_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2));
+  tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3));
+  tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size),
+  build_int_cst (ptr_type_node, 0));
+  /* If size is negative value, treat it as zero.  */
+  if (!TYPE_UNSIGNED (type))
+  {
+tree cond = fold_build2 (LT_EXPR, boolean_type_node,
+unshare_expr (size), build_zero_cst (type));
+size = fold_build3 (COND_EXPR, type, cond,
+   build_zero_cst (type), size);
+  }
+
+  /* Only when class_of_size is 1, i.e, the number of the elements of
+ the object type, return the size.  */
+  if (class_of_size != 1)
+return NULL_TREE;
+  else
+size = fold_convert (sizetype, size);
+
+  return size;
+}
+
+
 /* Instrument array bounds for ARRAY_REFs.  We create special builtin,
that gets expanded in the sanopt pass, and make an array dimension
of it.  ARRAY is the array, *INDEX is an index to the array.
@@ -401,6 +435,14 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
  && COMPLETE_TYPE_P (type)
  && integer_zerop (TYPE_SIZE (type)))
bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
+  else if (INDIRECT_REF_P (array)
+  && is_access_with_size_p ((TREE_OPERAND (array, 0
+   {
+ bound = get_bound_from_access_with_size ((TREE_OPERAND (array, 0)));
+ bound = fold_build2 (MINUS_EXPR, TREE_TYPE (bound),
+  bound,
+  build_int_cst (TREE_TYPE (bound), 1));
+   }
   else
return NULL_TREE;
 }
diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
new file mode 100644
index ..148934975ee5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
@@ -0,0 +1,45 @@
+/* test the attribute counted_by and its usage in
+   bounds sanitizer combined with VLA.  */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds" } */
+/* { dg-output "index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 20 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 11 out of bounds for type 'int 
\\\[\\\*\\\]\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int 
\\\[\\\*\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+
+#include 
+
+void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
+{
+   struct foo {
+   int n;
+   int p[][n] __attribute__((counted_by(n)));
+   } *f;
+
+   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
+   f->n = m;
+   f->p[m][n-1]=1;
+   return;
+}
+
+void __attribute__((__noinline__)) setup_and_test_vla_1 (int n1, int n2, int m)
+{
+  struct foo {
+int n;
+int p[][n2][n1] __attribute__((counted_by(n)));
+  } *f;
+
+  f = (struct foo *) malloc 

[PATCH v7 5/5] Add the 6th argument to .ACCESS_WITH_SIZE

2024-03-20 Thread Qing Zhao
to carry the TYPE of the flexible array.

Such information is needed during tree-object-size.cc.

We cannot use the result type or the type of the 1st argument
of the routine .ACCESS_WITH_SIZE to decide the element type
of the original array due to possible type casting in the
source code.

gcc/c/ChangeLog:

* c-typeck.cc (build_access_with_size_for_counted_by): Add the 6th
argument to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): Use the type
of the 6th argument for the type of the element.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-6.c: New test.
---
 gcc/c/c-typeck.cc | 11 +++--
 gcc/internal-fn.cc|  2 +
 .../gcc.dg/flex-array-counted-by-6.c  | 46 +++
 gcc/tree-object-size.cc   | 16 ---
 4 files changed, 66 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-6.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index a29a7d7ec029..c17ac6862546 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2608,7 +2608,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
 
to:
 
-   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1))
+   (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1,
+   (TYPE_OF_ARRAY *)0))
 
NOTE: The return type of this function is the POINTER type pointing
to the original flexible array type.
@@ -2620,6 +2621,9 @@ build_counted_by_ref (tree datum, tree subdatum, tree 
*counted_by_type)
The 4th argument of the call is a constant 0 with the TYPE of the
object pointed by COUNTED_BY_REF.
 
+   The 6th argument of the call is a constant 0 with the pointer TYPE
+   to the original flexible array type.
+
   */
 static tree
 build_access_with_size_for_counted_by (location_t loc, tree ref,
@@ -2632,12 +2636,13 @@ build_access_with_size_for_counted_by (location_t loc, 
tree ref,
 
   tree call
 = build_call_expr_internal_loc (loc, IFN_ACCESS_WITH_SIZE,
-   result_type, 5,
+   result_type, 6,
array_to_pointer_conversion (loc, ref),
counted_by_ref,
build_int_cst (integer_type_node, 1),
build_int_cst (counted_by_type, 0),
-   build_int_cst (integer_type_node, -1));
+   build_int_cst (integer_type_node, -1),
+   build_int_cst (result_type, 0));
   /* Wrap the call with an INDIRECT_REF with the flexible array type.  */
   call = build1 (INDIRECT_REF, TREE_TYPE (ref), call);
   SET_EXPR_LOCATION (call, loc);
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index e744080ee670..34e4a4aea534 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3411,6 +3411,8 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
  1: read_only
  2: write_only
  3: read_write
+   6th argument: A constant 0 with the pointer TYPE to the original flexible
+ array type.
 
Both the return type and the type of the first argument of this
function have been converted from the incomplete array type to
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
new file mode 100644
index ..65a401796479
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-6.c
@@ -0,0 +1,46 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size. when the type of the flexible array member
+ * is casting to another type.  */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+typedef unsigned short u16;
+
+struct info {
+   u16 data_len;
+   char data[] __attribute__((counted_by(data_len)));
+};
+
+struct foo {
+   int a;
+   int b;
+};
+
+static __attribute__((__noinline__))
+struct info *setup ()
+{
+ struct info *p;
+ size_t bytes = 3 * sizeof(struct foo);
+
+ p = (struct info *)malloc (sizeof (struct info) + bytes);
+ p->data_len = bytes;
+
+ return p;
+}
+
+static void
+__attribute__((__noinline__)) report (struct info *p)
+{
+ struct foo *bar = (struct foo *)p->data;
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 1), 1), 16);
+ EXPECT(__builtin_dynamic_object_size((char *)(bar + 2), 1), 8);
+}
+
+int main(int argc, char *argv[])
+{
+ struct info *p = setup();
+ report(p);
+ return 0;
+}
diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
index d258d0947545..ee9a0415c21c 100644
--- a/gcc/tree-object-size.cc
+++ b/gcc/tree-object-size.cc
@@ -762,9 +762,11 @@ addr_object_size (struct object_size_info *osi, const_tree 
ptr,
  1: the number of the elements of the object type;
4th argument 

[PATCH v7 3/5] Use the .ACCESS_WITH_SIZE in builtin object size.

2024-03-20 Thread Qing Zhao
gcc/ChangeLog:

* tree-object-size.cc (access_with_size_object_size): New function.
(call_object_size): Call the new function.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-common.h: Add a new macro EXPECT.
* gcc.dg/flex-array-counted-by-3.c: New test.
* gcc.dg/flex-array-counted-by-4.c: New test.
* gcc.dg/flex-array-counted-by-5.c: New test.
---
 .../gcc.dg/builtin-object-size-common.h   |  11 ++
 .../gcc.dg/flex-array-counted-by-3.c  |  63 +++
 .../gcc.dg/flex-array-counted-by-4.c  | 178 ++
 .../gcc.dg/flex-array-counted-by-5.c  |  48 +
 gcc/tree-object-size.cc   |  60 ++
 5 files changed, 360 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-5.c

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-common.h 
b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
index 66ff7cdd953a..b677067c6e6b 100644
--- a/gcc/testsuite/gcc.dg/builtin-object-size-common.h
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-common.h
@@ -30,3 +30,14 @@ unsigned nfails = 0;
   __builtin_abort ();\
 return 0;\
   } while (0)
+
+#define EXPECT(p, _v) do {   \
+  size_t v = _v; \
+  if (p == v)\
+__builtin_printf ("ok:  %s == %zd\n", #p, p);\
+  else   \
+{\
+  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v);\
+  FAIL ();   \
+}\
+} while (0);
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
new file mode 100644
index ..0066c32ca808
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
@@ -0,0 +1,63 @@
+/* test the attribute counted_by and its usage in
+ * __builtin_dynamic_object_size.  */ 
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "builtin-object-size-common.h"
+
+struct flex {
+  int b;
+  int c[];
+} *array_flex;
+
+struct annotated {
+  int b;
+  int c[] __attribute__ ((counted_by (b)));
+} *array_annotated;
+
+struct nested_annotated {
+  struct {
+union {
+  int b;
+  float f; 
+};
+int n;
+  };
+  int c[] __attribute__ ((counted_by (b)));
+} *array_nested_annotated;
+
+void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
+{
+  array_flex
+= (struct flex *)malloc (sizeof (struct flex)
++ normal_count *  sizeof (int));
+  array_flex->b = normal_count;
+
+  array_annotated
+= (struct annotated *)malloc (sizeof (struct annotated)
+ + attr_count *  sizeof (int));
+  array_annotated->b = attr_count;
+
+  array_nested_annotated
+= (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
++ attr_count *  sizeof (int));
+  array_nested_annotated->b = attr_count;
+
+  return;
+}
+
+void __attribute__((__noinline__)) test ()
+{
+EXPECT(__builtin_dynamic_object_size(array_flex->c, 1), -1);
+EXPECT(__builtin_dynamic_object_size(array_annotated->c, 1),
+  array_annotated->b * sizeof (int));
+EXPECT(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
+  array_nested_annotated->b * sizeof (int));
+}
+
+int main(int argc, char *argv[])
+{
+  setup (10,10);   
+  test ();
+  DONE ();
+}
diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c 
b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
new file mode 100644
index ..3ce7f3545549
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-4.c
@@ -0,0 +1,178 @@
+/* test the attribute counted_by and its usage in
+__builtin_dynamic_object_size: what's the correct behavior when the
+allocation size mismatched with the value of counted_by attribute?
+we should always use the latest value that is hold by the counted_by
+field.  */
+/* { dg-do run } */
+/* { dg-options "-O -fstrict-flex-arrays=3" } */
+
+#include "builtin-object-size-common.h"
+
+struct annotated {
+  size_t foo;
+  char others;
+  char array[] __attribute__((counted_by (foo)));
+};
+
+#define noinline __attribute__((__noinline__))
+#define SIZE_BUMP 10 
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+
+/* In general, Due to type casting, the type for the pointee of a pointer
+   does not say 

[PATCH v7 2/5] Convert references with "counted_by" attributes to/from .ACCESS_WITH_SIZE.

2024-03-20 Thread Qing Zhao
Including the following changes:
* The definition of the new internal function .ACCESS_WITH_SIZE
  in internal-fn.def.
* C FE converts every reference to a FAM with a "counted_by" attribute
  to a call to the internal function .ACCESS_WITH_SIZE.
  (build_component_ref in c_typeck.cc)

  This includes the case when the object is statically allocated and
  initialized.
  In order to make this working, the routines initializer_constant_valid_p_1
  and output_constant in varasm.cc are updated to handle calls to
  .ACCESS_WITH_SIZE.
  (initializer_constant_valid_p_1 and output_constant in varasm.c)

  However, for the reference inside "offsetof", the "counted_by" attribute is
  ignored since it's not useful at all.
  (c_parser_postfix_expression in c/c-parser.cc)

  In addtion to "offsetof", for the reference inside operator "typeof" and
  "alignof", we ignore counted_by attribute too.

  When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
  replace the call with its first argument.

* Convert every call to .ACCESS_WITH_SIZE to its first argument.
  (expand_ACCESS_WITH_SIZE in internal-fn.cc)
* Adjust alias analysis to exclude the new internal from clobbering anything.
  (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in tree-ssa-alias.cc)
* Adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE when
  it's LHS is eliminated as dead code.
  (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
* Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
  get the reference from the call to .ACCESS_WITH_SIZE.
  (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)

gcc/c/ChangeLog:

* c-parser.cc (c_parser_postfix_expression): Ignore the counted-by
attribute when build_component_ref inside offsetof operator.
* c-tree.h (build_component_ref): Add one more parameter.
* c-typeck.cc (build_counted_by_ref): New function.
(build_access_with_size_for_counted_by): New function.
(build_component_ref): Check the counted-by attribute and build
call to .ACCESS_WITH_SIZE.
(build_unary_op): When building ADDR_EXPR for
.ACCESS_WITH_SIZE, use its first argument.
(lvalue_p): Accept call to .ACCESS_WITH_SIZE.

gcc/ChangeLog:

* internal-fn.cc (expand_ACCESS_WITH_SIZE): New function.
* internal-fn.def (ACCESS_WITH_SIZE): New internal function.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special case
IFN_ACCESS_WITH_SIZE.
(call_may_clobber_ref_p_1): Special case IFN_ACCESS_WITH_SIZE.
* tree-ssa-dce.cc (eliminate_unnecessary_stmts): Eliminate the call
to .ACCESS_WITH_SIZE when its LHS is dead.
* tree.cc (process_call_operands): Adjust side effect for function
.ACCESS_WITH_SIZE.
(is_access_with_size_p): New function.
(get_ref_from_access_with_size): New function.
* tree.h (is_access_with_size_p): New prototype.
(get_ref_from_access_with_size): New prototype.
* varasm.cc (initializer_constant_valid_p_1): Handle call to
.ACCESS_WITH_SIZE.
(output_constant): Handle call to .ACCESS_WITH_SIZE.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-2.c: New test.
---
 gcc/c/c-parser.cc |  10 +-
 gcc/c/c-tree.h|   2 +-
 gcc/c/c-typeck.cc | 128 +-
 gcc/internal-fn.cc|  35 +
 gcc/internal-fn.def   |   4 +
 .../gcc.dg/flex-array-counted-by-2.c  | 112 +++
 gcc/tree-ssa-alias.cc |   2 +
 gcc/tree-ssa-dce.cc   |   5 +-
 gcc/tree.cc   |  25 +++-
 gcc/tree.h|   8 ++
 gcc/varasm.cc |  10 ++
 11 files changed, 331 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index c31349dae2ff..a6ed5ac43bb1 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10850,9 +10850,12 @@ c_parser_postfix_expression (c_parser *parser)
if (c_parser_next_token_is (parser, CPP_NAME))
  {
c_token *comp_tok = c_parser_peek_token (parser);
+   /* Ignore the counted_by attribute for reference inside
+  offsetof since the information is not useful at all.  */
offsetof_ref
  = build_component_ref (loc, offsetof_ref, comp_tok->value,
-comp_tok->location, UNKNOWN_LOCATION);
+comp_tok->location, UNKNOWN_LOCATION,
+false);
c_parser_consume_token (parser);
while (c_parser_next_token_is (parser, CPP_DOT)
   || 

[PATCH v7 1/5] Provide counted_by attribute to flexible array member field (PR108896)

2024-03-20 Thread Qing Zhao
'counted_by (COUNT)'
 The 'counted_by' attribute may be attached to the C99 flexible
 array member of a structure.  It indicates that the number of the
 elements of the array is given by the field named "COUNT" in the
 same structure as the flexible array member.
 GCC may use this information to improve detection of object size 
information
 for such structures and provide better results in compile-time diagnostics
 and runtime features like the array bound sanitizer and
 the '__builtin_dynamic_object_size'.

 For instance, the following code:

  struct P {
size_t count;
char other;
char array[] __attribute__ ((counted_by (count)));
  } *p;

 specifies that the 'array' is a flexible array member whose number
 of elements is given by the field 'count' in the same structure.

 The field that represents the number of the elements should have an
 integer type.  Otherwise, the compiler will report a warning and
 ignore the attribute.

 When the field that represents the number of the elements is assigned a
 negative integer value, the compiler will treat the value as zero.

 An explicit 'counted_by' annotation defines a relationship between
 two objects, 'p->array' and 'p->count', and there are the following
 requirementthat on the relationship between this pair:

* 'p->count' must be initialized before the first reference to
  'p->array';

* 'p->array' has _at least_ 'p->count' number of elements
  available all the time.  This relationship must hold even
  after any of these related objects are updated during the
  program.

 It's the user's responsibility to make sure the above requirements
 to be kept all the time.  Otherwise the compiler will report
 warnings, at the same time, the results of the array bound
 sanitizer and the '__builtin_dynamic_object_size' is undefined.

 One important feature of the attribute is, a reference to the
 flexible array member field will use the latest value assigned to
 the field that represents the number of the elements before that
 reference.  For example,

p->count = val1;
p->array[20] = 0;  // ref1 to p->array
p->count = val2;
p->array[30] = 0;  // ref2 to p->array

 in the above, 'ref1' will use 'val1' as the number of the elements
 in 'p->array', and 'ref2' will use 'val2' as the number of elements
 in 'p->array'.

gcc/c-family/ChangeLog:

PR C/108896
* c-attribs.cc (handle_counted_by_attribute): New function.
(attribute_takes_identifier_p): Add counted_by attribute to the list.
* c-common.cc (c_flexible_array_member_type_p): ...To this.
* c-common.h (c_flexible_array_member_type_p): New prototype.

gcc/c/ChangeLog:

PR C/108896
* c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
(add_flexible_array_elts_to_size): Use renamed function.
(is_flexible_array_member_p): Use renamed function.
(verify_counted_by_attribute): New function.
(finish_struct): Use renamed function and verify counted_by
attribute.
* c-tree.h (lookup_field): New prototype.
* c-typeck.cc (lookup_field): Expose as extern function.

gcc/ChangeLog:

PR C/108896
* doc/extend.texi: Document attribute counted_by.

gcc/testsuite/ChangeLog:

PR C/108896
* gcc.dg/flex-array-counted-by.c: New test.
---
 gcc/c-family/c-attribs.cc| 54 +-
 gcc/c-family/c-common.cc | 13 
 gcc/c-family/c-common.h  |  1 +
 gcc/c/c-decl.cc  | 78 +++-
 gcc/c/c-tree.h   |  1 +
 gcc/c/c-typeck.cc|  3 +-
 gcc/doc/extend.texi  | 67 +
 gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 ++
 8 files changed, 237 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 40a0cf90295d..51cf91c4fbfd 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -105,6 +105,8 @@ static tree handle_warn_if_not_aligned_attribute (tree *, 
tree, tree,
  int, bool *);
 static tree handle_strict_flex_array_attribute (tree *, tree, tree,
 int, bool *);
+static tree handle_counted_by_attribute (tree *, tree, tree,
+  int, bool *);
 static tree handle_weak_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_noplt_attribute (tree *, tree, tree, int, bool *) ;
 static tree handle_alias_ifunc_attribute (bool, tree *, tree, tree, bool *);
@@ -412,6 

[PATCH v7 0/5] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2024-03-20 Thread Qing Zhao
Hi,

This is the 7th version of the patch.

compare with the 6th version, the difference are:

updates per Siddhesh's comments:
1. update the error messages in "handle_counted_by_attribute"
   then update the testing case accordingly;
2. update the error messages in "verify_counted_by_attribute"
   then update the testing case accordingly;
3. update the documentation of "counted_by" in extend.texi
4. for the 3rd argument of ACCESS_WITH_SIZE, change it as following:
+   3rd argument CLASS_OF_SIZE: The size referenced by the REF_TO_SIZE 
represents
+ 0: the number of bytes;
+ 1: the number of the elements of the object type;

Update all other places accordingly.
5. update the comments of the routine "access_with_size_object_size"
   bail out if (object_size_type & OST_DYNAMIC) == 0 for this routine.
   change the variable name of "type_of_size" to "class_of_size" for 
   consistence.
6. add one more testing case for bound sanitizer to handle the case when
   counted-by field is zero value.


It based on the following original proposal:

https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
Represent the missing dependence for the "counted_by" attribute and its 
consumers

**The summary of the proposal is:

* Add a new internal function ".ACCESS_WITH_SIZE" to carry the size information 
for every reference to a FAM field;
* In C FE, Replace every reference to a FAM field whose TYPE has the 
"counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
* In every consumer of the size information, for example, BDOS or array bound 
sanitizer, query the size information or ACCESS_MODE information from the new 
internal function;
* When expansing to RTL, replace the internal function with the actual 
reference to the FAM field;
* Some adjustment to ipa alias analysis, and other SSA passes to mitigate the 
impact to the optimizer and code generation.


**The new internal function

  .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, TYPE_OF_SIZE, 
ACCESS_MODE, TYPE_OF_REF)

INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)

which returns the "REF_TO_OBJ" same as the 1st argument;

Both the return type and the type of the first argument of this function have 
been converted from the incomplete array type to the corresponding pointer type.

The call to .ACCESS_WITH_SIZE is wrapped with an INDIRECT_REF, whose type is 
the original imcomplete array type.

Please see the following link for why:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html

1st argument "REF_TO_OBJ": The reference to the object;
2nd argument "REF_TO_SIZE": The reference to the size of the object,
3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE represents
   0: the number of bytes;
   1: the number of the elements of the object type;
4th argument "TYPE_OF_SIZE": A constant 0 with the TYPE of the object
  refed by REF_TO_SIZE
5th argument "ACCESS_MODE":
  -1: Unknown access semantics
   0: none
   1: read_only
   2: write_only
   3: read_write
6th argument "TYPE_OF_REF": A constant 0 with the pointer TYPE to
  to the original flexible array type.

** The Patch sets included:

1. Provide counted_by attribute to flexible array member field;
  which includes:
  * "counted_by" attribute documentation;
  * C FE handling of the new attribute;
syntax checking, error reporting;
  * testing cases;

2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
  which includes:
  * The definition of the new internal function .ACCESS_WITH_SIZE in 
internal-fn.def.
  * C FE converts every reference to a FAM with "counted_by" attribute to a 
call to the internal function .ACCESS_WITH_SIZE.
(build_component_ref in c_typeck.cc)
This includes the case when the object is statically allocated and 
initialized.
In order to make this working, we should update 
initializer_constant_valid_p_1 and output_constant in varasm.cc to include 
calls to .ACCESS_WITH_SIZE.

However, for the reference inside "offsetof", ignore the "counted_by" 
attribute since it's not useful at all. (c_parser_postfix_expression in 
c/c-parser.cc)
In addtion to "offsetof", for the reference inside operator "typeof" and
  "alignof", we ignore counted_by attribute too.
When building ADDR_EXPR for the .ACCESS_WITH_SIZE in C FE,
  replace the call with its first argument.

  * Convert every call to .ACCESS_WITH_SIZE to its first argument.
(expand_ACCESS_WITH_SIZE in internal-fn.cc)
  * adjust alias analysis to exclude the new internal from clobbering 
anything.
(ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in 
tree-ssa-alias.cc)
  * adjust dead code elimination to eliminate the call to .ACCESS_WITH_SIZE 
when
it's LHS is eliminated as dead code.
(eliminate_unnecessary_stmts in tree-ssa-dce.cc)

Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-03-20 Thread Radek Barton
Hello, everyone.

I've re-run the `x86_64-w64-mingw32` target tests with all the languages 
enabled, except of Ada and JIT, which are harder to bootstrap, and Go, which is 
not supported by MinGW. The summarized results are:

536176 of expected passes
13859 of unexpected failures
188 of unexpected successes
4537 of expected failures
8271 of unresolved testcases
20958 of unsupported tests
97.40% of tests completes in expected way

and, again, zero regressions were detected.

The detailed results can be reviewed at 
https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-build/actions/runs/8349019387

Best regards,

Radek Bartoň


From: Radek Barton 
Sent: Monday, March 18, 2024 10:05 PM
To: Evgeny Karpov; gcc-patches@gcc.gnu.org
Cc: richard.sandif...@arm.com; Richard Earnshaw (lists); Andrew Pinski (QUIC); 
Maxim Kuvyrkov
Subject: Re: [PATCH v2 00/13] Add aarch64-w64-mingw32 target

Hello, everyone.

Currently, we are able to provide results of regression testing for 
`x86_64-w64-mingw32` target with `--enable-languages=c,lto,c++,fortran` running 
in WSL only.

The summarized results, both for the branch with patch set applied and its 
corresponding base branch, show:

517501 expected passes
4537 of expected failures
10828 unexpected failures
180 of unexpected successes
5934 of unresolved testcases
19113 of unsupported tests

which means that 98% of the tests ends in an expected way and we haven't 
detected a single regression between the branches.

The detailed results can be downloaded and reviewed at 
https://github.com/Windows-on-ARM-Experiments/mingw-woarm64-build/actions/runs/8327889403

Best regards,

Radek Bartoň


[PATCH 1/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Yury Khrustalev
This patch updates `aarch64-sys-regs.def', bringing it into sync with
the Binutils source.

gcc/ChangeLog:

* config/aarch64/aarch64-sys-regs.def: Copy from Binutils.
---
 gcc/config/aarch64/aarch64-sys-regs.def | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
b/gcc/config/aarch64/aarch64-sys-regs.def
index 6a948171d6e..8b65673a5d6 100644
--- a/gcc/config/aarch64/aarch64-sys-regs.def
+++ b/gcc/config/aarch64/aarch64-sys-regs.def
@@ -521,6 +521,7 @@
   SYSREG ("id_aa64isar0_el1",  CPENC (3,0,0,6,0),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("id_aa64isar1_el1",  CPENC (3,0,0,6,1),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("id_aa64isar2_el1",  CPENC (3,0,0,6,2),  F_REG_READ, 
AARCH64_NO_FEATURES)
+  SYSREG ("id_aa64isar3_el1",  CPENC (3,0,0,6,3),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("id_aa64mmfr0_el1",  CPENC (3,0,0,7,0),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("id_aa64mmfr1_el1",  CPENC (3,0,0,7,1),  F_REG_READ, 
AARCH64_NO_FEATURES)
   SYSREG ("id_aa64mmfr2_el1",  CPENC (3,0,0,7,2),  F_REG_READ, 
AARCH64_NO_FEATURES)
-- 
2.39.3



[PATCH 0/1] aarch64: Sync aarch64-sys-regs.def with Binutils

2024-03-20 Thread Yury Khrustalev
Hi,

This patch updates `aarch64-sys-regs.def', bringing it into sync with
the Binutils source.

Regression tested on aarch64-none-linux-gnu and no regressions have
been found. Is it OK for GCC master? I don't have commit access so I
need someone to commit on my behalf.

Thanks,
Yury

Yury Khrustalev (1):
  aarch64: Sync aarch64-sys-regs.def with Binutils

 gcc/config/aarch64/aarch64-sys-regs.def | 1 +
 1 file changed, 1 insertion(+)

-- 
2.39.3



[PATCH] s390x: Implement vector cost model

2024-03-20 Thread Juergen Christ
Hi,

s390x used the basic cost model which does not correctly model the cost of
register file crossing or the availability of certain instructions to
simplify reversed operations.  Implement an own cost model to better control
when to vectorize.

gcc/ChangeLog:

* config/s390/s390.cc (class s390_vector_costs): Implement.
(s390_vector_costs::s390_vector_costs): Dito.
(s390_vector_costs::add_stmt_cost): Dito.
(s390_vectorize_create_costs): Dito.
(TARGET_VECTORIZE_CREATE_COSTS): Dito.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/loop-1.c: New test.
* gcc.target/s390/vector/slp-1.c: New test.
* gcc.target/s390/vector/slp-2.c: New test.

Signed-off-by: Juergen Christ 

Bootstrapped and tested on s390x.  Ok for master?

---
 gcc/config/s390/s390.cc   | 127 ++
 gcc/testsuite/gcc.target/s390/vector/loop-1.c |  82 +++
 gcc/testsuite/gcc.target/s390/vector/slp-1.c  |  68 ++
 gcc/testsuite/gcc.target/s390/vector/slp-2.c  |  31 +
 4 files changed, 308 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/loop-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/slp-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/slp-2.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 372a23244032..b9dab1cf8a85 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -88,6 +88,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-prop.h"
 #include "ipa-fnsummary.h"
 #include "sched-int.h"
+#include "tree-vectorizer.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -4199,6 +4200,130 @@ s390_builtin_vectorization_cost (enum 
vect_cost_for_stmt type_of_cost,
 }
 }
 
+/* s390-specific vector costs */
+class s390_vector_costs : public vector_costs
+{
+  stmt_vec_info skipfinalpart;
+public:
+  s390_vector_costs (vec_info *, bool);
+
+  unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
+ stmt_vec_info stmt_info, slp_tree node,
+ tree vectype, int misalign,
+ vect_cost_model_location where) override;
+};
+
+s390_vector_costs::s390_vector_costs(vec_info *vinfo, bool costing_for_scalar)
+  : vector_costs(vinfo, costing_for_scalar)
+{
+}
+
+unsigned int
+s390_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
+ stmt_vec_info stmt_info, slp_tree node,
+ tree vectype, int misalign,
+ vect_cost_model_location where)
+{
+  bool fp = false;
+  int costs = s390_builtin_vectorization_cost (kind, vectype, misalign);
+
+  if (vectype != NULL)
+fp = FLOAT_TYPE_P (vectype);
+
+  if ((kind == scalar_to_vec || kind == vec_construct)
+  && node
+  && SLP_TREE_DEF_TYPE (node) == vect_external_def)
+{
+  unsigned int i;
+  tree op;
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
+   if (TREE_CODE (op) == SSA_NAME)
+ TREE_VISITED (op) = 0;
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
+   {
+ if (TREE_CODE (op) != SSA_NAME
+ || TREE_VISITED (op))
+   continue;
+ TREE_VISITED (op) = 1;
+ gimple *def = SSA_NAME_DEF_STMT (op);
+ tree temp;
+ if (is_gimple_assign(def)
+ && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))
+ && (temp = gimple_assign_rhs1(def))
+ && TREE_CODE (temp) == SSA_NAME
+ && tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (def)),
+   TREE_TYPE (temp)))
+   def = SSA_NAME_DEF_STMT (temp);
+ if (!gimple_assign_load_p (def))
+   {
+ /* For scalar_to_vec from a fp register, we might not
+cross the register files.  So keep the penalty small.
+??? If we have to cross, we actually cross twice
+leading to a huge runtime penalty.  Should we reflect
+this here?  */
+ if (kind == scalar_to_vec && fp)
+   costs += 2;
+ else
+   costs += 3;
+   }
+   }
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
+   if (TREE_CODE (op) == SSA_NAME)
+ TREE_VISITED (op) = 0;
+}
+  if (kind == scalar_stmt && stmt_info && is_gimple_assign (stmt_info->stmt))
+{
+  const gassign *assign = dyn_cast (stmt_info->stmt);
+  tree comptype = NULL_TREE;
+  if (gimple_assign_rhs_code (assign) == BIT_INSERT_EXPR)
+   comptype = TREE_TYPE (gimple_assign_rhs1 (assign));
+  if (gimple_assign_rhs_code (assign) == BIT_FIELD_REF)
+   comptype = TREE_TYPE (TREE_OPERAND (gimple_assign_rhs1 (assign), 0));
+  if (comptype != NULL_TREE && VECTOR_TYPE_P (comptype))
+   {
+ /* This will be a vlvg or vlgv that 

Re: [PATCH] Use integer_{zero,one}_node some more

2024-03-20 Thread Richard Biener
On Wed, 20 Mar 2024, Jakub Jelinek wrote:

> Hi!
> 
> When looking at the analyzer MEM_REF invalid second arg issue, I've noticed
> tons of build_int_cst (integer_type_node, {0,1}) or build_zero_cst
> (integer_type_node) cases.
> That just wastes compile time (albeit not very much).
> 
> The following patch fixes what my sed script found.
> 
> Ok for trunk (or defer for stage1, though guess I'd need to redo it
> from scratch at that point)?

OK.

Richard.

> 2024-03-20  Jakub Jelinek  
> 
> gcc/analyzer/
>   * constraint-manager.cc (test_range, test_constraint_conditions,
>   test_constant_comparisons, test_constraint_impl, test_purging,
>   test_bits): Use integer_zero_node instead of
>   build_zero_cst (integer_type_node) or
>   build_int_cst (integer_type_node, 0) and integer_one_node instead of
>   build_int_cst (integer_type_node, 1).
>   * region-model.cc (region_model::get_store_value,
>   append_interesting_constants, test_array_1,
>   test_get_representative_tree, test_unique_constants, test_assignment,
>   test_stack_frames, test_constraint_merging, test_widening_constraints,
>   test_iteration_1, test_array_2): Likewise.
> gcc/cp/
>   * coroutines.cc (expand_one_await_expression): Use
>   integer_zero_node instead of build_int_cst (integer_type_node, 0)
>   and integer_one_node instead of build_int_cst (integer_type_node, 1).
> gcc/fortran/
>   * trans-array.cc (structure_alloc_comps): Use integer_zero_node
>   instead of build_zero_cst (integer_type_node) or
>   build_int_cst (integer_type_node, 0) and integer_one_node instead of
>   build_int_cst (integer_type_node, 1).
>   * trans-expr.cc (conv_scalar_char_value): Likewise.
>   * trans-stmt.cc (gfc_trans_form_team, gfc_trans_change_team,
>   gfc_trans_sync_team, gfc_trans_sync): Likewise.
>   * trans-decl.cc (create_main_function): Likewise.
>   * trans-intrinsic.cc (trans_this_image, conv_intrinsic_move_alloc):
>   Likewise.
>   * trans.cc (gfc_allocate_using_caf_lib, gfc_deallocate_with_status):
>   Likewise.
> gcc/objc/
>   * objc-next-runtime-abi-02.cc (build_v2_class_ro_t_initializer): Use
>   integer_zero_node instead of build_int_cst (integer_type_node, 0).
> 
> --- gcc/cp/coroutines.cc.jj   2024-02-10 09:38:25.771984306 +0100
> +++ gcc/cp/coroutines.cc  2024-03-19 13:34:36.495989031 +0100
> @@ -1750,7 +1750,7 @@ expand_one_await_expression (tree *stmt,
>   data->coro_fp);
>r = cp_build_init_expr (cond, r);
>finish_switch_cond (r, sw);
> -  r = build_case_label (build_int_cst (integer_type_node, 0), NULL_TREE,
> +  r = build_case_label (integer_zero_node, NULL_TREE,
>   create_anon_label_with_ctx (loc, actor));
>add_stmt (r); /* case 0: */
>/* Implement the suspend, a scope exit without clean ups.  */
> @@ -1758,7 +1758,7 @@ expand_one_await_expression (tree *stmt,
>   is_cont ? cont : susp);
>r = coro_build_cvt_void_expr_stmt (r, loc);
>add_stmt (r); /*   goto ret;  */
> -  r = build_case_label (build_int_cst (integer_type_node, 1), NULL_TREE,
> +  r = build_case_label (integer_one_node, NULL_TREE,
>   create_anon_label_with_ctx (loc, actor));
>add_stmt (r); /* case 1:  */
>r = build1_loc (loc, GOTO_EXPR, void_type_node, resume_label);
> --- gcc/analyzer/constraint-manager.cc.jj 2024-02-27 16:52:23.725366127 
> +0100
> +++ gcc/analyzer/constraint-manager.cc2024-03-19 13:34:36.486989157 
> +0100
> @@ -3374,8 +3374,8 @@ namespace selftest {
>  static void
>  test_range ()
>  {
> -  tree int_0 = build_int_cst (integer_type_node, 0);
> -  tree int_1 = build_int_cst (integer_type_node, 1);
> +  tree int_0 = integer_zero_node;
> +  tree int_1 = integer_one_node;
>tree int_2 = build_int_cst (integer_type_node, 2);
>tree int_5 = build_int_cst (integer_type_node, 5);
>  
> @@ -3419,7 +3419,7 @@ static void
>  test_constraint_conditions ()
>  {
>tree int_42 = build_int_cst (integer_type_node, 42);
> -  tree int_0 = build_int_cst (integer_type_node, 0);
> +  tree int_0 = integer_zero_node;
>  
>tree x = build_global_decl ("x", integer_type_node);
>tree y = build_global_decl ("y", integer_type_node);
> @@ -3874,7 +3874,7 @@ test_transitivity ()
>  static void
>  test_constant_comparisons ()
>  {
> -  tree int_1 = build_int_cst (integer_type_node, 1);
> +  tree int_1 = integer_one_node;
>tree int_3 = build_int_cst (integer_type_node, 3);
>tree int_4 = build_int_cst (integer_type_node, 4);
>tree int_5 = build_int_cst (integer_type_node, 5);
> @@ -4058,7 +4058,7 @@ static void
>  test_constraint_impl ()
>  {
>tree int_42 = build_int_cst (integer_type_node, 42);
> -  tree int_0 = build_int_cst (integer_type_node, 0);
> +  tree int_0 = integer_zero_node;
>  
>tree x = build_global_decl ("x", integer_type_node);
>tree y = build_global_decl ("y", 

Re: [PATCH] bitint: Fix handling of conditional bitfield loads [PR114365]

2024-03-20 Thread Richard Biener
On Wed, 20 Mar 2024, Jakub Jelinek wrote:

> Hi!
> 
> For the m_var_msb (aka left shift) case of large/huge _BitInt bitfield loads
> handle_load adds a PHI node, but I forgot to actually update the temporary
> the code later on uses, so the PHI result was unused and the code
> incorrectly used something that wasn't valid SSA form.
> In particular, we emitted
>   if (_29 != 2)
> goto ; [80.00%]
>   else
> goto ; [20.00%]
>   
>[local count: 1073741824]:
>   _33 = VIEW_CONVERT_EXPR(s.D.2771)[_31];
>   
>[local count: 1073741824]:
>   # _34 = PHI <_33(4), 0(3)>
>   _35 = _32 >> 31;
>   _36 = _33 << 33;
>   _37 = _36 | _35;
>   _38 = _37 << _19;
> where instead of _33 the _36 def stmt should be using _34.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Richard.

> 2024-03-20  Jakub Jelinek  
> 
>   PR tree-optimization/114365
>   * gimple-lower-bitint.cc (bitint_large_huge::handle_load): When adding
>   a PHI node, set iv2 to its result afterwards.
> 
>   * gcc.dg/bitint-102.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-03-16 15:15:22.0 +0100
> +++ gcc/gimple-lower-bitint.cc2024-03-19 10:45:31.006649671 +0100
> @@ -2026,6 +2026,7 @@ bitint_large_huge::handle_load (gimple *
> add_phi_arg (phi, build_zero_cst (m_limb_type),
>  edge_false, UNKNOWN_LOCATION);
> m_gsi = gsi_after_labels (edge_true->dest);
> +   iv2 = iv3;
>   }
>   }
>g = gimple_build_assign (make_ssa_name (m_limb_type), RSHIFT_EXPR,
> --- gcc/testsuite/gcc.dg/bitint-102.c.jj  2024-03-19 10:54:15.317327543 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-102.c 2024-03-19 10:53:57.162580193 +0100
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/114365 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +struct S {
> +  int : 31;
> +#if __BITINT_MAXWIDTH__ >= 129
> +  _BitInt(129) b : 129;
> +#else
> +  _BitInt(63) b : 63;
> +#endif
> +} s;
> +
> +void
> +foo (int a)
> +{
> +  s.b <<= a;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] testsuite: add the case to cover vectorization of A[(i+x)*stride] [PR114322]

2024-03-20 Thread Hao Liu OS
> So - OK with using { target vect_int } instead.

Sure, it's much better to be target independent.

Refactored and committed in r14-9569-g4c276896

Thanks,
- Hao


From: Richard Biener 
Sent: Wednesday, March 20, 2024 16:21
To: Hao Liu OS
Cc: GCC-patches@gcc.gnu.org
Subject: Re: [PATCH] testsuite: add the case to cover vectorization of 
A[(i+x)*stride] [PR114322]

On Wed, 20 Mar 2024, Hao Liu OS wrote:

> Hi Richard,
>
> As mentioned in the comments of PR114322 (which has been fixed by PR114151
> r14-9540-ge0e9499a), this patch is to cover the case.
>
> Bootstrapped and regression tested on aarch64-linux-gnu, OK for trunk?
>
> gcc/testsuite/ChangeLog:
>
>   PR tree-optimization/114322
>   * gcc.dg/vect/pr114322.c: New testcase.
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr114322.c 
> b/gcc/testsuite/gcc.dg/vect/pr114322.c
> new file mode 100644
> index 000..7c3a4ffe260
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr114322.c
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/114322 */
> +/* { dg-do compile } */
> +
> +int
> +foo (short *A, int x, int stride)
> +{
> +  int sum = 0;
> +
> +  if (stride > 1)
> +{
> +  for (int i = 0; i < 1024; ++i)
> +sum += A[(i + x) * stride];
> +}
> +
> +  return sum;
> +}
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target 
> aarch64*-*-* } } } */

I don't like the target selector very much - if you make 'A'
a 'int *', the strided access should be the important part,
vect_int should match the requirements - the strided access should
end up with VMAT_ELEMENTWISE at worst (possibly gather on aarch64?),
and VMAT_ELEMENTWISE doesn't need any alignment.

So - OK with using { target vect_int } instead.

Richard.


[committed] tree-ssa-loop-ch: Fix comment typos

2024-03-20 Thread Jakub Jelinek
Hi!

When looking at PR109596, I've noticed some comment typos.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as
obvious.

2024-03-20  Jakub Jelinek  

* tree-ssa-loop-ch.cc (update_profile_after_ch): Fix comment typo:
probabbility -> probability.
(ch_base::copy_headers): Fix comment typo: itrations -> iterations.

--- gcc/tree-ssa-loop-ch.cc.jj  2024-01-12 10:07:58.380849033 +0100
+++ gcc/tree-ssa-loop-ch.cc 2024-03-20 00:51:18.570544185 +0100
@@ -568,7 +568,7 @@ do_while_loop_p (class loop *loop)
   
// region start
  loop_header:
-  if (cond1)   <- we need to update probabbility here
+  if (cond1)   <- we need to update probability here
 goto loop_exit;
   if (cond2)   <- and determine scaling factor here.
   moreover cond2 is now always true
@@ -1049,7 +1049,7 @@ ch_base::copy_headers (function *fun)
  fprintf (dump_file, "\n");
}
 
-  /* We possibly decreased number of itrations by 1.  */
+  /* We possibly decreased number of iterations by 1.  */
   auto_vec exits = get_loop_exit_edges (loop);
   bool precise = (nexits == (int) exits.length ());
   /* Check that loop may not terminate in other way than via

Jakub



[PATCH] Use integer_{zero,one}_node some more

2024-03-20 Thread Jakub Jelinek
Hi!

When looking at the analyzer MEM_REF invalid second arg issue, I've noticed
tons of build_int_cst (integer_type_node, {0,1}) or build_zero_cst
(integer_type_node) cases.
That just wastes compile time (albeit not very much).

The following patch fixes what my sed script found.

Ok for trunk (or defer for stage1, though guess I'd need to redo it
from scratch at that point)?

2024-03-20  Jakub Jelinek  

gcc/analyzer/
* constraint-manager.cc (test_range, test_constraint_conditions,
test_constant_comparisons, test_constraint_impl, test_purging,
test_bits): Use integer_zero_node instead of
build_zero_cst (integer_type_node) or
build_int_cst (integer_type_node, 0) and integer_one_node instead of
build_int_cst (integer_type_node, 1).
* region-model.cc (region_model::get_store_value,
append_interesting_constants, test_array_1,
test_get_representative_tree, test_unique_constants, test_assignment,
test_stack_frames, test_constraint_merging, test_widening_constraints,
test_iteration_1, test_array_2): Likewise.
gcc/cp/
* coroutines.cc (expand_one_await_expression): Use
integer_zero_node instead of build_int_cst (integer_type_node, 0)
and integer_one_node instead of build_int_cst (integer_type_node, 1).
gcc/fortran/
* trans-array.cc (structure_alloc_comps): Use integer_zero_node
instead of build_zero_cst (integer_type_node) or
build_int_cst (integer_type_node, 0) and integer_one_node instead of
build_int_cst (integer_type_node, 1).
* trans-expr.cc (conv_scalar_char_value): Likewise.
* trans-stmt.cc (gfc_trans_form_team, gfc_trans_change_team,
gfc_trans_sync_team, gfc_trans_sync): Likewise.
* trans-decl.cc (create_main_function): Likewise.
* trans-intrinsic.cc (trans_this_image, conv_intrinsic_move_alloc):
Likewise.
* trans.cc (gfc_allocate_using_caf_lib, gfc_deallocate_with_status):
Likewise.
gcc/objc/
* objc-next-runtime-abi-02.cc (build_v2_class_ro_t_initializer): Use
integer_zero_node instead of build_int_cst (integer_type_node, 0).

--- gcc/cp/coroutines.cc.jj 2024-02-10 09:38:25.771984306 +0100
+++ gcc/cp/coroutines.cc2024-03-19 13:34:36.495989031 +0100
@@ -1750,7 +1750,7 @@ expand_one_await_expression (tree *stmt,
data->coro_fp);
   r = cp_build_init_expr (cond, r);
   finish_switch_cond (r, sw);
-  r = build_case_label (build_int_cst (integer_type_node, 0), NULL_TREE,
+  r = build_case_label (integer_zero_node, NULL_TREE,
create_anon_label_with_ctx (loc, actor));
   add_stmt (r); /* case 0: */
   /* Implement the suspend, a scope exit without clean ups.  */
@@ -1758,7 +1758,7 @@ expand_one_await_expression (tree *stmt,
is_cont ? cont : susp);
   r = coro_build_cvt_void_expr_stmt (r, loc);
   add_stmt (r); /*   goto ret;  */
-  r = build_case_label (build_int_cst (integer_type_node, 1), NULL_TREE,
+  r = build_case_label (integer_one_node, NULL_TREE,
create_anon_label_with_ctx (loc, actor));
   add_stmt (r); /* case 1:  */
   r = build1_loc (loc, GOTO_EXPR, void_type_node, resume_label);
--- gcc/analyzer/constraint-manager.cc.jj   2024-02-27 16:52:23.725366127 
+0100
+++ gcc/analyzer/constraint-manager.cc  2024-03-19 13:34:36.486989157 +0100
@@ -3374,8 +3374,8 @@ namespace selftest {
 static void
 test_range ()
 {
-  tree int_0 = build_int_cst (integer_type_node, 0);
-  tree int_1 = build_int_cst (integer_type_node, 1);
+  tree int_0 = integer_zero_node;
+  tree int_1 = integer_one_node;
   tree int_2 = build_int_cst (integer_type_node, 2);
   tree int_5 = build_int_cst (integer_type_node, 5);
 
@@ -3419,7 +3419,7 @@ static void
 test_constraint_conditions ()
 {
   tree int_42 = build_int_cst (integer_type_node, 42);
-  tree int_0 = build_int_cst (integer_type_node, 0);
+  tree int_0 = integer_zero_node;
 
   tree x = build_global_decl ("x", integer_type_node);
   tree y = build_global_decl ("y", integer_type_node);
@@ -3874,7 +3874,7 @@ test_transitivity ()
 static void
 test_constant_comparisons ()
 {
-  tree int_1 = build_int_cst (integer_type_node, 1);
+  tree int_1 = integer_one_node;
   tree int_3 = build_int_cst (integer_type_node, 3);
   tree int_4 = build_int_cst (integer_type_node, 4);
   tree int_5 = build_int_cst (integer_type_node, 5);
@@ -4058,7 +4058,7 @@ static void
 test_constraint_impl ()
 {
   tree int_42 = build_int_cst (integer_type_node, 42);
-  tree int_0 = build_int_cst (integer_type_node, 0);
+  tree int_0 = integer_zero_node;
 
   tree x = build_global_decl ("x", integer_type_node);
   tree y = build_global_decl ("y", integer_type_node);
@@ -4220,7 +4220,7 @@ test_many_constants ()
 static void
 test_purging (void)
 {
-  tree int_0 = build_int_cst (integer_type_node, 0);
+  tree int_0 = integer_zero_node;
   tree a = 

[PATCH] bitint: Fix handling of conditional bitfield loads [PR114365]

2024-03-20 Thread Jakub Jelinek
Hi!

For the m_var_msb (aka left shift) case of large/huge _BitInt bitfield loads
handle_load adds a PHI node, but I forgot to actually update the temporary
the code later on uses, so the PHI result was unused and the code
incorrectly used something that wasn't valid SSA form.
In particular, we emitted
  if (_29 != 2)
goto ; [80.00%]
  else
goto ; [20.00%]
  
   [local count: 1073741824]:
  _33 = VIEW_CONVERT_EXPR(s.D.2771)[_31];
  
   [local count: 1073741824]:
  # _34 = PHI <_33(4), 0(3)>
  _35 = _32 >> 31;
  _36 = _33 << 33;
  _37 = _36 | _35;
  _38 = _37 << _19;
where instead of _33 the _36 def stmt should be using _34.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2024-03-20  Jakub Jelinek  

PR tree-optimization/114365
* gimple-lower-bitint.cc (bitint_large_huge::handle_load): When adding
a PHI node, set iv2 to its result afterwards.

* gcc.dg/bitint-102.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-03-16 15:15:22.0 +0100
+++ gcc/gimple-lower-bitint.cc  2024-03-19 10:45:31.006649671 +0100
@@ -2026,6 +2026,7 @@ bitint_large_huge::handle_load (gimple *
  add_phi_arg (phi, build_zero_cst (m_limb_type),
   edge_false, UNKNOWN_LOCATION);
  m_gsi = gsi_after_labels (edge_true->dest);
+ iv2 = iv3;
}
}
   g = gimple_build_assign (make_ssa_name (m_limb_type), RSHIFT_EXPR,
--- gcc/testsuite/gcc.dg/bitint-102.c.jj2024-03-19 10:54:15.317327543 
+0100
+++ gcc/testsuite/gcc.dg/bitint-102.c   2024-03-19 10:53:57.162580193 +0100
@@ -0,0 +1,18 @@
+/* PR tree-optimization/114365 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=c23 -O2" } */
+
+struct S {
+  int : 31;
+#if __BITINT_MAXWIDTH__ >= 129
+  _BitInt(129) b : 129;
+#else
+  _BitInt(63) b : 63;
+#endif
+} s;
+
+void
+foo (int a)
+{
+  s.b <<= a;
+}

Jakub



Re: _LIBCXX_DEBUG value initialized singular iterators assert failures in std algorithms [PR104316]

2024-03-20 Thread Jonathan Wakely
On Wed, 20 Mar 2024 at 05:59, François Dumont wrote:
>
> Thanks to you doc:
>
>  libstdc++: [_GLIBCXX_DEBUG] Define __[glibcxx,cpp_lib]_null_iterators
>
>  _GLIBCXX_DEBUG has now fully N3344 compliant iterator checks, we
> can define
>  __glibcxx_null_iterators and __cpp_lib_null_iterators macros like
> the normal
>  mode.
>
>  libstdc++-v3/ChangeLog:
>
>  * version.def (null_iterators): Remove extra_cond.
>  * version.h: Regenerate.
>
> Ok to commit ?

Please don't bother talking about __glibcxx_null_iterators in the
commit message, that's an implementation detail that always mirrors
the standard-defined __cpp_lib_null_iterators one. The first line of
the commit will be much easier to read without that.

OK with that change, thanks.

> I already noticed that GCC 13 has no version.h file so no backport question.

It has no version.h but it still has the macros:

include/std/iterator:# define __cpp_lib_null_iterators 201304L
include/std/version:# define __cpp_lib_null_iterators 201304L

Those definitions can be made to not depend on _GLIBCXX_DEBUG.



Re: [PATCH] handle unwind tables that are embedded within unwinding code, [PR111731]

2024-03-20 Thread Richard Biener
On Fri, Mar 15, 2024 at 11:31 AM Thomas Neumann
 wrote:
>
> Original bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111731
> Given that this is a regression, is this okay for gcc 13 and mainline?

It does look straightforward but I hope Jason or Florian can provide the ACK.

Thanks,
Richard.

> The unwinding mechanism registers both the code range and the unwind
> table itself within a b-tree lookup structure. That data structure
> assumes that is consists of non-overlappping intervals. This
> becomes a problem if the unwinding table is embedded within the
> code itself, as now the intervals do overlap.
>
> To fix this problem we now keep the unwind tables in a separate
> b-tree, which prevents the overlap.
>
> libgcc/ChangeLog:
> PR libgcc/111731
> * unwind-dw2-fde.c: Split unwind ranges if they contain the
> unwind table.
> ---
>   libgcc/unwind-dw2-fde.c | 37 +
>   1 file changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
> index 61a578d097e..9d503545677 100644
> --- a/libgcc/unwind-dw2-fde.c
> +++ b/libgcc/unwind-dw2-fde.c
> @@ -48,6 +48,7 @@ typedef __UINTPTR_TYPE__ uintptr_type;
>   #include "unwind-dw2-btree.h"
>
>   static struct btree registered_frames;
> +static struct btree registered_objects;
>   static bool in_shutdown;
>
>   static void
> @@ -58,6 +59,7 @@ release_registered_frames (void)
> /* Release the b-tree and all frames. Frame releases that happen later are
>  * silently ignored */
> btree_destroy (_frames);
> +  btree_destroy (_objects);
> in_shutdown = true;
>   }
>
> @@ -103,6 +105,21 @@ static __gthread_mutex_t object_mutex;
>   #endif
>   #endif
>
> +#ifdef ATOMIC_FDE_FAST_PATH
> +// Register the pc range for a given object in the lookup structure.
> +static void
> +register_pc_range_for_object (uintptr_type begin, struct object *ob)
> +{
> +  // Register the object itself to know the base pointer on deregistration.
> +  btree_insert (_objects, begin, 1, ob);
> +
> +  // Register the frame in the b-tree
> +  uintptr_type range[2];
> +  get_pc_range (ob, range);
> +  btree_insert (_frames, range[0], range[1] - range[0], ob);
> +}
> +#endif
> +
>   /* Called from crtbegin.o to register the unwind info for an object.  */
>
>   void
> @@ -124,13 +141,7 @@ __register_frame_info_bases (const void *begin, struct 
> object *ob,
>   #endif
>
>   #ifdef ATOMIC_FDE_FAST_PATH
> -  // Register the object itself to know the base pointer on deregistration.
> -  btree_insert (_frames, (uintptr_type) begin, 1, ob);
> -
> -  // Register the frame in the b-tree
> -  uintptr_type range[2];
> -  get_pc_range (ob, range);
> -  btree_insert (_frames, range[0], range[1] - range[0], ob);
> +  register_pc_range_for_object ((uintptr_type) begin, ob);
>   #else
> init_object_mutex_once ();
> __gthread_mutex_lock (_mutex);
> @@ -178,13 +189,7 @@ __register_frame_info_table_bases (void *begin, struct 
> object *ob,
> ob->s.b.encoding = DW_EH_PE_omit;
>
>   #ifdef ATOMIC_FDE_FAST_PATH
> -  // Register the object itself to know the base pointer on deregistration.
> -  btree_insert (_frames, (uintptr_type) begin, 1, ob);
> -
> -  // Register the frame in the b-tree
> -  uintptr_type range[2];
> -  get_pc_range (ob, range);
> -  btree_insert (_frames, range[0], range[1] - range[0], ob);
> +  register_pc_range_for_object ((uintptr_type) begin, ob);
>   #else
> init_object_mutex_once ();
> __gthread_mutex_lock (_mutex);
> @@ -232,7 +237,7 @@ __deregister_frame_info_bases (const void *begin)
>
>   #ifdef ATOMIC_FDE_FAST_PATH
> // Find the originally registered object to get the base pointer.
> -  ob = btree_remove (_frames, (uintptr_type) begin);
> +  ob = btree_remove (_objects, (uintptr_type) begin);
>
> // Remove the corresponding PC range.
> if (ob)
> @@ -240,7 +245,7 @@ __deregister_frame_info_bases (const void *begin)
> uintptr_type range[2];
> get_pc_range (ob, range);
> if (range[0] != range[1])
> -btree_remove (_frames, range[0]);
> +   btree_remove (_frames, range[0]);
>   }
>
> // Deallocate the sort array if any.
> --
> 2.43.0
>


[PATCH] MIPS: Predefine __mips_strict_alignment if STRICT_ALIGNMENT

2024-03-20 Thread YunQiang Su
Arm32 predefines __ARM_FEATURE_UNALIGNED if -mno-unaligned-access,
and RISC-V predefines __riscv_misaligned_avoid.

Let's define __mips_strict_alignment for MIPSr6 and -mstrict-align
is used.

Not that, this macro is always defined for pre-R6.

gcc
config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Predefine
__mips_strict_alignment if STRICT_ALIGNMENT.
---
 gcc/config/mips/mips.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 6444a68dfd5..616a275b918 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -694,6 +694,9 @@ struct mips_cpu_info {
builtin_define ("__mips_compact_branches_always");  \
   else \
builtin_define ("__mips_compact_branches_optimal"); \
+   \
+  if (STRICT_ALIGNMENT)\
+   builtin_define ("__mips_strict_alignment"); \
 }  \
   while (0)
 
-- 
2.39.2



Re: [PATCH] testsuite: add the case to cover vectorization of A[(i+x)*stride] [PR114322]

2024-03-20 Thread Richard Biener
On Wed, 20 Mar 2024, Hao Liu OS wrote:

> Hi Richard,
> 
> As mentioned in the comments of PR114322 (which has been fixed by PR114151
> r14-9540-ge0e9499a), this patch is to cover the case.
> 
> Bootstrapped and regression tested on aarch64-linux-gnu, OK for trunk?
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/114322
>   * gcc.dg/vect/pr114322.c: New testcase.
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr114322.c 
> b/gcc/testsuite/gcc.dg/vect/pr114322.c
> new file mode 100644
> index 000..7c3a4ffe260
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr114322.c
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/114322 */
> +/* { dg-do compile } */
> +
> +int
> +foo (short *A, int x, int stride)
> +{
> +  int sum = 0;
> +
> +  if (stride > 1)
> +{
> +  for (int i = 0; i < 1024; ++i)
> +sum += A[(i + x) * stride];
> +}
> +
> +  return sum;
> +}
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target 
> aarch64*-*-* } } } */

I don't like the target selector very much - if you make 'A'
a 'int *', the strided access should be the important part,
vect_int should match the requirements - the strided access should
end up with VMAT_ELEMENTWISE at worst (possibly gather on aarch64?),
and VMAT_ELEMENTWISE doesn't need any alignment.

So - OK with using { target vect_int } instead.

Richard.


Re: [PATCH] middle-end/113396 - int128 array index and value-ranges

2024-03-20 Thread Richard Biener
On Tue, 19 Mar 2024, Jakub Jelinek wrote:

> On Tue, Mar 19, 2024 at 03:47:37PM +0100, Richard Biener wrote:
> > The following fixes bogus truncation of a value-range for an int128
> > array index when computing the maximum extent for a variable array
> > reference.  Instead of possibly slowing things down by using
> > widest_int the following makes sure the range bounds fit within
> > the constraints offset_int were designed for.
> 
> Perhaps you could use wide_int/poly_wide_int with precision
> of offset_int if it is at most 64-bit precision and twice that precision
> otherwise.
> I think large BITINT_TYPEs shouldn't be a problem since r14-7200,
> so another fix might to truncate at gimplification time
> ARRAY_REF indexes wider than sizetype to sizetype.  Maybe GCC 15-ish
> material though.

Yeah, I was thinking of instead comparing to the effective biggest
index supporting the (half) address-space limit but I guess that's what
the get_precision check effectively does as well.

But yeah, I guess truncating ARRAY_REF indices like we truncate
shift amounts is the way to go in the end.  For GCC 15 indeed.

I've installed the patch as-is.

Thanks,
Richard.

> Anyway, guess your patch is ok as is too.
> 
> > PR middle-end/113396
> > * tree-dfa.cc (get_ref_base_and_extent): Use index range
> > bounds only if they fit within the address-range constraints
> > of offset_int.
> > 
> > * gcc.dg/torture/pr113396.c: New testcase.
> > ---
> >  gcc/testsuite/gcc.dg/torture/pr113396.c | 19 +++
> >  gcc/tree-dfa.cc |  6 --
> >  2 files changed, 23 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr113396.c
> > 
> > diff --git a/gcc/testsuite/gcc.dg/torture/pr113396.c 
> > b/gcc/testsuite/gcc.dg/torture/pr113396.c
> > new file mode 100644
> > index 000..585f717bdda
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/torture/pr113396.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-do run } */
> > +/* { dg-require-effective-target int128 } */
> > +
> > +unsigned char m[] = {5, 79, 79, 79, 79};
> > +__int128 p;
> > +int main()
> > +{
> > +  int g1 = 0;
> > +  p = 0;
> > +  for (int aj = 0; aj < 256; aj++)
> > +   {
> > +  m[0] = -4;
> > +  for (; p >= 0; p -= 1) {
> > +g1 = m[p];
> > +  }
> > +  }
> > +  if (g1 != 0xfc)
> > +__builtin_abort();
> > +}
> > diff --git a/gcc/tree-dfa.cc b/gcc/tree-dfa.cc
> > index cbd3774b21f..93e53b29a6d 100644
> > --- a/gcc/tree-dfa.cc
> > +++ b/gcc/tree-dfa.cc
> > @@ -549,7 +549,8 @@ get_ref_base_and_extent (tree exp, poly_int64 *poffset,
> > /* Try to constrain maxsize with range information.  */
> > offset_int omax
> >   = offset_int::from (max, TYPE_SIGN (TREE_TYPE (index)));
> > -   if (known_lt (lbound, omax))
> > +   if (wi::get_precision (max) <= ADDR_MAX_BITSIZE
> > +   && known_lt (lbound, omax))
> >   {
> > poly_offset_int rmaxsize;
> > rmaxsize = (omax - lbound + 1)
> > @@ -567,7 +568,8 @@ get_ref_base_and_extent (tree exp, poly_int64 *poffset,
> > /* Try to adjust bit_offset with range information.  */
> > offset_int omin
> >   = offset_int::from (min, TYPE_SIGN (TREE_TYPE (index)));
> > -   if (known_le (lbound, omin))
> > +   if (wi::get_precision (min) <= ADDR_MAX_BITSIZE
> > +   && known_le (lbound, omin))
> >   {
> > poly_offset_int woffset
> >   = wi::sext (omin - lbound,
> > -- 
> > 2.35.3
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] testsuite: add the case to cover vectorization of A[(i+x)*stride] [PR114322]

2024-03-20 Thread Hao Liu OS
Hi Richard,

As mentioned in the comments of PR114322 (which has been fixed by PR114151
r14-9540-ge0e9499a), this patch is to cover the case.

Bootstrapped and regression tested on aarch64-linux-gnu, OK for trunk?

gcc/testsuite/ChangeLog:

PR tree-optimization/114322
* gcc.dg/vect/pr114322.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/vect/pr114322.c 
b/gcc/testsuite/gcc.dg/vect/pr114322.c
new file mode 100644
index 000..7c3a4ffe260
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr114322.c
@@ -0,0 +1,18 @@
+/* PR tree-optimization/114322 */
+/* { dg-do compile } */
+
+int
+foo (short *A, int x, int stride)
+{
+  int sum = 0;
+
+  if (stride > 1)
+{
+  for (int i = 0; i < 1024; ++i)
+sum += A[(i + x) * stride];
+}
+
+  return sum;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target aarch64*-*-* 
} } } */
-- 
2.34.1



[PATCH] mips: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-20 Thread Xi Ruoyao
We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
arguments and there is nothing to advance, but that is not the case
for (...) functions returning by hidden reference which have one such
artificial argument.  This is causing gcc.dg/c23-stdarg-{6,8,9}.c to
fail.

Fix the issue by checking if arg.type is NULL, as r14-9503 explains.

gcc/ChangeLog:

PR target/114175
* config/mips/mips.cc (mips_setup_incoming_varargs): Only skip
mips_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P
functions if arg.type is NULL.
---

Bootstrapped and regtested on mips64el-linux-gnuabi64.  Ok for trunk?

 gcc/config/mips/mips.cc | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 68e2ae8d8fa..ce764a5cb35 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -6834,7 +6834,13 @@ mips_setup_incoming_varargs (cumulative_args_t cum,
  argument.  Advance a local copy of CUM past the last "real" named
  argument, to find out how many registers are left over.  */
   local_cum = *get_cumulative_args (cum);
-  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
+
+  /* For a C23 variadic function w/o any named argument, and w/o an
+ artifical argument for large return value, skip advancing args.
+ There is such an artifical argument iff. arg.type is non-NULL
+ (PR 114175).  */
+  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
+  || arg.type != NULL_TREE)
 mips_function_arg_advance (pack_cumulative_args (_cum), arg);
 
   /* Found out how many registers we need to save.  */
-- 
2.44.0



[PATCH v1] RISC-V: Bugfix function target attribute pollution

2024-03-20 Thread pan2 . li
From: Pan Li 

This patch depends on below ICE fix.

https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html

The function target attribute should be on a per-function basis.
For example, we have 3 function as below:

void test_1 () {}

void __attribute__((target("arch=+v"))) test_2 () {}

void __attribute__((target("arch=+zfh"))) test_3 () {}

void test_4 () {}

The scope of the target attribute should not extend the function body.
Aka, test_3 cannot have the 'v' extension, as well as the test_4
cannot have both the 'v' and 'zfh' extension.

Unfortunately, for now the test_4 is able to leverage the 'v' and
the 'zfh' extension which is incorrect.  This patch would like to
fix the sticking attribute by introduce the commandline subset_list.
When parse_arch, we always clone from the cmdline_subset_list instead
of the current_subset_list.

Meanwhile, we correct the print information about arch like below.

.option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zbb1p0

The riscv_declare_function_name hook is always after the hook
riscv_process_target_attr.  Thus, we introduce one hash_map to record
the 1:1 mapping from fndel to its' subset_list in advance.  And later
the riscv_declare_function_name is able to get the right information
about the arch.

Below test are passed for this patch
* The riscv fully regression test.

PR target/114352

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (struct riscv_func_target_info):
New struct for func decl and target name.
(struct riscv_func_target_hasher): New hasher for hash table mapping
from the fn_decl to fn_target_name.
(riscv_func_decl_hash): New func to compute the hash for fn_decl.
(riscv_func_target_hasher::hash): New func to impl hash interface.
(riscv_func_target_hasher::equal): New func to impl equal interface.
(riscv_cmdline_subset_list): New static var for cmdline subset list.
(riscv_func_target_table_lazy_init): New func to lazy init the func
target hash table.
(riscv_func_target_get): New func to get target name from hash table.
(riscv_func_target_put): New func to put target name into hash table.
(riscv_func_target_remove_and_destory): New func to remove target
info from the hash table and destory it.
(riscv_parse_arch_string): Set the static var cmdline_subset_list.
* config/riscv/riscv-subset.h (riscv_cmdline_subset_list): New static
var for cmdline subset list.
(riscv_func_target_get): New func decl.
(riscv_func_target_put): Ditto.
(riscv_func_target_remove_and_destory): Ditto.
* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Take cmdline_subset_list instead of current_subset_list when clone.
(riscv_process_target_attr): Record the func target info to hash table.
(riscv_option_valid_attribute_p): Add new arg tree fndel.
* config/riscv/riscv.cc (riscv_declare_function_name): Consume the
func target info and print the arch message.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114352-3.c: New test.

Signed-off-by: Pan Li 
---
 gcc/common/config/riscv/riscv-common.cc   | 105 +++-
 gcc/config/riscv/riscv-subset.h   |   4 +
 gcc/config/riscv/riscv-target-attr.cc |  18 ++-
 gcc/config/riscv/riscv.cc |   7 +-
 .../gcc.target/riscv/rvv/base/pr114352-3.c| 113 ++
 5 files changed, 240 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-3.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index d32bf147eca..76ec9bf846c 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -425,11 +425,108 @@ bool riscv_subset_list::parse_failed = false;
 
 static riscv_subset_list *current_subset_list = NULL;
 
+static riscv_subset_list *cmdline_subset_list = NULL;
+
+struct riscv_func_target_info
+{
+  tree fn_decl;
+  std::string fn_target_name;
+
+  riscv_func_target_info (const tree , const std::string _name)
+: fn_decl (decl), fn_target_name (target_name)
+  {
+  }
+};
+
+struct riscv_func_target_hasher : nofree_ptr_hash
+{
+  typedef tree compare_type;
+
+  static hashval_t hash (value_type);
+  static bool equal (value_type, const compare_type &);
+};
+
+static hash_table *func_target_table = NULL;
+
+static inline hashval_t riscv_func_decl_hash (tree fn_decl)
+{
+  inchash::hash h;
+
+  h.add_ptr (fn_decl);
+
+  return h.end ();
+}
+
+inline hashval_t
+riscv_func_target_hasher::hash (value_type value)
+{
+  return riscv_func_decl_hash (value->fn_decl);
+}
+
+inline bool
+riscv_func_target_hasher::equal (value_type value, const compare_type )
+{
+  return value->fn_decl == key;
+}
+
 const riscv_subset_list *riscv_current_subset_list ()
 {
   return