Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-02 Thread Kewen.Lin
Hi Jakub,

on 2024/4/2 16:03, Jakub Jelinek wrote:
> On Tue, Apr 02, 2024 at 02:12:04PM +0800, Kewen.Lin wrote:
>> The old code for the unused hidden parameter (which was the 9th param) 
>> would
>> fall thru to the "return NULL_RTX;" which would make the callee assume 
>> there
>> was a parameter save area allocated.  Now instead, we'll return a reg 
>> rtx,
>> probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll 
>> now
>> see a copy of r11 into a pseudo like we do for the other param regs.
>> Is that a problem? Given it's an unused parameter, it'll probably get 
>> deleted
>> as dead code, but could it cause any issues?  What if we have more than 
>> one
>>
>> I think Peter raised one good point, not sure it would really cause some 
>> issues,
>> but the assigned reg goes beyond GP_ARG_MAX_REG, at least it is confusing to 
>> people
>> especially without DCE like at -O0.  Can we aggressively remove these 
>> candidates
>> from DECL_ARGUMENTS chain?  Does it cause any assertion to fail?
> 
> I'd prefer not to remove DECL_ARGUMENTS chains, they are valid arguments that 
> just some
> invalid code doesn't pass.  By removing them you basically always create an
> invalid case, this time in the other direction, valid caller passes more
> arguments than the callee (invalidly) expects.

Thanks for the comments, do you mean it can affect the arguments validation 
when there
is explicit function declaration with interface?  Then can we strip them when 
we are
going to expand them (like checking currently_expanding_function_start)?  since 
from the
perspective of resulted assembly, with this workaround, the callee can:
  1) pass the hidden args in unexpected GPR like r11, ... at -O0;
  2) get rid of such hidden args as they are unused at -O2;
This proposal aims to make the assembly at -O0 not to pass with r11... (same as 
-O2),
comparing to the assembly at O2, the mismatch isn't actually changed.

BR,
Kewen



Re:[PATCH v2 1/1] [RISC-V] Add support for _Bfloat16

2024-04-02 Thread Palmer Dabbelt

On Tue, 02 Apr 2024 20:19:16 PDT (-0700), ji...@linux.alibaba.com wrote:

gcc/testsuite/ChangeLog:

* gcc.target/riscv/bf16_arithmetic.c: New test.
* gcc.target/riscv/bf16_call.c: New test.
* gcc.target/riscv/bf16_comparison.c: New test.
* gcc.target/riscv/bf16_float_libcall_convert.c: New test.
* gcc.target/riscv/bf16_integer_libcall_convert.c: New test.


  Hi, I have test this patch and it is very good. I think we need to add some
runable tests to ensure that the results are right for various types of
conversions, operations, and libfuncs.


Sorry I forgot to reply earlier, a few of us were talking about this is 
the patchwork meeting this morning.  We think this is too big for GCC-14 
this late in the cycle.  Folks are still looking at bugs, so it might 
take a bit to get reviewed for GCC-15.


I took a look and don't see anything wrong, but I'm not a floating-point 
person so I'd want to try and talk to someone who is before committing 
it.


More testing never hurts, though ;)


BR,
Jin


Re:[PATCH v2 1/1] [RISC-V] Add support for _Bfloat16

2024-04-02 Thread Jin Ma
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/bf16_arithmetic.c: New test.
>   * gcc.target/riscv/bf16_call.c: New test.
>   * gcc.target/riscv/bf16_comparison.c: New test.
>   * gcc.target/riscv/bf16_float_libcall_convert.c: New test.
>   * gcc.target/riscv/bf16_integer_libcall_convert.c: New test.

  Hi, I have test this patch and it is very good. I think we need to add some
runable tests to ensure that the results are right for various types of
conversions, operations, and libfuncs.

BR,
Jin


Re:[pushed] [PATCH] LoongArch: Remove unused code

2024-04-02 Thread Lulu Cheng

Pushed to r14-9766.

在 2024/4/2 下午2:33, Jiahao Xu 写道:

For machines that satisfy ISA_HAS_LSX && !TARGET_64BIT, we will not support 
them now
and in the future, so this patch removes these unused code.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove unused code.
* config/loongarch/loongarch-protos.h (loongarch_split_lsx_copy_d): 
Remove.
(loongarch_split_lsx_insert_d): Ditto.
(loongarch_split_lsx_fill_d): Ditto.
* config/loongarch/loongarch.cc (loongarch_split_lsx_copy_d): Ditto.
(loongarch_split_lsx_insert_d): Ditto.
(loongarch_split_lsx_fill_d): Ditto.
* config/loongarch/lsx.md (lsx_vpickve2gr_du): Remove splitter.
(lsx_vpickve2gr_): Ditto.
(abs2): Remove expander.
 (vabs2): Rename to abs2.

gcc/testsuite/ChangeLog:

 * gcc.target/loongarch/vector/lsx/lsx-abs.c: New test.
---
  gcc/config/loongarch/lasx.md  | 12 +--
  gcc/config/loongarch/loongarch-protos.h   |  3 -
  gcc/config/loongarch/loongarch.cc | 76 
  gcc/config/loongarch/lsx.md   | 89 ++-
  .../gcc.target/loongarch/vector/lsx/lsx-abs.c | 26 ++
  5 files changed, 35 insertions(+), 171 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-abs.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 2fa5e46c8e8..7bd61f8ed5b 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -572,12 +572,7 @@ (define_insn "lasx_xvinsgr2vr_"
  (match_operand 3 "const__operand" "")))]
"ISA_HAS_LASX"
  {
-#if 0
-  if (!TARGET_64BIT && (mode == V4DImode || mode == V4DFmode))
-return "#";
-  else
-#endif
-return "xvinsgr2vr.\t%u0,%z1,%y3";
+  return "xvinsgr2vr.\t%u0,%z1,%y3";
  }
[(set_attr "type" "simd_insert")
 (set_attr "mode" "")])
@@ -1446,10 +1441,7 @@ (define_insn "lasx_xvreplgr2vr_"
if (which_alternative == 1)
  return "xvldi.b\t%u0,0" ;
  
-  if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode))

-return "#";
-  else
-return "xvreplgr2vr.\t%u0,%z1";
+  return "xvreplgr2vr.\t%u0,%z1";
  }
[(set_attr "type" "simd_fill")
 (set_attr "mode" "")
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index e3ed2b912a5..e238d795a73 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -89,9 +89,6 @@ extern void loongarch_split_128bit_move (rtx, rtx);
  extern bool loongarch_split_128bit_move_p (rtx, rtx);
  extern void loongarch_split_256bit_move (rtx, rtx);
  extern bool loongarch_split_256bit_move_p (rtx, rtx);
-extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
-extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
-extern void loongarch_split_lsx_fill_d (rtx, rtx);
  extern const char *loongarch_output_move (rtx, rtx);
  #ifdef RTX_CODE
  extern void loongarch_expand_scc (rtx *);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index a69a203fbe6..8438cc64b0d 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4756,82 +4756,6 @@ loongarch_split_256bit_move (rtx dest, rtx src)
  }
  }
  
-

-/* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
-   used to generate subregs.  */
-
-void
-loongarch_split_lsx_copy_d (rtx dest, rtx src, rtx index,
-   rtx (*gen_fn)(rtx, rtx, rtx))
-{
-  gcc_assert ((GET_MODE (src) == V2DImode && GET_MODE (dest) == DImode)
- || (GET_MODE (src) == V2DFmode && GET_MODE (dest) == DFmode));
-
-  /* Note that low is always from the lower index, and high is always
- from the higher index.  */
-  rtx low = loongarch_subword (dest, false);
-  rtx high = loongarch_subword (dest, true);
-  rtx new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
-
-  emit_insn (gen_fn (low, new_src, GEN_INT (INTVAL (index) * 2)));
-  emit_insn (gen_fn (high, new_src, GEN_INT (INTVAL (index) * 2 + 1)));
-}
-
-/* Split a INSERT.D with operand DEST, SRC1.INDEX and SRC2.  */
-
-void
-loongarch_split_lsx_insert_d (rtx dest, rtx src1, rtx index, rtx src2)
-{
-  int i;
-  gcc_assert (GET_MODE (dest) == GET_MODE (src1));
-  gcc_assert ((GET_MODE (dest) == V2DImode
-  && (GET_MODE (src2) == DImode || src2 == const0_rtx))
- || (GET_MODE (dest) == V2DFmode && GET_MODE (src2) == DFmode));
-
-  /* Note that low is always from the lower index, and high is always
- from the higher index.  */
-  rtx low = loongarch_subword (src2, false);
-  rtx high = loongarch_subword (src2, true);
-  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
-  rtx new_src1 = simplify_gen_subreg (V4SImode, src1, GET_MODE (src1), 0);
-  i = exact_log2 (INTVAL (index));
-  gcc_assert (i != -1);
-
-  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, low, new_src1,
-  

Re: [PATCH] c++: Keep DECL_SAVED_TREE of destructor instantiations in modules [PR104040]

2024-04-02 Thread Nathaniel Shead
On Tue, Apr 02, 2024 at 01:18:17PM -0400, Jason Merrill wrote:
> On 3/28/24 23:21, Nathaniel Shead wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > 
> > -- >8 --
> > 
> > A template instantiation still needs to have its DECL_SAVED_TREE so that
> > its definition is emitted into the CMI. This way it can be emitted in
> > the object file of any importers that use it, in case it doesn't end up
> > getting emitted in this TU.
> > 
> > PR c++/104040
> > 
> > gcc/cp/ChangeLog:
> > 
> > * semantics.cc (expand_or_defer_fn_1): Also keep DECL_SAVED_TREE
> > for template instantiations.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/modules/pr104040_a.C: New test.
> > * g++.dg/modules/pr104040_b.C: New test.
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >   gcc/cp/semantics.cc   |  7 +--
> >   gcc/testsuite/g++.dg/modules/pr104040_a.C | 14 ++
> >   gcc/testsuite/g++.dg/modules/pr104040_b.C |  8 
> >   3 files changed, 27 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_a.C
> >   create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_b.C
> > 
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index adb1ba48d29..84e9901509a 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -5033,9 +5033,12 @@ expand_or_defer_fn_1 (tree fn)
> > /* We don't want to process FN again, so pretend we've written
> >  it out, even though we haven't.  */
> > TREE_ASM_WRITTEN (fn) = 1;
> > -  /* If this is a constexpr function, keep DECL_SAVED_TREE.  */
> > +  /* If this is a constexpr function, or the body might need to be
> > +exported from a module CMI, keep DECL_SAVED_TREE.  */
> > if (!DECL_DECLARED_CONSTEXPR_P (fn)
> > - && !(modules_p () && DECL_DECLARED_INLINE_P (fn)))
> > + && !(modules_p ()
> > +  && (DECL_DECLARED_INLINE_P (fn)
> > +  || DECL_TEMPLATE_INSTANTIATION (fn
> 
> How about using vague_linkage_p?
> 

Right, of course.  How about this?
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

A template instantiation still needs to have its DECL_SAVED_TREE so that
its definition is emitted into the CMI. This way it can be emitted in
the object file of any importers that use it, in case it doesn't end up
getting emitted in this TU.

PR c++/104040

gcc/cp/ChangeLog:

* semantics.cc (expand_or_defer_fn_1): Keep DECL_SAVED_TREE for
all vague linkage functions.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr104040_a.C: New test.
* g++.dg/modules/pr104040_b.C: New test.

Signed-off-by: Nathaniel Shead 
Reviewed-by: Jason Merrill 
---
 gcc/cp/semantics.cc   |  5 +++--
 gcc/testsuite/g++.dg/modules/pr104040_a.C | 14 ++
 gcc/testsuite/g++.dg/modules/pr104040_b.C |  8 
 3 files changed, 25 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_b.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index adb1ba48d29..03800a20b26 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -5033,9 +5033,10 @@ expand_or_defer_fn_1 (tree fn)
   /* We don't want to process FN again, so pretend we've written
 it out, even though we haven't.  */
   TREE_ASM_WRITTEN (fn) = 1;
-  /* If this is a constexpr function, keep DECL_SAVED_TREE.  */
+  /* If this is a constexpr function, or the body might need to be
+exported from a module CMI, keep DECL_SAVED_TREE.  */
   if (!DECL_DECLARED_CONSTEXPR_P (fn)
- && !(modules_p () && DECL_DECLARED_INLINE_P (fn)))
+ && !(modules_p () && vague_linkage_p (fn)))
DECL_SAVED_TREE (fn) = NULL_TREE;
   return false;
 }
diff --git a/gcc/testsuite/g++.dg/modules/pr104040_a.C 
b/gcc/testsuite/g++.dg/modules/pr104040_a.C
new file mode 100644
index 000..ea36ce0a798
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr104040_a.C
@@ -0,0 +1,14 @@
+// PR c++/104040
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi test }
+
+export module test;
+
+export template 
+struct test {
+  ~test() {}
+};
+
+test use() {
+  return {};
+}
diff --git a/gcc/testsuite/g++.dg/modules/pr104040_b.C 
b/gcc/testsuite/g++.dg/modules/pr104040_b.C
new file mode 100644
index 000..efe014673fb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr104040_b.C
@@ -0,0 +1,8 @@
+// PR c++/104040
+// { dg-additional-options "-fmodules-ts" }
+
+import test;
+
+int main() {
+  test t{};
+}
-- 
2.43.2



[pushed] analyzer: prevent ICEs with null types

2024-04-02 Thread David Malcolm
Fixes some ICEs seen analyzing the Linux kernel.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-9762-ge945d322fcbc68.

gcc/analyzer/ChangeLog:
* region-model-manager.cc (maybe_undo_optimize_bit_field_compare):
Guard against null types.
* region-model.cc (apply_constraints_for_gswitch): Likewise.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-manager.cc | 2 ++
 gcc/analyzer/region-model.cc | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index 4feb349c9142..f155eeb87c0d 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -616,6 +616,8 @@ maybe_undo_optimize_bit_field_compare (tree type,
   tree cst,
   const svalue *arg1)
 {
+  if (!type)
+return nullptr;
   if (!INTEGRAL_TYPE_P (type))
 return NULL;
 
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 902b887fc074..98f287145c6c 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -5781,7 +5781,8 @@ apply_constraints_for_gswitch (const switch_cfg_superedge 
,
  && is_a  (unaryop->get_arg ()))
if (const initial_svalue *initvalop = (as_a 
   (unaryop->get_arg (
- if (TREE_CODE (initvalop->get_type ()) == ENUMERAL_TYPE)
+ if (initvalop->get_type ()
+ && TREE_CODE (initvalop->get_type ()) == ENUMERAL_TYPE)
{
  index_sval = initvalop;
  check_index_type = false;
-- 
2.26.3



Re: [C PATCH] fix aliasing for structures/unions with incomplete types

2024-04-02 Thread Martin Uecker
Am Dienstag, dem 02.04.2024 um 20:42 + schrieb Joseph Myers:
> On Tue, 2 Apr 2024, Martin Uecker wrote:
> 
> > [C23]fix aliasing for structures/unions with incomplete types
> > 
> > When incomplete structure/union types are completed later, compatibility
> > of struct types that contain pointers to such types changes.  When forming
> > equivalence classes for TYPE_CANONICAL, we therefor need to be conservative
> > and treat all structs with the same tag which are pointer targets as
> > equivalent.
> 
> I don't see how what it done is actually about "which are pointer 
> targets".

Right, I see now that the description needs to be improved. This refers
only to targets of pointers included somewhere in the type we process
for purposes of determining the equivalence class of this type (but
not for other contexts).

> 
> > @@ -1355,6 +1356,7 @@ comptypes_internal (const_tree type1, const_tree 
> > type2,
> >/* Do not remove mode information.  */
> >if (TYPE_MODE (t1) != TYPE_MODE (t2))
> > return false;
> > +  data->pointedto = true;
> >return comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data);
> 
> This appears to be more like "which are the targets of pointers that *have 
> just been compared in the present comptypes call*".  Not which are targets 
> of some other pointers not involved in that call.

Correct.
> 
> Maybe some such logic based only on pointers compared in the present call 
> makes sense for some purposes, but it's not clear to me that either this 
> or any similar approach is a good approach for TYPE_CANONICAL - couldn't 
> that mean that two types are considered equivalent for TYPE_CANONICAL at 
> one point in the translation unit, but no longer equivalent at some later 
> point when the comparison takes place in the context of comparing two 
> other pointer types?

They would always be considered equivalent when pointed to from another
struct (or indirectly from a type nested in this struct) for purposes
of determining the equivalence class of this struct.

When not a pointer target, i.e. when considering the struct type itself
for which we compute TYPE_CANONICAL or another struct type directly used
for a member, then the types are compared by recursing into them. Such 
types can never be incomplete at this point, so this is also stable property.

To summarize: for determining equivalence classes we always stop
the recursion after following pointers into other structs. We give
the same TYPE_CANONICAL to the following two structs foo:

struct foo { struct aa { int x; } *p; };
struct foo { struct aa { float x; } *p; };

while we give different TYPE_CANONICAL to

struct bar { struct aa { int x; } p; };
struct bar { struct aa { float x; } p; };

(not pointer).  The reason is that for the struct foo's there
could be a 

struct foo { struct aa *p; };

with incomplete type struct aa that later turns out to be compatible
with either of them. So we have to put them all into the same 
equivalence class.

(a potential alternative is to compute the classes only at the very end
when all types have stablized, but this would require much more changes
and another pass over all the types.)


Note that the TYPE_CANONICAL for the aa's is not affected in any case and
always computed based on *their* content  independent of whether they are
pointer targets or not.  (but this reminds me to double check what
happens with types that are never completed in a TU.).


I hope this explanation makes sense.


Martin




> 



Re: [PATCH] libphobos, Darwin: Enable libphobos for most Darwin.

2024-04-02 Thread Iain Buclaw
Excerpts from Iain Sandoe's message of April 2, 2024 1:51 pm:
> I have been building and testing D/libphobos for some time and over
> some GCC and OS releases.  As discussed on IRC a while ago, I think
> we're ready to enable this (it also avoids an annoying build fail at
> stage 2 if one forgets to add the enable to the command line).
> 
> Also tested on x86_64 and powerpc64 linux gnu.
> 
> OK for trunk?
> OK for backports?
> thanks,
> Iain
> 

If you're confident, OK, let's enable it.

Iain.


Re: [C PATCH] fix aliasing for structures/unions with incomplete types

2024-04-02 Thread Joseph Myers
On Tue, 2 Apr 2024, Martin Uecker wrote:

> [C23]fix aliasing for structures/unions with incomplete types
> 
> When incomplete structure/union types are completed later, compatibility
> of struct types that contain pointers to such types changes.  When forming
> equivalence classes for TYPE_CANONICAL, we therefor need to be conservative
> and treat all structs with the same tag which are pointer targets as
> equivalent.

I don't see how what it done is actually about "which are pointer 
targets".

> @@ -1355,6 +1356,7 @@ comptypes_internal (const_tree type1, const_tree type2,
>/* Do not remove mode information.  */
>if (TYPE_MODE (t1) != TYPE_MODE (t2))
>   return false;
> +  data->pointedto = true;
>return comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data);

This appears to be more like "which are the targets of pointers that *have 
just been compared in the present comptypes call*".  Not which are targets 
of some other pointers not involved in that call.

Maybe some such logic based only on pointers compared in the present call 
makes sense for some purposes, but it's not clear to me that either this 
or any similar approach is a good approach for TYPE_CANONICAL - couldn't 
that mean that two types are considered equivalent for TYPE_CANONICAL at 
one point in the translation unit, but no longer equivalent at some later 
point when the comparison takes place in the context of comparing two 
other pointer types?

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [C PATCH] Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

2024-04-02 Thread Joseph Myers
On Tue, 2 Apr 2024, Martin Uecker wrote:

> Fix ICE with -g and -std=c23 related to incomplete types [PR114361]
> 
> We did not copy TYPE_CANONICAL to the incomplete variants when
> completing a structure.
> 
> PR c/114361
> 
> gcc/c/
> * c-decl.c (finish_struct): Set TYPE_CANONICAL when completing
>   strucute types.
> 
> gcc/testsuite/
> * gcc.dg/pr114361.c: New test.
> * gcc.dg/c23-tag-incomplete-1.c: New test.
> * gcc.dg/c23-tag-incomplete-2.c: New test.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



[committed] libstdc++: Guard uses of char8_t with __cpp_char8_t [PR114519]

2024-04-02 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

PR libstdc++/114519
* include/bits/unicode.h (_Utf8_view): Guard with check for
char8_t being enabled.
(__literal_encoding_is_unicode): Guard use of char8_t with check
for it being enabled.
* testsuite/std/format/functions/114519.cc: New test.
---
 libstdc++-v3/include/bits/unicode.h   | 10 +++---
 libstdc++-v3/testsuite/std/format/functions/114519.cc |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/std/format/functions/114519.cc

diff --git a/libstdc++-v3/include/bits/unicode.h 
b/libstdc++-v3/include/bits/unicode.h
index 51bf02e927f..0e95c86a0b0 100644
--- a/libstdc++-v3/include/bits/unicode.h
+++ b/libstdc++-v3/include/bits/unicode.h
@@ -578,8 +578,10 @@ namespace __unicode
   constexpr bool empty() const { return ranges::empty(_M_base); }
 };
 
+#ifdef __cpp_char8_t
   template
 using _Utf8_view = _Utf_view;
+#endif
   template
 using _Utf16_view = _Utf_view;
   template
@@ -991,12 +993,14 @@ inline namespace __v15_1_0
 consteval bool
 __literal_encoding_is_unicode()
 {
-  if constexpr (is_same_v<_CharT, char8_t>)
-   return true;
-  else if constexpr (is_same_v<_CharT, char16_t>)
+  if constexpr (is_same_v<_CharT, char16_t>)
return true;
   else if constexpr (is_same_v<_CharT, char32_t>)
  return true;
+#ifdef __cpp_char8_t
+  else if constexpr (is_same_v<_CharT, char8_t>)
+   return true;
+#endif
 
   const char* __enc = "";
 
diff --git a/libstdc++-v3/testsuite/std/format/functions/114519.cc 
b/libstdc++-v3/testsuite/std/format/functions/114519.cc
new file mode 100644
index 000..25a112a954e
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/format/functions/114519.cc
@@ -0,0 +1,3 @@
+// { dg-do compile { target c++20 } }
+// { dg-options "-fno-char8_t" }
+#include 
-- 
2.44.0



Re: [PATCH] libstdc++: Allow adjacent __maybe_present_t to overlap

2024-04-02 Thread Pilar Latiesa
> > This is subjectively horrible and, more objectively, would create
> > longer mangled names and additional RTTI.

> Yeah, it's a neat trick but probably not appropriate to use within the
> standard library.

I understand. I was genuinely curious about whether this would do the trick.

In fact, if I'm not mistaken, the whole thing could be simplified to:

template
using __maybe_present_t = __conditional_t<_Present, _Tp, _Absent>;


[C PATCH] fix aliasing for structures/unions with incomplete types

2024-04-02 Thread Martin Uecker



While fixing the other issue, I realized that the way the
equivalence classes are computed for TYPE_CANONICAL did
not take into account that completion of struct types
also affectes compatibility of types that contain pointers
to them.  So the algorithm must be more conservative
creating bigger equivalence classes.



Bootstrapped and regession tested on x86_64



[C23]fix aliasing for structures/unions with incomplete types

When incomplete structure/union types are completed later, compatibility
of struct types that contain pointers to such types changes.  When forming
equivalence classes for TYPE_CANONICAL, we therefor need to be conservative
and treat all structs with the same tag which are pointer targets as
equivalent.

gcc/c/
* c-typeck.cc (comptypes_internal): Add flag to track
whether a struct is the target of a pointer.
(tagged_types_tu_compatible): When forming equivalence
classes, treat pointed-to structs as equivalent.

gcc/testsuite/
* gcc.dg/c23-tag-incomplate-alias-1.c: New test.
---
 gcc/c/c-typeck.cc | 11 ++
 .../gcc.dg/c23-tag-incomplete-alias-1.c   | 34 +++
 2 files changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-incomplete-alias-1.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ddeab1e2a8a..b86450580ad 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1170,6 +1170,7 @@ struct comptypes_data {
   bool different_types_p;
   bool warning_needed;
   bool anon_field;
+  bool pointedto;
   bool equiv;
 
   const struct tagged_tu_seen_cache* cache;
@@ -1355,6 +1356,7 @@ comptypes_internal (const_tree type1, const_tree type2,
   /* Do not remove mode information.  */
   if (TYPE_MODE (t1) != TYPE_MODE (t2))
return false;
+  data->pointedto = true;
   return comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data);
 
 case FUNCTION_TYPE:
@@ -1513,6 +1515,14 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
   if (TYPE_NAME (t1) != TYPE_NAME (t2))
 return false;
 
+  /* When forming equivalence classes for TYPE_CANONICAL in C23, we
+ have to treat structs with the same tag as equivalent, when they
+ are targets of pointers inside other structs.  This is necessary
+ so that the relationship of types does not change when incomplete
+ types are completed.  */
+  if (data->equiv && data->pointedto)
+return true;
+
   if (!data->anon_field && NULL_TREE == TYPE_NAME (t1))
 return false;
 
@@ -1608,6 +1618,7 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
  return false;
 
data->anon_field = !DECL_NAME (s1);
+   data->pointedto = false;
 
data->cache = 
if (!comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2), data))
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-alias-1.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-alias-1.c
new file mode 100644
index 000..7fb6a8513b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-incomplete-alias-1.c
@@ -0,0 +1,34 @@
+/* { dg-do run } 
+ * { dg-options "-std=c23 -O2" } */
+
+[[gnu::noinline]]
+void *alias(void *ap, void *b, void *x, void *y)
+{
+   struct foo { struct bar *f; } *a = ap;
+   struct bar { long x; };
+
+   a->f = x;
+
+   {
+   struct bar;
+   struct foo { struct bar *f; };
+   struct bar { long x; };
+
+   ((struct foo*)b)->f = y;
+   }
+
+
+   return a->f;
+}
+
+int main()
+{
+   struct bar { long x; };
+   struct foo { struct bar *f; } a;
+   struct bar x, y;
+   if ( != alias(, , , ))
+   __builtin_abort();
+
+   return 0;
+}
+
-- 
2.39.2




[C PATCH] Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

2024-04-02 Thread Martin Uecker



I did not copy TYPE_CANONICAL to incomplete variants
when they are completed.



Bootstrapped and regession tested on x86_64



Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

We did not copy TYPE_CANONICAL to the incomplete variants when
completing a structure.

PR c/114361

gcc/c/
* c-decl.c (finish_struct): Set TYPE_CANONICAL when completing
strucute types.

gcc/testsuite/
* gcc.dg/pr114361.c: New test.
* gcc.dg/c23-tag-incomplete-1.c: New test.
* gcc.dg/c23-tag-incomplete-2.c: New test.
---
 gcc/c/c-decl.cc |  1 +
 gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c | 14 ++
 gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c | 13 +
 gcc/testsuite/gcc.dg/pr114361.c | 11 +++
 4 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr114361.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index c747abe9f4e..f2083b9d96f 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9722,6 +9722,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   C_TYPE_VARIABLE_SIZE (x) = C_TYPE_VARIABLE_SIZE (t);
   C_TYPE_VARIABLY_MODIFIED (x) = C_TYPE_VARIABLY_MODIFIED (t);
   C_TYPE_INCOMPLETE_VARS (x) = NULL_TREE;
+  TYPE_CANONICAL (x) = TYPE_CANONICAL (t);
 }
 
   /* Update type location to the one of the definition, instead of e.g.
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
new file mode 100644
index 000..82d652569e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23 -g" } */
+
+struct a;
+typedef struct a b;
+
+void g() {
+struct a { b* x; };
+}
+
+struct a { b* x; };
+
+
+
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
new file mode 100644
index 000..bc47a04ece5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23 -g" } */
+
+struct a;
+typedef struct a b;
+
+void f() {
+   extern struct a { b* x; } t;
+}
+
+extern struct a { b* x; } t;
+
+
diff --git a/gcc/testsuite/gcc.dg/pr114361.c b/gcc/testsuite/gcc.dg/pr114361.c
new file mode 100644
index 000..0f3feb53566
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114361.c
@@ -0,0 +1,11 @@
+/* PR c/114361 */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu23 -g" } */
+
+void f()
+{
+typedef struct foo bar;
+typedef __typeof( ({ (struct foo { bar *x; }){ }; }) ) wuz;
+struct foo { wuz *x; };
+}
+
-- 
2.39.2




[pushed] c++: binding reference to comma expr [PR114561]

2024-04-02 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

We represent a reference binding where the referent type is more qualified
by a ck_ref_bind around a ck_qual.  We performed the ck_qual and then tried
to undo it with STRIP_NOPS, but that doesn't work if the conversion is
buried in COMPOUND_EXPR.  So instead let's avoid performing that fake
conversion in the first place.

PR c++/114561
PR c++/114562

gcc/cp/ChangeLog:

* call.cc (convert_like_internal): Avoid adding qualification
conversion in direct reference binding.

gcc/testsuite/ChangeLog:

* g++.dg/conversion/ref10.C: New test.
* g++.dg/conversion/ref11.C: New test.
---
 gcc/cp/call.cc  | 23 +++--
 gcc/testsuite/g++.dg/conversion/ref10.C |  5 
 gcc/testsuite/g++.dg/conversion/ref11.C | 33 +
 3 files changed, 47 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/conversion/ref10.C
 create mode 100644 gcc/testsuite/g++.dg/conversion/ref11.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 9e4c8073600..9568b5eb2c4 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -8742,7 +8742,15 @@ convert_like_internal (conversion *convs, tree expr, 
tree fn, int argnum,
   break;
 };
 
-  expr = convert_like (next_conversion (convs), expr, fn, argnum,
+  conversion *nc = next_conversion (convs);
+  if (convs->kind == ck_ref_bind && nc->kind == ck_qual
+  && !convs->need_temporary_p)
+/* direct_reference_binding might have inserted a ck_qual under
+   this ck_ref_bind for the benefit of conversion sequence ranking.
+   Don't actually perform that conversion.  */
+nc = next_conversion (nc);
+
+  expr = convert_like (nc, expr, fn, argnum,
   convs->kind == ck_ref_bind
   ? issue_conversion_warnings : false,
   c_cast_p, /*nested_p=*/true, complain & ~tf_no_cleanup);
@@ -8820,19 +8828,6 @@ convert_like_internal (conversion *convs, tree expr, 
tree fn, int argnum,
   {
tree ref_type = totype;
 
-   /* direct_reference_binding might have inserted a ck_qual under
-  this ck_ref_bind for the benefit of conversion sequence ranking.
-  Ignore the conversion; we'll create our own below.  */
-   if (next_conversion (convs)->kind == ck_qual
-   && !convs->need_temporary_p)
- {
-   gcc_assert (same_type_p (TREE_TYPE (expr),
-next_conversion (convs)->type));
-   /* Strip the cast created by the ck_qual; cp_build_addr_expr
-  below expects an lvalue.  */
-   STRIP_NOPS (expr);
- }
-
if (convs->bad_p && !next_conversion (convs)->bad_p)
  {
tree extype = TREE_TYPE (expr);
diff --git a/gcc/testsuite/g++.dg/conversion/ref10.C 
b/gcc/testsuite/g++.dg/conversion/ref10.C
new file mode 100644
index 000..1913f733a6b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/conversion/ref10.C
@@ -0,0 +1,5 @@
+// PR c++/114561
+
+void create(void* u) {
+  const void* const& r = ( (void)0, u );
+}
diff --git a/gcc/testsuite/g++.dg/conversion/ref11.C 
b/gcc/testsuite/g++.dg/conversion/ref11.C
new file mode 100644
index 000..bb9b835034c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/conversion/ref11.C
@@ -0,0 +1,33 @@
+// PR c++/114562
+// { dg-do compile { target c++11 } }
+
+template 
+struct Optional {
+  Optional(T&&);
+};
+
+struct MyClass {
+  MyClass(Optional);
+};
+
+// const void* NONE = nullptr; // Correct Error
+void* NONE = nullptr; // Crash
+
+void beforeParam();
+
+template
+struct Create {
+  template  static T create(U &&) noexcept;
+};
+
+
+template 
+template
+T Create::create(U && u) noexcept {
+  return T( ( (beforeParam()), (u) ) ); // { dg-error "cannot bind rvalue 
reference" }
+  // return T( (u) ); // Correct Error
+}
+
+void test_func() {
+  Create::create(NONE);
+}

base-commit: 35408b3669fac104cd380582b32e32c64a603d8b
-- 
2.44.0



[PATCH] c++: constexpr error with fn redecl in local scope [PR111132]

2024-04-02 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/13?

-- >8 --
We evaluate constexpr functions on the original, pre-genericization bodies.
That means that the function body we're evaluating will not have gone
through cp_genericize_r's "Map block scope extern declarations to visible
declarations with the same name and type in outer scopes if any".  Here:

  constexpr bool bar() { return true; } // #1
  constexpr bool foo() {
constexpr bool bar(void); // #2
return bar();
  }

it means that we:
1) register_constexpr_fundef (#1)
2) cp_genericize (#1)
   nothing interesting happens
3) register_constexpr_fundef (foo)
   does copy_fn, so we have two copies of the BIND_EXPR
4) cp_genericize (foo)
   this remaps #2 to #1, but only on one copy of the BIND_EXPR
5) retrieve_constexpr_fundef (foo)
   we find it, no problem
6) retrieve_constexpr_fundef (#2)
   and here #2 isn't found in constexpr_fundef_table, because
   we're working on the BIND_EXPR copy where #2 wasn't mapped to #1
   so we fail.  We've only registered #1.

It should work to use DECL_LOCAL_DECL_ALIAS (which used to be
extern_decl_map).  We evaluate constexpr functions on pre-cp_fold
bodies to avoid diagnostic problems, but the remapping I'm proposing
should not interfere with diagnostics.

This is not a problem for a global scope redeclaration; there we go
through duplicate_decls which keeps the DECL_UID:
  DECL_UID (olddecl) = olddecl_uid;
and DECL_UID is what constexpr_fundef_hasher::hash uses.

PR c++/32

gcc/cp/ChangeLog:

* constexpr.cc (get_function_named_in_call): If there's
a DECL_LOCAL_DECL_ALIAS, use it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-redeclaration3.C: New test.
* g++.dg/cpp0x/constexpr-redeclaration4.C: New test.
---
 gcc/cp/constexpr.cc   | 19 +++
 .../g++.dg/cpp0x/constexpr-redeclaration3.C   | 13 +
 .../g++.dg/cpp0x/constexpr-redeclaration4.C   | 14 ++
 3 files changed, 42 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration4.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index fa346fe01c9..b47f0e984c0 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -702,15 +702,26 @@ build_constexpr_constructor_member_initializers (tree 
type, tree body)
 
 /* We have an expression tree T that represents a call, either CALL_EXPR
or AGGR_INIT_EXPR.  If the call is lexically to a named function,
-   retrun the _DECL for that function.  */
+   return the _DECL for that function.  */
 
 static tree
 get_function_named_in_call (tree t)
 {
   tree fun = cp_get_callee (t);
-  if (fun && TREE_CODE (fun) == ADDR_EXPR
-  && TREE_CODE (TREE_OPERAND (fun, 0)) == FUNCTION_DECL)
-fun = TREE_OPERAND (fun, 0);
+  if (fun)
+{
+  if (TREE_CODE (fun) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (fun, 0)) == FUNCTION_DECL)
+   fun = TREE_OPERAND (fun, 0);
+  /* We evaluate constexpr functions on the original, pre-genericization
+bodies.  So block-scope extern declarations have not been mapped to
+declarations in outer scopes.  Use the namespace-scope declaration,
+if any, so that retrieve_constexpr_fundef can find it (PR32).  */
+  if (TREE_CODE (fun) == FUNCTION_DECL && DECL_LOCAL_DECL_P (fun))
+   if (tree alias = DECL_LOCAL_DECL_ALIAS (fun))
+ if (alias != error_mark_node)
+   fun = alias;
+}
   return fun;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration3.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration3.C
new file mode 100644
index 000..2b41b456fc3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration3.C
@@ -0,0 +1,13 @@
+// PR c++/32
+// { dg-do compile { target c++11 } }
+
+constexpr bool bar(void) {
+return true;
+}
+
+constexpr bool foo() {
+constexpr bool bar(void);
+return bar();
+}
+
+static_assert(foo(), "");
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration4.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration4.C
new file mode 100644
index 000..c58247218c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-redeclaration4.C
@@ -0,0 +1,14 @@
+// PR c++/32
+// { dg-do compile { target c++11 } }
+
+constexpr bool bar(void) {
+return true;
+}
+
+constexpr bool bar(void);
+
+constexpr bool foo() {
+return bar();
+}
+
+static_assert(foo(), "");

base-commit: 0e64bbb8823f7b3757befc878ed177dfb59943d1
-- 
2.44.0



Re: [PATCH v2 2/3] aarch64: Add support for aarch64-gnu (GNU/Hurd on AArch64)

2024-04-02 Thread Richard Sandiford
Sergey Bugaev  writes:
> Coupled with a corresponding binutils patch, this produces a toolchain that 
> can
> sucessfully build working binaries targeting aarch64-gnu.
>
> gcc/Changelog:
>
>   * config.gcc: Recognize aarch64*-*-gnu* targets.
>   * config/aarch64/aarch64-gnu.h: New file.
>
> Signed-off-by: Sergey Bugaev 
> ---
>  gcc/config.gcc   |  6 +++
>  gcc/config/aarch64/aarch64-gnu.h | 68 
>  2 files changed, 74 insertions(+)
>  create mode 100644 gcc/config/aarch64/aarch64-gnu.h

I don't know if you're waiting on me, but just in case: this and patch 3
still LGTM if Thomas is OK with them.

Thanks,
Richard

> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 87a5c92b6..9d935164c 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1264,6 +1264,12 @@ aarch64*-*-linux*)
>   done
>   TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
>   ;;
> +aarch64*-*-gnu*)
> +tm_file="${tm_file} elfos.h gnu-user.h gnu.h glibc-stdint.h"
> +tm_file="${tm_file} aarch64/aarch64-elf.h aarch64/aarch64-errata.h 
> aarch64/aarch64-gnu.h"
> +tmake_file="${tmake_file} aarch64/t-aarch64"
> +tm_defines="${tm_defines}  TARGET_DEFAULT_ASYNC_UNWIND_TABLES=1"
> + ;;
>  aarch64*-wrs-vxworks*)
>  tm_file="${tm_file} elfos.h aarch64/aarch64-elf.h"
>  tm_file="${tm_file} vx-common.h vxworks.h aarch64/aarch64-vxworks.h"
> diff --git a/gcc/config/aarch64/aarch64-gnu.h 
> b/gcc/config/aarch64/aarch64-gnu.h
> new file mode 100644
> index 0..ee5494034
> --- /dev/null
> +++ b/gcc/config/aarch64/aarch64-gnu.h
> @@ -0,0 +1,68 @@
> +/* Definitions for AArch64 running GNU/Hurd.
> +   Copyright (C) 2009-2024 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but
> +   WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */
> +
> +#ifndef GCC_AARCH64_GNU_H
> +#define GCC_AARCH64_GNU_H
> +
> +#define GNU_USER_DYNAMIC_LINKER 
> "/lib/ld-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1"
> +
> +#define CPP_SPEC "%{pthread:-D_REENTRANT}"
> +
> +#define GNU_TARGET_LINK_SPEC  "%{h*} \
> +   %{static:-Bstatic}\
> +   %{shared:-shared} \
> +   %{symbolic:-Bsymbolic}\
> +   %{!static:%{!static-pie:  \
> + %{rdynamic:-export-dynamic} \
> + %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}} \
> +   %{static-pie:-Bstatic -pie --no-dynamic-linker -z text} \
> +   -X\
> +   %{mbig-endian:-EB} %{mlittle-endian:-EL} \
> +   -maarch64gnu%{mabi=ilp32:32}%{mbig-endian:b}"
> +
> +
> +#define LINK_SPEC GNU_TARGET_LINK_SPEC AARCH64_ERRATA_LINK_SPEC
> +
> +#define GNU_USER_TARGET_MATHFILE_SPEC \
> +  "%{Ofast|ffast-math|funsafe-math-optimizations:%{!shared:crtfastmath.o%s}}"
> +
> +#undef ENDFILE_SPEC
> +#define ENDFILE_SPEC   \
> +  GNU_USER_TARGET_MATHFILE_SPEC " " \
> +  GNU_USER_TARGET_ENDFILE_SPEC
> +
> +#define TARGET_OS_CPP_BUILTINS() \
> +  do \
> +{\
> + GNU_USER_TARGET_OS_CPP_BUILTINS();  \
> +}\
> +  while (0)
> +
> +#define TARGET_ASM_FILE_END aarch64_file_end_indicate_exec_stack
> +
> +/* Uninitialized common symbols in non-PIE executables, even with
> +   strong definitions in dependent shared libraries, will resolve
> +   to COPY relocated symbol in the executable.  See PR65780.  */
> +#undef TARGET_BINDS_LOCAL_P
> +#define TARGET_BINDS_LOCAL_P default_binds_local_p_2
> +
> +/* Define this to be nonzero if static stack checking is supported.  */
> +#define STACK_CHECK_STATIC_BUILTIN 1
> +
> +#endif  /* GCC_AARCH64_GNU_H */


Re: [PATCH] c++: Keep DECL_SAVED_TREE of destructor instantiations in modules [PR104040]

2024-04-02 Thread Jason Merrill

On 3/28/24 23:21, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

A template instantiation still needs to have its DECL_SAVED_TREE so that
its definition is emitted into the CMI. This way it can be emitted in
the object file of any importers that use it, in case it doesn't end up
getting emitted in this TU.

PR c++/104040

gcc/cp/ChangeLog:

* semantics.cc (expand_or_defer_fn_1): Also keep DECL_SAVED_TREE
for template instantiations.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr104040_a.C: New test.
* g++.dg/modules/pr104040_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/semantics.cc   |  7 +--
  gcc/testsuite/g++.dg/modules/pr104040_a.C | 14 ++
  gcc/testsuite/g++.dg/modules/pr104040_b.C |  8 
  3 files changed, 27 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_a.C
  create mode 100644 gcc/testsuite/g++.dg/modules/pr104040_b.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index adb1ba48d29..84e9901509a 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -5033,9 +5033,12 @@ expand_or_defer_fn_1 (tree fn)
/* We don't want to process FN again, so pretend we've written
 it out, even though we haven't.  */
TREE_ASM_WRITTEN (fn) = 1;
-  /* If this is a constexpr function, keep DECL_SAVED_TREE.  */
+  /* If this is a constexpr function, or the body might need to be
+exported from a module CMI, keep DECL_SAVED_TREE.  */
if (!DECL_DECLARED_CONSTEXPR_P (fn)
- && !(modules_p () && DECL_DECLARED_INLINE_P (fn)))
+ && !(modules_p ()
+  && (DECL_DECLARED_INLINE_P (fn)
+  || DECL_TEMPLATE_INSTANTIATION (fn


How about using vague_linkage_p?


DECL_SAVED_TREE (fn) = NULL_TREE;
return false;
  }
diff --git a/gcc/testsuite/g++.dg/modules/pr104040_a.C 
b/gcc/testsuite/g++.dg/modules/pr104040_a.C
new file mode 100644
index 000..ea36ce0a798
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr104040_a.C
@@ -0,0 +1,14 @@
+// PR c++/104040
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi test }
+
+export module test;
+
+export template 
+struct test {
+  ~test() {}
+};
+
+test use() {
+  return {};
+}
diff --git a/gcc/testsuite/g++.dg/modules/pr104040_b.C 
b/gcc/testsuite/g++.dg/modules/pr104040_b.C
new file mode 100644
index 000..efe014673fb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr104040_b.C
@@ -0,0 +1,8 @@
+// PR c++/104040
+// { dg-additional-options "-fmodules-ts" }
+
+import test;
+
+int main() {
+  test t{};
+}




Re: [PATCH] libstdc++: Allow adjacent __maybe_present_t to overlap

2024-04-02 Thread Patrick Palka
On Tue, 2 Apr 2024, Jonathan Wakely wrote:

> On Tue, 2 Apr 2024 at 18:00, Pilar Latiesa wrote:
> >
> > Just out of curiosity: would this also work?
> >
> > template
> > struct _Absent {};
> >
> > template
> > using __maybe_present_t = __conditional_t<_Present, _Tp, _Absent<_Tp, 
> > _Disc>>;
> >
> > That would avoid having to type 0, 1, ... manually.
> 
> This is subjectively horrible and, more objectively, would create
> longer mangled names and additional RTTI.

Yeah, it's a neat trick but probably not appropriate to use within the
standard library.

Another reason to avoid it is that GCC's support for lambdas within
template arguments has some known bugs (e.g. PR107457 but that should
hopefully be fixed soon).



Re: [PATCH] c++: ICE with scoped enum in switch condition [PR114451]

2024-04-02 Thread Jason Merrill

On 3/29/24 18:31, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/13?


OK.


-- >8 --
Here we ICE when gimplifying

   enum class Type { Pawn };
   struct Piece {
 Type type : 4;
   };
   void foo() {
 switch (Piece().type)
   case Type::Pawn:;
   }

because we ended up with TYPE_PRECISION (cond) < TYPE_PRECISION (case).
That's because the case expr type here is the unlowered type Type,
whereas the conditional's type is the lowered .  This
is not supposed to happen: see the comment in pop_switch around the
is_bitfield_expr_with_lowered_type check.

But here we did not revert to the lowered SWITCH_STMT_TYPE, because
the conditional contains a TARGET_EXPR, which has side-effects, which
means that finish_switch_cond -> maybe_cleanup_point_expr wraps it
in a CLEANUP_POINT_EXPR.  And is_bitfield_expr_with_lowered_type does
not see through those.

PR c++/103825

gcc/cp/ChangeLog:

* typeck.cc (is_bitfield_expr_with_lowered_type): Handle
CLEANUP_POINT_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/enum44.C: New test.
---
  gcc/cp/typeck.cc|  1 +
  gcc/testsuite/g++.dg/cpp0x/enum44.C | 30 +
  2 files changed, 31 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/enum44.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index f5a0a2273be..9a096b51d55 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -2400,6 +2400,7 @@ is_bitfield_expr_with_lowered_type (const_tree exp)
  case NEGATE_EXPR:
  case NON_LVALUE_EXPR:
  case BIT_NOT_EXPR:
+case CLEANUP_POINT_EXPR:
return is_bitfield_expr_with_lowered_type (TREE_OPERAND (exp, 0));
  
  case COMPONENT_REF:

diff --git a/gcc/testsuite/g++.dg/cpp0x/enum44.C 
b/gcc/testsuite/g++.dg/cpp0x/enum44.C
new file mode 100644
index 000..92408c92217
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/enum44.C
@@ -0,0 +1,30 @@
+// PR c++/103825
+// { dg-do compile { target c++11 } }
+
+enum class Type { Pawn };
+struct Piece {
+  Type type : 4;
+};
+
+void
+foo ()
+{
+  switch (Piece().type)
+case Type::Pawn:;
+
+  auto x = Piece().type;
+  switch (x)
+case Type::Pawn:;
+}
+
+enum class En {A};
+struct St {En field :1;};
+
+void
+bar ()
+{
+  volatile St s = {En::A};
+  switch(s.field) {
+case En::A : break;
+  }
+}

base-commit: 4c18ace1cb69a31af4ac719850a66de79ed12e93




Re: [PATCH] c++: make __is_array return false for T[0] [PR114479]

2024-04-02 Thread Jason Merrill

On 4/1/24 13:50, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
When we switched to using the __is_array built-in trait to implement
std::is_array in r14-6623-g7fd9c349e45534, we started saying that
T[0] is an array.  There are various opinions as to whether that is
the best answer, but it seems prudent to keep the GCC 13 result.

PR c++/114479

gcc/cp/ChangeLog:

* semantics.cc (trait_expr_value) : Return false
for zero-sized arrays.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_array.C: Extend.
---
  gcc/cp/semantics.cc |  4 +++-
  gcc/testsuite/g++.dg/ext/is_array.C | 12 
  2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 9838331d2a9..f561c119dfd 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12439,7 +12439,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
return CP_AGGREGATE_TYPE_P (type1);
  
  case CPTK_IS_ARRAY:

-  return type_code1 == ARRAY_TYPE;
+  return (type_code1 == ARRAY_TYPE
+ /* We don't want to report T[0] as being an array type.  */


Please elaborate that this is for compatibility with an implementation 
of is_array by template argument deduction, because 
compute_array_index_type_loc rejects a zero-size array in SFINAE context.


OK with that adjustment.


+ && !(TYPE_SIZE (type1) && integer_zerop (TYPE_SIZE (type1;
  
  case CPTK_IS_ASSIGNABLE:

return is_xible (MODIFY_EXPR, type1, type2);
diff --git a/gcc/testsuite/g++.dg/ext/is_array.C 
b/gcc/testsuite/g++.dg/ext/is_array.C
index f1a6e08b87a..84993266629 100644
--- a/gcc/testsuite/g++.dg/ext/is_array.C
+++ b/gcc/testsuite/g++.dg/ext/is_array.C
@@ -1,4 +1,5 @@
  // { dg-do compile { target c++11 } }
+// { dg-options "" }
  
  #define SA(X) static_assert((X),#X)
  
@@ -10,18 +11,29 @@
  
  class ClassType { };
  
+constexpr int sz0 = 0;

+constexpr int sz2 = 2;
+
  SA_TEST_CATEGORY(__is_array, int[2], true);
  SA_TEST_CATEGORY(__is_array, int[], true);
+SA_TEST_CATEGORY(__is_array, int[0], false);
  SA_TEST_CATEGORY(__is_array, int[2][3], true);
  SA_TEST_CATEGORY(__is_array, int[][3], true);
+SA_TEST_CATEGORY(__is_array, int[0][3], false);
+SA_TEST_CATEGORY(__is_array, int[3][0], false);
  SA_TEST_CATEGORY(__is_array, float*[2], true);
  SA_TEST_CATEGORY(__is_array, float*[], true);
  SA_TEST_CATEGORY(__is_array, float*[2][3], true);
  SA_TEST_CATEGORY(__is_array, float*[][3], true);
  SA_TEST_CATEGORY(__is_array, ClassType[2], true);
  SA_TEST_CATEGORY(__is_array, ClassType[], true);
+SA_TEST_CATEGORY(__is_array, ClassType[0], false);
  SA_TEST_CATEGORY(__is_array, ClassType[2][3], true);
  SA_TEST_CATEGORY(__is_array, ClassType[][3], true);
+SA_TEST_CATEGORY(__is_array, ClassType[0][3], false);
+SA_TEST_CATEGORY(__is_array, ClassType[2][0], false);
+SA_TEST_CATEGORY(__is_array, int[sz2], true);
+SA_TEST_CATEGORY(__is_array, int[sz0], false);
  
  // Sanity check.

  SA_TEST_CATEGORY(__is_array, ClassType, false);

base-commit: bba118db3f63cb1e3953a014aa3ac2ad89908950




Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread Jan Hubicka
> > I am bit worried about commonly used functions getting "infected" by
> > being called once from ifunc resolver.  I think we only use thread local
> > storage for indirect call profiling, so we may just disable indirect
> > call profiling for these functions.
> 
> Will change it.
> 
> > Also the patch will be noop with -flto -flto-partition=max, so probably
> > we need to compute this flag at WPA time and stream to partitions.
> >
> 
> Why is it a nop with -flto -flto-partition=max? I got
> 
> (gdb) bt
> #0  symtab_node::check_ifunc_callee_symtab_nodes ()
> at /export/gnu/import/git/gitlab/x86-gcc/gcc/symtab.cc:1440
> #1  0x00e487d3 in symbol_table::compile (this=0x7fffea006000)
> at /export/gnu/import/git/gitlab/x86-gcc/gcc/cgraphunit.cc:2320
> #2  0x00d23ecf in lto_main ()
> at /export/gnu/import/git/gitlab/x86-gcc/gcc/lto/lto.cc:687
> #3  0x015254d2 in compile_file ()
> at /export/gnu/import/git/gitlab/x86-gcc/gcc/toplev.cc:449
> #4  0x015284a4 in do_compile ()
> at /export/gnu/import/git/gitlab/x86-gcc/gcc/toplev.cc:2154
> #5  0x01528864 in toplev::main (this=0x7fffd84a, argc=16,
> argv=0x42261f0) at 
> /export/gnu/import/git/gitlab/x86-gcc/gcc/toplev.cc:2310
> #6  0x030a3fe2 in main (argc=16, argv=0x7fffd958)
> at /export/gnu/import/git/gitlab/x86-gcc/gcc/main.cc:39
> 
> Do you have a testcase to show that it is a nop?
Aha, sorry.  I tought this is run during late optimization, but it is
done early, so LTo partitioning does not mix things up.  So current
patch modified to disable only instrumentation that needs TLS should be
fine.

Honza
> 
> -- 
> H.J.


Re: [PATCH] libstdc++: Allow adjacent __maybe_present_t to overlap

2024-04-02 Thread Jonathan Wakely
On Tue, 2 Apr 2024 at 18:00, Pilar Latiesa wrote:
>
> Just out of curiosity: would this also work?
>
> template
> struct _Absent {};
>
> template
> using __maybe_present_t = __conditional_t<_Present, _Tp, _Absent<_Tp, _Disc>>;
>
> That would avoid having to type 0, 1, ... manually.

This is subjectively horrible and, more objectively, would create
longer mangled names and additional RTTI.


Re: [PATCH] libstdc++: Allow adjacent __maybe_present_t to overlap

2024-04-02 Thread Pilar Latiesa
Just out of curiosity: would this also work?

template
struct _Absent {};

template
using __maybe_present_t = __conditional_t<_Present, _Tp, _Absent<_Tp,
_Disc>>;

That would avoid having to type 0, 1, ... manually.


Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread H.J. Lu
On Tue, Apr 2, 2024 at 7:50 AM Jan Hubicka  wrote:
>
> > On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu  wrote:
> > >
> > > We can't instrument an IFUNC resolver nor its callees as it may require
> > > TLS which hasn't been set up yet when the dynamic linker is resolving
> > > IFUNC symbols.
> > >
> > > Add an IFUNC resolver caller marker to cgraph_node and set it if the
> > > function is called by an IFUNC resolver.  Update tree_profiling to skip
> > > functions called by IFUNC resolver.
> > >
> > > Tested with profiledbootstrap on Fedora 39/x86-64.
> > >
> > > gcc/ChangeLog:
> > >
> > > PR tree-optimization/114115
> > > * cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
> > > (cgraph_node): Add called_by_ifunc_resolver.
> > > * cgraphunit.cc (symbol_table::compile): Call
> > > symtab_node::check_ifunc_callee_symtab_nodes.
> > > * symtab.cc (check_ifunc_resolver): New.
> > > (ifunc_ref_map): Likewise.
> > > (is_caller_ifunc_resolver): Likewise.
> > > (symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
> > > * tree-profile.cc (tree_profiling): Do not instrument an IFUNC
> > > resolver nor its callees.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR tree-optimization/114115
> > > * gcc.dg/pr114115.c: New test.
> >
> > PING.
>
> I am bit worried about commonly used functions getting "infected" by
> being called once from ifunc resolver.  I think we only use thread local
> storage for indirect call profiling, so we may just disable indirect
> call profiling for these functions.

Will change it.

> Also the patch will be noop with -flto -flto-partition=max, so probably
> we need to compute this flag at WPA time and stream to partitions.
>

Why is it a nop with -flto -flto-partition=max? I got

(gdb) bt
#0  symtab_node::check_ifunc_callee_symtab_nodes ()
at /export/gnu/import/git/gitlab/x86-gcc/gcc/symtab.cc:1440
#1  0x00e487d3 in symbol_table::compile (this=0x7fffea006000)
at /export/gnu/import/git/gitlab/x86-gcc/gcc/cgraphunit.cc:2320
#2  0x00d23ecf in lto_main ()
at /export/gnu/import/git/gitlab/x86-gcc/gcc/lto/lto.cc:687
#3  0x015254d2 in compile_file ()
at /export/gnu/import/git/gitlab/x86-gcc/gcc/toplev.cc:449
#4  0x015284a4 in do_compile ()
at /export/gnu/import/git/gitlab/x86-gcc/gcc/toplev.cc:2154
#5  0x01528864 in toplev::main (this=0x7fffd84a, argc=16,
argv=0x42261f0) at /export/gnu/import/git/gitlab/x86-gcc/gcc/toplev.cc:2310
#6  0x030a3fe2 in main (argc=16, argv=0x7fffd958)
at /export/gnu/import/git/gitlab/x86-gcc/gcc/main.cc:39

Do you have a testcase to show that it is a nop?

-- 
H.J.


Re: [PATCH] aarch64: Fix typo in comment about FEATURE_STRING

2024-04-02 Thread Richard Sandiford
Christophe Lyon  writes:
> Fix the comment to document FEATURE_STRING instead of FEAT_STRING.
>
> 2024-03-29  Christophe Lyon  
>
>   gcc/
>   * config/aarch64/aarch64-option-extensions.def: Fix comment.

OK, thanks.

Richard

> ---
>  gcc/config/aarch64/aarch64-option-extensions.def | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
> b/gcc/config/aarch64/aarch64-option-extensions.def
> index 061a145e9e7..aa3cd99f791 100644
> --- a/gcc/config/aarch64/aarch64-option-extensions.def
> +++ b/gcc/config/aarch64/aarch64-option-extensions.def
> @@ -54,14 +54,14 @@
>   If a feature A appears in this list then the list implicitly includes
>   any features that are transitively dependent on A (according to 
> REQUIRES).
>  
> -   - FEAT_STRING is a string containing the entries in the 'Features' field 
> of
> - /proc/cpuinfo on a GNU/Linux system that correspond to this architecture
> - extension being available.  Sometimes multiple entries are needed to 
> enable
> - the extension (for example, the 'crypto' extension depends on four
> - entries: aes, pmull, sha1, sha2 being present).  In that case this field
> - should contain a space (" ") separated list of the strings in 'Features'
> - that are required.  Their order is not important.  An empty string means
> - do not detect this feature during auto detection.
> +   - FEATURE_STRING is a string containing the entries in the 'Features' 
> field
> + of /proc/cpuinfo on a GNU/Linux system that correspond to this
> + architecture extension being available.  Sometimes multiple entries are
> + needed to enable the extension (for example, the 'crypto' extension
> + depends on four entries: aes, pmull, sha1, sha2 being present).  In that
> + case this field should contain a space (" ") separated list of the 
> strings
> + in 'Features' that are required.  Their order is not important.  An 
> empty
> + string means do not detect this feature during auto detection.
>  
> - OPT_FLAGS is a list of feature IDENTS that should be enabled (along with
>   their transitive dependencies) when the specified FMV feature is 
> present.


Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread Jan Hubicka
> On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu  wrote:
> >
> > We can't instrument an IFUNC resolver nor its callees as it may require
> > TLS which hasn't been set up yet when the dynamic linker is resolving
> > IFUNC symbols.
> >
> > Add an IFUNC resolver caller marker to cgraph_node and set it if the
> > function is called by an IFUNC resolver.  Update tree_profiling to skip
> > functions called by IFUNC resolver.
> >
> > Tested with profiledbootstrap on Fedora 39/x86-64.
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/114115
> > * cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
> > (cgraph_node): Add called_by_ifunc_resolver.
> > * cgraphunit.cc (symbol_table::compile): Call
> > symtab_node::check_ifunc_callee_symtab_nodes.
> > * symtab.cc (check_ifunc_resolver): New.
> > (ifunc_ref_map): Likewise.
> > (is_caller_ifunc_resolver): Likewise.
> > (symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
> > * tree-profile.cc (tree_profiling): Do not instrument an IFUNC
> > resolver nor its callees.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/114115
> > * gcc.dg/pr114115.c: New test.
> 
> PING.

I am bit worried about commonly used functions getting "infected" by
being called once from ifunc resolver.  I think we only use thread local
storage for indirect call profiling, so we may just disable indirect
call profiling for these functions.

Also the patch will be noop with -flto -flto-partition=max, so probably
we need to compute this flag at WPA time and stream to partitions.

Honza


PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread H.J. Lu
On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu  wrote:
>
> We can't instrument an IFUNC resolver nor its callees as it may require
> TLS which hasn't been set up yet when the dynamic linker is resolving
> IFUNC symbols.
>
> Add an IFUNC resolver caller marker to cgraph_node and set it if the
> function is called by an IFUNC resolver.  Update tree_profiling to skip
> functions called by IFUNC resolver.
>
> Tested with profiledbootstrap on Fedora 39/x86-64.
>
> gcc/ChangeLog:
>
> PR tree-optimization/114115
> * cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
> (cgraph_node): Add called_by_ifunc_resolver.
> * cgraphunit.cc (symbol_table::compile): Call
> symtab_node::check_ifunc_callee_symtab_nodes.
> * symtab.cc (check_ifunc_resolver): New.
> (ifunc_ref_map): Likewise.
> (is_caller_ifunc_resolver): Likewise.
> (symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
> * tree-profile.cc (tree_profiling): Do not instrument an IFUNC
> resolver nor its callees.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/114115
> * gcc.dg/pr114115.c: New test.
> ---
>  gcc/cgraph.h|  6 +++
>  gcc/cgraphunit.cc   |  2 +
>  gcc/symtab.cc   | 89 +
>  gcc/testsuite/gcc.dg/pr114115.c | 24 +
>  gcc/tree-profile.cc |  4 ++
>  5 files changed, 125 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr114115.c
>
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 47f35e8078d..a8c3224802c 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -479,6 +479,9 @@ public:
>   Return NULL if there's no such node.  */
>static symtab_node *get_for_asmname (const_tree asmname);
>
> +  /* Check symbol table for callees of IFUNC resolvers.  */
> +  static void check_ifunc_callee_symtab_nodes (void);
> +
>/* Verify symbol table for internal consistency.  */
>static DEBUG_FUNCTION void verify_symtab_nodes (void);
>
> @@ -896,6 +899,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
> public symtab_node
>redefined_extern_inline (false), tm_may_enter_irr (false),
>ipcp_clone (false), declare_variant_alt (false),
>calls_declare_variant_alt (false), gc_candidate (false),
> +  called_by_ifunc_resolver (false),
>m_uid (uid), m_summary_id (-1)
>{}
>
> @@ -1495,6 +1499,8 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
> public symtab_node
>   is set for local SIMD clones when they are created and cleared if the
>   vectorizer uses them.  */
>unsigned gc_candidate : 1;
> +  /* Set if the function is called by an IFUNC resolver.  */
> +  unsigned called_by_ifunc_resolver : 1;
>
>  private:
>/* Unique id of the node.  */
> diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
> index d200166f7e9..2bd0289ffba 100644
> --- a/gcc/cgraphunit.cc
> +++ b/gcc/cgraphunit.cc
> @@ -2317,6 +2317,8 @@ symbol_table::compile (void)
>
>symtab_node::checking_verify_symtab_nodes ();
>
> +  symtab_node::check_ifunc_callee_symtab_nodes ();
> +
>timevar_push (TV_CGRAPHOPT);
>if (pre_ipa_mem_report)
>  dump_memory_report ("Memory consumption before IPA");
> diff --git a/gcc/symtab.cc b/gcc/symtab.cc
> index 4c7e3c135ca..3256133891d 100644
> --- a/gcc/symtab.cc
> +++ b/gcc/symtab.cc
> @@ -1369,6 +1369,95 @@ symtab_node::verify (void)
>timevar_pop (TV_CGRAPH_VERIFY);
>  }
>
> +/* Return true and set *DATA to true if NODE is an ifunc resolver.  */
> +
> +static bool
> +check_ifunc_resolver (cgraph_node *node, void *data)
> +{
> +  if (node->ifunc_resolver)
> +{
> +  bool *is_ifunc_resolver = (bool *) data;
> +  *is_ifunc_resolver = true;
> +  return true;
> +}
> +  return false;
> +}
> +
> +static auto_bitmap ifunc_ref_map;
> +
> +/* Return true if any caller of NODE is an ifunc resolver.  */
> +
> +static bool
> +is_caller_ifunc_resolver (cgraph_node *node)
> +{
> +  bool is_ifunc_resolver = false;
> +
> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> +{
> +  /* Return true if caller is known to be an IFUNC resolver.  */
> +  if (e->caller->called_by_ifunc_resolver)
> +   return true;
> +
> +  /* Check for recursive call.  */
> +  if (e->caller == node)
> +   continue;
> +
> +  /* Skip if it has been visited.  */
> +  unsigned int uid = e->caller->get_uid ();
> +  if (bitmap_bit_p (ifunc_ref_map, uid))
> +   continue;
> +  bitmap_set_bit (ifunc_ref_map, uid);
> +
> +  if (is_caller_ifunc_resolver (e->caller))
> +   {
> + /* Return true if caller is an IFUNC resolver.  */
> + e->caller->called_by_ifunc_resolver = true;
> + return true;
> +   }
> +
> +  /* Check if caller's alias is an IFUNC resolver.  */
> +  e->caller->call_for_symbol_and_aliases (check_ifunc_resolver,
> + _ifunc_resolver,
> + 

Re: [PATCH][Backport][GCC10] Fix SSA corruption due to widening_mul opt on conflict across an abnormal edge [PR111407]

2024-04-02 Thread Qing Zhao


On Apr 2, 2024, at 03:06, Richard Biener  wrote:

On Mon, Apr 1, 2024 at 3:36 PM Qing Zhao 
mailto:qing.z...@oracle.com>> wrote:

This is a bug in tree-ssa-math-opts.c, when applying the widening mul
optimization, the compiler needs to check whether the operand is in a
ABNORMAL PHI, if YES, we should avoid the transformation.

   PR tree-optimization/111407

gcc/ChangeLog:

   * tree-ssa-math-opts.c (convert_mult_to_widen): Avoid the transform
   when one of the operands is subject to abnormal coalescing.

gcc/testsuite/ChangeLog:

   * gcc.dg/pr111407.c: New test.

(cherry picked from commit 4aca1cfd6235090e48a53dab734437740671bbf3)

bootstraped and regression tested on both aarch64 and x86.

Okay for commit to GCC10?

Note the GCC 10 branch is closed.  If the patch boostraps/tests on the
11, 12 and 13
branches it is OK there.  You do not need approval to backport fixes
for _regressions_
if the patch cherry-picks without major edits and boostraps/tests OK.

Thanks for the info.

I will commit the patches for GCC11, 12, and 13 soon.

Qing

Thanks,
Richard.

thanks.

Qing
---
gcc/testsuite/gcc.dg/pr111407.c | 21 +
gcc/tree-ssa-math-opts.c|  8 
2 files changed, 29 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/pr111407.c

diff --git a/gcc/testsuite/gcc.dg/pr111407.c b/gcc/testsuite/gcc.dg/pr111407.c
new file mode 100644
index ..a171074753f9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111407.c
@@ -0,0 +1,21 @@
+/* PR tree-optimization/111407*/
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+enum { SEND_TOFILE } __sigsetjmp();
+void fclose();
+void foldergets();
+void sendpart_stats(int *p1, int a1, int b1) {
+ int *a = p1;
+ fclose();
+ p1 = 0;
+ long t = b1;
+ if (__sigsetjmp()) {
+   {
+ long t1 = a1;
+ a1+=1;
+ fclose(a1*(long)t1);
+   }
+ }
+ if (p1)
+   fclose();
+}
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index dd0b8c6f0577..47981da20e05 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2543,6 +2543,14 @@ convert_mult_to_widen (gimple *stmt, 
gimple_stmt_iterator *gsi)
  if (!is_widening_mult_p (stmt, , , , ))
return false;

+  /* if any one of rhs1 and rhs2 is subject to abnormal coalescing,
+ avoid the tranform. */
+  if ((TREE_CODE (rhs1) == SSA_NAME
+   && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1))
+  || (TREE_CODE (rhs2) == SSA_NAME
+ && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs2)))
+return false;
+
  to_mode = SCALAR_INT_TYPE_MODE (type);
  from_mode = SCALAR_INT_TYPE_MODE (type1);
  if (to_mode == from_mode)
--
2.31.1



Re: [PATCH] libgcc: Add missing HWCAP entries to aarch64/cpuinfo.c

2024-04-02 Thread Richard Sandiford
Wilco Dijkstra  writes:
> A few HWCAP entries are missing from aarch64/cpuinfo.c.  This results in 
> build errors
> on older machines.
>
> This counts a trivial build fix, but since it's late in stage 4 I'll let 
> maintainers chip in.
> OK for commit?
>
> libgcc/
> * config/aarch64/cpuinfo.c: Add HWCAP_EVTSTRM, HWCAP_CRC32, 
> HWCAP_CPUID,
> HWCAP_PACA and HWCAP_PACG.

OK, thanks.

Richard

> ---
>
> diff --git a/libgcc/config/aarch64/cpuinfo.c b/libgcc/config/aarch64/cpuinfo.c
> index 
> 3c6fb8a575b423c2aff71a1a9f40812b154ee284..4b94fca869507145ec690c825f637abbc82a3493
>  100644
> --- a/libgcc/config/aarch64/cpuinfo.c
> +++ b/libgcc/config/aarch64/cpuinfo.c
> @@ -52,15 +52,15 @@ struct {
>  #ifndef AT_HWCAP
>  #define AT_HWCAP 16
>  #endif
> -#ifndef HWCAP_CPUID
> -#define HWCAP_CPUID (1 << 11)
> -#endif
>  #ifndef HWCAP_FP
>  #define HWCAP_FP (1 << 0)
>  #endif
>  #ifndef HWCAP_ASIMD
>  #define HWCAP_ASIMD (1 << 1)
>  #endif
> +#ifndef HWCAP_EVTSTRM
> +#define HWCAP_EVTSTRM (1 << 2)
> +#endif
>  #ifndef HWCAP_AES
>  #define HWCAP_AES (1 << 3)
>  #endif
> @@ -73,6 +73,9 @@ struct {
>  #ifndef HWCAP_SHA2
>  #define HWCAP_SHA2 (1 << 6)
>  #endif
> +#ifndef HWCAP_CRC32
> +#define HWCAP_CRC32 (1 << 7)
> +#endif
>  #ifndef HWCAP_ATOMICS
>  #define HWCAP_ATOMICS (1 << 8)
>  #endif
> @@ -82,6 +85,9 @@ struct {
>  #ifndef HWCAP_ASIMDHP
>  #define HWCAP_ASIMDHP (1 << 10)
>  #endif
> +#ifndef HWCAP_CPUID
> +#define HWCAP_CPUID (1 << 11)
> +#endif
>  #ifndef HWCAP_ASIMDRDM
>  #define HWCAP_ASIMDRDM (1 << 12)
>  #endif
> @@ -133,6 +139,12 @@ struct {
>  #ifndef HWCAP_SB
>  #define HWCAP_SB (1 << 29)
>  #endif
> +#ifndef HWCAP_PACA
> +#define HWCAP_PACA (1 << 30)
> +#endif
> +#ifndef HWCAP_PACG
> +#define HWCAP_PACG (1UL << 31)
> +#endif
>
>  #ifndef HWCAP2_DCPODP
>  #define HWCAP2_DCPODP (1 << 0)


Re: [PATCH] libquadmath: printf: fix misaligned access on args

2024-04-02 Thread Florian Weimer
* Simon Chopin:

> On x86, this compiles into movdqa which segfaults on unaligned access.
>
> This kind of failure has been seen when running against glibc 2.39,
> which incidentally changed the printf implementation to move away from
> alloca() for this data to instead append it at the end of an existing
> "scratch buffer", with arbitrary alignement, whereas alloca() was
> probably more likely to be naturally aligned.

This glibc change appears to be incorrect.  I think we need to preserve
ABI alignment for types than can be passed through the vararg interface.
I'm not sure if this easily possible, though.  Certainly needs a
discussion on libc-alpha.

Thanks,
Florian



Re: [PATCH] tree-optimization/114557 - reduce ehcleanup peak memory use

2024-04-02 Thread Jakub Jelinek
On Tue, Apr 02, 2024 at 02:06:27PM +0200, Richard Biener wrote:
> On Tue, 2 Apr 2024, Richard Biener wrote:
> 
> > The following reduces peak memory use for the PR114480 testcase at -O1
> > which is almost exclusively spent by the ehcleanup pass in allocating
> > PHI nodes.  The free_phinodes cache we maintain isn't very effective
> > since it has effectively two slots, one for 4 and one for 9 argument
> > PHIs and it is only ever used for allocations up to 9 arguments but
> > we put all larger PHIs in the 9 argument bucket.  This proves
> > uneffective resulting in much garbage to be kept when incrementally
> > growing PHI nodes by edge redirection.
> > 
> > The mitigation is to rely on the GC freelist for larger sizes and
> > thus immediately return all larger bucket sized PHIs to it via ggc_free.
> > 
> > This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
> > from 359s to 168s.
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > 
> > OK for trunk?  I'll leave more surgery for stage1.
> 
> Testing revealed on other use-after-free.  Revised patch as follows.

LGTM if there aren't similar further issues.

> This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
> from 359s to 168s.
> 
>   PR tree-optimization/114557
>   PR tree-optimization/114480
>   * tree-phinodes.cc (release_phi_node): Return PHIs from
>   allocation buckets not covered by free_phinodes to GC.
>   (remove_phi_node): Release the PHI LHS before freeing the
>   PHI node.
>   * tree-vect-loop.cc (vectorizable_live_operation): Get PHI lhs
>   before releasing it.

Jakub



[PATCH] libgcc: Add missing HWCAP entries to aarch64/cpuinfo.c

2024-04-02 Thread Wilco Dijkstra

A few HWCAP entries are missing from aarch64/cpuinfo.c.  This results in build 
errors
on older machines.

This counts a trivial build fix, but since it's late in stage 4 I'll let 
maintainers chip in.
OK for commit?

libgcc/
* config/aarch64/cpuinfo.c: Add HWCAP_EVTSTRM, HWCAP_CRC32, 
HWCAP_CPUID, 
HWCAP_PACA and HWCAP_PACG.

---

diff --git a/libgcc/config/aarch64/cpuinfo.c b/libgcc/config/aarch64/cpuinfo.c
index 
3c6fb8a575b423c2aff71a1a9f40812b154ee284..4b94fca869507145ec690c825f637abbc82a3493
 100644
--- a/libgcc/config/aarch64/cpuinfo.c
+++ b/libgcc/config/aarch64/cpuinfo.c
@@ -52,15 +52,15 @@ struct {
 #ifndef AT_HWCAP
 #define AT_HWCAP 16
 #endif
-#ifndef HWCAP_CPUID
-#define HWCAP_CPUID (1 << 11)
-#endif
 #ifndef HWCAP_FP
 #define HWCAP_FP (1 << 0)
 #endif
 #ifndef HWCAP_ASIMD
 #define HWCAP_ASIMD (1 << 1)
 #endif
+#ifndef HWCAP_EVTSTRM
+#define HWCAP_EVTSTRM (1 << 2)
+#endif
 #ifndef HWCAP_AES
 #define HWCAP_AES (1 << 3)
 #endif
@@ -73,6 +73,9 @@ struct {
 #ifndef HWCAP_SHA2
 #define HWCAP_SHA2 (1 << 6)
 #endif
+#ifndef HWCAP_CRC32
+#define HWCAP_CRC32 (1 << 7)
+#endif
 #ifndef HWCAP_ATOMICS
 #define HWCAP_ATOMICS (1 << 8)
 #endif
@@ -82,6 +85,9 @@ struct {
 #ifndef HWCAP_ASIMDHP
 #define HWCAP_ASIMDHP (1 << 10)
 #endif
+#ifndef HWCAP_CPUID
+#define HWCAP_CPUID (1 << 11)
+#endif
 #ifndef HWCAP_ASIMDRDM
 #define HWCAP_ASIMDRDM (1 << 12)
 #endif
@@ -133,6 +139,12 @@ struct {
 #ifndef HWCAP_SB
 #define HWCAP_SB (1 << 29)
 #endif
+#ifndef HWCAP_PACA
+#define HWCAP_PACA (1 << 30)
+#endif
+#ifndef HWCAP_PACG
+#define HWCAP_PACG (1UL << 31)
+#endif
 
 #ifndef HWCAP2_DCPODP
 #define HWCAP2_DCPODP (1 << 0)



Re: [PATCH] tree-optimization/114557 - reduce ehcleanup peak memory use

2024-04-02 Thread Richard Biener
On Tue, 2 Apr 2024, Richard Biener wrote:

> The following reduces peak memory use for the PR114480 testcase at -O1
> which is almost exclusively spent by the ehcleanup pass in allocating
> PHI nodes.  The free_phinodes cache we maintain isn't very effective
> since it has effectively two slots, one for 4 and one for 9 argument
> PHIs and it is only ever used for allocations up to 9 arguments but
> we put all larger PHIs in the 9 argument bucket.  This proves
> uneffective resulting in much garbage to be kept when incrementally
> growing PHI nodes by edge redirection.
> 
> The mitigation is to rely on the GC freelist for larger sizes and
> thus immediately return all larger bucket sized PHIs to it via ggc_free.
> 
> This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
> from 359s to 168s.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> OK for trunk?  I'll leave more surgery for stage1.

Testing revealed on other use-after-free.  Revised patch as follows.

Richard.

>From 3507c14d05994eba5396492f08a919847b9e54ab Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Tue, 2 Apr 2024 12:31:04 +0200
Subject: [PATCH] tree-optimization/114557 - reduce ehcleanup peak memory use
To: gcc-patches@gcc.gnu.org

The following reduces peak memory use for the PR114480 testcase at -O1
which is almost exclusively spent by the ehcleanup pass in allocating
PHI nodes.  The free_phinodes cache we maintain isn't very effective
since it has effectively two slots, one for 4 and one for 9 argument
PHIs and it is only ever used for allocations up to 9 arguments but
we put all larger PHIs in the 9 argument bucket.  This proves
uneffective resulting in much garbage to be kept when incrementally
growing PHI nodes by edge redirection.

The mitigation is to rely on the GC freelist for larger sizes and
thus immediately return all larger bucket sized PHIs to it via ggc_free.

This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
from 359s to 168s.

PR tree-optimization/114557
PR tree-optimization/114480
* tree-phinodes.cc (release_phi_node): Return PHIs from
allocation buckets not covered by free_phinodes to GC.
(remove_phi_node): Release the PHI LHS before freeing the
PHI node.
* tree-vect-loop.cc (vectorizable_live_operation): Get PHI lhs
before releasing it.
---
 gcc/ggc-page.cc   |  6 ++
 gcc/tree-phinodes.cc  | 10 +-
 gcc/tree-vect-loop.cc |  2 +-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-phinodes.cc b/gcc/tree-phinodes.cc
index ddd731323e1..5a7e4a94e57 100644
--- a/gcc/tree-phinodes.cc
+++ b/gcc/tree-phinodes.cc
@@ -223,6 +223,14 @@ release_phi_node (gimple *phi)
   delink_imm_use (imm);
 }
 
+  /* Immediately return the memory to the allocator when we would
+ only ever re-use it for a smaller size allocation.  */
+  if (len - 2 >= NUM_BUCKETS - 2)
+{
+  ggc_free (phi);
+  return;
+}
+
   bucket = len > NUM_BUCKETS - 1 ? NUM_BUCKETS - 1 : len;
   bucket -= 2;
   vec_safe_push (free_phinodes[bucket], phi);
@@ -445,9 +453,9 @@ remove_phi_node (gimple_stmt_iterator *gsi, bool 
release_lhs_p)
 
   /* If we are deleting the PHI node, then we should release the
  SSA_NAME node so that it can be reused.  */
-  release_phi_node (phi);
   if (release_lhs_p)
 release_ssa_name (gimple_phi_result (phi));
+  release_phi_node (phi);
 }
 
 /* Remove all the phi nodes from BB.  */
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index f33629e9b04..984636edbc5 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -10962,8 +10962,8 @@ vectorizable_live_operation (vec_info *vinfo, 
stmt_vec_info stmt_info,
 lhs_type, _gsi);
 
  auto gsi = gsi_for_stmt (use_stmt);
- remove_phi_node (, false);
  tree lhs_phi = gimple_phi_result (use_stmt);
+ remove_phi_node (, false);
  gimple *copy = gimple_build_assign (lhs_phi, new_tree);
  gsi_insert_before (_gsi, copy, GSI_SAME_STMT);
  break;
-- 
2.35.3



[PATCH] libphobos, Darwin: Enable libphobos for most Darwin.

2024-04-02 Thread Iain Sandoe
I have been building and testing D/libphobos for some time and over
some GCC and OS releases.  As discussed on IRC a while ago, I think
we're ready to enable this (it also avoids an annoying build fail at
stage 2 if one forgets to add the enable to the command line).

Also tested on x86_64 and powerpc64 linux gnu.

OK for trunk?
OK for backports?
thanks,
Iain

--- 8< ---

Earlier Darwin systems can be made to work too - but they need non-
standard 'binutils', so for now these must be enabled specifically.

libphobos/ChangeLog:

* configure.tgt: Enable libphobos for Darwin >= 12.

Signed-off-by: Iain Sandoe 
---
 libphobos/configure.tgt | 9 +
 1 file changed, 9 insertions(+)

diff --git a/libphobos/configure.tgt b/libphobos/configure.tgt
index 13879380416..7159688 100644
--- a/libphobos/configure.tgt
+++ b/libphobos/configure.tgt
@@ -27,6 +27,9 @@ case "${target}" in
   *-*-dragonfly*)
LIBPHOBOS_SUPPORTED=yes
;;
+  aarch64-*-darwin2*)
+   LIBPHOBOS_SUPPORTED=yes
+   ;;
   aarch64*-*-linux*)
LIBPHOBOS_SUPPORTED=yes
;;
@@ -58,6 +61,12 @@ case "${target}" in
   sparc*-*-solaris2.11*)
LIBPHOBOS_SUPPORTED=yes
;;
+  *-*-darwin9* | *-*-darwin1[01]*)
+   LIBDRUNTIME_ONLY=yes
+   ;;
+  x86_64-*-darwin1[2-9]* | x86_64-*-darwin2* | i?86-*-darwin1[2-7])
+   LIBPHOBOS_SUPPORTED=yes
+   ;;
   x86_64-*-freebsd* | i?86-*-freebsd*)
LIBPHOBOS_SUPPORTED=yes
;;
-- 
2.39.2 (Apple Git-143)



Ping: [PATCH] jit: Ensure ssize_t is defined.

2024-04-02 Thread Iain Sandoe


> On 29 Jan 2024, at 11:26, Iain Sandoe  wrote:

> I guess the solution here depends on the scope over which we expect
> the header to be used.
> 
>> On 28 Jan 2024, at 23:13, Iain Sandoe  wrote:
>>> On 28 Jan 2024, at 21:25, Eric Gallager  wrote:
>>> On Sun, Jan 28, 2024 at 6:45 AM Iain Sandoe  wrote:
 
 Tested on i686, x86_64 Darwin, x86_64 Linux,
 OK for trunk?
 
 --- 8< ---
 
 On some targets it seems that ssize_t is not defined by any of the
 headers transitively included by .  This leads to a bootstrap
 fail when jit is enabled.
 
 The fix proposed here is to include sys/types.h when it is available
 since that is where Posix specifies that ssize_t is defined.
 
 gcc/jit/ChangeLog:
 
  * libgccjit.h: Conditionally include  where it is
  available to ensure declaration of ssize_t.
 
 Signed-off-by: Iain Sandoe 
 ---
 gcc/jit/libgccjit.h | 3 +++
 1 file changed, 3 insertions(+)
 
 diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
 index 235cab053e0..db4f27a48bf 100644
 --- a/gcc/jit/libgccjit.h
 +++ b/gcc/jit/libgccjit.h
 @@ -21,6 +21,9 @@ along with GCC; see the file COPYING3.  If not see
 #define LIBGCCJIT_H
 
 #include 
 +#if __has_include()
>>> 
>>> Is __has_include() something that we can use unconditionally?
>> 
>> Hmm.. maybe we cannot, it seems it was introduced in gcc-4.9 and we only ask
>> for 4.8, IIRC.
>> 
>> I guess HAVE_SYS_TYPES_H might be an alternative (I’ll have to retest)
> 
> Answering my own question; no that is not going to work  either since the 
> header is
> installed and config.h is not.
> 
> I guess the question is “is this header ever [meaningfully] consumed by a 
> compiler
> other than the current GCC that it supports”?
> 
> e.g. if we expected we could build libgccjit with clang in a 
> “—disable-bootstrap”
> configuration and expect that to work?
> 

this … (as attached)

> The fallback is
> #ifdef __APPLE__
> # include   /* For ssize_t.  */
> #endif


> 
> (which I will test on a number of platform versions).
> 
> since this breaks bootstrap at stage 2 on affected platform versions, so we 
> need some
> fix.



0001-jit-Ensure-ssize_t-is-defined.patch
Description: Binary data




Re: [PATCH] jit, Darwin: Implement library exports list.

2024-04-02 Thread Iain Sandoe
Hi David,

> On 25 Jan 2024, at 10:16, Iain Sandoe  wrote:
> 

>> On 24 Jan 2024, at 18:31, David Malcolm  wrote:
>> 
>> On Tue, 2024-01-16 at 11:10 +, Iain Sandoe wrote:
>>> Tested on x86_64, i686 Darwin and x86_64 Linux,
>>> OK for trunk? when ?
>>> thanks,
>>> Iain
>> 
>> Hi Iain, thanks for the patch.
>> 
>> I'll have to defer to your Darwin expertise here; given that you've
>> tested it on the above configurations I'll assume it's correct, but...
>> 
>>> 
>>> --- 8< ---
>>> 
>>> Currently, we have no exports list for libgccjit, which means that
>>> all symbols are exported, including those from libstdc++ which is
>>> linked statically into the lib.  This causes failures when the
>>> shared libstdc++ is used but some c++ symbols are satisfied from
>>> libgccjit.
>>> 
>>> This implements an export file for Darwin (which is currently
>>> manually created by cross-checking libgccjit.map).
>> 
>> ...I'm a little nervous about this; Antoyo has a number of out-of-tree
>> patches we're working towards merging, and almost all of these touch
>> libgccjit.map.
>> 
>> 
>>>  Ideally we'd
>>> script this, at some point.  
>> 
>> Yes.  How about a Python 3 script (inside "contrib", or in "gcc/jit")
>> that would do that.  
> 
> I’m not sure we want to make a build dependency on Python 3.. 
> the reason I say ‘build’ is ...
> 
>> Then whenever a patch touches libgccjit.map we'd
>> run that script to regenerate libgccjit.exp in the source tree.  I can
>> have a go at writing it, if you think that's the best way to go.
> 
> … there are two other places in the current sources where ld map files
> are converted to Darwin (and Solaris) symbol export files [libgcc, libstdc++].
> 
> In these cases, the export file is created on-the-fly at build time by scripts
> (IIRC a mixture of awk, sh, perl).
> 
> This requires more surgery to the Make stuff and that we have a suitable
> script - but it does mean that we do not need to commit the Darwin (and
> potentially Solaris) versions to the source tree.
> 
> I’m actually happy with either solution - since we do not expect this to
> be a daily occurance, we could have a maintainter’s python3 script to
> update a committed export file or we could try the mechanism used in
> the other two places (but then the script would need to use awk/sh/perl
> to avoid new build deps).
> 
>> I take it .exp is the standard extension for these exports file in the
>> Darwin world.  If so, it's a shame (but unavoidable) that it clashes
>> with the existing uses of .exp in our source tree for our
>> expect/Tcl/DejaGnu sources.
> 
> I suspect the linker will accept other extensions, although ‘exp’ is a
> convention used elsewhere, it is unfortunate that it clashes indeed.
> - let me try an alternate (e.g. .export) and report back.
> 
>> I think the patch as-is is OK for trunk now, assuming that you've
>> tested it as above.
> 
> I’m going to hold off on this for now (but do want some solution before
> 14 branches, because there are quite a few new fails from it).

It seems that we are not going to get time to implement something better
for GCC-14; this is what I applied (I renamed the extension to .exports 
which the linker is fine with) so that it is not confused with .exp files.

I guess I’ll need to do a final pass of checking I’ve copied all the syms
before 14 branches.

thanks
Iain



0001-jit-Darwin-Implement-library-exports-list.patch
Description: Binary data




[pushed] testsuite: Remove duplicate -lgcov [PR114034]

2024-04-02 Thread Iain Sandoe
Tested on x86_64, i686 Darwin and x86_64, powerpc64 linux, pushed to
trunk as obvious, thanks
Iain

--- 8< ---

Duplicate library entries now cause linker warnings with newer linker
versions on Darwin which leads to these tests regressing.  The library
is already added by the test flags so there is no need to put an extra
one in the options.

PR testsuite/114034

gcc/testsuite/ChangeLog:

* g++.dg/gcov/gcov-dump-1.C: Remove extra -lgcov.
* g++.dg/gcov/gcov-dump-2.C: Likewise.

Signed-off-by: Iain Sandoe 
---
 gcc/testsuite/g++.dg/gcov/gcov-dump-1.C | 2 +-
 gcc/testsuite/g++.dg/gcov/gcov-dump-2.C | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/g++.dg/gcov/gcov-dump-1.C 
b/gcc/testsuite/g++.dg/gcov/gcov-dump-1.C
index f0e81e9b042..774a7269ff2 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov-dump-1.C
+++ b/gcc/testsuite/g++.dg/gcov/gcov-dump-1.C
@@ -1,4 +1,4 @@
-/* { dg-options "-fprofile-generate -ftest-coverage -lgcov" } */
+/* { dg-options "-fprofile-generate -ftest-coverage " } */
 /* { dg-do run { target native } } */
 
 int value;
diff --git a/gcc/testsuite/g++.dg/gcov/gcov-dump-2.C 
b/gcc/testsuite/g++.dg/gcov/gcov-dump-2.C
index 6234a81a586..e748989d2c0 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov-dump-2.C
+++ b/gcc/testsuite/g++.dg/gcov/gcov-dump-2.C
@@ -1,4 +1,4 @@
-/* { dg-options "-fprofile-generate -ftest-coverage -lgcov" } */
+/* { dg-options "-fprofile-generate -ftest-coverage " } */
 /* { dg-do run { target native } } */
 
 int value;
-- 
2.39.2 (Apple Git-143)



[pushed] testsuite, Darwin: Allow for an undefined symbol [PR114036].

2024-04-02 Thread Iain Sandoe
Tested on x86_64-darwin17,21,23 and on x86_64 and powerpc64 linux gnu,
pushed to trunk, thanks
Iain

--- 8< ---

Darwin's linker defaults to requiring all symbols to be defined at
static link time (unless specifically noted or dynamic lookuo is
enabled).

For this test, we just need to note that the symbol is expected to
be undefined.

PR testsuite/114036

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-14.c: Allow for 'Foo' to be undefined
on Darwin link lines.

Signed-off-by: Iain Sandoe 
---
 gcc/testsuite/gcc.misc-tests/gcov-14.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-14.c 
b/gcc/testsuite/gcc.misc-tests/gcov-14.c
index 2bebf7e4a93..61a9191c068 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-14.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-14.c
@@ -3,7 +3,7 @@
 /* { dg-do run { target native } } */
 /* { dg-options "-O2 -fprofile-arcs -ftest-coverage -fgnu89-inline" } */
 /* The following line arranges that Darwin has behavior like elf weak import.  
*/
-/* { dg-additional-options "-flat_namespace -undefined suppress" { target 
*-*-darwin* }  } */
+/* { dg-additional-options "-Wl,-U,_Foo" { target *-*-darwin* }  } */
 /* { dg-require-weak "" } */
 /* { dg-skip-if "undefined weak not supported" { { hppa*-*-hpux* } && { ! lp64 
} } } */
 /* { dg-skip-if "undefined weak not supported" { powerpc-ibm-aix* } } */
-- 
2.39.2 (Apple Git-143)



[PATCH] tree-optimization/114557 - reduce ehcleanup peak memory use

2024-04-02 Thread Richard Biener
The following reduces peak memory use for the PR114480 testcase at -O1
which is almost exclusively spent by the ehcleanup pass in allocating
PHI nodes.  The free_phinodes cache we maintain isn't very effective
since it has effectively two slots, one for 4 and one for 9 argument
PHIs and it is only ever used for allocations up to 9 arguments but
we put all larger PHIs in the 9 argument bucket.  This proves
uneffective resulting in much garbage to be kept when incrementally
growing PHI nodes by edge redirection.

The mitigation is to rely on the GC freelist for larger sizes and
thus immediately return all larger bucket sized PHIs to it via ggc_free.

This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
from 359s to 168s.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK for trunk?  I'll leave more surgery for stage1.

Thanks,
Richard.

PR tree-optimization/114557
PR tree-optimization/114480
* tree-phinodes.cc (release_phi_node): Return PHIs from
allocation buckets not covered by free_phinodes to GC.
(remove_phi_node): Release the PHI LHS before freeing the
PHI node.
---
 gcc/tree-phinodes.cc | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-phinodes.cc b/gcc/tree-phinodes.cc
index ddd731323e1..5a7e4a94e57 100644
--- a/gcc/tree-phinodes.cc
+++ b/gcc/tree-phinodes.cc
@@ -223,6 +223,14 @@ release_phi_node (gimple *phi)
   delink_imm_use (imm);
 }
 
+  /* Immediately return the memory to the allocator when we would
+ only ever re-use it for a smaller size allocation.  */
+  if (len - 2 >= NUM_BUCKETS - 2)
+{
+  ggc_free (phi);
+  return;
+}
+
   bucket = len > NUM_BUCKETS - 1 ? NUM_BUCKETS - 1 : len;
   bucket -= 2;
   vec_safe_push (free_phinodes[bucket], phi);
@@ -445,9 +453,9 @@ remove_phi_node (gimple_stmt_iterator *gsi, bool 
release_lhs_p)
 
   /* If we are deleting the PHI node, then we should release the
  SSA_NAME node so that it can be reused.  */
-  release_phi_node (phi);
   if (release_lhs_p)
 release_ssa_name (gimple_phi_result (phi));
+  release_phi_node (phi);
 }
 
 /* Remove all the phi nodes from BB.  */
-- 
2.35.3


[pushed] Darwin: Correct a version check.

2024-04-02 Thread Iain Sandoe
Tested on x86_64-darwin17,21,23, pushed to trunk, thanks,
Iain

--- 8< ---

When the version for dsymutil comes from a clang build, it is
of the form NNmm.pp.qq where NN and mm are the major and minor
LLVM version components.  We need to check for a major version
greater than or equal to 7 - so use 700 in the check.

gcc/ChangeLog:

* config/darwin.cc (darwin_override_options): Update the
clang major version value in the dsymutil check.

Signed-off-by: Iain Sandoe 
---
 gcc/config/darwin.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/darwin.cc b/gcc/config/darwin.cc
index c37a1a4756f..63b8c509405 100644
--- a/gcc/config/darwin.cc
+++ b/gcc/config/darwin.cc
@@ -3420,7 +3420,7 @@ darwin_override_options (void)
   /* External toolchains based on LLVM or clang 7+ have support for
 dwarf-4.  */
   if ((dsymutil_version.kind == LLVM && dsymutil_version.major >= 7)
- || (dsymutil_version.kind == CLANG && dsymutil_version.major >= 7))
+ || (dsymutil_version.kind == CLANG && dsymutil_version.major >= 700))
dwarf_version = 4;
   else if (dsymutil_version.kind == DWARFUTILS
   && dsymutil_version.major >= 121)
-- 
2.39.2 (Apple Git-143)



Re: [PATCH] Fix up duplicated words mostly in comments, part 1

2024-04-02 Thread Richard Biener
On Tue, 2 Apr 2024, Jakub Jelinek wrote:

> Hi!
> 
> Like in r12-7519-g027e30414492d50feb2854aff38227b14300dc4b, I've done
> git grep -v 'long long\|optab optab\|template template\|double double' | grep 
> ' \([a-zA-Z]\+\) \1 '
> 
> This is just part of the changes, mostly for non-gcc directories.
> I'll try to get to the rest soon.  Obviously, the above command also
> finds cases which are correct as is and shouldn't be changed, so one
> needs to manually inspect everything.
> 
> I'd hope most of it is pretty obvious, but the config/ and libstdc++-v3/
> hunks include a tweak in a license wording, though other copies of the
> similar license have the wording right.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK

> 2024-04-02  Jakub Jelinek  
> 
>   * Makefile.tpl: Fix duplicated words; returns returns ->
>   returns.
> config/
>   * lcmessage.m4: Fix duplicated words; can can -> can,
>   package package -> package.
> libdecnumber/
>   * decCommon.c (decFinalize): Fix duplicated words in
>   comment; the the -> the.
> libgcc/
>   * unwind-dw2-fde.c (struct fde_accumulator): Fix duplicated
>   words in comment; is is -> is.
> libgfortran/
>   * configure.host: Fix duplicated words; the the -> the.
> libgm2/
>   * configure.host: Fix duplicated words; the the -> the.
> libgomp/
>   * libgomp.texi (OpenMP 5.2): Fix duplicated words; with with ->
>   with.
>   (omp_target_associate_ptr): Fix duplicated words; either either ->
>   either.
>   (omp_init_allocator): Fix duplicated words; be be -> be.
>   (omp_realloc): Fix duplicated words; is is -> is.
>   (OMP_ALLOCATOR): Fix duplicated words; other other -> other.
>   * priority_queue.h (priority_queue_multi_p): Fix duplicated words;
>   to to -> to.
> libiberty/
>   * regex.c (byte_re_match_2_internal): Fix duplicated words in comment;
>   next next -> next.
>   * dyn-string.c (dyn_string_init): Fix duplicated words in comment;
>   of of -> of.
> libitm/
>   * beginend.cc (GTM::gtm_thread::begin_transaction): Fix duplicated
>   words in comment; not not -> not to.
> libobjc/
>   * init.c (duplicate_classes): Fix duplicated words in comment; in in
>   -> in.
>   * sendmsg.c (__objc_prepare_dtable_for_class): Fix duplicated words
>   in comment; the the -> the.
>   * encoding.c (objc_layout_structure): Likewise.
> libstdc++-v3/
>   * acinclude.m4: Fix duplicated words; file file -> file can.
>   * configure.host: Fix duplicated words; the the -> the.
> libvtv/
>   * vtv_rts.cc (vtv_fail): Fix duplicated words; to to -> to.
>   * vtv_fail.cc (vtv_fail): Likewise.
> 
> --- Makefile.tpl.jj   2024-01-10 12:19:07.609682386 +0100
> +++ Makefile.tpl  2024-03-28 15:38:31.471917678 +0100
> @@ -1976,7 +1976,7 @@ configure-target-[+module+]: maybe-all-g
> (define dep-maybe (lambda ()
>(if (exist? "hard") "" "maybe-")))
>  
> -   ;; dep-kind returns returns "prebootstrap" for configure or build
> +   ;; dep-kind returns "prebootstrap" for configure or build
> ;; dependencies of bootstrapped modules on a build module
> ;; (e.g. all-gcc on all-build-bison); "normal" if the dependency is
> ;; on an "install" target, or if the dependence module is not
> --- config/lcmessage.m4.jj2020-01-11 16:31:53.155321678 +0100
> +++ config/lcmessage.m4   2024-03-28 16:01:40.879060037 +0100
> @@ -6,13 +6,13 @@ dnl Public License, this file may be dis
>  dnl that contains a configuration script generated by Autoconf, under
>  dnl the same distribution terms as the rest of that program.
>  dnl
> -dnl This file can can be used in projects which are not available under
> +dnl This file can be used in projects which are not available under
>  dnl the GNU General Public License or the GNU Library General Public
>  dnl License but which still want to provide support for the GNU gettext
>  dnl functionality.
>  dnl Please note that the actual code of the GNU gettext library is covered
>  dnl by the GNU Library General Public License, and the rest of the GNU
> -dnl gettext package package is covered by the GNU General Public License.
> +dnl gettext package is covered by the GNU General Public License.
>  dnl They are *not* in the public domain.
>  
>  dnl Authors:
> --- libdecnumber/decCommon.c.jj   2024-01-03 12:07:28.096370943 +0100
> +++ libdecnumber/decCommon.c  2024-03-28 16:00:26.576068973 +0100
> @@ -388,7 +388,7 @@ static decFloat * decFinalize(decFloat *
>   UBFROMUI(ub-3, 0);   /* to  */
>   }
> /* [note ub could now be to left of msd, and it is not safe */
> -   /* to write to the the left of the msd] */
> +   /* to write to the left of the msd] */
> /* now at most 3 digits left to non-9 (usually just the one) */
> for (; ub>=umsd; *ub=0, ub--) {
>   if (*ub==9) continue;/* carry */
> --- 

[pushed] Darwin: Do not emit .macinfo when dsymutil cannot consume it.

2024-04-02 Thread Iain Sandoe
This causes quite a number of testsuite fails on systems using Xcode 15.
More significantly, it is a serious debug regression (since the entire
debug is ignored when macinfo is seen).

tested on x86_64-darwin17,21,23 with / without Xcode-15, pushed to trunk,
thanks
Iain

--- 8< ---

Some verions of dsymutil do not ignore .macinfo sections, but instead
ignore the entire debug in the file.

To avoid this total loss of debug, when we detect that the debug level
is g3 and the dsymutil version cannot support it, we reduce the level
to g2 and issue a note.

This behaviour can be overidden by -gstrict-dwarf (although the objects
will contain macinfo; dsymutil will not produce a .dSYM with it).

gcc/ChangeLog:

* config/darwin.cc (darwin_override_options): Reduce the debug
level to 2 if dsymutil cannot handle .macinfo sections.

Signed-off-by: Iain Sandoe 
---
 gcc/config/darwin.cc | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/gcc/config/darwin.cc b/gcc/config/darwin.cc
index 9e5d64e6f32..c37a1a4756f 100644
--- a/gcc/config/darwin.cc
+++ b/gcc/config/darwin.cc
@@ -3415,11 +3415,6 @@ darwin_override_options (void)
  global_options.x_flag_objc_abi);
 }
 
-  /* Limit DWARF to the chosen version, the linker and debug linker might not
- be able to consume newer structures.  */
-  if (!OPTION_SET_P (dwarf_strict))
-dwarf_strict = 1;
-
   if (!OPTION_SET_P (dwarf_version))
 {
   /* External toolchains based on LLVM or clang 7+ have support for
@@ -3442,6 +3437,24 @@ darwin_override_options (void)
   OPTION_SET_P (dwarf_split_debug_info) = 0;
 }
 
+  /* Cases where dsymutil will exclude files with .macinfo sections; we are
+ better off forcing the debug level to 2 than completely excluding the
+ files.  If strict dwarf is set, then emit the macinfo anyway.  */
+  if (debug_info_level == DINFO_LEVEL_VERBOSE
+  && (!OPTION_SET_P (dwarf_strict) || dwarf_strict == 0)
+  && ((dsymutil_version.kind == CLANG && dsymutil_version.major >= 1500)
+ || (dsymutil_version.kind == LLVM && dsymutil_version.major >= 15)))
+{
+  inform (input_location,
+ "%<-g3%> is not supported by the debug linker in use (set to 2)");
+  debug_info_level = DINFO_LEVEL_NORMAL;
+}
+
+  /* Limit DWARF to the chosen version, the linker and debug linker might not
+ be able to consume newer structures.  */
+  if (!OPTION_SET_P (dwarf_strict))
+dwarf_strict = 1;
+
   /* Do not allow unwind tables to be generated by default for m32.
  fnon-call-exceptions will override this, regardless of what we do.  */
   if (generating_for_darwin_version < 10
-- 
2.39.2 (Apple Git-143)



Re: [PATCH] Fix up postboot dependencies [PR106472]

2024-04-02 Thread Richard Biener
On Tue, 2 Apr 2024, Jakub Jelinek wrote:

> On Wed, Mar 13, 2024 at 10:13:37AM +0100, Jakub Jelinek wrote:
> > While the first Makefile.tpl hunk looks obviously ok, the others look
> > completely wrong to me.
> > There is nothing special about libgo vs. libbacktrace/libatomic
> > compared to any other target library which is not bootstrapped vs. any
> > of its dependencies which are in the bootstrapped set.
> > So, Makefile.tpl shouldn't hardcode such dependencies.
> 
> Here is my version of the fix.
> The dependencies in the toplevel Makefile simply didn't take into account
> that some target modules could be in a bootstrapped build built in some
> configurations as bootstrap modules (typically as dependencies of other
> target bootstrap modules), while in other configurations just as
> dependencies of non-bootstrap target modules and so not built during the
> bootstrap, but after it.
> Makefile.tpl arranges for those postboot target module -> target module
> dependencies to be emitted only inside of an @unless gcc-bootstrap block,
> while for @if gcc-bootstrap it just emits
> configure-target-whatever: stage_last
> dependencies which ensure those postbootstrap target modules are only built
> after everything that is bootstrapped has been.
> 
> Now, the libbacktrace/libatomic target modules have bootstrap=true
> target_modules = { module= libbacktrace; bootstrap=true; };
> target_modules = { module= libatomic; bootstrap=true; lib_path=.libs; };
> because those modules are dependencies of libphobos target module, so
> when d is included among bootstrapped languages, those are all bootstrapped
> and everything works correctly.
> While if d is not included, libphobos target module is disabled,
> libbacktrace/libatomic target modules aren't bootstrapped, nothing during
> bootstrap needs them, but post bootstrap libgo target module depends on
> the libatomic and libbacktrace target modules, libgfortran target module
> depends on the libbacktrace target module and libgm2 target module depends
> on the libatomic target module, but those dependencies were emitted only
> @unless gcc-bootstrap.  There is a similar theoretical problem for zlib
> target module if GCJ would be ressurected, libphobos as bootstrap target
> module depends on the zlib target module, but if d is not configured,
> fastjar also depends on it.
> 
> The following patch arranges for the @if gcc-bootstrap case to emit also
> target module -> target module dependencies, but conditionally on the
> on dependency not being bootstrapped.
> 
> In the generated Makefile.in you can see what the Makefile.tpl change
> produces and that it just adds extra dependencies which weren't there
> before in the @if gcc-bootstrap case.
> 
> I've bootstrapped without this patch with
> ../configure --enable-languages=c,c++,go; make
> on x86_64-linux (note, make -j2 or higher usually worked) which failed
> as described in the PR, then with this patch with the same command which
> built fine and the Makefile difference between the two builds being
> diff -up obj40{a,b}/Makefile
> --- obj40a/Makefile   2024-03-31 00:35:22.243791499 +0100
> +++ obj40b/Makefile   2024-03-31 22:40:38.143299144 +0200
> @@ -29376,6 +29376,14 @@ configure-bison: stage_last
>  configure-flex: stage_last
>  configure-m4: stage_last
>  
> +configure-target-fastjar: maybe-configure-target-zlib
> +all-target-fastjar: maybe-all-target-zlib
> +all-target-libgo: maybe-all-target-libbacktrace
> +all-target-libgo: maybe-all-target-libatomic
> +all-target-libgm2: maybe-all-target-libatomic
> +configure-target-libgfortran: maybe-all-target-libbacktrace
> +configure-target-libgo: maybe-all-target-libbacktrace
> +
>  
>  # Dependencies for target modules on other target modules are
>  # described by lang_env_dependencies; the defaults apply to anything
> 
> which I believe are exactly the extra dependencies we want.
> Plus I've done normal x86_64-linux and i686-linux bootstraps/regtests
> which in my case include 
> --enable-languages=default,ada,obj-c++,lto,go,d,rust,m2
> for x86_64 and the same except ada for i686; those with my usual make -j32.
> The Makefile difference in those builds vs. unpatched case
> is just an extra empty line.
> 
> Ok for trunk?

OK.

Richard.

> 2024-04-02  Jakub Jelinek  
> 
>   PR bootstrap/106472
>   * Makefile.tpl (make-postboot-target-dep): New lambda.
>   Use it to add --enable-bootstrap dependencies of target modules
>   on other target modules if the latter aren't bootstrapped.
>   * Makefile.in: Regenerate.
> 
> --- Makefile.tpl.jj   2024-01-09 22:40:16.812824317 +0100
> +++ Makefile.tpl  2024-03-30 14:23:51.985398859 +0100
> @@ -2013,6 +2013,25 @@ configure-target-[+module+]: maybe-all-g
>(unless (=* target "target-")
> (string-append "configure-" target ": " dep "\n"))
>  
> +   ;; Dependencies in between target modules if the dependencies
> +   ;; are bootstrap target modules and the target modules which
> +   ;; 

[pushed] testsuite, Darwin: Update bad-mapper-1 after libiberty changes.

2024-04-02 Thread Iain Sandoe
Tested on i686-darwin9, x86_64-darwin17, 21, 23, and on x86_64 and powerpc64
linux gnu, pushed to trunk, thanks,
Iain

--- 8< ---

A recent change to libiberty has improved the process spawning on
older Darwin platforms.  This patch updates the expected test output
after the changes.

gcc/testsuite/ChangeLog:

* g++.dg/modules/bad-mapper-1.C: Update expected test output
for earlier Darwin.

Signed-off-by: Iain Sandoe 
---
 gcc/testsuite/g++.dg/modules/bad-mapper-1.C | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/g++.dg/modules/bad-mapper-1.C 
b/gcc/testsuite/g++.dg/modules/bad-mapper-1.C
index b0b0b86c9cd..3dfb5a6073e 100644
--- a/gcc/testsuite/g++.dg/modules/bad-mapper-1.C
+++ b/gcc/testsuite/g++.dg/modules/bad-mapper-1.C
@@ -1,9 +1,9 @@
 //  { dg-additional-options "-fmodules-ts -fmodule-mapper=|this-will-not-work" 
}
 import unique1.bob;
-// { dg-error "-:failed (exec|CreateProcess|posix_spawn).*mapper.* 
.*this-will-not-work" "" { target { ! { *-*-darwin[89]* *-*-darwin10* 
hppa*-*-hpux* } } } 0 }
+// { dg-error "-:failed (exec|CreateProcess|posix_spawn).*mapper.* 
.*this-will-not-work" "" { target { ! { hppa*-*-hpux* } } } 0 }
 // { dg-prune-output "fatal error:" }
 // { dg-prune-output "failed to read" }
 // { dg-prune-output "compilation terminated" }
-// { dg-error "-:failed mapper handshake communication" "" { target { 
*-*-darwin[89]* *-*-darwin10* hppa*-*-hpux* } } 0 }
+// { dg-error "-:failed mapper handshake communication" "" { target { 
hppa*-*-hpux* } } 0 }
 // { dg-prune-output "trying to exec .this-will-not-work."  }
 // { dg-prune-output "unknown Compiled Module Interface"  }
-- 
2.39.2 (Apple Git-143)



[PING] Re: [PATCH 1/2] ivopts: Revert computation of address cost complexity.

2024-04-02 Thread Aleksandar Rakic
I remind you that the patch for the computation of complexity for unsupported 
addressing modes ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109429 ) has 
been sent:
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647966.html


[PATCH][wwwdocs] changes.html changes for AArch64 for GCC 14.1

2024-04-02 Thread Kyrylo Tkachov
Hi all,

Here's a writeup of the AArch64 changes to highlight in GCC 14.1.
If there's something you'd like to highlight feel free to comment or add
a patch yourself. I don't expect the list to be exhaustive.

It's been a busy release for AArch64!
Thanks,
Kyrill


gcc-14-aarch64-wwwdocs.patch
Description: gcc-14-aarch64-wwwdocs.patch


Re: [PATCH] Fix up duplicated words mostly in comments, part 1

2024-04-02 Thread Jonathan Wakely
On Tue, 2 Apr 2024 at 08:47, Jakub Jelinek wrote:
>
> Hi!
>
> Like in r12-7519-g027e30414492d50feb2854aff38227b14300dc4b, I've done
> git grep -v 'long long\|optab optab\|template template\|double double' | grep 
> ' \([a-zA-Z]\+\) \1 '
>
> This is just part of the changes, mostly for non-gcc directories.
> I'll try to get to the rest soon.  Obviously, the above command also
> finds cases which are correct as is and shouldn't be changed, so one
> needs to manually inspect everything.
>
> I'd hope most of it is pretty obvious, but the config/ and libstdc++-v3/
> hunks include a tweak in a license wording, though other copies of the
> similar license have the wording right.

Those libstdc++ parts are fine, thanks.


>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2024-04-02  Jakub Jelinek  
>
> * Makefile.tpl: Fix duplicated words; returns returns ->
> returns.
> config/
> * lcmessage.m4: Fix duplicated words; can can -> can,
> package package -> package.
> libdecnumber/
> * decCommon.c (decFinalize): Fix duplicated words in
> comment; the the -> the.
> libgcc/
> * unwind-dw2-fde.c (struct fde_accumulator): Fix duplicated
> words in comment; is is -> is.
> libgfortran/
> * configure.host: Fix duplicated words; the the -> the.
> libgm2/
> * configure.host: Fix duplicated words; the the -> the.
> libgomp/
> * libgomp.texi (OpenMP 5.2): Fix duplicated words; with with ->
> with.
> (omp_target_associate_ptr): Fix duplicated words; either either ->
> either.
> (omp_init_allocator): Fix duplicated words; be be -> be.
> (omp_realloc): Fix duplicated words; is is -> is.
> (OMP_ALLOCATOR): Fix duplicated words; other other -> other.
> * priority_queue.h (priority_queue_multi_p): Fix duplicated words;
> to to -> to.
> libiberty/
> * regex.c (byte_re_match_2_internal): Fix duplicated words in comment;
> next next -> next.
> * dyn-string.c (dyn_string_init): Fix duplicated words in comment;
> of of -> of.
> libitm/
> * beginend.cc (GTM::gtm_thread::begin_transaction): Fix duplicated
> words in comment; not not -> not to.
> libobjc/
> * init.c (duplicate_classes): Fix duplicated words in comment; in in
> -> in.
> * sendmsg.c (__objc_prepare_dtable_for_class): Fix duplicated words
> in comment; the the -> the.
> * encoding.c (objc_layout_structure): Likewise.
> libstdc++-v3/
> * acinclude.m4: Fix duplicated words; file file -> file can.
> * configure.host: Fix duplicated words; the the -> the.
> libvtv/
> * vtv_rts.cc (vtv_fail): Fix duplicated words; to to -> to.
> * vtv_fail.cc (vtv_fail): Likewise.
>
> --- Makefile.tpl.jj 2024-01-10 12:19:07.609682386 +0100
> +++ Makefile.tpl2024-03-28 15:38:31.471917678 +0100
> @@ -1976,7 +1976,7 @@ configure-target-[+module+]: maybe-all-g
> (define dep-maybe (lambda ()
>(if (exist? "hard") "" "maybe-")))
>
> -   ;; dep-kind returns returns "prebootstrap" for configure or build
> +   ;; dep-kind returns "prebootstrap" for configure or build
> ;; dependencies of bootstrapped modules on a build module
> ;; (e.g. all-gcc on all-build-bison); "normal" if the dependency is
> ;; on an "install" target, or if the dependence module is not
> --- config/lcmessage.m4.jj  2020-01-11 16:31:53.155321678 +0100
> +++ config/lcmessage.m4 2024-03-28 16:01:40.879060037 +0100
> @@ -6,13 +6,13 @@ dnl Public License, this file may be dis
>  dnl that contains a configuration script generated by Autoconf, under
>  dnl the same distribution terms as the rest of that program.
>  dnl
> -dnl This file can can be used in projects which are not available under
> +dnl This file can be used in projects which are not available under
>  dnl the GNU General Public License or the GNU Library General Public
>  dnl License but which still want to provide support for the GNU gettext
>  dnl functionality.
>  dnl Please note that the actual code of the GNU gettext library is covered
>  dnl by the GNU Library General Public License, and the rest of the GNU
> -dnl gettext package package is covered by the GNU General Public License.
> +dnl gettext package is covered by the GNU General Public License.
>  dnl They are *not* in the public domain.
>
>  dnl Authors:
> --- libdecnumber/decCommon.c.jj 2024-01-03 12:07:28.096370943 +0100
> +++ libdecnumber/decCommon.c2024-03-28 16:00:26.576068973 +0100
> @@ -388,7 +388,7 @@ static decFloat * decFinalize(decFloat *
> UBFROMUI(ub-3, 0);   /* to  */
> }
>   /* [note ub could now be to left of msd, and it is not safe */
> - /* to write to the the left of the msd] */
> + /* to write to the left of the msd] */
>   /* now at most 3 digits left to non-9 (usually just the one) */

Re: [PATCH] libstdc++: Allow adjacent __maybe_present_t to overlap

2024-04-02 Thread Jonathan Wakely
On Mon, 1 Apr 2024 at 23:16, Patrick Palka  wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

This is a layout change for some specializations of slide_view, but
better to do that now than change it between gcc 14 and 15.

OK for trunk.


>
> -- >8 --
>
> Currently __maybe_present_t maps to the same empty class
> type independent of T.  This is suboptimal because it means adjacent
> __maybe_present_t members with the [[no_unique_address]]
> attribute can't overlap even if the conditionally present types are
> different.
>
> This patch fixes this by turning this empty class type into a template
> parameterized by the conditionally present type, so that
>
>   [[no_unique_address]] __maybe_present_t _M_a;
>   [[no_unique_address]] __maybe_present_t _M_b;
>
> now overlap if T and U are different.
>
> This patch goes a step further and also adds an optional integer
> discriminator parameter to allow for overlapping when T and U are
> the same.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (ranges::__detail::_Empty): Rename to ...
> (ranges::__detail::_Absent): ... this.  Turn into a template
> parameterized by the absent type _Tp and discriminator _Disc.
> (ranges::__detail::__maybe_present_t): Add an optional
> discriminator parameter.
> (slide_view::_M_cached_begin): Pass a discriminator argument to
> __maybe_present_t.
> (slide_view::_M_cached_end): Likewise.
> * testsuite/std/ranges/adaptors/sizeof.cc: Verify the size of
> slide_view is 3 instead 4 pointers.
> ---
>  libstdc++-v3/include/std/ranges | 13 -
>  .../testsuite/std/ranges/adaptors/sizeof.cc |  4 
>  2 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index 7d739852677..afce818376b 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -886,14 +886,17 @@ namespace views
>
>  namespace __detail
>  {
> -  struct _Empty { };
> +  template
> +struct _Absent { };
>
>// Alias for a type that is conditionally present
>// (and is an empty type otherwise).
>// Data members using this alias should use [[no_unique_address]] so that
>// they take no space when not needed.
> -  template
> -using __maybe_present_t = __conditional_t<_Present, _Tp, _Empty>;
> +  // The optional template parameter _Disc is for discriminating two 
> otherwise
> +  // equivalent absent types so that even they can overlap.
> +  template
> +using __maybe_present_t = __conditional_t<_Present, _Tp, _Absent<_Tp, 
> _Disc>>;
>
>// Alias for a type that is conditionally const.
>template
> @@ -6553,10 +6556,10 @@ namespace views::__adaptor
>  range_difference_t<_Vp> _M_n;
>  [[no_unique_address]]
>__detail::__maybe_present_t<__detail::__slide_caches_first<_Vp>,
> - __detail::_CachedPosition<_Vp>> 
> _M_cached_begin;
> + __detail::_CachedPosition<_Vp>, 0> 
> _M_cached_begin;
>  [[no_unique_address]]
>__detail::__maybe_present_t<__detail::__slide_caches_last<_Vp>,
> - __detail::_CachedPosition<_Vp>> 
> _M_cached_end;
> + __detail::_CachedPosition<_Vp>, 1> 
> _M_cached_end;
>
>  template class _Iterator;
>  class _Sentinel;
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/sizeof.cc 
> b/libstdc++-v3/testsuite/std/ranges/adaptors/sizeof.cc
> index 12a9da3181d..08c01704d10 100644
> --- a/libstdc++-v3/testsuite/std/ranges/adaptors/sizeof.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/adaptors/sizeof.cc
> @@ -49,3 +49,7 @@ static_assert(sizeof(ranges::lazy_split_view std::string_view>) == 4*ptr);
>
>  static_assert
>   (sizeof(ranges::reverse_view>) == 
> 3*ptr);
> +
> +#if __cpp_lib_ranges_slide
> +static_assert(sizeof(ranges::slide_view) == 3*ptr);
> +#endif
> --
> 2.44.0.448.gc2cbfbd2e2
>



[PATCH v2 1/1] [RISC-V] Add support for _Bfloat16

2024-04-02 Thread Xiao Zeng
1 At point ,
  BF16 has already been completed "post public review".

2 LLVM has also added support for RISCV BF16 in
   and
  .

3 According to the discussion 
,
  this use __bf16 and use DF16b in riscv_mangle_type like x86.

Below test are passed for this patch
* The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/iterators.md: New mode iterator HFBF.
* config/riscv/riscv-builtins.cc (riscv_init_builtin_types):
Initialize data type _Bfloat16.
* config/riscv/riscv-modes.def (FLOAT_MODE): New.
(ADJUST_FLOAT_FORMAT): New.
* config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_init_libfuncs): Set the conversion method for BFmode and
HFmode.
(riscv_block_arith_comp_libfuncs_for_mode): Set the arithmetic
and comparison libfuncs for the mode.
* config/riscv/riscv.md (mode" ): Add BF.
(movhf): Support for BFmode.
(mov): Ditto.
(*movhf_softfloat): Ditto.
(*mov_softfloat): Ditto.

libgcc/ChangeLog:

* config/riscv/sfp-machine.h (_FP_NANFRAC_B): New.
(_FP_NANSIGN_B): Ditto.
* config/riscv/t-softfp32: Add support for BF16 libfuncs.
* config/riscv/t-softfp64: Ditto.
* soft-fp/floatsibf.c: For si -> bf16.
* soft-fp/floatunsibf.c: For unsi -> bf16.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/bf16_arithmetic.c: New test.
* gcc.target/riscv/bf16_call.c: New test.
* gcc.target/riscv/bf16_comparison.c: New test.
* gcc.target/riscv/bf16_float_libcall_convert.c: New test.
* gcc.target/riscv/bf16_integer_libcall_convert.c: New test.

Co-authored-by: Jin Ma 
---
 gcc/config/riscv/iterators.md |  2 +
 gcc/config/riscv/riscv-builtins.cc| 16 
 gcc/config/riscv/riscv-modes.def  |  3 +
 gcc/config/riscv/riscv.cc | 64 ++-
 gcc/config/riscv/riscv.md | 24 +++---
 .../gcc.target/riscv/bf16_arithmetic.c| 42 ++
 gcc/testsuite/gcc.target/riscv/bf16_call.c| 12 +++
 .../gcc.target/riscv/bf16_comparison.c| 36 +
 .../riscv/bf16_float_libcall_convert.c| 57 +
 .../riscv/bf16_integer_libcall_convert.c  | 81 +++
 libgcc/config/riscv/sfp-machine.h |  3 +
 libgcc/config/riscv/t-softfp32| 10 ++-
 libgcc/config/riscv/t-softfp64|  3 +-
 libgcc/soft-fp/floatsibf.c| 45 +++
 libgcc/soft-fp/floatunsibf.c  | 45 +++
 15 files changed, 407 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_arithmetic.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_call.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_comparison.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_float_libcall_convert.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/bf16_integer_libcall_convert.c
 create mode 100644 libgcc/soft-fp/floatsibf.c
 create mode 100644 libgcc/soft-fp/floatunsibf.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index a7694137685..40bf20f42bb 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -75,6 +75,8 @@
 ;; Iterator for floating-point modes that can be loaded into X registers.
 (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
 
+;; Iterator for floating-point modes of BF16
+(define_mode_iterator HFBF [HF BF])
 
 ;; ---
 ;; Mode attributes
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index d457e306dd1..4c08834288a 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -230,6 +230,7 @@ static GTY(()) int riscv_builtin_decl_index[NUM_INSN_CODES];
   riscv_builtin_decls[riscv_builtin_decl_index[(CODE)]]
 
 tree riscv_float16_type_node = NULL_TREE;
+tree riscv_bfloat16_type_node = NULL_TREE;
 
 /* Return the function type associated with function prototype TYPE.  */
 
@@ -273,6 +274,21 @@ riscv_init_builtin_types (void)
   if (!maybe_get_identifier ("_Float16"))
 lang_hooks.types.register_builtin_type (riscv_float16_type_node,
"_Float16");
+
+  /* Provide the _Bfloat16 type and bfloat16_type_node if needed.  */
+  if (!bfloat16_type_node)
+{
+  riscv_bfloat16_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (riscv_bfloat16_type_node) = 16;
+  SET_TYPE_MODE (riscv_bfloat16_type_node, BFmode);
+  layout_type (riscv_bfloat16_type_node);
+ 

[PATCH v2 0/1] [RISC-V] Add support for _Bfloat16

2024-04-02 Thread Xiao Zeng
Hi all RISC-V folks:

This patch completes the support for the bf16 data type in the
riscv architecture.On this basis, there will be a series of
patches in the future to strengthen support for BF16.

It is recommended to first review this patch from the testcases,
where detailed explanations have been provided on the flow of
data type conversion.

The basis of this patch is: 


The website for the first patch submission is: 


However, considering the non-standard commit information, this
submission was made.

Patch v2 fixed failed test cases.

*** BLURB HERE ***

Xiao Zeng (1):
  [RISC-V] Add support for _Bfloat16

 gcc/config/riscv/iterators.md |  2 +
 gcc/config/riscv/riscv-builtins.cc| 16 
 gcc/config/riscv/riscv-modes.def  |  3 +
 gcc/config/riscv/riscv.cc | 64 ++-
 gcc/config/riscv/riscv.md | 24 +++---
 .../gcc.target/riscv/bf16_arithmetic.c| 42 ++
 gcc/testsuite/gcc.target/riscv/bf16_call.c| 12 +++
 .../gcc.target/riscv/bf16_comparison.c| 36 +
 .../riscv/bf16_float_libcall_convert.c| 57 +
 .../riscv/bf16_integer_libcall_convert.c  | 81 +++
 libgcc/config/riscv/sfp-machine.h |  3 +
 libgcc/config/riscv/t-softfp32| 10 ++-
 libgcc/config/riscv/t-softfp64|  3 +-
 libgcc/soft-fp/floatsibf.c| 45 +++
 libgcc/soft-fp/floatunsibf.c  | 45 +++
 15 files changed, 407 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_arithmetic.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_call.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_comparison.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_float_libcall_convert.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/bf16_integer_libcall_convert.c
 create mode 100644 libgcc/soft-fp/floatsibf.c
 create mode 100644 libgcc/soft-fp/floatunsibf.c

-- 
2.17.1



Re: [PATCH] RISC-V: Minor fix for max_point

2024-04-02 Thread juzhe.zh...@rivai.ai
It's obvious fix to previous incorrect typo.
So LGTM to trunk (GCC-14).

Thanks.


juzhe.zh...@rivai.ai
 
From: demin.han
Date: 2024-04-02 16:34
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH] RISC-V: Minor fix for max_point
The program points start from 1, so max_point should be equal to
length().
 
Tested on RV64 and no regression.
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-costs.cc: Use length()
 
Signed-off-by: demin.han 
---
gcc/config/riscv/riscv-vector-costs.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
 
diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index 484196b15b4..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -759,7 +759,7 @@ update_local_live_ranges (
We will be likely using one more vector variable.  */
  unsigned int max_point
- = (*program_points_per_bb.get (bb)).length () - 1;
+ = (*program_points_per_bb.get (bb)).length ();
  auto *live_ranges = live_ranges_per_bb.get (bb);
  bool existed_p = false;
  tree var = type == load_vec_info_type
-- 
2.44.0
 
 


[PATCH] RISC-V: Minor fix for max_point

2024-04-02 Thread demin.han
The program points start from 1, so max_point should be equal to
length().

Tested on RV64 and no regression.

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc: Use length()

Signed-off-by: demin.han 
---
 gcc/config/riscv/riscv-vector-costs.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index 484196b15b4..9f7fe936a29 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -759,7 +759,7 @@ update_local_live_ranges (
 
We will be likely using one more vector variable.  */
  unsigned int max_point
-   = (*program_points_per_bb.get (bb)).length () - 1;
+   = (*program_points_per_bb.get (bb)).length ();
  auto *live_ranges = live_ranges_per_bb.get (bb);
  bool existed_p = false;
  tree var = type == load_vec_info_type
-- 
2.44.0



Re: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model

2024-04-02 Thread juzhe.zh...@rivai.ai
Thanks for fixing it. LGTM to GCC-15 as Jeff suggested.



juzhe.zh...@rivai.ai
 
From: demin.han
Date: 2024-04-02 16:30
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV 
cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.
 
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
 
Tested on RV64 and no regression.
 
PR target/114506
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
 
gcc/testsuite/ChangeLog:
 
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
 
Signed-off-by: demin.han 
---
V2 changes:
  1. remove max_point issue
  2. minor change in commit message
 
gcc/config/riscv/riscv-vector-costs.cc| 23 ---
.../vect/costmodel/riscv/rvv/pr114506.c   | 23 +++
2 files changed, 38 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
 
diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
 return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store.  */
+/* Return true if addtional vector vars needed.  */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
   enum stmt_vec_info_type type
 = STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
-  return ((type == load_vec_info_type || type == store_vec_info_type)
-   && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+  if (type == load_vec_info_type || type == store_vec_info_type)
+{
+  if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+   && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+  machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+  int lmul = riscv_get_v_regno_alignment (mode);
+  if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+}
+  return false;
}
/* Return the LMUL of the current analysis.  */
@@ -739,10 +749,7 @@ update_local_live_ranges (
  stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
  enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
-   if (non_contiguous_memory_access_p (stmt_info)
-   /* LOAD_LANES/STORE_LANES doesn't need a perm indice.  */
-   && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
-!= VMAT_LOAD_STORE_LANES)
+   if (need_additional_vector_vars_p (stmt_info))
{
  /* For non-adjacent load/store STMT, we will potentially
convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+  for (int i = 0; i < 256; i++)
+{
+  for (int j = 0; j < 256; j++)
+ {
+   aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+  a[i] = b[i] + c[i] * d[i];
+}
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it 
has unexpected spills" "vect" } } */
-- 
2.44.0
 
 


[PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model

2024-04-02 Thread demin.han
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.

This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.

Tested on RV64 and no regression.

PR target/114506

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): 
Rename
(need_additional_vector_vars_p): Rename and refine condition

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.

Signed-off-by: demin.han 
---
V2 changes:
  1. remove max_point issue
  2. minor change in commit message

 gcc/config/riscv/riscv-vector-costs.cc| 23 ---
 .../vect/costmodel/riscv/rvv/pr114506.c   | 23 +++
 2 files changed, 38 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c

diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
 return gimple_assign_rhs1 (stmt);
 }
 
-/* Return true if it is non-contiguous load/store.  */
+/* Return true if addtional vector vars needed.  */
 static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
 {
   enum stmt_vec_info_type type
 = STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
-  return ((type == load_vec_info_type || type == store_vec_info_type)
- && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+  if (type == load_vec_info_type || type == store_vec_info_type)
+{
+  if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+ && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+   return true;
+
+  machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+  int lmul = riscv_get_v_regno_alignment (mode);
+  if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+   return true;
+}
+  return false;
 }
 
 /* Return the LMUL of the current analysis.  */
@@ -739,10 +749,7 @@ update_local_live_ranges (
  stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
  enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
- if (non_contiguous_memory_access_p (stmt_info)
- /* LOAD_LANES/STORE_LANES doesn't need a perm indice.  */
- && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
-  != VMAT_LOAD_STORE_LANES)
+ if (need_additional_vector_vars_p (stmt_info))
{
  /* For non-adjacent load/store STMT, we will potentially
 convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+  for (int i = 0; i < 256; i++)
+{
+  for (int j = 0; j < 256; j++)
+   {
+ aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+   }
+  a[i] = b[i] + c[i] * d[i];
+}
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it 
has unexpected spills" "vect" } } */
-- 
2.44.0



Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-02 Thread Jakub Jelinek
On Tue, Apr 02, 2024 at 02:12:04PM +0800, Kewen.Lin wrote:
>  The old code for the unused hidden parameter (which was the 9th param) 
>  would
>  fall thru to the "return NULL_RTX;" which would make the callee assume 
>  there
>  was a parameter save area allocated.  Now instead, we'll return a reg 
>  rtx,
>  probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll 
>  now
>  see a copy of r11 into a pseudo like we do for the other param regs.
>  Is that a problem? Given it's an unused parameter, it'll probably get 
>  deleted
>  as dead code, but could it cause any issues?  What if we have more than 
>  one
> 
> I think Peter raised one good point, not sure it would really cause some 
> issues,
> but the assigned reg goes beyond GP_ARG_MAX_REG, at least it is confusing to 
> people
> especially without DCE like at -O0.  Can we aggressively remove these 
> candidates
> from DECL_ARGUMENTS chain?  Does it cause any assertion to fail?

I'd prefer not to remove DECL_ARGUMENTS chains, they are valid arguments that 
just some
invalid code doesn't pass.  By removing them you basically always create an
invalid case, this time in the other direction, valid caller passes more
arguments than the callee (invalidly) expects.

Jakub



[PATCH] s390x: Optimize vector permute with constant indexes

2024-04-02 Thread Juergen Christ
Loop vectorizer can generate vector permutes with constant indexes
where all indexes are equal.  Optimize this case to use vector
replicate instead of vector permute.

gcc/ChangeLog:

* config/s390/s390.cc (expand_perm_as_replicate): Implement.
(vectorize_vec_perm_const_1): Call new function.
* config/s390/vx-builtins.md (vec_splat): Change to...
(@vec_splat): ...this.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-expand-replicate.c: New test.

Bootstrapped and regtested on s390x.  Ok for trunk?

Signed-off-by: Juergen Christ 
---
 gcc/config/s390/s390.cc   | 32 +++
 gcc/config/s390/vx-builtins.md|  2 +-
 .../s390/vector/vec-expand-replicate.c| 30 +
 3 files changed, 63 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-expand-replicate.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 372a23244032..4b4014ebe444 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -17923,6 +17923,35 @@ expand_perm_as_a_vlbr_vstbr_candidate (const struct 
expand_vec_perm_d )
   return false;
 }
 
+static bool expand_perm_as_replicate (const struct expand_vec_perm_d )
+{
+  unsigned char i;
+  unsigned char elem;
+  rtx base = d.op0;
+  rtx insn;
+  /* Needed to silence maybe-uninitialized warning.  */
+  gcc_assert(d.nelt > 0);
+  elem = d.perm[0];
+  for (i = 1; i < d.nelt; ++i)
+if (d.perm[i] != elem)
+  return false;
+  if (!d.testing_p)
+{
+  if (elem >= d.nelt)
+   {
+ base = d.op1;
+ elem -= d.nelt;
+   }
+  insn = maybe_gen_vec_splat (d.vmode, d.target, base, GEN_INT (elem));
+  if (insn == NULL_RTX)
+   return false;
+  emit_insn (insn);
+  return true;
+}
+  else
+return maybe_code_for_vec_splat (d.vmode) != CODE_FOR_nothing;
+}
+
 /* Try to find the best sequence for the vector permute operation
described by D.  Return true if the operation could be
expanded.  */
@@ -17941,6 +17970,9 @@ vectorize_vec_perm_const_1 (const struct 
expand_vec_perm_d )
   if (expand_perm_as_a_vlbr_vstbr_candidate (d))
 return true;
 
+  if (expand_perm_as_replicate(d))
+return true;
+
   return false;
 }
 
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 432d81a719fc..93c0d408a43e 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -424,7 +424,7 @@
 
 
 ; Replicate from vector element
-(define_expand "vec_splat"
+(define_expand "@vec_splat"
   [(set (match_operand:V_HW  0 "register_operand"  "")
(vec_duplicate:V_HW (vec_select:
 (match_operand:V_HW 1 "register_operand"  "")
diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-expand-replicate.c 
b/gcc/testsuite/gcc.target/s390/vector/vec-expand-replicate.c
new file mode 100644
index ..27563a00f22b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/vec-expand-replicate.c
@@ -0,0 +1,30 @@
+/* Check that the vectorize_vec_perm_const expander correctly deals with
+   replication.  Extracted from spec "nab".  */
+
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z13 -fvect-cost-model=unlimited" } */
+
+
+#define REAL_T  double
+typedef REAL_T  MATRIX_T[ 4 ][ 4 ];
+
+int concat_mat_i, concat_mat_j;
+static void concat_mat(MATRIX_T m1, MATRIX_T, MATRIX_T m3);
+MATRIX_T *rot4p() {
+  MATRIX_T mat3, mat4;
+  static MATRIX_T mat5;
+  concat_mat(mat4, mat3, mat5);
+}
+void concat_mat(MATRIX_T m1, MATRIX_T, MATRIX_T m3) {
+  int k;
+  for (;; concat_mat_i++) {
+concat_mat_j = 0;
+for (; 4; concat_mat_j++) {
+  k = 0;
+  for (; k < 4; k++)
+m3[concat_mat_i][concat_mat_j] += m1[concat_mat_i][k];
+}
+  }
+}
+
+/* { dg-final { scan-assembler-not "vperm" } } */
-- 
2.39.3



[PATCH] Fix up duplicated words mostly in comments, part 1

2024-04-02 Thread Jakub Jelinek
Hi!

Like in r12-7519-g027e30414492d50feb2854aff38227b14300dc4b, I've done
git grep -v 'long long\|optab optab\|template template\|double double' | grep ' 
\([a-zA-Z]\+\) \1 '

This is just part of the changes, mostly for non-gcc directories.
I'll try to get to the rest soon.  Obviously, the above command also
finds cases which are correct as is and shouldn't be changed, so one
needs to manually inspect everything.

I'd hope most of it is pretty obvious, but the config/ and libstdc++-v3/
hunks include a tweak in a license wording, though other copies of the
similar license have the wording right.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-02  Jakub Jelinek  

* Makefile.tpl: Fix duplicated words; returns returns ->
returns.
config/
* lcmessage.m4: Fix duplicated words; can can -> can,
package package -> package.
libdecnumber/
* decCommon.c (decFinalize): Fix duplicated words in
comment; the the -> the.
libgcc/
* unwind-dw2-fde.c (struct fde_accumulator): Fix duplicated
words in comment; is is -> is.
libgfortran/
* configure.host: Fix duplicated words; the the -> the.
libgm2/
* configure.host: Fix duplicated words; the the -> the.
libgomp/
* libgomp.texi (OpenMP 5.2): Fix duplicated words; with with ->
with.
(omp_target_associate_ptr): Fix duplicated words; either either ->
either.
(omp_init_allocator): Fix duplicated words; be be -> be.
(omp_realloc): Fix duplicated words; is is -> is.
(OMP_ALLOCATOR): Fix duplicated words; other other -> other.
* priority_queue.h (priority_queue_multi_p): Fix duplicated words;
to to -> to.
libiberty/
* regex.c (byte_re_match_2_internal): Fix duplicated words in comment;
next next -> next.
* dyn-string.c (dyn_string_init): Fix duplicated words in comment;
of of -> of.
libitm/
* beginend.cc (GTM::gtm_thread::begin_transaction): Fix duplicated
words in comment; not not -> not to.
libobjc/
* init.c (duplicate_classes): Fix duplicated words in comment; in in
-> in.
* sendmsg.c (__objc_prepare_dtable_for_class): Fix duplicated words
in comment; the the -> the.
* encoding.c (objc_layout_structure): Likewise.
libstdc++-v3/
* acinclude.m4: Fix duplicated words; file file -> file can.
* configure.host: Fix duplicated words; the the -> the.
libvtv/
* vtv_rts.cc (vtv_fail): Fix duplicated words; to to -> to.
* vtv_fail.cc (vtv_fail): Likewise.

--- Makefile.tpl.jj 2024-01-10 12:19:07.609682386 +0100
+++ Makefile.tpl2024-03-28 15:38:31.471917678 +0100
@@ -1976,7 +1976,7 @@ configure-target-[+module+]: maybe-all-g
(define dep-maybe (lambda ()
   (if (exist? "hard") "" "maybe-")))
 
-   ;; dep-kind returns returns "prebootstrap" for configure or build
+   ;; dep-kind returns "prebootstrap" for configure or build
;; dependencies of bootstrapped modules on a build module
;; (e.g. all-gcc on all-build-bison); "normal" if the dependency is
;; on an "install" target, or if the dependence module is not
--- config/lcmessage.m4.jj  2020-01-11 16:31:53.155321678 +0100
+++ config/lcmessage.m4 2024-03-28 16:01:40.879060037 +0100
@@ -6,13 +6,13 @@ dnl Public License, this file may be dis
 dnl that contains a configuration script generated by Autoconf, under
 dnl the same distribution terms as the rest of that program.
 dnl
-dnl This file can can be used in projects which are not available under
+dnl This file can be used in projects which are not available under
 dnl the GNU General Public License or the GNU Library General Public
 dnl License but which still want to provide support for the GNU gettext
 dnl functionality.
 dnl Please note that the actual code of the GNU gettext library is covered
 dnl by the GNU Library General Public License, and the rest of the GNU
-dnl gettext package package is covered by the GNU General Public License.
+dnl gettext package is covered by the GNU General Public License.
 dnl They are *not* in the public domain.
 
 dnl Authors:
--- libdecnumber/decCommon.c.jj 2024-01-03 12:07:28.096370943 +0100
+++ libdecnumber/decCommon.c2024-03-28 16:00:26.576068973 +0100
@@ -388,7 +388,7 @@ static decFloat * decFinalize(decFloat *
UBFROMUI(ub-3, 0);   /* to  */
}
  /* [note ub could now be to left of msd, and it is not safe */
- /* to write to the the left of the msd] */
+ /* to write to the left of the msd] */
  /* now at most 3 digits left to non-9 (usually just the one) */
  for (; ub>=umsd; *ub=0, ub--) {
if (*ub==9) continue;/* carry */
--- libgcc/unwind-dw2-fde.c.jj  2024-03-23 08:22:50.622605182 +0100
+++ libgcc/unwind-dw2-fde.c 2024-03-28 15:59:32.552802535 +0100
@@ -501,7 +501,7 @@ fde_mixed_encoding_extract (struct 

Re: [pushed][PATCH] LoongArch: Fix missing plugin header

2024-04-02 Thread chenglulu

Pushed to r14-9743.

在 2024/4/2 上午9:20, Yang Yujie 写道:

gcc/ChangeLog:

* config/loongarch/t-loongarch: Add loongarch-def-arrays.h
to OPTION_H_EXTRA.
---
  gcc/config/loongarch/t-loongarch | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/t-loongarch b/gcc/config/loongarch/t-loongarch
index 3dd7c4b031e..acf5da95310 100644
--- a/gcc/config/loongarch/t-loongarch
+++ b/gcc/config/loongarch/t-loongarch
@@ -18,8 +18,9 @@
  
  
  GTM_H += loongarch-multilib.h

-OPTIONS_H_EXTRA += $(srcdir)/config/loongarch/loongarch-def.h  \
-  $(srcdir)/config/loongarch/loongarch-tune.h  \
+OPTIONS_H_EXTRA += $(srcdir)/config/loongarch/loongarch-def.h  \
+  $(srcdir)/config/loongarch/loongarch-def-array.h \
+  $(srcdir)/config/loongarch/loongarch-tune.h  \
   $(srcdir)/config/loongarch/loongarch-cpucfg-map.h
  
  # Canonical target triplet from config.gcc




[PATCH] Fix up postboot dependencies [PR106472]

2024-04-02 Thread Jakub Jelinek
On Wed, Mar 13, 2024 at 10:13:37AM +0100, Jakub Jelinek wrote:
> While the first Makefile.tpl hunk looks obviously ok, the others look
> completely wrong to me.
> There is nothing special about libgo vs. libbacktrace/libatomic
> compared to any other target library which is not bootstrapped vs. any
> of its dependencies which are in the bootstrapped set.
> So, Makefile.tpl shouldn't hardcode such dependencies.

Here is my version of the fix.
The dependencies in the toplevel Makefile simply didn't take into account
that some target modules could be in a bootstrapped build built in some
configurations as bootstrap modules (typically as dependencies of other
target bootstrap modules), while in other configurations just as
dependencies of non-bootstrap target modules and so not built during the
bootstrap, but after it.
Makefile.tpl arranges for those postboot target module -> target module
dependencies to be emitted only inside of an @unless gcc-bootstrap block,
while for @if gcc-bootstrap it just emits
configure-target-whatever: stage_last
dependencies which ensure those postbootstrap target modules are only built
after everything that is bootstrapped has been.

Now, the libbacktrace/libatomic target modules have bootstrap=true
target_modules = { module= libbacktrace; bootstrap=true; };
target_modules = { module= libatomic; bootstrap=true; lib_path=.libs; };
because those modules are dependencies of libphobos target module, so
when d is included among bootstrapped languages, those are all bootstrapped
and everything works correctly.
While if d is not included, libphobos target module is disabled,
libbacktrace/libatomic target modules aren't bootstrapped, nothing during
bootstrap needs them, but post bootstrap libgo target module depends on
the libatomic and libbacktrace target modules, libgfortran target module
depends on the libbacktrace target module and libgm2 target module depends
on the libatomic target module, but those dependencies were emitted only
@unless gcc-bootstrap.  There is a similar theoretical problem for zlib
target module if GCJ would be ressurected, libphobos as bootstrap target
module depends on the zlib target module, but if d is not configured,
fastjar also depends on it.

The following patch arranges for the @if gcc-bootstrap case to emit also
target module -> target module dependencies, but conditionally on the
on dependency not being bootstrapped.

In the generated Makefile.in you can see what the Makefile.tpl change
produces and that it just adds extra dependencies which weren't there
before in the @if gcc-bootstrap case.

I've bootstrapped without this patch with
../configure --enable-languages=c,c++,go; make
on x86_64-linux (note, make -j2 or higher usually worked) which failed
as described in the PR, then with this patch with the same command which
built fine and the Makefile difference between the two builds being
diff -up obj40{a,b}/Makefile
--- obj40a/Makefile 2024-03-31 00:35:22.243791499 +0100
+++ obj40b/Makefile 2024-03-31 22:40:38.143299144 +0200
@@ -29376,6 +29376,14 @@ configure-bison: stage_last
 configure-flex: stage_last
 configure-m4: stage_last
 
+configure-target-fastjar: maybe-configure-target-zlib
+all-target-fastjar: maybe-all-target-zlib
+all-target-libgo: maybe-all-target-libbacktrace
+all-target-libgo: maybe-all-target-libatomic
+all-target-libgm2: maybe-all-target-libatomic
+configure-target-libgfortran: maybe-all-target-libbacktrace
+configure-target-libgo: maybe-all-target-libbacktrace
+
 
 # Dependencies for target modules on other target modules are
 # described by lang_env_dependencies; the defaults apply to anything

which I believe are exactly the extra dependencies we want.
Plus I've done normal x86_64-linux and i686-linux bootstraps/regtests
which in my case include --enable-languages=default,ada,obj-c++,lto,go,d,rust,m2
for x86_64 and the same except ada for i686; those with my usual make -j32.
The Makefile difference in those builds vs. unpatched case
is just an extra empty line.

Ok for trunk?

2024-04-02  Jakub Jelinek  

PR bootstrap/106472
* Makefile.tpl (make-postboot-target-dep): New lambda.
Use it to add --enable-bootstrap dependencies of target modules
on other target modules if the latter aren't bootstrapped.
* Makefile.in: Regenerate.

--- Makefile.tpl.jj 2024-01-09 22:40:16.812824317 +0100
+++ Makefile.tpl2024-03-30 14:23:51.985398859 +0100
@@ -2013,6 +2013,25 @@ configure-target-[+module+]: maybe-all-g
 (unless (=* target "target-")
(string-append "configure-" target ": " dep "\n"))
 
+   ;; Dependencies in between target modules if the dependencies
+   ;; are bootstrap target modules and the target modules which
+   ;; depend on them are emitted inside of @unless gcc-bootstrap.
+   ;; Unfortunately, some target modules like libatomic or libbacktrace
+   ;; have bootstrap flag set, but whether they are actually built
+   ;; during bootstrap or after 

[PATCH] LoongArch: Remove unused code

2024-04-02 Thread Jiahao Xu
For machines that satisfy ISA_HAS_LSX && !TARGET_64BIT, we will not support 
them now
and in the future, so this patch removes these unused code.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove unused code.
* config/loongarch/loongarch-protos.h (loongarch_split_lsx_copy_d): 
Remove.
(loongarch_split_lsx_insert_d): Ditto.
(loongarch_split_lsx_fill_d): Ditto.
* config/loongarch/loongarch.cc (loongarch_split_lsx_copy_d): Ditto.
(loongarch_split_lsx_insert_d): Ditto.
(loongarch_split_lsx_fill_d): Ditto.
* config/loongarch/lsx.md (lsx_vpickve2gr_du): Remove splitter.
(lsx_vpickve2gr_): Ditto.
(abs2): Remove expander.
(vabs2): Rename to abs2.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lsx/lsx-abs.c: New test.
---
 gcc/config/loongarch/lasx.md  | 12 +--
 gcc/config/loongarch/loongarch-protos.h   |  3 -
 gcc/config/loongarch/loongarch.cc | 76 
 gcc/config/loongarch/lsx.md   | 89 ++-
 .../gcc.target/loongarch/vector/lsx/lsx-abs.c | 26 ++
 5 files changed, 35 insertions(+), 171 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-abs.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 2fa5e46c8e8..7bd61f8ed5b 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -572,12 +572,7 @@ (define_insn "lasx_xvinsgr2vr_"
  (match_operand 3 "const__operand" "")))]
   "ISA_HAS_LASX"
 {
-#if 0
-  if (!TARGET_64BIT && (mode == V4DImode || mode == V4DFmode))
-return "#";
-  else
-#endif
-return "xvinsgr2vr.\t%u0,%z1,%y3";
+  return "xvinsgr2vr.\t%u0,%z1,%y3";
 }
   [(set_attr "type" "simd_insert")
(set_attr "mode" "")])
@@ -1446,10 +1441,7 @@ (define_insn "lasx_xvreplgr2vr_"
   if (which_alternative == 1)
 return "xvldi.b\t%u0,0" ;
 
-  if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode))
-return "#";
-  else
-return "xvreplgr2vr.\t%u0,%z1";
+  return "xvreplgr2vr.\t%u0,%z1";
 }
   [(set_attr "type" "simd_fill")
(set_attr "mode" "")
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index e3ed2b912a5..e238d795a73 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -89,9 +89,6 @@ extern void loongarch_split_128bit_move (rtx, rtx);
 extern bool loongarch_split_128bit_move_p (rtx, rtx);
 extern void loongarch_split_256bit_move (rtx, rtx);
 extern bool loongarch_split_256bit_move_p (rtx, rtx);
-extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
-extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
-extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index a69a203fbe6..8438cc64b0d 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4756,82 +4756,6 @@ loongarch_split_256bit_move (rtx dest, rtx src)
 }
 }
 
-
-/* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
-   used to generate subregs.  */
-
-void
-loongarch_split_lsx_copy_d (rtx dest, rtx src, rtx index,
-   rtx (*gen_fn)(rtx, rtx, rtx))
-{
-  gcc_assert ((GET_MODE (src) == V2DImode && GET_MODE (dest) == DImode)
- || (GET_MODE (src) == V2DFmode && GET_MODE (dest) == DFmode));
-
-  /* Note that low is always from the lower index, and high is always
- from the higher index.  */
-  rtx low = loongarch_subword (dest, false);
-  rtx high = loongarch_subword (dest, true);
-  rtx new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
-
-  emit_insn (gen_fn (low, new_src, GEN_INT (INTVAL (index) * 2)));
-  emit_insn (gen_fn (high, new_src, GEN_INT (INTVAL (index) * 2 + 1)));
-}
-
-/* Split a INSERT.D with operand DEST, SRC1.INDEX and SRC2.  */
-
-void
-loongarch_split_lsx_insert_d (rtx dest, rtx src1, rtx index, rtx src2)
-{
-  int i;
-  gcc_assert (GET_MODE (dest) == GET_MODE (src1));
-  gcc_assert ((GET_MODE (dest) == V2DImode
-  && (GET_MODE (src2) == DImode || src2 == const0_rtx))
- || (GET_MODE (dest) == V2DFmode && GET_MODE (src2) == DFmode));
-
-  /* Note that low is always from the lower index, and high is always
- from the higher index.  */
-  rtx low = loongarch_subword (src2, false);
-  rtx high = loongarch_subword (src2, true);
-  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
-  rtx new_src1 = simplify_gen_subreg (V4SImode, src1, GET_MODE (src1), 0);
-  i = exact_log2 (INTVAL (index));
-  gcc_assert (i != -1);
-
-  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, low, new_src1,
- GEN_INT (1 << (i * 2;
-  emit_insn (gen_lsx_vinsgr2vr_w 

Re: [PATCH][Backport][GCC10] Fix SSA corruption due to widening_mul opt on conflict across an abnormal edge [PR111407]

2024-04-02 Thread Richard Biener
On Mon, Apr 1, 2024 at 3:36 PM Qing Zhao  wrote:
>
> This is a bug in tree-ssa-math-opts.c, when applying the widening mul
> optimization, the compiler needs to check whether the operand is in a
> ABNORMAL PHI, if YES, we should avoid the transformation.
>
> PR tree-optimization/111407
>
> gcc/ChangeLog:
>
> * tree-ssa-math-opts.c (convert_mult_to_widen): Avoid the transform
> when one of the operands is subject to abnormal coalescing.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr111407.c: New test.
>
> (cherry picked from commit 4aca1cfd6235090e48a53dab734437740671bbf3)
>
> bootstraped and regression tested on both aarch64 and x86.
>
> Okay for commit to GCC10?

Note the GCC 10 branch is closed.  If the patch boostraps/tests on the
11, 12 and 13
branches it is OK there.  You do not need approval to backport fixes
for _regressions_
if the patch cherry-picks without major edits and boostraps/tests OK.

Thanks,
Richard.

> thanks.
>
> Qing
> ---
>  gcc/testsuite/gcc.dg/pr111407.c | 21 +
>  gcc/tree-ssa-math-opts.c|  8 
>  2 files changed, 29 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr111407.c
>
> diff --git a/gcc/testsuite/gcc.dg/pr111407.c b/gcc/testsuite/gcc.dg/pr111407.c
> new file mode 100644
> index ..a171074753f9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr111407.c
> @@ -0,0 +1,21 @@
> +/* PR tree-optimization/111407*/
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +enum { SEND_TOFILE } __sigsetjmp();
> +void fclose();
> +void foldergets();
> +void sendpart_stats(int *p1, int a1, int b1) {
> + int *a = p1;
> + fclose();
> + p1 = 0;
> + long t = b1;
> + if (__sigsetjmp()) {
> +   {
> + long t1 = a1;
> + a1+=1;
> + fclose(a1*(long)t1);
> +   }
> + }
> + if (p1)
> +   fclose();
> +}
> diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> index dd0b8c6f0577..47981da20e05 100644
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -2543,6 +2543,14 @@ convert_mult_to_widen (gimple *stmt, 
> gimple_stmt_iterator *gsi)
>if (!is_widening_mult_p (stmt, , , , ))
>  return false;
>
> +  /* if any one of rhs1 and rhs2 is subject to abnormal coalescing,
> + avoid the tranform. */
> +  if ((TREE_CODE (rhs1) == SSA_NAME
> +   && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1))
> +  || (TREE_CODE (rhs2) == SSA_NAME
> + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs2)))
> +return false;
> +
>to_mode = SCALAR_INT_TYPE_MODE (type);
>from_mode = SCALAR_INT_TYPE_MODE (type1);
>if (to_mode == from_mode)
> --
> 2.31.1
>


[PATCH v1] LoongArch: Set default alignment for functions jumps and loops [PR112919].

2024-04-02 Thread Lulu Cheng
Xi Ruoyao set the alignment rules under LA464 in commit r14-1839,
but the macro ASM_OUTPUT_ALIGN_WITH_NOP was removed in R14-4674,
which affected the alignment rules.

So I set different aligns on LA464 and LA664 again to test the
performance of spec2006, and modify the alignment based on the test
results.

gcc/ChangeLog:

PR target/112919
* config/loongarch/loongarch-def.cc (la664_align): Newly defined
function that sets alignment rules under the LA664 microarchitecture.
* config/loongarch/loongarch-opts.cc
(loongarch_target_option_override): If not optimizing for size, set
the default alignment to what the target wants.
* config/loongarch/loongarch-tune.h (struct loongarch_align): Add
new member variables jump and loop.
---
 gcc/config/loongarch/loongarch-def.cc  | 11 ---
 gcc/config/loongarch/loongarch-opts.cc | 19 +--
 gcc/config/loongarch/loongarch-tune.h  | 22 +++---
 3 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/gcc/config/loongarch/loongarch-def.cc 
b/gcc/config/loongarch/loongarch-def.cc
index e8c129ce643..63a8f108f4e 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -81,14 +81,19 @@ array_tune loongarch_cpu_cache =
 
 static inline loongarch_align la464_align ()
 {
-  return loongarch_align ().function_ ("32").label_ ("16");
+  return loongarch_align ().function_ ("32").loop_ ("16").jump_ ("16");
+}
+
+static inline loongarch_align la664_align ()
+{
+  return loongarch_align ().function_ ("8").loop_ ("8").jump_ ("32");
 }
 
 array_tune loongarch_cpu_align =
   array_tune ()
-.set (CPU_LOONGARCH64, la464_align ())
+.set (CPU_LOONGARCH64, la664_align ())
 .set (CPU_LA464, la464_align ())
-.set (CPU_LA664, la464_align ());
+.set (CPU_LA664, la664_align ());
 
 /* Default RTX cost initializer.  */
 loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
diff --git a/gcc/config/loongarch/loongarch-opts.cc 
b/gcc/config/loongarch/loongarch-opts.cc
index 2a6fc41b247..7b21cc311a8 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -922,13 +922,20 @@ loongarch_target_option_override (struct loongarch_target 
*target,
 {
   loongarch_update_gcc_opt_status (target, opts, opts_set);
 
-  /* alignments */
-  if (opts->x_flag_align_functions && !opts->x_str_align_functions)
-opts->x_str_align_functions
-  = loongarch_cpu_align[target->cpu_tune].function;
+  /* If not optimizing for size, set the default
+ alignment to what the target wants.  */
+  if (!opts->x_optimize_size)
+{
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions)
+   opts->x_str_align_functions
+ = loongarch_cpu_align[target->cpu_tune].function;
+
+  if (opts->x_flag_align_loops && !opts->x_str_align_loops)
+   opts->x_str_align_loops = loongarch_cpu_align[target->cpu_tune].loop;
 
-  if (opts->x_flag_align_labels && !opts->x_str_align_labels)
-opts->x_str_align_labels = loongarch_cpu_align[target->cpu_tune].label;
+  if (opts->x_flag_align_jumps && !opts->x_str_align_jumps)
+   opts->x_str_align_jumps = loongarch_cpu_align[target->cpu_tune].jump;
+}
 
   /* Set up parameters to be used in prefetching algorithm.  */
   int simultaneous_prefetches
diff --git a/gcc/config/loongarch/loongarch-tune.h 
b/gcc/config/loongarch/loongarch-tune.h
index 72b75f6de3f..3974edf9a90 100644
--- a/gcc/config/loongarch/loongarch-tune.h
+++ b/gcc/config/loongarch/loongarch-tune.h
@@ -162,14 +162,16 @@ struct loongarch_cache {
   }
 };
 
-/* Alignment for functions and labels for best performance.  For new uarchs
-   the value should be measured via benchmarking.  See the documentation for
-   -falign-functions and -falign-labels in invoke.texi for the format.  */
+/* Alignment for functions loops and jumps for best performance.  For new
+   uarchs the value should be measured via benchmarking.  See the documentation
+   for -falign-functions -falign-loops and -falign-jumps in invoke.texi for the
+   format.  */
 struct loongarch_align {
   const char *function;/* default value for -falign-functions */
-  const char *label;   /* default value for -falign-labels */
+  const char *loop;/* default value for -falign-loops */
+  const char *jump;/* default value for -falign-jumps */
 
-  loongarch_align () : function (nullptr), label (nullptr) {}
+  loongarch_align () : function (nullptr), loop (nullptr), jump (nullptr) {}
 
   loongarch_align function_ (const char *_function)
   {
@@ -177,9 +179,15 @@ struct loongarch_align {
 return *this;
   }
 
-  loongarch_align label_ (const char *_label)
+  loongarch_align loop_ (const char *_loop)
   {
-label = _label;
+loop = _loop;
+return *this;
+  }
+
+  loongarch_align jump_ (const char *_jump)
+  {
+jump = _jump;
 return *this;
   }
 };
-- 
2.39.3



Re: [PATCH] libiberty: Invoke D demangler when --format=auto

2024-04-02 Thread Richard Biener
On Sat, Mar 30, 2024 at 9:11 PM Tom Tromey  wrote:
>
> Investigating GDB PR d/31580 showed that the libiberty demangler
> doesn't automatically demangle D mangled names.  However, I think it
> should -- like C++ and Rust (new-style), D mangled names are readily
> distinguished by the leading "_D", and so the likelihood of confusion
> is low.  The other non-"auto" cases in this code are Ada (where the
> encoded form could more easily be confused by ordinary programs) and
> Java (which is long gone, but which also shared the C++ mangling and
> thus was just an output style preference).
>
> This patch also fixed another GDB bug, though of course that part
> won't apply to the GCC repository.

OK.

> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31580
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30276
>
> libiberty
> * cplus-dem.c (cplus_demangle): Try the D demangler with
> "auto" format.
> * testsuite/d-demangle-expected: Add --format=auto test.
> ---
>  gdb/testsuite/gdb.dlang/dlang-start-2.exp | 4 +---
>  libiberty/cplus-dem.c | 2 +-
>  libiberty/testsuite/d-demangle-expected   | 5 +
>  3 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/gdb/testsuite/gdb.dlang/dlang-start-2.exp 
> b/gdb/testsuite/gdb.dlang/dlang-start-2.exp
> index 4b3163ec97d..284f841b54a 100644
> --- a/gdb/testsuite/gdb.dlang/dlang-start-2.exp
> +++ b/gdb/testsuite/gdb.dlang/dlang-start-2.exp
> @@ -79,10 +79,8 @@ if {[gdb_start_cmd] < 0} {
>  return -1
>  }
>
> -# We should probably have "D main" instead of "_Dmain" here, filed PR30276
> -# '[gdb/symtab] function name is _Dmain instead of "D main"' about that.
>  gdb_test "" \
> -"in _Dmain \\(\\)" \
> +"in D main \\(\\)" \
>  "start"
>
>  gdb_test "show language" {"auto; currently d".}
> diff --git a/libiberty/cplus-dem.c b/libiberty/cplus-dem.c
> index 8b92946981f..ee9e84f5d6b 100644
> --- a/libiberty/cplus-dem.c
> +++ b/libiberty/cplus-dem.c
> @@ -186,7 +186,7 @@ cplus_demangle (const char *mangled, int options)
>if (GNAT_DEMANGLING)
>  return ada_demangle (mangled, options);
>
> -  if (DLANG_DEMANGLING)
> +  if (DLANG_DEMANGLING || AUTO_DEMANGLING)
>  {
>ret = dlang_demangle (mangled, options);
>if (ret)
> diff --git a/libiberty/testsuite/d-demangle-expected 
> b/libiberty/testsuite/d-demangle-expected
> index 47b059c4298..cfbdf2a52cb 100644
> --- a/libiberty/testsuite/d-demangle-expected
> +++ b/libiberty/testsuite/d-demangle-expected
> @@ -1470,3 +1470,8 @@ demangle.anonymous
>  --format=dlang
>  _D8demangle9anonymous03fooZ
>  demangle.anonymous.foo
> +#
> +# Test that 'auto' works.
> +--format=auto
> +_D8demangle9anonymous03fooZ
> +demangle.anonymous.foo
> --
> 2.43.0
>


Re: [PATCH] Prettify output of debug_dwarf_die

2024-04-02 Thread Richard Biener
On Thu, Mar 28, 2024 at 8:35 PM Tom Tromey  wrote:
>
> When debugging gcc, I tried calling debug_dwarf_die and I saw this
> output:
>
>   DW_AT_location: location descriptor:
> (0x7fffe9c2e870) DW_OP_dup 0, 0
> (0x7fffe9c2e8c0) DW_OP_bra location descriptor (0x7fffe9c2e640)
> , 0
> (0x7fffe9c2e820) DW_OP_lit4 4, 0
> (0x7fffe9c2e910) DW_OP_skip location descriptor (0x7fffe9c2e9b0)
> , 0
> (0x7fffe9c2e640) DW_OP_dup 0, 0
>
> I think those ", 0" should not appear on their own lines.  The issue
> seems to be that print_dw_val should not generally emit a newline,
> except when recursing.

OK.

> gcc/ChangeLog
>
> * dwarf2out.cc (print_dw_val) : Don't
> print newline when not recursing.
> ---
>  gcc/dwarf2out.cc | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
> index 8f18bc4fe64..1b0e8b5a5b2 100644
> --- a/gcc/dwarf2out.cc
> +++ b/gcc/dwarf2out.cc
> @@ -6651,7 +6651,7 @@ print_dw_val (dw_val_node *val, bool recurse, FILE 
> *outfile)
>  case dw_val_class_loc:
>fprintf (outfile, "location descriptor");
>if (val->v.val_loc == NULL)
> -   fprintf (outfile, " -> \n");
> +   fprintf (outfile, " -> ");
>else if (recurse)
> {
>   fprintf (outfile, ":\n");
> @@ -6662,9 +6662,9 @@ print_dw_val (dw_val_node *val, bool recurse, FILE 
> *outfile)
>else
> {
>   if (flag_dump_noaddr || flag_dump_unnumbered)
> -   fprintf (outfile, " #\n");
> +   fprintf (outfile, " #");
>   else
> -   fprintf (outfile, " (%p)\n", (void *) val->v.val_loc);
> +   fprintf (outfile, " (%p)", (void *) val->v.val_loc);
> }
>break;
>  case dw_val_class_loc_list:
> --
> 2.43.0
>


Re: [PATCH] LoongArch: Remove unused code and add sign/zero-extend for vpickve2gr.d

2024-04-02 Thread xujiahao
We recently discovered an issue with the sign/zero extension behavior of 
|[x]vpickve2gr.|The QI , HI and SI are extended to SI instead of DI, 
which may lead to the generation of additional sign extension 
instructions. We have decided to fix this issue in the next version. I 
will upload a new patch later that will only remove unused code.


在 2024/3/22 16:03, Jiahao Xu 写道:

For machines that satisfy ISA_HAS_LSX && !TARGET_64BIT, we will not support 
them now
and in the future, so this patch removes these unused code.

This patch also adds sign/zero-extend operations to vpickve2gr.d to match 
the actual
instruction behavior, and integrates the template definition of vpickve2gr.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove unused code.
* config/loongarch/loongarch-protos.h (loongarch_split_lsx_copy_d): 
Remove.
(loongarch_split_lsx_insert_d): Ditto.
(loongarch_split_lsx_fill_d): Ditto.
* config/loongarch/loongarch.cc (loongarch_split_lsx_copy_d): Ditto.
(loongarch_split_lsx_insert_d): Ditto.
(loongarch_split_lsx_fill_d): Ditto.
* config/loongarch/lsx.md (lsx_vpickve2gr_): Redefine.
(lsx_vpickve2gr_du): Remove.
(lsx_vpickve2gr_): Ditto.

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 2fa5e46c8e8..7bd61f8ed5b 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -572,12 +572,7 @@ (define_insn "lasx_xvinsgr2vr_"
  (match_operand 3 "const__operand" "")))]
"ISA_HAS_LASX"
  {
-#if 0
-  if (!TARGET_64BIT && (mode == V4DImode || mode == V4DFmode))
-return "#";
-  else
-#endif
-return "xvinsgr2vr.\t%u0,%z1,%y3";
+  return "xvinsgr2vr.\t%u0,%z1,%y3";
  }
[(set_attr "type" "simd_insert")
 (set_attr "mode" "")])
@@ -1446,10 +1441,7 @@ (define_insn "lasx_xvreplgr2vr_"
if (which_alternative == 1)
  return "xvldi.b\t%u0,0" ;
  
-  if (!TARGET_64BIT && (mode == V2DImode || mode == V2DFmode))

-return "#";
-  else
-return "xvreplgr2vr.\t%u0,%z1";
+  return "xvreplgr2vr.\t%u0,%z1";
  }
[(set_attr "type" "simd_fill")
 (set_attr "mode" "")
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index e3ed2b912a5..e238d795a73 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -89,9 +89,6 @@ extern void loongarch_split_128bit_move (rtx, rtx);
  extern bool loongarch_split_128bit_move_p (rtx, rtx);
  extern void loongarch_split_256bit_move (rtx, rtx);
  extern bool loongarch_split_256bit_move_p (rtx, rtx);
-extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
-extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
-extern void loongarch_split_lsx_fill_d (rtx, rtx);
  extern const char *loongarch_output_move (rtx, rtx);
  #ifdef RTX_CODE
  extern void loongarch_expand_scc (rtx *);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 030957db4e7..34850a0fc64 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4759,82 +4759,6 @@ loongarch_split_256bit_move (rtx dest, rtx src)
  }
  }
  
-

-/* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
-   used to generate subregs.  */
-
-void
-loongarch_split_lsx_copy_d (rtx dest, rtx src, rtx index,
-   rtx (*gen_fn)(rtx, rtx, rtx))
-{
-  gcc_assert ((GET_MODE (src) == V2DImode && GET_MODE (dest) == DImode)
- || (GET_MODE (src) == V2DFmode && GET_MODE (dest) == DFmode));
-
-  /* Note that low is always from the lower index, and high is always
- from the higher index.  */
-  rtx low = loongarch_subword (dest, false);
-  rtx high = loongarch_subword (dest, true);
-  rtx new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
-
-  emit_insn (gen_fn (low, new_src, GEN_INT (INTVAL (index) * 2)));
-  emit_insn (gen_fn (high, new_src, GEN_INT (INTVAL (index) * 2 + 1)));
-}
-
-/* Split a INSERT.D with operand DEST, SRC1.INDEX and SRC2.  */
-
-void
-loongarch_split_lsx_insert_d (rtx dest, rtx src1, rtx index, rtx src2)
-{
-  int i;
-  gcc_assert (GET_MODE (dest) == GET_MODE (src1));
-  gcc_assert ((GET_MODE (dest) == V2DImode
-  && (GET_MODE (src2) == DImode || src2 == const0_rtx))
- || (GET_MODE (dest) == V2DFmode && GET_MODE (src2) == DFmode));
-
-  /* Note that low is always from the lower index, and high is always
- from the higher index.  */
-  rtx low = loongarch_subword (src2, false);
-  rtx high = loongarch_subword (src2, true);
-  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
-  rtx new_src1 = simplify_gen_subreg (V4SImode, src1, GET_MODE (src1), 0);
-  i = exact_log2 (INTVAL (index));
-  gcc_assert (i != -1);
-
-  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, low, new_src1,
- GEN_INT (1 << (i * 2;
-  emit_insn (gen_lsx_vinsgr2vr_w 

Re: [pushed][PATCH v5] LoongArch: Add support for TLS descriptors

2024-04-02 Thread chenglulu

Pushed to r14-9742.

Rebase to the latest, and modify invoke.texi to add a description of the 
TLS DESC compilation option.


在 2024/3/19 上午9:54, mengqinggang 写道:

Add support for TLS descriptors on normal code model and extreme code model.

Normal code model instruction sequence:
   -mno-explicit-relocs:
 la.tls.desc$r4, s
 add.d  $r12, $r4, $r2
   -mexplicit-relocs:
 pcalau12i  $r4,%desc_pc_hi20(s)
 addi.d $r4,$r4,%desc_pc_lo12(s)
 ld.d   $r1,$r4,%desc_ld(s)
 jirl   $r1,$r1,%desc_call(s)
 add.d  $r12, $r4, $r2

Extreme code model instruction sequence:
   -mno-explicit-relocs:
 la.tls.desc$r4, $r12, s
 add.d  $r12, $r4, $r2
   -mexplicit-relocs:
 pcalau12i  $r4,%desc_pc_hi20(s)
 addi.d $r12,$r0,%desc_pc_lo12(s)
 lu32i.d$r12,%desc64_pc_lo20(s)
 lu52i.d$r12,$r12,%desc64_pc_hi12(s)
 add.d  $r4,$r4,$r12
 ld.d   $r1,$r4,%desc_ld(s)
 jirl   $r1,$r1,%desc_call(s)
 add.d  $r12, $r4, $r2

The default is still traditional TLS model, but can be configured with
--with-tls={trad,desc}. The default can change to TLS descriptors once
libc and LLVM support this.

gcc/ChangeLog:

* config.gcc: Add --with-tls option to change TLS flavor.
* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
configure TLS flavor.
* config/loongarch/loongarch-def.h (struct loongarch_target): Add
tls_dialect.
* config/loongarch/loongarch-driver.cc (la_driver_init): Add tls
flavor.
* config/loongarch/loongarch-opts.cc (loongarch_init_target): Add
tls_dialect.
(loongarch_config_target): Ditto.
(loongarch_update_gcc_opt_status): Ditto.
* config/loongarch/loongarch-opts.h (loongarch_init_target):Ditto.
(TARGET_TLS_DESC): New define.
* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add TLS DESC
instructions sequence length.
(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
(loongarch_option_override_internal): Add la_opt_tls_dialect.
(loongarch_option_restore): Add la_target.tls_dialect.
* config/loongarch/loongarch.md (@got_load_tls_desc): Normal
code model for TLS DESC.
(got_load_tls_desc_off64): Extreme code model for TLS DESC.
* config/loongarch/loongarch.opt: Regenerated.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/cmodel-extreme-1.c: Add -mtls-dialect=trad.
* gcc.target/loongarch/cmodel-extreme-2.c: Ditto.
* gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c: Ditto.
* gcc.target/loongarch/explicit-relocs-medium-call36-auto-tls-ld-gd.c:
Ditto.
* gcc.target/loongarch/func-call-medium-1.c: Ditto.
* gcc.target/loongarch/func-call-medium-2.c: Ditto.
* gcc.target/loongarch/func-call-medium-3.c: Ditto.
* gcc.target/loongarch/func-call-medium-4.c: Ditto.
* gcc.target/loongarch/tls-extreme-macro.c: Ditto.
* gcc.target/loongarch/tls-gd-noplt.c: Ditto.
* gcc.target/loongarch/explicit-relocs-auto-extreme-tls-desc.c: New 
test.
* gcc.target/loongarch/explicit-relocs-auto-tls-desc.c: New test.
* gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c: New test.
* gcc.target/loongarch/explicit-relocs-tls-desc.c: New test.

Co-authored-by: Lulu Cheng 
Co-authored-by: Xi Ruoyao 
---
Changes v4 -> v5:
- Use (reg:P 4) instead of match_operand in got_load_tls_desc and
   got_load_tls_desc_off64.
- Change instruction sequence to prevent additional white spaces in the output 
asm
   before tabs.

Changes v3 -> v4:
- Add TLS descriptors test cases.

Changes v2 -> v3:
- Set default to traditional TLS model.
- Add support for -mexplicit-relocs and extreme code model.

Changes v1 -> v2:
- Clobber fcc0-fcc7 registers in got_load_tls_desc template.
- Support --with-tls in configure.

v4 link: https://sourceware.org/pipermail/gcc-patches/2024-March/647597.html
v3 link: https://sourceware.org/pipermail/gcc-patches/2024-March/647578.html
v2 link: https://sourceware.org/pipermail/gcc-patches/2024-February/646817.html
v1 link: https://sourceware.org/pipermail/gcc-patches/2023-December/638907.html

  gcc/config.gcc| 19 +-
  gcc/config/loongarch/genopts/loongarch.opt.in | 14 
  gcc/config/loongarch/loongarch-def.h  |  7 ++
  gcc/config/loongarch/loongarch-driver.cc  |  2 +-
  gcc/config/loongarch/loongarch-opts.cc| 12 +++-
  gcc/config/loongarch/loongarch-opts.h |  2 +
  gcc/config/loongarch/loongarch.cc | 47 +
  gcc/config/loongarch/loongarch.md | 68 +++
  gcc/config/loongarch/loongarch.opt| 14 
  .../gcc.target/loongarch/cmodel-extreme-1.c   |  2 +-
  .../gcc.target/loongarch/cmodel-extreme-2.c   |  2 +-
  .../explicit-relocs-auto-extreme-tls-desc.c   | 10 +++
  

Re: [PATCH v2] rs6000: Stackoverflow in optimized code on PPC [PR100799]

2024-04-02 Thread Kewen.Lin
Hi!

on 2024/3/24 02:37, Ajit Agarwal wrote:
> 
> 
> On 23/03/24 9:33 pm, Peter Bergner wrote:
>> On 3/23/24 4:33 AM, Ajit Agarwal wrote:
> -  else if (align_words < GP_ARG_NUM_REG)
> +  else if (align_words < GP_ARG_NUM_REG
> +|| (cum->hidden_string_length
> +&& cum->actual_parm_length <= GP_ARG_NUM_REG))
 {
   if (TARGET_32BIT && TARGET_POWERPC64)
 return rs6000_mixed_function_arg (mode, type, align_words);

   return gen_rtx_REG (mode, GP_ARG_MIN_REG + align_words);
 }
   else
 return NULL_RTX;

 The old code for the unused hidden parameter (which was the 9th param) 
 would
 fall thru to the "return NULL_RTX;" which would make the callee assume 
 there
 was a parameter save area allocated.  Now instead, we'll return a reg rtx,
 probably of r11 (r3 thru r10 are our param regs) and I'm guessing we'll now
 see a copy of r11 into a pseudo like we do for the other param regs.
 Is that a problem? Given it's an unused parameter, it'll probably get 
 deleted
 as dead code, but could it cause any issues?  What if we have more than one

I think Peter raised one good point, not sure it would really cause some issues,
but the assigned reg goes beyond GP_ARG_MAX_REG, at least it is confusing to 
people
especially without DCE like at -O0.  Can we aggressively remove these candidates
from DECL_ARGUMENTS chain?  Does it cause any assertion to fail?

BR,
Kewen


 unused hidden parameter and we return r12 and r13 which have specific uses
 in our ABIs (eg, r13 is our TCB pointer), so it may not actually look dead.
 Have you verified what the callee RTL looks like after expand for these
 unused hidden parameters?  Is there a rtx we can return that isn't a 
 NULL_RTX
 which triggers the assumption of a parameter save area, but isn't a reg rtx
 which might lead to some rtl being generated?  Would a (const_int 0) or
 something else work?


>>> For the above use case it will return 
>>>
>>> (reg:DI 5 %r5) and below check entry_parm = 
>>> (reg:DI 5 %r5) and the following check will not return TRUE and hence
>>>parameter save area will not be allocated.
>>
>> Why r5?!?!   The 8th (integer) param would return r10, so I'd assume if
>> the next param was a hidden param, then it'd get the next gpr, so r11.
>> How does it jump back to r5 which may have been used by the 3rd param?
>>
>>
> My mistake its r11 only for hidden param.
>>
>>
>>
>>> It will not generate any rtx in the callee rtl code but it just used to
>>> check whether to allocate parameter save area or not when number of args > 
>>> 8.
>>>
>>> /* If there is no incoming register, we need a stack.  */
>>>   entry_parm = rs6000_function_arg (args_so_far, arg);
>>>   if (entry_parm == NULL)
>>> return true;
>>>
>>>   /* Likewise if we need to pass both in registers and on the stack.  */
>>>   if (GET_CODE (entry_parm) == PARALLEL
>>>   && XEXP (XVECEXP (entry_parm, 0, 0), 0) == NULL_RTX)
>>> return true;
>>
>> Yes, this code in rs6000_parm_needs_stack() uses the rs6000_function_arg()
>> return value as a boolean to tell us whether a parameter save area is 
>> required
>> so what we return is unimportant other than to know it's not NULL_RTX.
>>
>> I'm more concerned about the use of the target hook 
>> targetm.calls.function_arg
>> used in the generic parts of the compiler.  What will that code do 
>> differently
>> now that we return a reg rtx rather than NULL_RTX?  Might that code use
>> the reg rtx to emit something?  I'd feel better if you could verify what
>> happens in that code when we return a reg rtx for that 9th hidden param which
>> isn't really being passed in a register.
>>
> 
> As per my understanding and debugging openBLAS code testcase I see that 
> reg_rtx returned inside the below IF condition is used for check whether 
> paramter save area is needed or not. 
> 
> In the generic code where targetm.calls.function_arg is called 
> in calls.cc returned rtx is used for PARALLEL case so that we can
> check if we need to pass both in registers and stack then they emit
> store with respect to return rtx. If we identify that we need only
> registers for argument then it emits nothing.
> 
> Thanks & Regards
> Ajit
>>
>> Peter
>>
>>