Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-01 Thread Robin Dapp
I stumped across this change from 
https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/88

and I want to express my strong disagreement with this change.


Perhaps I'm accustomed to Arm's behavior, but I believe using -march= to 
target a specific CPU isn't ideal.


* -march=X: (execution domain) Generate code that can use instructions 
available in the architecture X
* -mtune=X: (optimization domain) Optimize for the microarchitecture X, but 
does not change the ABI or make assumptions about available instructions
* -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two 
options. The supported values are generally the same as -mtune=. The 
architecture name is inferred from X


For execution domain settings, -march=X overrides -mcpu=X regardless of their 
positions.


In cases like `-march=LOWER -mcpu=HIGHER` or `-mcpu=HIGHER -march=LOWER`, the 
-march= option can disable certain target features.


I strongly disagree with Clang adopting this behavior.
I'm not convinced by the GCC patch explanation.


Suppose we have a Makefile that specifies -march=rv64gc by default.


In the project specifies a lower feature set, then the compiler should 
respect it or the user should fix the project build system.


There are two reasons for the change:

- Adhere to the usual "last option wins" convention and at the same time
- Make it easy to specify an overriding -march string representing a CPU 
  without spelling out the >200 char expanded arch string, a very error-prone 
  procedure.


I agree that fixing the "project build system" might be another way of having 
only a single -march in a compiler command line but it doesn't match my 
experience.  Over the years I have seen a significant number of projects where 
multiple -marchs where present.  With the "last option wins" convention this 
works and a user always has a "last ditch" way of setting their preferred 
architecture.
A typical use case is a default march that is overridden by CFLAGS.  There's 
nothing to be fixed here because it's standard procedure.  In that situation we 
would always need the full -march string and that can't be what we want?


What's your main concern?  The less clear separation between arch and CPU?
I don't think the change takes anything away from -march but just adds a 
shortcut "set the arch level to what this CPU supports".  That's fully in line 
with what other targets are doing,  also on the clang side.


Ideally (IMHO) we'd also imply an -mtune when specifying -march by default 
rather than using -mtune=rocket or so.


If we had an -mcpu=... that overrode -march and -mtune we'd already have a good 
way of achieving what I intended.  In that case -march wouldn't need to be 
changed but to my understanding this has been unwanted so far.  For me a "last 
option wins" -mcpu that overrides march and mtune would also work and would 
keep the clear separation between -march and -mtune.


--
Regards
Robin



[PING * 3][PATCH v3] Add new warning Wmissing-designated-initializers [PR39589]

2025-06-01 Thread Peter Frost

Ping https://gcc.gnu.org/pipermail/gcc-patches/2025-January/672568.html


Re: DEFAULT_PCC_STRUCT_RETURN on NetBSD vs Linux

2025-06-01 Thread Jeff Law




On 6/1/25 8:21 AM, John Paul Adrian Glaubitz wrote:



And what about the value for STACK_BOUNDARY? It seems to be 16 for many
Linux targets while it's 32 for NetBSD. Is there a reason why it's 16-bit-
aligned on Linux?
Regardless of differences in the values, I don't see changing them at 
this point as that would be an ABI change.


Jeff



Re: [PATCH v4 04/20] Remove unnecessary `record` argument from maybe_version_functions.

2025-06-01 Thread Jeff Law




On 5/28/25 4:10 AM, Alfie Richards wrote:

Previously, the `record` argument in maybe_version_function allowed the
call to cgraph_node::record_function_versions to be skipped.  However,
this was only skipped when both decls were already marked as versioned,
in which case we trigger the early exit in record_function_versions
instead. Therefore, the argument is unnecessary.

gcc/cp/ChangeLog:

* class.cc (add_method): Remove argument.
* cp-tree.h (maybe_version_functions): Ditto.
* decl.cc (decls_match): Ditto.
(maybe_version_functions): Ditto.

OK
jeff



[PATCH] c: Move checking assertions from recursion when forming composite types to avoid ICE.

2025-06-01 Thread Martin Uecker


After looking a bit more at this, it turns out that it is still
possible to trigger the assertion that checks that the composite
type is compatible to the original types when using self-referential
types where it then sees composite types which are still under
construction.

This patch now moves the checking assertion out of the recursion,
which seems better anyhow. I could not do the check unconditionally
for all types. because we sometimes call composite_type for with
mismatching qualifiers.

There is also still a remaining issue with setting
C_VARIABLY_MODIFIED_TYPE_P correctly for composite types for
such self-referential types.


Bootstrapped and regression tested for x86_64.

Martin


c: Move checking assertions from recursion when forming composite types to 
avoid ICE.

The checking assertion in composite_type_internal for structures and unions 
may
fail if there are self-referential types.  To avoid this, we move them out 
of
the recursion.  This should also be more efficient and covers other types.
We have to ignore some cases where we form composite types despite 
qualifiers
not matching.

gcc/c/ChangeLog:
* c-typeck.cc (composite_type_internal,composite_type): Move
checking assertions.

gcc/testsuite/ChangeLog:
* gcc.dg/gnu23-tag-composite-6.c: Update.

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 2f243cab8da..417db8795c5 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -846,12 +846,7 @@ composite_type_internal (tree t1, tree t2, struct 
composite_cache* cache)
  n = finish_struct (input_location, n, fields, attributes, NULL,
 &expr);
 
- n = qualify_type (n, t1);
-
- gcc_checking_assert (!TYPE_NAME (n) || comptypes (n, t1));
- gcc_checking_assert (!TYPE_NAME (n) || comptypes (n, t2));
-
- return n;
+ return qualify_type (n, t1);
}
   /* FALLTHRU */
 case ENUMERAL_TYPE:
@@ -1004,7 +999,15 @@ tree
 composite_type (tree t1, tree t2)
 {
   struct composite_cache cache = { };
-  return composite_type_internal (t1, t2, &cache);
+  tree n = composite_type_internal (t1, t2, &cache);
+  /* For function and arrays there are some exceptions where
+ qualifiers do not match.  */
+  if (FUNCTION_TYPE != TREE_CODE (n) && ARRAY_TYPE != TREE_CODE (n))
+{
+  gcc_checking_assert (comptypes (n, t1));
+  gcc_checking_assert (comptypes (n, t2));
+}
+  return n;
 }
 
 /* Return the type of a conditional expression between pointers to
diff --git a/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c 
b/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c
index 2411b04d388..076c066f9c8 100644
--- a/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c
+++ b/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c
@@ -1,11 +1,31 @@
 /* { dg-do compile } */
 /* { dg-options "-std=gnu23" } */
 
+#define NEST(...) typeof(({ (__VA_ARGS__){ }; }))
+
 int f()
 {
 typedef struct foo bar;
-struct foo { typeof(({ (struct foo { bar * x; }){ }; })) * x; } *q;
-typeof(q->x) p;
-1 ? p : q;
+struct foo { NEST(struct foo { bar *x; }) *x; } *q;
+typeof(q->x) p0;
+typeof(q->x) p1;
+1 ? p0 : q;
+1 ? p1 : q;
+1 ? p0 : p1;
+}
+
+int g()
+{
+typedef struct fo2 bar;
+struct fo2 { NEST(struct fo2 { NEST(struct fo2 { bar *x; }) * x; }) *x; } 
*q;
+typeof(q->x) p0;
+typeof(q->x->x) p1;
+typeof(q->x->x->x) p2;
+1 ? p0 : q;
+1 ? p1 : q;
+1 ? p2 : q;
+1 ? p0 : p1;
+1 ? p2 : p1;
+1 ? p0 : p2;
 }
 



Re: [PATCH] testsuite: RISC-V: Fix the typo in param-autovec-mode.c

2025-06-01 Thread Jeff Law




On 5/28/25 7:59 PM, Liao Shihua wrote:

This patch fixes the typo in the test case `param-autovec-mode.c` in the RISC-V 
autovec testsuite.
The option `autovec-mode` is changed to `riscv-autovec-mode` to match the 
expected parameter name.



gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/param-autovec-mode.c: Change 
`autovec-mode` to `riscv-autovec-mode` in dg-options.

Wrapped the overly long ChangeLog line and push this to the trunk for you.

Thanks,
jeff


Re: [PATCH] Add newlib to gitignore

2025-06-01 Thread Jeff Law




On 6/1/25 1:40 PM, Arijit Kumar Das wrote:

newlib sources are not a part of GCC so should be ignored, if present.

Signed-off-by: Arijit Kumar Das 
---
  .gitignore | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/.gitignore b/.gitignore
index 7150fc3b29c..5ae6a5a08d6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -74,3 +74,6 @@ stamp-*
  
  # ADDITIONS from GCCRS front-end

  libgrust/*/target/
+
+# newlib sources are not a part of GCC so should be ignored, if present as a 
symlink
+newlib
There's all kind of directories that would fit this description, say for 
a single tree build, but which aren't included in .gitignore (binutils, 
gas, ld, opcodes, etc etc).


Why would newlib be special here?  I'm not inclined to include this 
without a strong reason.


jeff



Re: [PATCH] or1k: Fix struct return test

2025-06-01 Thread Jeff Law




On 5/31/25 12:03 AM, Stafford Horne wrote:

In or1k structs are returned from functions using the memory address
passed in r3.  In the current version of GCC the struct stores changed
from r11 (the return value) to r3 the incoming memory address.  Both of
are valid.

Adjust the test to match what GCC is producing now.

gcc/testsuite/ChangeLog:

* gcc.target/or1k/return-2.c: Fix test.
Given you're the maintainer of this port it seems like you can 
self-approve and commit this and the long jump offset patch.


Jeff



Re: [PATCH v4] c++: Unwrap type traits defined in terms of builtins within diagnostics [PR117294]

2025-06-01 Thread Patrick Palka
On Sat, 31 May 2025, Nathaniel Shead wrote:

> On Fri, May 30, 2025 at 11:10:08AM -0400, Patrick Palka wrote:
> > On Fri, 30 May 2025, Patrick Palka wrote:
> > 
> > > On Fri, 30 May 2025, Nathaniel Shead wrote:
> > > 
> > > > On Wed, May 28, 2025 at 02:14:06PM -0400, Patrick Palka wrote:
> > > > > On Tue, 27 May 2025, Nathaniel Shead wrote:
> > > > > 
> > > > > > On Wed, Nov 27, 2024 at 11:45:40AM -0500, Patrick Palka wrote:
> > > > > > > On Fri, 8 Nov 2024, Nathaniel Shead wrote:
> > > > > > > 
> > > > > > > > Does this approach seem reasonable?  I'm pretty sure that the 
> > > > > > > > way I've
> > > > > > > > handled the templating here is unideal but I'm not sure what a 
> > > > > > > > neat way
> > > > > > > > to do what I'm trying to do here would be; any comments are 
> > > > > > > > welcome.
> > > > > > > 
> > > > > > > Clever approach, I like it!
> > > > > > > 
> > > > > > > > 
> > > > > > > > -- >8 --
> > > > > > > > 
> > > > > > > > Currently, concept failures of standard type traits just report
> > > > > > > > 'expression X evaluates to false'.  However, many type 
> > > > > > > > traits are
> > > > > > > > actually defined in terms of compiler builtins; we can do 
> > > > > > > > better here.
> > > > > > > > For instance, 'is_constructible_v' could go on to explain why 
> > > > > > > > the type
> > > > > > > > is not constructible, or 'is_invocable_v' could list potential
> > > > > > > > candidates.
> > > > > > > 
> > > > > > > That'd be great improvement.
> > > > > > > 
> > > > > > > > 
> > > > > > > > As a first step to supporting that we need to be able to map the
> > > > > > > > standard type traits to the builtins that they use.  Rather 
> > > > > > > > than adding
> > > > > > > > another list that would need to be kept up-to-date whenever a 
> > > > > > > > builtin is
> > > > > > > > added, this patch instead tries to detect any variable template 
> > > > > > > > defined
> > > > > > > > directly in terms of a TRAIT_EXPR.
> > > > > > > > 
> > > > > > > > To avoid false positives, we ignore any variable templates that 
> > > > > > > > have any
> > > > > > > > specialisations (partial or explicit), even if we wouldn't have 
> > > > > > > > chosen
> > > > > > > > that specialisation anyway.  This shouldn't affect any of the 
> > > > > > > > standard
> > > > > > > > library type traits that I could see.
> > > > > > > 
> > > > > > > You should be able to tsubst the TEMPLATE_ID_EXPR directly and 
> > > > > > > look at
> > > > > > > its TI_PARTIAL_INFO in order to determine which (if any) partial
> > > > > > > specialization was selected.  And if an explicit specialization 
> > > > > > > was
> > > > > > > selected the resulting VAR_DECL will have 
> > > > > > > DECL_TEMPLATE_SPECIALIZATION
> > > > > > > set.
> > > > > > > 
> > > > > > > > ...[snip]...
> > > > > > > 
> > > > > > > If we substituted the TEMPLATE_ID_EXPR as a whole we could use the
> > > > > > > DECL_TI_ARGS of that IIUC?
> > > > > > > 
> > > > > > 
> > > > > > Thanks for your comments, they were very helpful.  Here's a totally 
> > > > > > new
> > > > > > approach which I'm much happier with.  I've also removed the 
> > > > > > "disable in
> > > > > > case any specialisation exists" logic, as on further reflection I 
> > > > > > don't
> > > > > > imagine this to be the kind of issue I thought it might have been.
> > > > > > 
> > > > > > With this patch,
> > > > > > 
> > > > > >   template 
> > > > > >   constexpr bool is_default_constructible_v = __is_constructible(T);
> > > > > > 
> > > > > >   template 
> > > > > >   concept default_constructible = is_default_constructible_v;
> > > > > > 
> > > > > >   static_assert(default_constructible);
> > > > > > 
> > > > > > now emits the following error:
> > > > > > 
> > > > > >   test.cpp:6:15: error: static assertion failed
> > > > > >   6 | static_assert(default_constructible);
> > > > > > |   ^~~
> > > > > >   test.cpp:6:15: note: constraints not satisfied
> > > > > >   test.cpp:4:9:   required by the constraints of ‘template 
> > > > > > concept default_constructible’
> > > > > >   test.cpp:4:33: note:   ‘void’ is not default constructible
> > > > > >   4 | concept default_constructible = 
> > > > > > is_default_constructible_v;
> > > > > > | 
> > > > > > ^
> > > > > > 
> > > > > > There's still a lot of improvements to be made in this area, I 
> > > > > > think:
> > > > > > 
> > > > > > - I haven't yet looked into updating the specific diagnostics 
> > > > > > emitted by
> > > > > >   the traits; I'd like to try to avoid too much code duplication 
> > > > > > with
> > > > > >   the implementation in cp/semantics.cc.  (I also don't think the 
> > > > > > manual
> > > > > >   indentation at the start of the message is particularly helpful?)
> > > > > 
> > > > > For is_xible / is_convertible etc, perhaps they could use a 'complain'
> > > > > parameter that they propa

Re: [PATCH] RISC-V: Implement full-featured iterator for riscv_subset_list [NFC]

2025-06-01 Thread Kito Cheng
CI passed, pushed to trunk :)

On Thu, May 29, 2025 at 1:59 PM Kito Cheng  wrote:
>
> This commit implements a full-featured iterator for the
> riscv_subset_list, that it able to use range-based-for-loop.
>
> That could simplfy the code in the future, and make it more readable,
> also more compatible with standard C++ containers.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Use
> range-based-for-loop.
> * config/riscv/riscv-subset.h (riscv_subset_list::iterator):
> New.
> (riscv_subset_list::const_iterator): New.
> ---
>  gcc/config/riscv/riscv-c.cc | 18 --
>  gcc/config/riscv/riscv-subset.h | 62 +++--
>  2 files changed, 66 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> index 1ff1968f6d8..d2c0af35955 100644
> --- a/gcc/config/riscv/riscv-c.cc
> +++ b/gcc/config/riscv/riscv-c.cc
> @@ -239,26 +239,22 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>size_t max_ext_len = 0;
>
>/* Figure out the max length of extension name for reserving buffer.   */
> -  for (const riscv_subset_t *subset = subset_list->begin ();
> -   subset != subset_list->end ();
> -   subset = subset->next)
> -max_ext_len = MAX (max_ext_len, subset->name.length ());
> +  for (auto &subset : *subset_list)
> +max_ext_len = MAX (max_ext_len, subset.name.length ());
>
>char *buf = (char *)alloca (max_ext_len + 10 /* For __riscv_ and '\0'.  
> */);
>
> -  for (const riscv_subset_t *subset = subset_list->begin ();
> -   subset != subset_list->end ();
> -   subset = subset->next)
> +  for (auto &subset : *subset_list)
>  {
> -  int version_value = riscv_ext_version_value (subset->major_version,
> -  subset->minor_version);
> +  int version_value = riscv_ext_version_value (subset.major_version,
> +  subset.minor_version);
>/* Special rule for zicsr and zifencei, it's used for ISA spec 2.2 or
>  earlier.  */
> -  if ((subset->name == "zicsr" || subset->name == "zifencei")
> +  if ((subset.name == "zicsr" || subset.name == "zifencei")
>   && version_value == 0)
> version_value = riscv_ext_version_value (2, 0);
>
> -  sprintf (buf, "__riscv_%s", subset->name.c_str ());
> +  sprintf (buf, "__riscv_%s", subset.name.c_str ());
>builtin_define_with_int_value (buf, version_value);
>  }
>  }
> diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h
> index c5d9fab4de9..a35537d7754 100644
> --- a/gcc/config/riscv/riscv-subset.h
> +++ b/gcc/config/riscv/riscv-subset.h
> @@ -109,9 +109,6 @@ public:
>static riscv_subset_list *parse (const char *, location_t);
>const char *parse_single_ext (const char *, bool exact_single_p = true);
>
> -  const riscv_subset_t *begin () const {return m_head;};
> -  const riscv_subset_t *end () const {return NULL;};
> -
>int match_score (riscv_subset_list *) const;
>
>void set_loc (location_t);
> @@ -119,6 +116,65 @@ public:
>void set_allow_adding_dup (bool v) { m_allow_adding_dup = v; }
>
>void finalize ();
> +
> +  class iterator
> +  {
> +  public:
> +explicit iterator(riscv_subset_t *node) : m_node(node) {}
> +
> +riscv_subset_t &operator*() const { return *m_node; }
> +riscv_subset_t *operator->() const { return m_node; }
> +
> +iterator &operator++()
> +{
> +  if (m_node)
> +   m_node = m_node->next;
> +  return *this;
> +}
> +
> +bool operator!=(const iterator &other) const
> +{
> +  return m_node != other.m_node;
> +}
> +
> +bool operator==(const iterator &other) const
> +{
> +  return m_node == other.m_node;
> +}
> +
> +  private:
> +riscv_subset_t *m_node;
> +  };
> +
> +  iterator begin() { return iterator(m_head); }
> +  iterator end()   { return iterator(nullptr); }
> +
> +  class const_iterator
> +  {
> +  public:
> +explicit const_iterator(const riscv_subset_t *node) : m_node(node) {}
> +
> +const riscv_subset_t &operator*() const { return *m_node; }
> +const riscv_subset_t *operator->() const { return m_node; }
> +
> +const_iterator &operator++()
> +{
> +  if (m_node)
> +   m_node = m_node->next;
> +  return *this;
> +}
> +
> +bool operator!=(const const_iterator &other) const
> +{
> +  return m_node != other.m_node;
> +}
> +
> +  private:
> +const riscv_subset_t *m_node;
> +  };
> +
> +  const_iterator begin() const { return const_iterator(m_head); }
> +  const_iterator end() const   { return const_iterator(nullptr); }
>  };
>
>  extern const riscv_subset_list *riscv_cmdline_subset_list (void);
> --
> 2.34.1
>


Re: [PATCH] c++tools: Don't check --enable-default-pie.

2025-06-01 Thread Kito Cheng
Pushed to trunk :)

On Fri, May 30, 2025 at 2:01 PM Richard Biener
 wrote:
>
> On Thu, May 29, 2025 at 8:06 AM Kito Cheng  wrote:
> >
> > `--enable-default-pie` is an option to specify whether to enable
> > position-independent executables by default for `target`.
> >
> > However c++tools is build for `host`, so it should just follow
> > `--enable-host-pie` option to determine whether to build with
> > position-independent executables or not.
> >
> > NOTE:
> >
> > I checked PR 98324 and build with same configure option
> > (`--enable-default-pie` and lto bootstrap) on x86-64 linux to make sure
> > it won't cause same problem.
>
> Makes sense to me, thus OK if nobody objects over the weekend.
>
> Richard.
>
> > c++tools/ChangeLog:
> >
> > * configure.ac: Don't check `--enable-default-pie`.
> > * configure: Regen.
> > ---
> >  c++tools/configure| 11 ---
> >  c++tools/configure.ac |  6 --
> >  2 files changed, 17 deletions(-)
> >
> > diff --git a/c++tools/configure b/c++tools/configure
> > index 1353479beca..6df4a2f0dfa 100755
> > --- a/c++tools/configure
> > +++ b/c++tools/configure
> > @@ -700,7 +700,6 @@ enable_option_checking
> >  enable_c___tools
> >  enable_maintainer_mode
> >  enable_checking
> > -enable_default_pie
> >  enable_host_pie
> >  enable_host_bind_now
> >  with_gcc_major_version_only
> > @@ -1335,7 +1334,6 @@ Optional Features:
> >enable expensive run-time checks. With LIST, 
> > enable
> >only specific categories of checks. Categories 
> > are:
> >yes,no,all,none,release.
> > -  --enable-default-pieenable Position Independent Executable as default
> >--enable-host-pie   build host code as PIE
> >--enable-host-bind-now  link host code as BIND_NOW
> >
> > @@ -2946,15 +2944,6 @@ $as_echo "#define ENABLE_ASSERT_CHECKING 1" 
> > >>confdefs.h
> >
> >  fi
> >
> > -# Check whether --enable-default-pie was given.
> > -# Check whether --enable-default-pie was given.
> > -if test "${enable_default_pie+set}" = set; then :
> > -  enableval=$enable_default_pie; PICFLAG=-fPIE
> > -else
> > -  PICFLAG=
> > -fi
> > -
> > -
> >  # Enable --enable-host-pie
> >  # Check whether --enable-host-pie was given.
> >  if test "${enable_host_pie+set}" = set; then :
> > diff --git a/c++tools/configure.ac b/c++tools/configure.ac
> > index db34ee678e0..8c4b72a8023 100644
> > --- a/c++tools/configure.ac
> > +++ b/c++tools/configure.ac
> > @@ -97,12 +97,6 @@ if test x$ac_assert_checking != x ; then
> >  [Define if you want assertions enabled.  This is a cheap check.])
> >  fi
> >
> > -# Check whether --enable-default-pie was given.
> > -AC_ARG_ENABLE(default-pie,
> > -[AS_HELP_STRING([--enable-default-pie],
> > - [enable Position Independent Executable as default])],
> > -[PICFLAG=-fPIE], [PICFLAG=])
> > -
> >  # Enable --enable-host-pie
> >  AC_ARG_ENABLE(host-pie,
> >  [AS_HELP_STRING([--enable-host-pie],
> > --
> > 2.34.1
> >


Re: [PATCH] RISC-V: Adjust build rule for gen-riscv-ext-opt and gen-riscv-ext-texi

2025-06-01 Thread Kito Cheng
committed :)

On Mon, Jun 2, 2025 at 11:28 AM Jeff Law  wrote:
>
>
>
> On 5/28/25 11:59 PM, Kito Cheng wrote:
> > Separate the build rules to compile and link stage to make sure
> > BUILD_LINKERFLAGS and BUILD_LDFLAGS are applied correctly.
> >
> > We hit this issue when we try to build GCC with non-system-default g++,
> > and it use newer libstdc++, and then got error from using older libstdc++ 
> > from
> > system, that should not happened if we link with -static-libgcc and
> > -static-libstdc++.
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/t-riscv: Adjust build rule for gen-riscv-ext-opt
> >   and gen-riscv-ext-texi.
> Seems reasonable to me.
>
> jeff
>


Re: [PATCH, libgfortran] PR119856 Part 2 Fix error handling for missing commas in format strings

2025-06-01 Thread Harald Anlauf

Hi Jerry,

On 5/31/25 19:38, Jerry D wrote:

Hi all,

The attached patch fixes a latent issue where we were saving a parsed 
and checked format string that had a missing comma. This resulted in the 
correct error on the first use of the string, but a missed error on 
subsequent uses of the string.


New test case provided.

Regression tested on x86_64

OK for trunk and eventual backport to 15?


both is OK.

Thanks for the patch!

Harald


Regards,

Jerry


[Sidenote: this is a pure libgfortran test where AFAICT torture-testing
does not make much sense.  It is cheap, though.]




Re: [PATCH v4 03/20] Add string_slice class.

2025-06-01 Thread Jeff Law




On 5/28/25 4:10 AM, Alfie Richards wrote:

The string_slice inherits from array_slice and is used to refer to a
substring of an array that is memory managed elsewhere without modifying
the underlying array.

For example, this is useful in cases such as when needing to refer to a
substring of an attribute in the syntax tree.

Adds some minimal helper functions for string_slice,
such as a strtok alternative, equality operators, strcmp, and a function
to strip whitespace from the beginning and end of a string_slice.

gcc/c-family/ChangeLog:

* c-format.cc (local_string_slice_node): New node type.
(asm_fprintf_char_table): New entry.
(init_dynamic_diag_info): Add support for string_slice.
* c-format.h (T_STRING_SLICE): New node type.

gcc/ChangeLog:

* pretty-print.cc (format_phase_2): Add support for string_slice.
* vec.cc (string_slice::tokenize): New method.
(strcmp): New implementation for string_slice.
(string_slice::strip): New method.
(test_string_slice_initializers): New test.
(test_string_slice_tokenize): Ditto.
(test_string_slice_strcmp): Ditto.
(test_string_slice_equality): Ditto.
(test_string_slice_inequality): Ditto.
(test_string_slice_invalid): Ditto.
(test_string_slice_strip): Ditto.
(vec_cc_tests): Add new tests.
* vec.h (class string_slice): New class.
(strcmp): New implementation for stirng_slice.
Implementation itself is fine.  However each function/method should have 
a block comment indicating what the function does, including brief 
descriptions of its inputs and return value.


Pre-approved after adding the appropriate function comments.

jeff

ps.  Thanks for adding testcases for the behavior of this new code. 
It's greatly appreciated.


Re: [PATCH] RISC-V: Adjust build rule for gen-riscv-ext-opt and gen-riscv-ext-texi

2025-06-01 Thread Jeff Law




On 5/28/25 11:59 PM, Kito Cheng wrote:

Separate the build rules to compile and link stage to make sure
BUILD_LINKERFLAGS and BUILD_LDFLAGS are applied correctly.

We hit this issue when we try to build GCC with non-system-default g++,
and it use newer libstdc++, and then got error from using older libstdc++ from
system, that should not happened if we link with -static-libgcc and
-static-libstdc++.

gcc/ChangeLog:

* config/riscv/t-riscv: Adjust build rule for gen-riscv-ext-opt
and gen-riscv-ext-texi.

Seems reasonable to me.

jeff



Re: [PATCH 0/3] Redirect to specific target based on TARGET_VERSION_COMPATIBLE

2025-06-01 Thread Jeff Law




On 5/28/25 3:58 AM, Alfie Richards wrote:


On 28/05/2025 03:36, Jeff Law wrote:



On 5/27/25 2:36 AM, Alfie Richards wrote:

Hi Jeff,

On 22/05/2025 21:02, Jeff Law wrote:



On 5/22/25 9:05 AM, Alfie Richards wrote:

Hi Jeff,

I sent this patch with my implementation a while ago:
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681043.html

There hasn't been any feedback on that patch yet.

These patches are still useful and I would like to go ahead with 
them. I am in favour of using my implementation as it is a bit 
stronger, but it also requires my larger FMV series to be approved 
first.
Can you ping your larger FMV series?  I strongly suspect everyone is 
digging out from everything that queued up while the trunk was in 
bugfixing stages.



Hers the series: https://gcc.gnu.org/pipermail/gcc-patches/2025- 
April/681047.html


I'd love any feedback on that and to get it moving.

You submitted the original at a bad time, right near a release ;-)

Can you repost the series, I think it's like 20 patches and things may 
have moved since then.  It's virtually certain I don't have them in my 
inbox right now ;-)  I may have thought they were primary ARM and 
punted assuming Richard S. would take care of them.



Just reposted now!

Just FYI, there is a V5 in progress addressing Jason’s C++ diagnostics. 
Only a minor diagnostics change so far.

Yea, noticed it today.  I'll switch to that version going forward.  Thanks.



Thank you for the advice and the help!
Of course, happy to help.  Just hope it doesn't get too far into code 
I'm not familiar with :-)


jeff




Re: [PATCH 3/3] OpenMP: Handle more cases in user/condition selector

2025-06-01 Thread Sandra Loosemore

On 5/29/25 22:19, Tobias Burnus wrote:

Sandra Loosemore wrote:

Like the attached V2 patch?


LGTM.

However, I think in metadirective-condition-template.C the 
"scan-tree-dump" should now be changed to "scan-tree-dump-times", given 
that there are several tests.


Yup, good idea.  I've pushed the attached version of the patch.

-Sandra

From 67072d9ec2d48c78e8e3b40ff6c42af13ad4022c Mon Sep 17 00:00:00 2001
From: Sandra Loosemore 
Date: Mon, 2 Jun 2025 03:26:42 +
Subject: [PATCH] OpenMP: Handle more cases in user/condition selector

Tobias had noted that the C front end was not treating C23 constexprs
as constant in the user/condition selector property, which led to
missed opportunities to resolve metadirectives at parse time.
Additionally neither C nor C++ was permitting the expression to have
pointer or floating-point type -- the former being a common idiom in
other C/C++ conditional expressions.  By using the existing front-end
hooks for the implicit conversion to bool in conditional expressions,
we also get free support for using a C++ class object that has a bool
conversion operator in the user/condition selector.

gcc/c/ChangeLog
	* c-parser.cc (c_parser_omp_context_selector): Call
	convert_lvalue_to_rvalue and c_objc_common_truthvalue_conversion
	on the expression for OMP_TRAIT_PROPERTY_BOOL_EXPR.

gcc/cp/ChangeLog
	* cp-tree.h (maybe_convert_cond): Declare.
	* parser.cc (cp_parser_omp_context_selector): Call
	maybe_convert_cond and fold_build_cleanup_point_expr on the
	expression for OMP_TRAIT_PROPERTY_BOOL_EXPR.
	* pt.cc (tsubst_omp_context_selector): Likewise.
	* semantics.cc (maybe_convert_cond): Remove static declaration.

gcc/testsuite/ChangeLog
	* c-c++-common/gomp/declare-variant-2.c: Update expected output.
	* c-c++-common/gomp/metadirective-condition-constexpr.c: New.
	* c-c++-common/gomp/metadirective-condition.c: New.
	* c-c++-common/gomp/metadirective-error-recovery.c: Update expected
	output.
	* g++.dg/gomp/metadirective-condition-class.C: New.
	* g++.dg/gomp/metadirective-condition-template.C: New.
---
 gcc/c/c-parser.cc | 19 ++--
 gcc/cp/cp-tree.h  |  1 +
 gcc/cp/parser.cc  | 21 +++--
 gcc/cp/pt.cc  | 30 ++---
 gcc/cp/semantics.cc   |  3 +-
 .../c-c++-common/gomp/declare-variant-2.c |  2 +-
 .../gomp/metadirective-condition-constexpr.c  | 13 ++
 .../gomp/metadirective-condition.c| 25 +++
 .../gomp/metadirective-error-recovery.c   |  9 +++-
 .../gomp/metadirective-condition-class.C  | 43 +++
 .../gomp/metadirective-condition-template.C   | 41 ++
 11 files changed, 188 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-condition-constexpr.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-condition.c
 create mode 100644 gcc/testsuite/g++.dg/gomp/metadirective-condition-class.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/metadirective-condition-template.C

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 4144aa17fde..e11e6034461 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -26865,17 +26865,30 @@ c_parser_omp_context_selector (c_parser *parser, enum omp_tss_code set,
 	  break;
 	case OMP_TRAIT_PROPERTY_DEV_NUM_EXPR:
 	case OMP_TRAIT_PROPERTY_BOOL_EXPR:
-	  t = c_parser_expr_no_commas (parser, NULL).value;
+	  {
+		c_expr texpr = c_parser_expr_no_commas (parser, NULL);
+		texpr = convert_lvalue_to_rvalue (token->location, texpr,
+		  true, true);
+		t = texpr.value;
+	  }
 	  if (t == error_mark_node)
 		return error_mark_node;
 	  mark_exp_read (t);
-	  t = c_fully_fold (t, false, NULL);
-	  if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
+	  if (property_kind == OMP_TRAIT_PROPERTY_BOOL_EXPR)
+		{
+		  t = c_objc_common_truthvalue_conversion (token->location,
+			   t,
+			   boolean_type_node);
+		  if (t == error_mark_node)
+		return error_mark_node;
+		}
+	  else if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
 		{
 		  error_at (token->location,
 			"property must be integer expression");
 		  return error_mark_node;
 		}
+	  t = c_fully_fold (t, false, NULL);
 	  properties = make_trait_property (NULL_TREE, t, properties);
 	  break;
 	case OMP_TRAIT_PROPERTY_CLAUSE_LIST:
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index d9fc80b92e5..b42c67872a6 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7948,6 +7948,7 @@ extern bool perform_deferred_access_checks	(tsubst_flags_t);
 extern bool perform_or_defer_access_check	(tree, tree, tree,
 		 tsubst_flags_t,
 		 access_failure_info *afi = NULL);
+extern tree maybe_convert_cond (tree);
 
 /* RAII sentinel to ensures that deferred access checks are popped before
   a function returns.  */
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 3e39

Re: [PATCH] testsuite: Add tls_link effective target

2025-06-01 Thread Jeff Law




On 5/27/25 7:36 AM, Christophe Lyon wrote:

Some tests have 'dg-do link' but currently require 'tls' which is a
compile-only check.

In some configurations of arm-none-eabi, the 'tls' effective-target
can be successful although these tests fail to link with
undefined reference to `__aeabi_read_tp'

This patch as a new tls_link effective target which makes sure we can
build an executable.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_tls_link): New.
* g++.dg/tls/pr102496-1.C: Require tls_link.
* g++.dg/tls/pr77285-1.C: Likewise.

gcc/ChangeLog:

* doc/sourcebuild.texi (tls_link): Add documentation.

OK
jeff



Re: [PATCH v5 07/24] Change make_attribute to take string_slice.

2025-06-01 Thread Jeff Law




On 5/29/25 6:46 AM, Alfie Richards wrote:

gcc/ChangeLog:

* attribs.cc (make_attribute): Change arguments.
* attribs.h (make_attribute): Change arguments.

Approved by Richard Sandiford.

Similarly for anything else already approved that is safe to commit now.

jeff


Re: [PATCH] expmed: Prevent non-canonical subreg generation in store_bit_field [PR118873]

2025-06-01 Thread Jeff Law




On 5/30/25 12:12 AM, Richard Biener wrote:

On Thu, May 29, 2025 at 12:27 PM Konstantinos Eleftheriou
 wrote:


Hi Richard, thanks for the response.

On Mon, May 26, 2025 at 11:55 AM Richard Biener  wrote:


On Mon, 26 May 2025, Konstantinos Eleftheriou wrote:


In `store_bit_field_1`, when the value to be written in the bitfield
and/or the bitfield itself have vector modes, non-canonical subregs
are generated, like `(subreg:V4SI (reg:V8SI x) 0)`. If one them is
a scalar, this happens only when the scalar mode is different than the
vector's inner mode.

This patch tries to prevent this, using vec_set patterns when
possible.


I know almost nothing about this code, but why does the patch
fixup things after the fact rather than avoid generating the
SUBREG in the first place?


That's what we are doing, we are trying to prevent the non-canonical
subreg generation (it's not always possible). But, there are cases
where these types of subregs are passed into `store_bit_field` by its
caller, in which case we choose not to touch them.


ISTR it also (unfortunately) depends on the target which forms
are considered canonical.


But, the way that we interpret the documentation, the
canonicalizations are machine-independent. Is that not true? Or,
specifically for the subregs that operate on vectors, is there any
target that considers them canonical?


I'm also not sure you got endianess right for all possible
values of SUBREG_BYTE.  One more reason to not generate such
subreg in the first place but stick to vec_select/concat.


The only way that we would generate subregs are from the calls to
`extract_bit_field` or `store_bit_field_1` and these should handle the
endianness. Also, these subregs wouldn't operate on vectors. Do you
mean that something could go wrong with these calls?


I wanted to remark that endianess WRT memory order (which is
what store/extract_bit_field deal with) isn't always the same as
endianess in register order (which is what vec_concat and friends
operate on).  If we can avoid transitioning from one to the other
this will help avoid mistakes.

In general it would be more obvious (to me) if you fixed the callers
that create those subregs.

Now, I didn't want to pretend I'm reviewing the patch - so others please
do that (as said, I'm not familiar enough with the code to tell whether
it's actually correct).
Well, I'd tend to think your core concern is correct -- if something is 
generating a non-canonical SUBREG, then that needs to be fixed.


If I understand Konstantinos's comments correctly they're tackling one 
of those paths and instead generating a correct form.  My understanding 
is also that this patch doesn't catch all the known cases.


I don't think we anyone anymore that *really* knows the code in 
question.  We're kind of slogging along as best as we can, but it's not 
an ideal situation.


WRT fixing this earlier in the callers, I think it's the right thing to 
check.  So I think that means understanding how we get into this routine 
with VALUE being a SUBREG and FIELDMODE being a vector mode.


Jeff



Re: [PATCH v5 05/24] Update is_function_default_version to work with target_version.

2025-06-01 Thread Jeff Law




On 5/29/25 6:44 AM, Alfie Richards wrote:

Notably this respects target_version semantics where an unannotated
function can be the default version.

gcc/ChangeLog:

* attribs.cc (is_function_default_version): Add target_version logic.

Approved by Richard Sandiford.
Given this is approved, please consider pushing it to the trunk if it 
doesn't depend on subsequent patches.  No sense is having it wait in 
that case.


jeff



Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-01 Thread Robin Dapp
This rule clearly applies to directly related options like -ffoo and  
-fno-foo, but it’s less obvious for unrelated pairs like -ffoo and
-fbar especially when there is traditionally strong specifics. 



In many cases, the principle of "the most specific option wins"
governs the behavior.  


Here we have a case where -ffoo and -fbar overlap so they are not unrelated.
If the most specific option wins then -mcpu should win as it specifies more 
than either -mtune or -march individually.



* -march= and -mcpu= has become less orthogonal, requiring users to
navigate more complex rules.
* We've lost the ability to use combinations like -march=LOWER
-mcpu=HIGHER or -mcpu=HIGHER -march=LOWER to disable specific target
features or detect incompatibilities.
* We landed the commit 4a182418c89666e7594bcb0e5edc5194aa147910
hastily on May 23, 2025, with far-reaching implications that weren't
fully considered.


I can buy into the first point to some degree but IMHO it's not much more 
complicated.


The second point I'm not so sure about.  Why have we lost anything?  All we now 
can do is use a shorthand CPU name instead of a lengthy arch string and 
-march=lower -mcpu=higher is still available, as well as the other way around.


And the third point is IMHO exaggerated.  We're at the beginning of stage 1 
when more experimental things hit the trunk that can easily be adjusted.  So 
neither is anything done hastily nor do I believe the implications are that far 
reaching.


You could just have said something like: "Is this set in stone already?  In 
order to achieve what you want we could also introduce an -march=unset and 
-mtune=unset instead".


IMHO there is enough difference in aarch64's and riscv's extension model that 
it warrants doing some things not in their exact same way.  aarch64's arch 
strings don't regularly exceed a line.



I strongly disagree with implicitly setting -mtune= when specifying
-march=. Adding more semantics to -march complicates the rules users
must follow and increases the risk of error-prone patterns.
The traditional behavior established by aarch64, in contrast, is easy
to remember:

  -mcpu=X: Specify both -march= and -mtune= but can be overridden by
the two options. The supported values are generally the same as
-mtune=. The architecture name is inferred from X


x86's scheme is also easy to remember:

-march=arch or cpu including its proper tune model
-mtune=different tune model

I don't have a problem changing something or reverting if there's consensus but 
so far ITSM that it's mostly a matter of personal taste.  If people prefer 
-march=unset etc. (I don't) then let's go for that?  We need a bit more input 
for that, though.  So far we just have one vote each.


--
Regards
Robin



Re: DEFAULT_PCC_STRUCT_RETURN on NetBSD vs Linux

2025-06-01 Thread John Paul Adrian Glaubitz
On Sun, 2025-06-01 at 19:44 -0600, Jeff Law wrote:
> 
> On 6/1/25 8:21 AM, John Paul Adrian Glaubitz wrote:
> 
> > 
> > And what about the value for STACK_BOUNDARY? It seems to be 16 for many
> > Linux targets while it's 32 for NetBSD. Is there a reason why it's 16-bit-
> > aligned on Linux?
>
> Regardless of differences in the values, I don't see changing them at 
> this point as that would be an ABI change.

Both Gentoo and Debian are going to switch m68k downstream to 4 bytes alignment
as just too many packages are failing to build these days with 2 bytes 
alignment.

See the (incomplete) discussion here: https://wiki.debian.org/M68k/Alignment

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


Re: PING³ — Re: PING (and v2) – [Patch] nvptx/nvptx.opt: Update -march-map= for newer sm_xxx

2025-06-01 Thread Tobias Burnus

Tobias Burnus wrote:

PING²

On May 12, 2025, Tobias Burnus wrote:

PING.

There is actually a minor update as meanwhile CUDA 12.8 was
released that added the 'f' suffix and sm_103 and sm_121.
Still, the pattern remains the same; hence, a normal PING.

On April 25, 2025, Tobias Burnus wrote:


The idea of -march-map= is to simply and future proof select the
best -march for a certain arch, without requiring that the compiler
has support for it (like having a special multilib for it) - while
-march= sets the actually used '.target' (and the compiler might
actually generate specialized code for it).

The patch updates the sm_X for the CUDA 12.8 additions, namely for
three Blackwell GPU architectures.

Cf. 
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes
or also 
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html


OK for mainline?

Tobias

PS: CUDA 12.7 seems to be an internal release, which shows up as
PTX version but was not released to the public.
PTX 8.6/CUDA 12.7 added sm_100/sm_101 - and PTX 8.7/CUDA 12.8 added 
sm120.


PPS: sm_80 (Ampere) was added in PTX ISA 7.0 (CUDA 11.0),
sm_89 (Ada) in PTX ISA 7.8 (CUDA).
As sm_90 (Hopper) + sm_100/101/120 (Blackwell) currently/now map to
sm_89, GCC generates PTX ISA .version 7.8 for them.
Otherwise, sm_80 and sm_89 produce (for now) identical code.nvptx/nvptx.opt: Update -march-map= for newer sm_xxx

Usage of the -march-map=: "Select the closest available '-march=' value
that is not more capable."

As PTX ISA 8.6/8.7 (= unreleased CUDA 12.7 + CUDA 12.8) added the
Nvidia Blackwell GPUs SM_100, SM_101, and SM_120, it makes sense to
add them as well. Note that all three come as sm_XXX and sm_XXXa.

PTX ISA 8.8 (CUDA 12.9) added SM_103 and SM_121 and the new 'f' suffix
for all SM_1xx.

Internally, GCC currently generates the same code for >= sm_80 (Ampere);
however, as GCC's -march= also supports sm_89 (Ada), the here added
sm_1xxs (Blackwell) will map to sm_89.

[Naming note: while ptx code generated for sm_X can also run with sm_Y
if Y > X, code generated for sm_XXXa can (generally) only run on
the specific hardware; and sm_XXXf implies compatibility with only
subsequent targets in the same family.]

gcc/ChangeLog:

	* config/nvptx/nvptx.opt (march-map=): Add sm_100{,f,a},
	sm_101{,f,a}, sm_103{,a,f}, sm_120{,a,f} and sm_121{,f,a}.

 gcc/config/nvptx/nvptx.opt | 45 +
 1 file changed, 45 insertions(+)

diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index d326ca4ad26..9796839f8df 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -120,6 +120,51 @@ Target RejectNegative Alias(misa=,sm_89)
 march-map=sm_90a
 Target RejectNegative Alias(misa=,sm_89)
 
+march-map=sm_100
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_100f
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_100a
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_101
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_101f
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_101a
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_103
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_103f
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_103a
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_120
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_120f
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_120a
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_121
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_121f
+Target RejectNegative Alias(misa=,sm_89)
+
+march-map=sm_121a
+Target RejectNegative Alias(misa=,sm_89)
+
 Enum
 Name(ptx_version) Type(enum ptx_version)
 Known PTX ISA versions (for use with the -mptx= option):


[PATCH v2] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-06-01 Thread Andrew Pinski
This implements a simple copy propagation for aggregates in the similar
fashion as we already do for copy prop of zeroing.

Right now this only looks at the previous vdef statement but this allows us
to catch a lot of cases that show up in C++ code.

This used to deleted aggregate copies that are to the same location (PR57361)
But that was found to delete statements that are needed for aliasing markers 
reason.
So we need to keep them around until that is solved. Note DSE will delete the 
statements
anyways so there is no testcase added since we expose the latent bug in the 
same way.
See https://gcc.gnu.org/pipermail/gcc-patches/2025-May/685003.html for the 
testcase and
explaintation there.

Also adds a variant of pr22237.c which was found while working on this patch.

Changes since v1:
* v2: change check for vuse to use default definition.
  Remove dest/src arguments for optimize_agr_copyprop
  Changed dump messages slightly.
  Added stats
  Don't delete `a = a` until aliasing markers are added.

PR tree-optimization/14295
PR tree-optimization/108358
PR tree-optimization/114169

gcc/ChangeLog:

* tree-ssa-forwprop.cc (optimize_agr_copyprop): New function.
(pass_forwprop::execute): Call optimize_agr_copyprop for load/store 
statements.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/20031106-6.c: Un-xfail. Add scan for forwprop1.
* g++.dg/opt/pr66119.C: Disable forwprop since that does
the copy prop now.
* gcc.dg/tree-ssa/pr108358-a.c: New test.
* gcc.dg/tree-ssa/pr114169-1.c: New test.
* gcc.c-torture/execute/builtins/pr22237-1-lib.c: New test.
* gcc.c-torture/execute/builtins/pr22237-1.c: New test.
* gcc.dg/tree-ssa/pr57361.c: Disable forwprop1.
* gcc.dg/tree-ssa/pr57361-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/testsuite/g++.dg/opt/pr66119.C|  2 +-
 .../execute/builtins/pr22237-1-lib.c  | 27 ++
 .../execute/builtins/pr22237-1.c  | 57 
 gcc/testsuite/gcc.dg/tree-ssa/20031106-6.c|  8 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr108358-a.c| 33 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr114169-1.c| 39 +
 gcc/testsuite/gcc.dg/tree-ssa/pr57361-1.c | 10 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr57361.c   |  2 +-
 gcc/tree-ssa-forwprop.cc  | 87 +++
 9 files changed, 261 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr108358-a.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr114169-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr57361-1.c

diff --git a/gcc/testsuite/g++.dg/opt/pr66119.C 
b/gcc/testsuite/g++.dg/opt/pr66119.C
index d1b1845a258..52362e44434 100644
--- a/gcc/testsuite/g++.dg/opt/pr66119.C
+++ b/gcc/testsuite/g++.dg/opt/pr66119.C
@@ -3,7 +3,7 @@
the value of MOVE_RATIO now is.  */
 
 /* { dg-do compile  { target { { i?86-*-* x86_64-*-* } && c++11 } }  }  */
-/* { dg-options "-O3 -mavx -fdump-tree-sra -march=slm -mtune=slm 
-fno-early-inlining" } */
+/* { dg-options "-O3 -mavx -fdump-tree-sra -fno-tree-forwprop -march=slm 
-mtune=slm -fno-early-inlining" } */
 // { dg-skip-if "requires hosted libstdc++ for cstdlib malloc" { ! hostedlib } 
}
 
 #include 
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c 
b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c
new file mode 100644
index 000..44032357405
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c
@@ -0,0 +1,27 @@
+extern void abort (void);
+
+void *
+memcpy (void *dst, const void *src, __SIZE_TYPE__ n)
+{
+  const char *srcp;
+  char *dstp;
+
+  srcp = src;
+  dstp = dst;
+
+  if (dst < src)
+{
+  if (dst + n > src)
+   abort ();
+}
+  else
+{
+  if (src + n > dst)
+   abort ();
+}
+
+  while (n-- != 0)
+*dstp++ = *srcp++;
+
+  return dst;
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c 
b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c
new file mode 100644
index 000..0a12b0fc9a1
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c
@@ -0,0 +1,57 @@
+extern void abort (void);
+extern void exit (int);
+struct s { unsigned char a[256]; };
+union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; };
+static union u v;
+static union u v0;
+static struct s *p = &v.d.b;
+static struct s *q = &v.e.b;
+
+struct outers
+{
+  struct s inner;
+};
+
+static inline struct s rp (void) { return *p; }
+static inline struct s rq (void) { return *q; }
+static void pq (void)
+{
+  struct outers o = {rq () };
+  *p = o.inner;
+}
+static void qp (void)
+{
+  struct outers o = {rp () };
+  *q  = o.inner;
+}
+
+static void
+init (struct s *sp)
+{

[PATCH] Move get_call_rtx_from to final.c

2025-06-01 Thread H.J. Lu
Move get_call_rtx_from to final.c and call call_from_call_insn.

PR other/120493
* final.cc (call_from_call_insn): Change the argument type to
const rtx_call_insn *.
(get_call_rtx_from): New.
* rtl.h (is_a_helper ::test): New.
(get_call_rtx_from): Moved to the final.cc section.
* rtlanal.cc (get_call_rtx_from): Removed.

Tested on x86-64.

-- 
H.J.
From 5f68d5644870e3984343240b26918ee97e3e6f17 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 1 Jun 2025 09:29:48 +0800
Subject: [PATCH] Move get_call_rtx_from to final.c

Move get_call_rtx_from to final.c and call call_from_call_insn.

	PR other/120493
	* final.cc (call_from_call_insn): Change the argument type to
	const rtx_call_insn *.
	(get_call_rtx_from): New.
	* rtl.h (is_a_helper ::test): New.
	(get_call_rtx_from): Moved to the final.cc section.
	* rtlanal.cc (get_call_rtx_from): Removed.

Signed-off-by: H.J. Lu 
---
 gcc/final.cc   | 11 ++-
 gcc/rtl.h  | 10 +-
 gcc/rtlanal.cc | 15 ---
 3 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/gcc/final.cc b/gcc/final.cc
index 12c6eb0ac09..a4dbab72914 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -2072,7 +2072,7 @@ output_alternate_entry_point (FILE *file, rtx_insn *insn)
 
 /* Given a CALL_INSN, find and return the nested CALL. */
 static rtx
-call_from_call_insn (rtx_call_insn *insn)
+call_from_call_insn (const rtx_call_insn *insn)
 {
   rtx x;
   gcc_assert (CALL_P (insn));
@@ -2098,6 +2098,15 @@ call_from_call_insn (rtx_call_insn *insn)
   return x;
 }
 
+/* Return the CALL in X if there is one.  */
+
+rtx
+get_call_rtx_from (const rtx_insn *insn)
+{
+  const rtx_call_insn *call_insn = as_a (insn);
+  return call_from_call_insn (call_insn);
+}
+
 /* Print a comment into the asm showing FILENAME, LINENUM, and the
corresponding source line, if available.  */
 
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 7049ed775e8..8a740219c2a 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -957,6 +957,14 @@ is_a_helper ::test (rtx_insn *insn)
   return CALL_P (insn);
 }
 
+template <>
+template <>
+inline bool
+is_a_helper ::test (const rtx_insn *insn)
+{
+  return CALL_P (insn);
+}
+
 template <>
 template <>
 inline bool
@@ -3682,7 +3690,6 @@ extern bool nonzero_address_p (const_rtx);
 extern bool rtx_unstable_p (const_rtx);
 extern bool rtx_varies_p (const_rtx, bool);
 extern bool rtx_addr_varies_p (const_rtx, bool);
-extern rtx get_call_rtx_from (const rtx_insn *);
 extern tree get_call_fndecl (const rtx_insn *);
 extern HOST_WIDE_INT get_integer_term (const_rtx);
 extern rtx get_related_value (const_rtx);
@@ -4572,6 +4579,7 @@ extern void simplify_using_condition (rtx, rtx *, bitmap);
 extern void compute_alignments (void);
 extern void update_alignments (vec &);
 extern int asm_str_count (const char *templ);
+extern rtx get_call_rtx_from (const rtx_insn *);
 
 struct rtl_hooks
 {
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 900f53e252a..239d6691c4c 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -813,21 +813,6 @@ rtx_addr_varies_p (const_rtx x, bool for_alias)
   return false;
 }
 
-/* Return the CALL in X if there is one.  */
-
-rtx
-get_call_rtx_from (const rtx_insn *insn)
-{
-  rtx x = PATTERN (insn);
-  if (GET_CODE (x) == PARALLEL)
-x = XVECEXP (x, 0, 0);
-  if (GET_CODE (x) == SET)
-x = SET_SRC (x);
-  if (GET_CODE (x) == CALL && MEM_P (XEXP (x, 0)))
-return x;
-  return NULL_RTX;
-}
-
 /* Get the declaration of the function called by INSN.  */
 
 tree
-- 
2.49.0



Re: [PATCH] Fix crash with constant initializer caused by IPA

2025-06-01 Thread Eric Botcazou
> If one wants to look up something in symbol table during late optimization
> one is supposed to check that ::get does not return NULL and be conservative
> otheriwse.  So your change for PR120156 was OK with me, but I did not
> explicitly write (sorty for that). Similarly for vectorizer I would simply
> test for node to exist.

That's a bit ugly since we test decl_in_symtab_p just before, but of course 
very safe for release branches (the crash for the Ada testcase is a regression 
present in GCC 12 and later).

Tested on x86-64/Linux, OK for all branches (once they reopen)?


* tree-vect-data-refs.cc (vect_can_force_dr_alignment_p): Return
false if the variable has no symtab node.

-- 
Eric Botcazou
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 9fd1ef29650..6e35f549a8c 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -7242,7 +7242,8 @@ vect_can_force_dr_alignment_p (const_tree decl, poly_uint64 alignment)
 return false;
 
   if (decl_in_symtab_p (decl)
-  && !symtab_node::get (decl)->can_increase_alignment_p ())
+  && (!symtab_node::get (decl)
+	  || !symtab_node::get (decl)->can_increase_alignment_p ()))
 return false;
 
   if (TREE_STATIC (decl))


Re: [PATCH] aarch64:sve: Use create_tmp_reg_or_ssa_name instead of create_tmp_var in the folder

2025-06-01 Thread Richard Biener
On Sat, May 31, 2025 at 8:41 PM Andrew Pinski  wrote:
>
> Currently gimple_folder::convert_and_fold calls create_tmp_var; that means 
> while in ssa form,
> the pass which calls fold_stmt will always have to update the ssa (via 
> TODO_update_ssa or otherwise).
> This seems not very useful since we know that this will always be a ssa name, 
> using create_tmp_reg_or_ssa_name
> instead is better and don't need to depend on the ssa updater. Plus this 
> should have a small compile time performance
> and memory usage improvement too since this uses an anonymous ssa name rather 
> than creating a full decl for this.

Note the gimple_in_ssa_p check in create_tmp_reg_or_ssa_name is
superfluous these days
given the gimplifier already creates SSA names for temporaries when
not in SSA form
(no PHI nodes, no CFG).  So in principle we could switch
create_tmp_reg to always yield a SSA name.

Thus for the fixup here I'd suggest using make_ssa_name (type) instead.

Richard.

> Built and tested on aarch64-linux-gnu.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-sve-builtins.cc 
> (gimple_folder::convert_and_fold): Use create_tmp_reg_or_ssa_name
> instead of create_tmp_var for the temporary.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins.cc
> index 36519262efd..329237b086d 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -3675,7 +3675,7 @@ gimple_folder::convert_and_fold (tree type,
>tree old_ty = TREE_TYPE (lhs);
>gimple_seq stmts = NULL;
>bool convert_lhs_p = !useless_type_conversion_p (type, old_ty);
> -  tree lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs;
> +  tree lhs_conv = convert_lhs_p ? create_tmp_reg_or_ssa_name (type) : lhs;
>unsigned int num_args = gimple_call_num_args (call);
>auto_vec args_conv;
>args_conv.safe_grow (num_args);
> --
> 2.43.0
>


Re: [PATCH] Move get_call_rtx_from to final.c

2025-06-01 Thread Richard Biener
On Sun, Jun 1, 2025 at 9:14 AM H.J. Lu  wrote:
>
> Move get_call_rtx_from to final.c and call call_from_call_insn.
>
> PR other/120493
> * final.cc (call_from_call_insn): Change the argument type to
> const rtx_call_insn *.
> (get_call_rtx_from): New.
> * rtl.h (is_a_helper ::test): New.
> (get_call_rtx_from): Moved to the final.cc section.
> * rtlanal.cc (get_call_rtx_from): Removed.

OK.

Thanks,
Richard.

> Tested on x86-64.
>
> --
> H.J.


Re: [PATCH] Fix crash with constant initializer caused by IPA

2025-06-01 Thread Richard Biener
On Sun, Jun 1, 2025 at 12:54 PM Eric Botcazou  wrote:
>
> > If one wants to look up something in symbol table during late optimization
> > one is supposed to check that ::get does not return NULL and be conservative
> > otheriwse.  So your change for PR120156 was OK with me, but I did not
> > explicitly write (sorty for that). Similarly for vectorizer I would simply
> > test for node to exist.
>
> That's a bit ugly since we test decl_in_symtab_p just before, but of course
> very safe for release branches (the crash for the Ada testcase is a regression
> present in GCC 12 and later).
>
> Tested on x86-64/Linux, OK for all branches (once they reopen)?

OK.

Thanks,
Richard.

>
> * tree-vect-data-refs.cc (vect_can_force_dr_alignment_p): Return
> false if the variable has no symtab node.
>
> --
> Eric Botcazou


Re: DEFAULT_PCC_STRUCT_RETURN on NetBSD vs Linux

2025-06-01 Thread John Paul Adrian Glaubitz
On Wed, 2025-05-28 at 18:10 +0200, Andreas Schwab wrote:
> On Mai 28 2025, John Paul Adrian Glaubitz wrote:
> 
> > Shouldn't the #undef in linux.h undefine DEFAULT_PCC_STRUCT_RETURN and not
> > PCC_STATIC_STRUCT_RETURN?
> 
> No, they are separate target options.  PCC_STATIC_STRUCT_RETURN is no
> longer defined by default, so this is redundant now.

OK, guess someone can drop that line then.

> > And, secondly, shouldn't the comment in linux.h be corrected since
> > apparently linux.h and netbsd-elf.h disagree on what the SVR4 ABI
> > specifies how structs and unions are returned?
> 
> This is controlled by TARGET_RETURN_IN_MEMORY if
> DEFAULT_PCC_STRUCT_RETURN is 0.

I was talking about the comments, not the code since NetBSD and Linux
disagree on what the SVR4 ABI claims.

Btw, shouldn't EMPTY_FIELD_BOUNDARY not changed by -malign-int to 32
similar to BIGGEST_ALIGNMENT (see netbsd-elf.h)?

And what about the value for STACK_BOUNDARY? It seems to be 16 for many
Linux targets while it's 32 for NetBSD. Is there a reason why it's 16-bit-
aligned on Linux?

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


Re: [PATCH] RISC-V: Support CPUs in -march.

2025-06-01 Thread Fangrui Song
On Sun, Jun 1, 2025 at 4:06 AM Robin Dapp  wrote:
> > I stumped across this change from
> > https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/88
> > and I want to express my strong disagreement with this change.
> >
> >
> > Perhaps I'm accustomed to Arm's behavior, but I believe using -march= to
> > target a specific CPU isn't ideal.
> >
> > * -march=X: (execution domain) Generate code that can use instructions
> > available in the architecture X
> > * -mtune=X: (optimization domain) Optimize for the microarchitecture X, but
> > does not change the ABI or make assumptions about available instructions
> > * -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two
> > options. The supported values are generally the same as -mtune=. The
> > architecture name is inferred from X
> >
> > For execution domain settings, -march=X overrides -mcpu=X regardless of 
> > their
> > positions.
> >
> > In cases like `-march=LOWER -mcpu=HIGHER` or `-mcpu=HIGHER -march=LOWER`, 
> > the
> > -march= option can disable certain target features.
> >
> > I strongly disagree with Clang adopting this behavior.
> > I'm not convinced by the GCC patch explanation.
> >
> >> Suppose we have a Makefile that specifies -march=rv64gc by default.
> >
> > In the project specifies a lower feature set, then the compiler should
> > respect it or the user should fix the project build system.
>
> There are two reasons for the change:
>
>  - Adhere to the usual "last option wins" convention and at the same time

This rule clearly applies to directly related options like -ffoo and
-fno-foo, but it’s less obvious for unrelated pairs like -ffoo and
-fbar,
especially when there is traditionally strong specifics.

In many cases, the principle of "the most specific option wins"
governs the behavior.

>  - Make it easy to specify an overriding -march string representing a CPU
>without spelling out the >200 char expanded arch string, a very error-prone
>procedure.
>
> I agree that fixing the "project build system" might be another way of having
> only a single -march in a compiler command line but it doesn't match my
> experience.  Over the years I have seen a significant number of projects where
> multiple -marchs where present.  With the "last option wins" convention this
> works and a user always has a "last ditch" way of setting their preferred
> architecture.
> A typical use case is a default march that is overridden by CFLAGS.  There's
> nothing to be fixed here because it's standard procedure.  In that situation 
> we
> would always need the full -march string and that can't be what we want?

When multiple -march= options are specified, the last one wins.
However, this does not apply when combining -march= and -mcpu=, as
-march= is intended to restrict the feature set.
If you'd like to propose that -mcpu= should override -march=, one
approach could be introducing a special value for -march=, such as
-march=unset.

> What's your main concern?  The less clear separation between arch and CPU?
> I don't think the change takes anything away from -march but just adds a
> shortcut "set the arch level to what this CPU supports".  That's fully in line
> with what other targets are doing,  also on the clang side.

* -march= and -mcpu= has become less orthogonal, requiring users to
navigate more complex rules.
* We've lost the ability to use combinations like -march=LOWER
-mcpu=HIGHER or -mcpu=HIGHER -march=LOWER to disable specific target
features or detect incompatibilities.
* We landed the commit 4a182418c89666e7594bcb0e5edc5194aa147910
hastily on May 23, 2025, with far-reaching implications that weren't
fully considered.

> Ideally (IMHO) we'd also imply an -mtune when specifying -march by default
> rather than using -mtune=rocket or so.

I strongly disagree with implicitly setting -mtune= when specifying
-march=. Adding more semantics to -march complicates the rules users
must follow and increases the risk of error-prone patterns.
The traditional behavior established by aarch64, in contrast, is easy
to remember:

  -mcpu=X: Specify both -march= and -mtune= but can be overridden by
the two options. The supported values are generally the same as
-mtune=. The architecture name is inferred from X

> If we had an -mcpu=... that overrode -march and -mtune we'd already have a 
> good
> way of achieving what I intended.  In that case -march wouldn't need to be
> changed but to my understanding this has been unwanted so far.  For me a "last
> option wins" -mcpu that overrides march and mtune would also work and would
> keep the clear separation between -march and -mtune.


[PATCH] gcc: middle-end opt for trigonometric pi-based functions builtins

2025-06-01 Thread Yuao Ma
Hello world,

This patch partially handled PR118592.

This patch builds upon r16-710-g591d3d02664c7b and r16-711-g89935d56f768b4. It
introduces middle-end optimizations, such as constant folding, for our
trigonometric pi-based function built-ins.

For MPFR versions older than 4.2.0, we've included our own folding functions.
I've also added tests to ensure everything works as expected.

Best regards,
Yuao


0001-gcc-middle-end-opt-for-trigonometric-pi-based-functi.patch
Description: 0001-gcc-middle-end-opt-for-trigonometric-pi-based-functi.patch


Re: [PATCH] forwprop: Manually rename the virtual mem op for complex and vector loads prop

2025-06-01 Thread Richard Biener



> Am 01.06.2025 um 04:16 schrieb Andrew Pinski :
> 
> There are two places which forwprop replaces an original load to a few 
> different loads.
> Both can set the vuse manually instead of relying on update_ssa.
> One is doing a complex load followed by REAL/IMAG_PART only
> And the other is very similar but for vector loads followed by BIT_FIELD_REF.
> 
> Since this was the last place that needed to handle updating the ssa form,
> Remove the TODO_update_ssa also from the pass.

Ok

Richard 

> gcc/ChangeLog:
> 
>* tree-ssa-forwprop.cc (optimize_vector_load): Set the vuse manually
>on the new load statements. Also remove forward declaration since
>the definition is before the first use.
>(pass_forwprop::execute): Likewise for complex loads.
>(pass_data_forwprop): Remove TODO_update_ssa.
> 
> Signed-off-by: Andrew Pinski 
> ---
> gcc/tree-ssa-forwprop.cc | 8 ++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> index 81ea7d4195e..75901ecfbb0 100644
> --- a/gcc/tree-ssa-forwprop.cc
> +++ b/gcc/tree-ssa-forwprop.cc
> @@ -205,7 +205,6 @@ struct _vec_perm_simplify_seq
> typedef struct _vec_perm_simplify_seq *vec_perm_simplify_seq;
> 
> static bool forward_propagate_addr_expr (tree, tree, bool);
> -static void optimize_vector_load (gimple_stmt_iterator *);
> 
> /* Set to true if we delete dead edges during the optimization.  */
> static bool cfg_changed;
> @@ -3387,6 +3386,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi)
>   gimple *stmt = gsi_stmt (*gsi);
>   tree lhs = gimple_assign_lhs (stmt);
>   tree rhs = gimple_assign_rhs1 (stmt);
> +  tree vuse = gimple_vuse (stmt);
> 
>   /* Gather BIT_FIELD_REFs to rewrite, looking through
>  VEC_UNPACK_{LO,HI}_EXPR.  */
> @@ -3495,6 +3495,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi)
>  gimple *new_stmt = gimple_build_assign (tem, new_rhs);
>  location_t loc = gimple_location (use_stmt);
>  gimple_set_location (new_stmt, loc);
> +  gimple_set_vuse (new_stmt, vuse);
>  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>  /* Perform scalar promotion.  */
>  new_stmt = gimple_build_assign (gimple_assign_lhs (use_stmt),
> @@ -3514,6 +3515,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi)
>  new_rhs);
>  location_t loc = gimple_location (use_stmt);
>  gimple_set_location (new_stmt, loc);
> +  gimple_set_vuse (new_stmt, vuse);
>  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
>}
>   gimple_stmt_iterator gsi2 = gsi_for_stmt (use_stmt);
> @@ -4167,7 +4169,7 @@ const pass_data pass_data_forwprop =
>   0, /* properties_provided */
>   0, /* properties_destroyed */
>   0, /* todo_flags_start */
> -  TODO_update_ssa, /* todo_flags_finish */
> +  0, /* todo_flags_finish */
> };
> 
> class pass_forwprop : public gimple_opt_pass
> @@ -4404,6 +4406,7 @@ pass_forwprop::execute (function *fun)
> component-wise loads.  */
>  use_operand_p use_p;
>  imm_use_iterator iter;
> +  tree vuse = gimple_vuse (stmt);
>  bool rewrite = true;
>  FOR_EACH_IMM_USE_FAST (use_p, iter, lhs)
>{
> @@ -4443,6 +4446,7 @@ pass_forwprop::execute (function *fun)
> 
>  location_t loc = gimple_location (use_stmt);
>  gimple_set_location (new_stmt, loc);
> +  gimple_set_vuse (new_stmt, vuse);
>  gimple_stmt_iterator gsi2 = gsi_for_stmt (use_stmt);
>  unlink_stmt_vdef (use_stmt);
>  gsi_remove (&gsi2, true);
> --
> 2.43.0
> 


[PATCH v2] c++, coroutines: CWG2563 promise lifetime extension [PR115908].

2025-06-01 Thread Iain Sandoe
Updated. I realised we no longer need to refer to
initial_await_resume_called in the ramp at all, v2 removes the setting
of the variable from there and puts it into the start of the actor.

Tested on x86_64-darwin and powerp64le-linux.
Confirmed that the sanitizer test in PR 118074 is fixed but we already
have a suitable testcase in PR 115908 which has now been moved to the
'torture' sub-directory to get wider code-gen coverage.
OK for trunk?
thanks
Iain

--- 8< ---

This implements the final piece of the revised CWG2563 wording;
"It exits the scope of promise only if the coroutine completed
 without suspending."

The change removes the need to track whether we have reached the ramp
return.  We can now assume that the coroutine frame is valid when
constructing the ramp return value.  The ramp function no longer needs
to check initial_await_resume_called, and the references to that can
be made local to the resumer.

The cleanups for promise, arg copies and frame are now adjusted to
depend only on whether a suspend has occurred.

PR c++/115908
PR c++/118074

gcc/cp/ChangeLog:

* coroutines.cc: New GTY coro_ramp_cleanup_id.
(coro_init_identifiers): Initialize coro_ramp_cleanup_id.
(build_actor_fn): Set coro_ramp_cleanup_id false when we
exit via a suspend or continuation return.  Do not clean
up if we pass the final await expression without suspending.
(cp_coroutine_transform::wrap_original_function_body):
Declare _Coro_ramp_cleanup.
(cp_coroutine_transform::build_ramp_function): Amend
cleanups to use coro_ramp_cleanup and set that to be true
on init.  Remove the use of _Coro_befoe_return.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr115908.C: Move to...
* g++.dg/coroutines/torture/pr115908.C: ...here.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc  | 121 +-
 .../coroutines/{ => torture}/pr115908.C   |   9 +-
 2 files changed, 60 insertions(+), 70 deletions(-)
 rename gcc/testsuite/g++.dg/coroutines/{ => torture}/pr115908.C (88%)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index a2b6d53805a..42f719ddde1 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -348,6 +348,7 @@ static GTY(()) tree coro_resume_index_id;
 static GTY(()) tree coro_self_handle_id;
 static GTY(()) tree coro_actor_continue_id;
 static GTY(()) tree coro_frame_i_a_r_c_id;
+static GTY(()) tree coro_ramp_cleanup_id;
 
 /* Create the identifiers used by the coroutines library interfaces and
the implementation frame state.  */
@@ -385,6 +386,7 @@ coro_init_identifiers ()
   coro_resume_index_id = get_identifier ("_Coro_resume_index");
   coro_self_handle_id = get_identifier ("_Coro_self_handle");
   coro_actor_continue_id = get_identifier ("_Coro_actor_continue");
+  coro_ramp_cleanup_id = get_identifier ("_Coro_ramp_should_cleanup");
 }
 
 /* Trees we only need to set up once.  */
@@ -2568,6 +2570,17 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
   /* Now we start building the rewritten function body.  */
   add_stmt (build_stmt (loc, LABEL_EXPR, actor_begin_label));
 
+  tree i_a_r_c = NULL_TREE;
+  if (flag_exceptions)
+{
+  i_a_r_c
+   = coro_build_frame_access_expr (actor_frame, coro_frame_i_a_r_c_id,
+   false, tf_warning_or_error);
+  tree m = cp_build_modify_expr (loc, i_a_r_c, NOP_EXPR,
+   boolean_false_node, tf_warning_or_error);
+  finish_expr_stmt (m);
+}
+
   /* actor's coroutine 'self handle'.  */
   tree ash = coro_build_frame_access_expr (actor_frame, coro_self_handle_id,
   false, tf_warning_or_error);
@@ -2597,6 +2610,16 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
   /* Add in our function body with the co_returns rewritten to final form.  */
   add_stmt (fnbody);
 
+  /* If we reach here without suspending, then the ramp should cleanup.  */
+  tree ramp_should_cleanup
+= coro_build_frame_access_expr (actor_frame, coro_ramp_cleanup_id,
+   false, tf_warning_or_error);
+  tree ramp_cu_if = begin_if_stmt ();
+  finish_if_stmt_cond (ramp_should_cleanup, ramp_cu_if);
+  finish_return_stmt (NULL_TREE);
+  finish_then_clause (ramp_cu_if);
+  finish_if_stmt (ramp_cu_if);
+
   /* Now do the tail of the function; first cleanups.  */
   r = build_stmt (loc, LABEL_EXPR, del_promise_label);
   add_stmt (r);
@@ -2638,6 +2661,9 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
   /* This is the suspend return point.  */
   add_stmt (build_stmt (loc, LABEL_EXPR, ret_label));
 
+  r = cp_build_modify_expr (loc, ramp_should_cleanup, NOP_EXPR,
+   boolean_false_node, tf_warning_or_error);
+  finish_expr_stmt (r);
   finish_return_stmt (NULL_TREE);
 
   /* Th

[PATCH v2] aarch64:sve: Use make_ssa_name instead of create_tmp_var in the folder

2025-06-01 Thread Andrew Pinski
Currently gimple_folder::convert_and_fold calls create_tmp_var; that
means while in ssa form, the pass which calls fold_stmt will always
have to update the ssa (via TODO_update_ssa or otherwise).  This seems
not very useful since we know that this will always be a ssa name, using
make_ssa_name instead is better and don't need to depend on the ssa updater.
Plus this should have a small compile time performance and memory usage
improvement too since this uses an anonymous ssa name rather than creating a
full decl for this.

Changes since v1:
* Use make_ssa_name instead of create_tmp_reg_or_ssa_name, anonymous ssa
  names are allowed early on in gimple too.

Built and tested on aarch64-linux-gnu.

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins.cc: Include value-range.h and 
tree-ssanames.h
(gimple_folder::convert_and_fold): Use make_ssa_name
instead of create_tmp_var for the temporary. Add comment about callback 
argument.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64-sve-builtins.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 36519262efd..8540aef47b2 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -47,6 +47,8 @@
 #include "langhooks.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "value-range.h"
+#include "tree-ssanames.h"
 #include "aarch64-sve-builtins.h"
 #include "aarch64-sve-builtins-base.h"
 #include "aarch64-sve-builtins-sve2.h"
@@ -3664,7 +3666,8 @@ gimple_folder::fold_pfalse ()
 /* Convert the lhs and all non-boolean vector-type operands to TYPE.
Pass the converted variables to the callback FP, and finally convert the
result back to the original type. Add the necessary conversion statements.
-   Return the new call.  */
+   Return the new call. Note the tree argument to the callback FP, can only be 
set
+   once; it will always be a SSA_NAME.  */
 gimple *
 gimple_folder::convert_and_fold (tree type,
 gimple *(*fp) (gimple_folder &,
@@ -3675,7 +3678,7 @@ gimple_folder::convert_and_fold (tree type,
   tree old_ty = TREE_TYPE (lhs);
   gimple_seq stmts = NULL;
   bool convert_lhs_p = !useless_type_conversion_p (type, old_ty);
-  tree lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs;
+  tree lhs_conv = convert_lhs_p ? make_ssa_name (type) : lhs;
   unsigned int num_args = gimple_call_num_args (call);
   auto_vec args_conv;
   args_conv.safe_grow (num_args);
-- 
2.43.0



[committed] cobol: Wrap the call to fprintf in a libgcobol routine. [PR119524]

2025-06-01 Thread Robert Dubner
>From 501f95ffae0371e2335f89951a02a3a32f0cd53d Mon Sep 17 00:00:00 2001
From: Robert Dubner mailto:rdub...@symas.com
Date: Sun, 1 Jun 2025 12:32:37 -0400
Subject: [PATCH] [PR119524]

---
 gcc/cobol/gengen.cc| 16 +---
 libgcobol/libgcobol.cc | 14 ++
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/gcc/cobol/gengen.cc b/gcc/cobol/gengen.cc
index 91f67d534f6..a5f143cf234 100644
--- a/gcc/cobol/gengen.cc
+++ b/gcc/cobol/gengen.cc
@@ -2152,18 +2152,6 @@ gg_printf(const char *format_string, ...)
   int nargs = 0;
   tree args[ARG_LIMIT];
 
-  // Because this routine is intended for debugging, we are sending the
-  // text to STDERR
-
-  // Because we don't actually use stderr ourselves, we just pick it up
as a
-  // VOID_P and pass it along to fprintf()
-  tree t_stderr = gg_declare_variable(VOID_P, "stderr",
-  NULL_TREE,
-  vs_external_reference);
-
-  gg_push_context();
-
-  args[nargs++] = t_stderr;
   args[nargs++] = build_string_literal(strlen(format_string)+1,
format_string);
 
   va_list ap;
@@ -2197,7 +2185,7 @@ gg_printf(const char *format_string, ...)
   static tree function = NULL_TREE;
   if( !function )
 {
-function = gg_get_function_address(INT, "fprintf");
+function = gg_get_function_address(INT, "__gg__fprintf_stderr");
 }
 
   tree stmt = build_call_array_loc (location_from_lineno(),
@@ -2206,8 +2194,6 @@ gg_printf(const char *format_string, ...)
 nargs,
 args);
   gg_append_statement(stmt);
-
-  gg_pop_context();
   }
 
 tree
diff --git a/libgcobol/libgcobol.cc b/libgcobol/libgcobol.cc
index 6721dd20740..9fd160b7083 100644
--- a/libgcobol/libgcobol.cc
+++ b/libgcobol/libgcobol.cc
@@ -49,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 #if __has_include()
 # include  // for program_invocation_short_name
 #endif
@@ -13159,3 +13160,16 @@ __gg__set_env_value(cblc_field_t   *value,
   // And now, anticlimactically, set the variable:
   setenv(trimmed_env, trimmed_val, 1);
   }
+
+extern "C"
+void
+__gg__fprintf_stderr(const char *format_string, ...)
+  {
+  /*  This routine allows the compiler to send stuff to stderr in a way
+  that is straightforward to use..  */
+  va_list ap;
+  va_start(ap, format_string);
+  vfprintf(stderr, format_string, ap);
+  va_end(ap);
+  }
+
-- 
2.34.1



Re: [PATCH] aarch64:sve: Use create_tmp_reg_or_ssa_name instead of create_tmp_var in the folder

2025-06-01 Thread Andrew Pinski
On Sun, Jun 1, 2025 at 3:54 AM Richard Biener
 wrote:
>
> On Sat, May 31, 2025 at 8:41 PM Andrew Pinski  
> wrote:
> >
> > Currently gimple_folder::convert_and_fold calls create_tmp_var; that means 
> > while in ssa form,
> > the pass which calls fold_stmt will always have to update the ssa (via 
> > TODO_update_ssa or otherwise).
> > This seems not very useful since we know that this will always be a ssa 
> > name, using create_tmp_reg_or_ssa_name
> > instead is better and don't need to depend on the ssa updater. Plus this 
> > should have a small compile time performance
> > and memory usage improvement too since this uses an anonymous ssa name 
> > rather than creating a full decl for this.
>
> Note the gimple_in_ssa_p check in create_tmp_reg_or_ssa_name is
> superfluous these days
> given the gimplifier already creates SSA names for temporaries when
> not in SSA form
> (no PHI nodes, no CFG).  So in principle we could switch
> create_tmp_reg to always yield a SSA name.
>
> Thus for the fixup here I'd suggest using make_ssa_name (type) instead.

yes that worked, I submitted
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685263.html for
that.
I am waiting on this being approved before pushing the CCP patch that
removes update_ssa since the CCP exposes this issue.

Thanks,
Andrew

>
> Richard.
>
> > Built and tested on aarch64-linux-gnu.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-sve-builtins.cc 
> > (gimple_folder::convert_and_fold): Use create_tmp_reg_or_ssa_name
> > instead of create_tmp_var for the temporary.
> >
> > Signed-off-by: Andrew Pinski 
> > ---
> >  gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
> > b/gcc/config/aarch64/aarch64-sve-builtins.cc
> > index 36519262efd..329237b086d 100644
> > --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> > +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> > @@ -3675,7 +3675,7 @@ gimple_folder::convert_and_fold (tree type,
> >tree old_ty = TREE_TYPE (lhs);
> >gimple_seq stmts = NULL;
> >bool convert_lhs_p = !useless_type_conversion_p (type, old_ty);
> > -  tree lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs;
> > +  tree lhs_conv = convert_lhs_p ? create_tmp_reg_or_ssa_name (type) : lhs;
> >unsigned int num_args = gimple_call_num_args (call);
> >auto_vec args_conv;
> >args_conv.safe_grow (num_args);
> > --
> > 2.43.0
> >


[PATCH] Add newlib to gitignore

2025-06-01 Thread Arijit Kumar Das
newlib sources are not a part of GCC so should be ignored, if present.

Signed-off-by: Arijit Kumar Das 
---
 .gitignore | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.gitignore b/.gitignore
index 7150fc3b29c..5ae6a5a08d6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -74,3 +74,6 @@ stamp-*
 
 # ADDITIONS from GCCRS front-end
 libgrust/*/target/
+
+# newlib sources are not a part of GCC so should be ignored, if present as a 
symlink
+newlib
-- 
2.39.5