Re: [PATCH] RISC-V: Support CPUs in -march.
I stumped across this change from https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/88 and I want to express my strong disagreement with this change. Perhaps I'm accustomed to Arm's behavior, but I believe using -march= to target a specific CPU isn't ideal. * -march=X: (execution domain) Generate code that can use instructions available in the architecture X * -mtune=X: (optimization domain) Optimize for the microarchitecture X, but does not change the ABI or make assumptions about available instructions * -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two options. The supported values are generally the same as -mtune=. The architecture name is inferred from X For execution domain settings, -march=X overrides -mcpu=X regardless of their positions. In cases like `-march=LOWER -mcpu=HIGHER` or `-mcpu=HIGHER -march=LOWER`, the -march= option can disable certain target features. I strongly disagree with Clang adopting this behavior. I'm not convinced by the GCC patch explanation. Suppose we have a Makefile that specifies -march=rv64gc by default. In the project specifies a lower feature set, then the compiler should respect it or the user should fix the project build system. There are two reasons for the change: - Adhere to the usual "last option wins" convention and at the same time - Make it easy to specify an overriding -march string representing a CPU without spelling out the >200 char expanded arch string, a very error-prone procedure. I agree that fixing the "project build system" might be another way of having only a single -march in a compiler command line but it doesn't match my experience. Over the years I have seen a significant number of projects where multiple -marchs where present. With the "last option wins" convention this works and a user always has a "last ditch" way of setting their preferred architecture. A typical use case is a default march that is overridden by CFLAGS. There's nothing to be fixed here because it's standard procedure. In that situation we would always need the full -march string and that can't be what we want? What's your main concern? The less clear separation between arch and CPU? I don't think the change takes anything away from -march but just adds a shortcut "set the arch level to what this CPU supports". That's fully in line with what other targets are doing, also on the clang side. Ideally (IMHO) we'd also imply an -mtune when specifying -march by default rather than using -mtune=rocket or so. If we had an -mcpu=... that overrode -march and -mtune we'd already have a good way of achieving what I intended. In that case -march wouldn't need to be changed but to my understanding this has been unwanted so far. For me a "last option wins" -mcpu that overrides march and mtune would also work and would keep the clear separation between -march and -mtune. -- Regards Robin
[PING * 3][PATCH v3] Add new warning Wmissing-designated-initializers [PR39589]
Ping https://gcc.gnu.org/pipermail/gcc-patches/2025-January/672568.html
Re: DEFAULT_PCC_STRUCT_RETURN on NetBSD vs Linux
On 6/1/25 8:21 AM, John Paul Adrian Glaubitz wrote: And what about the value for STACK_BOUNDARY? It seems to be 16 for many Linux targets while it's 32 for NetBSD. Is there a reason why it's 16-bit- aligned on Linux? Regardless of differences in the values, I don't see changing them at this point as that would be an ABI change. Jeff
Re: [PATCH v4 04/20] Remove unnecessary `record` argument from maybe_version_functions.
On 5/28/25 4:10 AM, Alfie Richards wrote: Previously, the `record` argument in maybe_version_function allowed the call to cgraph_node::record_function_versions to be skipped. However, this was only skipped when both decls were already marked as versioned, in which case we trigger the early exit in record_function_versions instead. Therefore, the argument is unnecessary. gcc/cp/ChangeLog: * class.cc (add_method): Remove argument. * cp-tree.h (maybe_version_functions): Ditto. * decl.cc (decls_match): Ditto. (maybe_version_functions): Ditto. OK jeff
[PATCH] c: Move checking assertions from recursion when forming composite types to avoid ICE.
After looking a bit more at this, it turns out that it is still possible to trigger the assertion that checks that the composite type is compatible to the original types when using self-referential types where it then sees composite types which are still under construction. This patch now moves the checking assertion out of the recursion, which seems better anyhow. I could not do the check unconditionally for all types. because we sometimes call composite_type for with mismatching qualifiers. There is also still a remaining issue with setting C_VARIABLY_MODIFIED_TYPE_P correctly for composite types for such self-referential types. Bootstrapped and regression tested for x86_64. Martin c: Move checking assertions from recursion when forming composite types to avoid ICE. The checking assertion in composite_type_internal for structures and unions may fail if there are self-referential types. To avoid this, we move them out of the recursion. This should also be more efficient and covers other types. We have to ignore some cases where we form composite types despite qualifiers not matching. gcc/c/ChangeLog: * c-typeck.cc (composite_type_internal,composite_type): Move checking assertions. gcc/testsuite/ChangeLog: * gcc.dg/gnu23-tag-composite-6.c: Update. diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 2f243cab8da..417db8795c5 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -846,12 +846,7 @@ composite_type_internal (tree t1, tree t2, struct composite_cache* cache) n = finish_struct (input_location, n, fields, attributes, NULL, &expr); - n = qualify_type (n, t1); - - gcc_checking_assert (!TYPE_NAME (n) || comptypes (n, t1)); - gcc_checking_assert (!TYPE_NAME (n) || comptypes (n, t2)); - - return n; + return qualify_type (n, t1); } /* FALLTHRU */ case ENUMERAL_TYPE: @@ -1004,7 +999,15 @@ tree composite_type (tree t1, tree t2) { struct composite_cache cache = { }; - return composite_type_internal (t1, t2, &cache); + tree n = composite_type_internal (t1, t2, &cache); + /* For function and arrays there are some exceptions where + qualifiers do not match. */ + if (FUNCTION_TYPE != TREE_CODE (n) && ARRAY_TYPE != TREE_CODE (n)) +{ + gcc_checking_assert (comptypes (n, t1)); + gcc_checking_assert (comptypes (n, t2)); +} + return n; } /* Return the type of a conditional expression between pointers to diff --git a/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c b/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c index 2411b04d388..076c066f9c8 100644 --- a/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c +++ b/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c @@ -1,11 +1,31 @@ /* { dg-do compile } */ /* { dg-options "-std=gnu23" } */ +#define NEST(...) typeof(({ (__VA_ARGS__){ }; })) + int f() { typedef struct foo bar; -struct foo { typeof(({ (struct foo { bar * x; }){ }; })) * x; } *q; -typeof(q->x) p; -1 ? p : q; +struct foo { NEST(struct foo { bar *x; }) *x; } *q; +typeof(q->x) p0; +typeof(q->x) p1; +1 ? p0 : q; +1 ? p1 : q; +1 ? p0 : p1; +} + +int g() +{ +typedef struct fo2 bar; +struct fo2 { NEST(struct fo2 { NEST(struct fo2 { bar *x; }) * x; }) *x; } *q; +typeof(q->x) p0; +typeof(q->x->x) p1; +typeof(q->x->x->x) p2; +1 ? p0 : q; +1 ? p1 : q; +1 ? p2 : q; +1 ? p0 : p1; +1 ? p2 : p1; +1 ? p0 : p2; }
Re: [PATCH] testsuite: RISC-V: Fix the typo in param-autovec-mode.c
On 5/28/25 7:59 PM, Liao Shihua wrote: This patch fixes the typo in the test case `param-autovec-mode.c` in the RISC-V autovec testsuite. The option `autovec-mode` is changed to `riscv-autovec-mode` to match the expected parameter name. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/param-autovec-mode.c: Change `autovec-mode` to `riscv-autovec-mode` in dg-options. Wrapped the overly long ChangeLog line and push this to the trunk for you. Thanks, jeff
Re: [PATCH] Add newlib to gitignore
On 6/1/25 1:40 PM, Arijit Kumar Das wrote: newlib sources are not a part of GCC so should be ignored, if present. Signed-off-by: Arijit Kumar Das --- .gitignore | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.gitignore b/.gitignore index 7150fc3b29c..5ae6a5a08d6 100644 --- a/.gitignore +++ b/.gitignore @@ -74,3 +74,6 @@ stamp-* # ADDITIONS from GCCRS front-end libgrust/*/target/ + +# newlib sources are not a part of GCC so should be ignored, if present as a symlink +newlib There's all kind of directories that would fit this description, say for a single tree build, but which aren't included in .gitignore (binutils, gas, ld, opcodes, etc etc). Why would newlib be special here? I'm not inclined to include this without a strong reason. jeff
Re: [PATCH] or1k: Fix struct return test
On 5/31/25 12:03 AM, Stafford Horne wrote: In or1k structs are returned from functions using the memory address passed in r3. In the current version of GCC the struct stores changed from r11 (the return value) to r3 the incoming memory address. Both of are valid. Adjust the test to match what GCC is producing now. gcc/testsuite/ChangeLog: * gcc.target/or1k/return-2.c: Fix test. Given you're the maintainer of this port it seems like you can self-approve and commit this and the long jump offset patch. Jeff
Re: [PATCH v4] c++: Unwrap type traits defined in terms of builtins within diagnostics [PR117294]
On Sat, 31 May 2025, Nathaniel Shead wrote: > On Fri, May 30, 2025 at 11:10:08AM -0400, Patrick Palka wrote: > > On Fri, 30 May 2025, Patrick Palka wrote: > > > > > On Fri, 30 May 2025, Nathaniel Shead wrote: > > > > > > > On Wed, May 28, 2025 at 02:14:06PM -0400, Patrick Palka wrote: > > > > > On Tue, 27 May 2025, Nathaniel Shead wrote: > > > > > > > > > > > On Wed, Nov 27, 2024 at 11:45:40AM -0500, Patrick Palka wrote: > > > > > > > On Fri, 8 Nov 2024, Nathaniel Shead wrote: > > > > > > > > > > > > > > > Does this approach seem reasonable? I'm pretty sure that the > > > > > > > > way I've > > > > > > > > handled the templating here is unideal but I'm not sure what a > > > > > > > > neat way > > > > > > > > to do what I'm trying to do here would be; any comments are > > > > > > > > welcome. > > > > > > > > > > > > > > Clever approach, I like it! > > > > > > > > > > > > > > > > > > > > > > > -- >8 -- > > > > > > > > > > > > > > > > Currently, concept failures of standard type traits just report > > > > > > > > 'expression X evaluates to false'. However, many type > > > > > > > > traits are > > > > > > > > actually defined in terms of compiler builtins; we can do > > > > > > > > better here. > > > > > > > > For instance, 'is_constructible_v' could go on to explain why > > > > > > > > the type > > > > > > > > is not constructible, or 'is_invocable_v' could list potential > > > > > > > > candidates. > > > > > > > > > > > > > > That'd be great improvement. > > > > > > > > > > > > > > > > > > > > > > > As a first step to supporting that we need to be able to map the > > > > > > > > standard type traits to the builtins that they use. Rather > > > > > > > > than adding > > > > > > > > another list that would need to be kept up-to-date whenever a > > > > > > > > builtin is > > > > > > > > added, this patch instead tries to detect any variable template > > > > > > > > defined > > > > > > > > directly in terms of a TRAIT_EXPR. > > > > > > > > > > > > > > > > To avoid false positives, we ignore any variable templates that > > > > > > > > have any > > > > > > > > specialisations (partial or explicit), even if we wouldn't have > > > > > > > > chosen > > > > > > > > that specialisation anyway. This shouldn't affect any of the > > > > > > > > standard > > > > > > > > library type traits that I could see. > > > > > > > > > > > > > > You should be able to tsubst the TEMPLATE_ID_EXPR directly and > > > > > > > look at > > > > > > > its TI_PARTIAL_INFO in order to determine which (if any) partial > > > > > > > specialization was selected. And if an explicit specialization > > > > > > > was > > > > > > > selected the resulting VAR_DECL will have > > > > > > > DECL_TEMPLATE_SPECIALIZATION > > > > > > > set. > > > > > > > > > > > > > > > ...[snip]... > > > > > > > > > > > > > > If we substituted the TEMPLATE_ID_EXPR as a whole we could use the > > > > > > > DECL_TI_ARGS of that IIUC? > > > > > > > > > > > > > > > > > > > Thanks for your comments, they were very helpful. Here's a totally > > > > > > new > > > > > > approach which I'm much happier with. I've also removed the > > > > > > "disable in > > > > > > case any specialisation exists" logic, as on further reflection I > > > > > > don't > > > > > > imagine this to be the kind of issue I thought it might have been. > > > > > > > > > > > > With this patch, > > > > > > > > > > > > template > > > > > > constexpr bool is_default_constructible_v = __is_constructible(T); > > > > > > > > > > > > template > > > > > > concept default_constructible = is_default_constructible_v; > > > > > > > > > > > > static_assert(default_constructible); > > > > > > > > > > > > now emits the following error: > > > > > > > > > > > > test.cpp:6:15: error: static assertion failed > > > > > > 6 | static_assert(default_constructible); > > > > > > | ^~~ > > > > > > test.cpp:6:15: note: constraints not satisfied > > > > > > test.cpp:4:9: required by the constraints of ‘template > > > > > > concept default_constructible’ > > > > > > test.cpp:4:33: note: ‘void’ is not default constructible > > > > > > 4 | concept default_constructible = > > > > > > is_default_constructible_v; > > > > > > | > > > > > > ^ > > > > > > > > > > > > There's still a lot of improvements to be made in this area, I > > > > > > think: > > > > > > > > > > > > - I haven't yet looked into updating the specific diagnostics > > > > > > emitted by > > > > > > the traits; I'd like to try to avoid too much code duplication > > > > > > with > > > > > > the implementation in cp/semantics.cc. (I also don't think the > > > > > > manual > > > > > > indentation at the start of the message is particularly helpful?) > > > > > > > > > > For is_xible / is_convertible etc, perhaps they could use a 'complain' > > > > > parameter that they propa
Re: [PATCH] RISC-V: Implement full-featured iterator for riscv_subset_list [NFC]
CI passed, pushed to trunk :) On Thu, May 29, 2025 at 1:59 PM Kito Cheng wrote: > > This commit implements a full-featured iterator for the > riscv_subset_list, that it able to use range-based-for-loop. > > That could simplfy the code in the future, and make it more readable, > also more compatible with standard C++ containers. > > gcc/ChangeLog: > > * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Use > range-based-for-loop. > * config/riscv/riscv-subset.h (riscv_subset_list::iterator): > New. > (riscv_subset_list::const_iterator): New. > --- > gcc/config/riscv/riscv-c.cc | 18 -- > gcc/config/riscv/riscv-subset.h | 62 +++-- > 2 files changed, 66 insertions(+), 14 deletions(-) > > diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc > index 1ff1968f6d8..d2c0af35955 100644 > --- a/gcc/config/riscv/riscv-c.cc > +++ b/gcc/config/riscv/riscv-c.cc > @@ -239,26 +239,22 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile) >size_t max_ext_len = 0; > >/* Figure out the max length of extension name for reserving buffer. */ > - for (const riscv_subset_t *subset = subset_list->begin (); > - subset != subset_list->end (); > - subset = subset->next) > -max_ext_len = MAX (max_ext_len, subset->name.length ()); > + for (auto &subset : *subset_list) > +max_ext_len = MAX (max_ext_len, subset.name.length ()); > >char *buf = (char *)alloca (max_ext_len + 10 /* For __riscv_ and '\0'. > */); > > - for (const riscv_subset_t *subset = subset_list->begin (); > - subset != subset_list->end (); > - subset = subset->next) > + for (auto &subset : *subset_list) > { > - int version_value = riscv_ext_version_value (subset->major_version, > - subset->minor_version); > + int version_value = riscv_ext_version_value (subset.major_version, > + subset.minor_version); >/* Special rule for zicsr and zifencei, it's used for ISA spec 2.2 or > earlier. */ > - if ((subset->name == "zicsr" || subset->name == "zifencei") > + if ((subset.name == "zicsr" || subset.name == "zifencei") > && version_value == 0) > version_value = riscv_ext_version_value (2, 0); > > - sprintf (buf, "__riscv_%s", subset->name.c_str ()); > + sprintf (buf, "__riscv_%s", subset.name.c_str ()); >builtin_define_with_int_value (buf, version_value); > } > } > diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h > index c5d9fab4de9..a35537d7754 100644 > --- a/gcc/config/riscv/riscv-subset.h > +++ b/gcc/config/riscv/riscv-subset.h > @@ -109,9 +109,6 @@ public: >static riscv_subset_list *parse (const char *, location_t); >const char *parse_single_ext (const char *, bool exact_single_p = true); > > - const riscv_subset_t *begin () const {return m_head;}; > - const riscv_subset_t *end () const {return NULL;}; > - >int match_score (riscv_subset_list *) const; > >void set_loc (location_t); > @@ -119,6 +116,65 @@ public: >void set_allow_adding_dup (bool v) { m_allow_adding_dup = v; } > >void finalize (); > + > + class iterator > + { > + public: > +explicit iterator(riscv_subset_t *node) : m_node(node) {} > + > +riscv_subset_t &operator*() const { return *m_node; } > +riscv_subset_t *operator->() const { return m_node; } > + > +iterator &operator++() > +{ > + if (m_node) > + m_node = m_node->next; > + return *this; > +} > + > +bool operator!=(const iterator &other) const > +{ > + return m_node != other.m_node; > +} > + > +bool operator==(const iterator &other) const > +{ > + return m_node == other.m_node; > +} > + > + private: > +riscv_subset_t *m_node; > + }; > + > + iterator begin() { return iterator(m_head); } > + iterator end() { return iterator(nullptr); } > + > + class const_iterator > + { > + public: > +explicit const_iterator(const riscv_subset_t *node) : m_node(node) {} > + > +const riscv_subset_t &operator*() const { return *m_node; } > +const riscv_subset_t *operator->() const { return m_node; } > + > +const_iterator &operator++() > +{ > + if (m_node) > + m_node = m_node->next; > + return *this; > +} > + > +bool operator!=(const const_iterator &other) const > +{ > + return m_node != other.m_node; > +} > + > + private: > +const riscv_subset_t *m_node; > + }; > + > + const_iterator begin() const { return const_iterator(m_head); } > + const_iterator end() const { return const_iterator(nullptr); } > }; > > extern const riscv_subset_list *riscv_cmdline_subset_list (void); > -- > 2.34.1 >
Re: [PATCH] c++tools: Don't check --enable-default-pie.
Pushed to trunk :) On Fri, May 30, 2025 at 2:01 PM Richard Biener wrote: > > On Thu, May 29, 2025 at 8:06 AM Kito Cheng wrote: > > > > `--enable-default-pie` is an option to specify whether to enable > > position-independent executables by default for `target`. > > > > However c++tools is build for `host`, so it should just follow > > `--enable-host-pie` option to determine whether to build with > > position-independent executables or not. > > > > NOTE: > > > > I checked PR 98324 and build with same configure option > > (`--enable-default-pie` and lto bootstrap) on x86-64 linux to make sure > > it won't cause same problem. > > Makes sense to me, thus OK if nobody objects over the weekend. > > Richard. > > > c++tools/ChangeLog: > > > > * configure.ac: Don't check `--enable-default-pie`. > > * configure: Regen. > > --- > > c++tools/configure| 11 --- > > c++tools/configure.ac | 6 -- > > 2 files changed, 17 deletions(-) > > > > diff --git a/c++tools/configure b/c++tools/configure > > index 1353479beca..6df4a2f0dfa 100755 > > --- a/c++tools/configure > > +++ b/c++tools/configure > > @@ -700,7 +700,6 @@ enable_option_checking > > enable_c___tools > > enable_maintainer_mode > > enable_checking > > -enable_default_pie > > enable_host_pie > > enable_host_bind_now > > with_gcc_major_version_only > > @@ -1335,7 +1334,6 @@ Optional Features: > >enable expensive run-time checks. With LIST, > > enable > >only specific categories of checks. Categories > > are: > >yes,no,all,none,release. > > - --enable-default-pieenable Position Independent Executable as default > >--enable-host-pie build host code as PIE > >--enable-host-bind-now link host code as BIND_NOW > > > > @@ -2946,15 +2944,6 @@ $as_echo "#define ENABLE_ASSERT_CHECKING 1" > > >>confdefs.h > > > > fi > > > > -# Check whether --enable-default-pie was given. > > -# Check whether --enable-default-pie was given. > > -if test "${enable_default_pie+set}" = set; then : > > - enableval=$enable_default_pie; PICFLAG=-fPIE > > -else > > - PICFLAG= > > -fi > > - > > - > > # Enable --enable-host-pie > > # Check whether --enable-host-pie was given. > > if test "${enable_host_pie+set}" = set; then : > > diff --git a/c++tools/configure.ac b/c++tools/configure.ac > > index db34ee678e0..8c4b72a8023 100644 > > --- a/c++tools/configure.ac > > +++ b/c++tools/configure.ac > > @@ -97,12 +97,6 @@ if test x$ac_assert_checking != x ; then > > [Define if you want assertions enabled. This is a cheap check.]) > > fi > > > > -# Check whether --enable-default-pie was given. > > -AC_ARG_ENABLE(default-pie, > > -[AS_HELP_STRING([--enable-default-pie], > > - [enable Position Independent Executable as default])], > > -[PICFLAG=-fPIE], [PICFLAG=]) > > - > > # Enable --enable-host-pie > > AC_ARG_ENABLE(host-pie, > > [AS_HELP_STRING([--enable-host-pie], > > -- > > 2.34.1 > >
Re: [PATCH] RISC-V: Adjust build rule for gen-riscv-ext-opt and gen-riscv-ext-texi
committed :) On Mon, Jun 2, 2025 at 11:28 AM Jeff Law wrote: > > > > On 5/28/25 11:59 PM, Kito Cheng wrote: > > Separate the build rules to compile and link stage to make sure > > BUILD_LINKERFLAGS and BUILD_LDFLAGS are applied correctly. > > > > We hit this issue when we try to build GCC with non-system-default g++, > > and it use newer libstdc++, and then got error from using older libstdc++ > > from > > system, that should not happened if we link with -static-libgcc and > > -static-libstdc++. > > > > gcc/ChangeLog: > > > > * config/riscv/t-riscv: Adjust build rule for gen-riscv-ext-opt > > and gen-riscv-ext-texi. > Seems reasonable to me. > > jeff >
Re: [PATCH, libgfortran] PR119856 Part 2 Fix error handling for missing commas in format strings
Hi Jerry, On 5/31/25 19:38, Jerry D wrote: Hi all, The attached patch fixes a latent issue where we were saving a parsed and checked format string that had a missing comma. This resulted in the correct error on the first use of the string, but a missed error on subsequent uses of the string. New test case provided. Regression tested on x86_64 OK for trunk and eventual backport to 15? both is OK. Thanks for the patch! Harald Regards, Jerry [Sidenote: this is a pure libgfortran test where AFAICT torture-testing does not make much sense. It is cheap, though.]
Re: [PATCH v4 03/20] Add string_slice class.
On 5/28/25 4:10 AM, Alfie Richards wrote: The string_slice inherits from array_slice and is used to refer to a substring of an array that is memory managed elsewhere without modifying the underlying array. For example, this is useful in cases such as when needing to refer to a substring of an attribute in the syntax tree. Adds some minimal helper functions for string_slice, such as a strtok alternative, equality operators, strcmp, and a function to strip whitespace from the beginning and end of a string_slice. gcc/c-family/ChangeLog: * c-format.cc (local_string_slice_node): New node type. (asm_fprintf_char_table): New entry. (init_dynamic_diag_info): Add support for string_slice. * c-format.h (T_STRING_SLICE): New node type. gcc/ChangeLog: * pretty-print.cc (format_phase_2): Add support for string_slice. * vec.cc (string_slice::tokenize): New method. (strcmp): New implementation for string_slice. (string_slice::strip): New method. (test_string_slice_initializers): New test. (test_string_slice_tokenize): Ditto. (test_string_slice_strcmp): Ditto. (test_string_slice_equality): Ditto. (test_string_slice_inequality): Ditto. (test_string_slice_invalid): Ditto. (test_string_slice_strip): Ditto. (vec_cc_tests): Add new tests. * vec.h (class string_slice): New class. (strcmp): New implementation for stirng_slice. Implementation itself is fine. However each function/method should have a block comment indicating what the function does, including brief descriptions of its inputs and return value. Pre-approved after adding the appropriate function comments. jeff ps. Thanks for adding testcases for the behavior of this new code. It's greatly appreciated.
Re: [PATCH] RISC-V: Adjust build rule for gen-riscv-ext-opt and gen-riscv-ext-texi
On 5/28/25 11:59 PM, Kito Cheng wrote: Separate the build rules to compile and link stage to make sure BUILD_LINKERFLAGS and BUILD_LDFLAGS are applied correctly. We hit this issue when we try to build GCC with non-system-default g++, and it use newer libstdc++, and then got error from using older libstdc++ from system, that should not happened if we link with -static-libgcc and -static-libstdc++. gcc/ChangeLog: * config/riscv/t-riscv: Adjust build rule for gen-riscv-ext-opt and gen-riscv-ext-texi. Seems reasonable to me. jeff
Re: [PATCH 0/3] Redirect to specific target based on TARGET_VERSION_COMPATIBLE
On 5/28/25 3:58 AM, Alfie Richards wrote: On 28/05/2025 03:36, Jeff Law wrote: On 5/27/25 2:36 AM, Alfie Richards wrote: Hi Jeff, On 22/05/2025 21:02, Jeff Law wrote: On 5/22/25 9:05 AM, Alfie Richards wrote: Hi Jeff, I sent this patch with my implementation a while ago: https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681043.html There hasn't been any feedback on that patch yet. These patches are still useful and I would like to go ahead with them. I am in favour of using my implementation as it is a bit stronger, but it also requires my larger FMV series to be approved first. Can you ping your larger FMV series? I strongly suspect everyone is digging out from everything that queued up while the trunk was in bugfixing stages. Hers the series: https://gcc.gnu.org/pipermail/gcc-patches/2025- April/681047.html I'd love any feedback on that and to get it moving. You submitted the original at a bad time, right near a release ;-) Can you repost the series, I think it's like 20 patches and things may have moved since then. It's virtually certain I don't have them in my inbox right now ;-) I may have thought they were primary ARM and punted assuming Richard S. would take care of them. Just reposted now! Just FYI, there is a V5 in progress addressing Jason’s C++ diagnostics. Only a minor diagnostics change so far. Yea, noticed it today. I'll switch to that version going forward. Thanks. Thank you for the advice and the help! Of course, happy to help. Just hope it doesn't get too far into code I'm not familiar with :-) jeff
Re: [PATCH 3/3] OpenMP: Handle more cases in user/condition selector
On 5/29/25 22:19, Tobias Burnus wrote: Sandra Loosemore wrote: Like the attached V2 patch? LGTM. However, I think in metadirective-condition-template.C the "scan-tree-dump" should now be changed to "scan-tree-dump-times", given that there are several tests. Yup, good idea. I've pushed the attached version of the patch. -Sandra From 67072d9ec2d48c78e8e3b40ff6c42af13ad4022c Mon Sep 17 00:00:00 2001 From: Sandra Loosemore Date: Mon, 2 Jun 2025 03:26:42 + Subject: [PATCH] OpenMP: Handle more cases in user/condition selector Tobias had noted that the C front end was not treating C23 constexprs as constant in the user/condition selector property, which led to missed opportunities to resolve metadirectives at parse time. Additionally neither C nor C++ was permitting the expression to have pointer or floating-point type -- the former being a common idiom in other C/C++ conditional expressions. By using the existing front-end hooks for the implicit conversion to bool in conditional expressions, we also get free support for using a C++ class object that has a bool conversion operator in the user/condition selector. gcc/c/ChangeLog * c-parser.cc (c_parser_omp_context_selector): Call convert_lvalue_to_rvalue and c_objc_common_truthvalue_conversion on the expression for OMP_TRAIT_PROPERTY_BOOL_EXPR. gcc/cp/ChangeLog * cp-tree.h (maybe_convert_cond): Declare. * parser.cc (cp_parser_omp_context_selector): Call maybe_convert_cond and fold_build_cleanup_point_expr on the expression for OMP_TRAIT_PROPERTY_BOOL_EXPR. * pt.cc (tsubst_omp_context_selector): Likewise. * semantics.cc (maybe_convert_cond): Remove static declaration. gcc/testsuite/ChangeLog * c-c++-common/gomp/declare-variant-2.c: Update expected output. * c-c++-common/gomp/metadirective-condition-constexpr.c: New. * c-c++-common/gomp/metadirective-condition.c: New. * c-c++-common/gomp/metadirective-error-recovery.c: Update expected output. * g++.dg/gomp/metadirective-condition-class.C: New. * g++.dg/gomp/metadirective-condition-template.C: New. --- gcc/c/c-parser.cc | 19 ++-- gcc/cp/cp-tree.h | 1 + gcc/cp/parser.cc | 21 +++-- gcc/cp/pt.cc | 30 ++--- gcc/cp/semantics.cc | 3 +- .../c-c++-common/gomp/declare-variant-2.c | 2 +- .../gomp/metadirective-condition-constexpr.c | 13 ++ .../gomp/metadirective-condition.c| 25 +++ .../gomp/metadirective-error-recovery.c | 9 +++- .../gomp/metadirective-condition-class.C | 43 +++ .../gomp/metadirective-condition-template.C | 41 ++ 11 files changed, 188 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-condition-constexpr.c create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-condition.c create mode 100644 gcc/testsuite/g++.dg/gomp/metadirective-condition-class.C create mode 100644 gcc/testsuite/g++.dg/gomp/metadirective-condition-template.C diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 4144aa17fde..e11e6034461 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -26865,17 +26865,30 @@ c_parser_omp_context_selector (c_parser *parser, enum omp_tss_code set, break; case OMP_TRAIT_PROPERTY_DEV_NUM_EXPR: case OMP_TRAIT_PROPERTY_BOOL_EXPR: - t = c_parser_expr_no_commas (parser, NULL).value; + { + c_expr texpr = c_parser_expr_no_commas (parser, NULL); + texpr = convert_lvalue_to_rvalue (token->location, texpr, + true, true); + t = texpr.value; + } if (t == error_mark_node) return error_mark_node; mark_exp_read (t); - t = c_fully_fold (t, false, NULL); - if (!INTEGRAL_TYPE_P (TREE_TYPE (t))) + if (property_kind == OMP_TRAIT_PROPERTY_BOOL_EXPR) + { + t = c_objc_common_truthvalue_conversion (token->location, + t, + boolean_type_node); + if (t == error_mark_node) + return error_mark_node; + } + else if (!INTEGRAL_TYPE_P (TREE_TYPE (t))) { error_at (token->location, "property must be integer expression"); return error_mark_node; } + t = c_fully_fold (t, false, NULL); properties = make_trait_property (NULL_TREE, t, properties); break; case OMP_TRAIT_PROPERTY_CLAUSE_LIST: diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index d9fc80b92e5..b42c67872a6 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -7948,6 +7948,7 @@ extern bool perform_deferred_access_checks (tsubst_flags_t); extern bool perform_or_defer_access_check (tree, tree, tree, tsubst_flags_t, access_failure_info *afi = NULL); +extern tree maybe_convert_cond (tree); /* RAII sentinel to ensures that deferred access checks are popped before a function returns. */ diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 3e39
Re: [PATCH] testsuite: Add tls_link effective target
On 5/27/25 7:36 AM, Christophe Lyon wrote: Some tests have 'dg-do link' but currently require 'tls' which is a compile-only check. In some configurations of arm-none-eabi, the 'tls' effective-target can be successful although these tests fail to link with undefined reference to `__aeabi_read_tp' This patch as a new tls_link effective target which makes sure we can build an executable. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_tls_link): New. * g++.dg/tls/pr102496-1.C: Require tls_link. * g++.dg/tls/pr77285-1.C: Likewise. gcc/ChangeLog: * doc/sourcebuild.texi (tls_link): Add documentation. OK jeff
Re: [PATCH v5 07/24] Change make_attribute to take string_slice.
On 5/29/25 6:46 AM, Alfie Richards wrote: gcc/ChangeLog: * attribs.cc (make_attribute): Change arguments. * attribs.h (make_attribute): Change arguments. Approved by Richard Sandiford. Similarly for anything else already approved that is safe to commit now. jeff
Re: [PATCH] expmed: Prevent non-canonical subreg generation in store_bit_field [PR118873]
On 5/30/25 12:12 AM, Richard Biener wrote: On Thu, May 29, 2025 at 12:27 PM Konstantinos Eleftheriou wrote: Hi Richard, thanks for the response. On Mon, May 26, 2025 at 11:55 AM Richard Biener wrote: On Mon, 26 May 2025, Konstantinos Eleftheriou wrote: In `store_bit_field_1`, when the value to be written in the bitfield and/or the bitfield itself have vector modes, non-canonical subregs are generated, like `(subreg:V4SI (reg:V8SI x) 0)`. If one them is a scalar, this happens only when the scalar mode is different than the vector's inner mode. This patch tries to prevent this, using vec_set patterns when possible. I know almost nothing about this code, but why does the patch fixup things after the fact rather than avoid generating the SUBREG in the first place? That's what we are doing, we are trying to prevent the non-canonical subreg generation (it's not always possible). But, there are cases where these types of subregs are passed into `store_bit_field` by its caller, in which case we choose not to touch them. ISTR it also (unfortunately) depends on the target which forms are considered canonical. But, the way that we interpret the documentation, the canonicalizations are machine-independent. Is that not true? Or, specifically for the subregs that operate on vectors, is there any target that considers them canonical? I'm also not sure you got endianess right for all possible values of SUBREG_BYTE. One more reason to not generate such subreg in the first place but stick to vec_select/concat. The only way that we would generate subregs are from the calls to `extract_bit_field` or `store_bit_field_1` and these should handle the endianness. Also, these subregs wouldn't operate on vectors. Do you mean that something could go wrong with these calls? I wanted to remark that endianess WRT memory order (which is what store/extract_bit_field deal with) isn't always the same as endianess in register order (which is what vec_concat and friends operate on). If we can avoid transitioning from one to the other this will help avoid mistakes. In general it would be more obvious (to me) if you fixed the callers that create those subregs. Now, I didn't want to pretend I'm reviewing the patch - so others please do that (as said, I'm not familiar enough with the code to tell whether it's actually correct). Well, I'd tend to think your core concern is correct -- if something is generating a non-canonical SUBREG, then that needs to be fixed. If I understand Konstantinos's comments correctly they're tackling one of those paths and instead generating a correct form. My understanding is also that this patch doesn't catch all the known cases. I don't think we anyone anymore that *really* knows the code in question. We're kind of slogging along as best as we can, but it's not an ideal situation. WRT fixing this earlier in the callers, I think it's the right thing to check. So I think that means understanding how we get into this routine with VALUE being a SUBREG and FIELDMODE being a vector mode. Jeff
Re: [PATCH v5 05/24] Update is_function_default_version to work with target_version.
On 5/29/25 6:44 AM, Alfie Richards wrote: Notably this respects target_version semantics where an unannotated function can be the default version. gcc/ChangeLog: * attribs.cc (is_function_default_version): Add target_version logic. Approved by Richard Sandiford. Given this is approved, please consider pushing it to the trunk if it doesn't depend on subsequent patches. No sense is having it wait in that case. jeff
Re: [PATCH] RISC-V: Support CPUs in -march.
This rule clearly applies to directly related options like -ffoo and -fno-foo, but it’s less obvious for unrelated pairs like -ffoo and -fbar especially when there is traditionally strong specifics. In many cases, the principle of "the most specific option wins" governs the behavior. Here we have a case where -ffoo and -fbar overlap so they are not unrelated. If the most specific option wins then -mcpu should win as it specifies more than either -mtune or -march individually. * -march= and -mcpu= has become less orthogonal, requiring users to navigate more complex rules. * We've lost the ability to use combinations like -march=LOWER -mcpu=HIGHER or -mcpu=HIGHER -march=LOWER to disable specific target features or detect incompatibilities. * We landed the commit 4a182418c89666e7594bcb0e5edc5194aa147910 hastily on May 23, 2025, with far-reaching implications that weren't fully considered. I can buy into the first point to some degree but IMHO it's not much more complicated. The second point I'm not so sure about. Why have we lost anything? All we now can do is use a shorthand CPU name instead of a lengthy arch string and -march=lower -mcpu=higher is still available, as well as the other way around. And the third point is IMHO exaggerated. We're at the beginning of stage 1 when more experimental things hit the trunk that can easily be adjusted. So neither is anything done hastily nor do I believe the implications are that far reaching. You could just have said something like: "Is this set in stone already? In order to achieve what you want we could also introduce an -march=unset and -mtune=unset instead". IMHO there is enough difference in aarch64's and riscv's extension model that it warrants doing some things not in their exact same way. aarch64's arch strings don't regularly exceed a line. I strongly disagree with implicitly setting -mtune= when specifying -march=. Adding more semantics to -march complicates the rules users must follow and increases the risk of error-prone patterns. The traditional behavior established by aarch64, in contrast, is easy to remember: -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two options. The supported values are generally the same as -mtune=. The architecture name is inferred from X x86's scheme is also easy to remember: -march=arch or cpu including its proper tune model -mtune=different tune model I don't have a problem changing something or reverting if there's consensus but so far ITSM that it's mostly a matter of personal taste. If people prefer -march=unset etc. (I don't) then let's go for that? We need a bit more input for that, though. So far we just have one vote each. -- Regards Robin
Re: DEFAULT_PCC_STRUCT_RETURN on NetBSD vs Linux
On Sun, 2025-06-01 at 19:44 -0600, Jeff Law wrote: > > On 6/1/25 8:21 AM, John Paul Adrian Glaubitz wrote: > > > > > And what about the value for STACK_BOUNDARY? It seems to be 16 for many > > Linux targets while it's 32 for NetBSD. Is there a reason why it's 16-bit- > > aligned on Linux? > > Regardless of differences in the values, I don't see changing them at > this point as that would be an ABI change. Both Gentoo and Debian are going to switch m68k downstream to 4 bytes alignment as just too many packages are failing to build these days with 2 bytes alignment. See the (incomplete) discussion here: https://wiki.debian.org/M68k/Alignment Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer `. `' Physicist `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: PING³ — Re: PING (and v2) – [Patch] nvptx/nvptx.opt: Update -march-map= for newer sm_xxx
Tobias Burnus wrote: PING² On May 12, 2025, Tobias Burnus wrote: PING. There is actually a minor update as meanwhile CUDA 12.8 was released that added the 'f' suffix and sm_103 and sm_121. Still, the pattern remains the same; hence, a normal PING. On April 25, 2025, Tobias Burnus wrote: The idea of -march-map= is to simply and future proof select the best -march for a certain arch, without requiring that the compiler has support for it (like having a special multilib for it) - while -march= sets the actually used '.target' (and the compiler might actually generate specialized code for it). The patch updates the sm_X for the CUDA 12.8 additions, namely for three Blackwell GPU architectures. Cf. https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#release-notes or also https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html OK for mainline? Tobias PS: CUDA 12.7 seems to be an internal release, which shows up as PTX version but was not released to the public. PTX 8.6/CUDA 12.7 added sm_100/sm_101 - and PTX 8.7/CUDA 12.8 added sm120. PPS: sm_80 (Ampere) was added in PTX ISA 7.0 (CUDA 11.0), sm_89 (Ada) in PTX ISA 7.8 (CUDA). As sm_90 (Hopper) + sm_100/101/120 (Blackwell) currently/now map to sm_89, GCC generates PTX ISA .version 7.8 for them. Otherwise, sm_80 and sm_89 produce (for now) identical code.nvptx/nvptx.opt: Update -march-map= for newer sm_xxx Usage of the -march-map=: "Select the closest available '-march=' value that is not more capable." As PTX ISA 8.6/8.7 (= unreleased CUDA 12.7 + CUDA 12.8) added the Nvidia Blackwell GPUs SM_100, SM_101, and SM_120, it makes sense to add them as well. Note that all three come as sm_XXX and sm_XXXa. PTX ISA 8.8 (CUDA 12.9) added SM_103 and SM_121 and the new 'f' suffix for all SM_1xx. Internally, GCC currently generates the same code for >= sm_80 (Ampere); however, as GCC's -march= also supports sm_89 (Ada), the here added sm_1xxs (Blackwell) will map to sm_89. [Naming note: while ptx code generated for sm_X can also run with sm_Y if Y > X, code generated for sm_XXXa can (generally) only run on the specific hardware; and sm_XXXf implies compatibility with only subsequent targets in the same family.] gcc/ChangeLog: * config/nvptx/nvptx.opt (march-map=): Add sm_100{,f,a}, sm_101{,f,a}, sm_103{,a,f}, sm_120{,a,f} and sm_121{,f,a}. gcc/config/nvptx/nvptx.opt | 45 + 1 file changed, 45 insertions(+) diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt index d326ca4ad26..9796839f8df 100644 --- a/gcc/config/nvptx/nvptx.opt +++ b/gcc/config/nvptx/nvptx.opt @@ -120,6 +120,51 @@ Target RejectNegative Alias(misa=,sm_89) march-map=sm_90a Target RejectNegative Alias(misa=,sm_89) +march-map=sm_100 +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_100f +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_100a +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_101 +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_101f +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_101a +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_103 +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_103f +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_103a +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_120 +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_120f +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_120a +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_121 +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_121f +Target RejectNegative Alias(misa=,sm_89) + +march-map=sm_121a +Target RejectNegative Alias(misa=,sm_89) + Enum Name(ptx_version) Type(enum ptx_version) Known PTX ISA versions (for use with the -mptx= option):
[PATCH v2] gimple-fold: Implement simple copy propagation for aggregates [PR14295]
This implements a simple copy propagation for aggregates in the similar fashion as we already do for copy prop of zeroing. Right now this only looks at the previous vdef statement but this allows us to catch a lot of cases that show up in C++ code. This used to deleted aggregate copies that are to the same location (PR57361) But that was found to delete statements that are needed for aliasing markers reason. So we need to keep them around until that is solved. Note DSE will delete the statements anyways so there is no testcase added since we expose the latent bug in the same way. See https://gcc.gnu.org/pipermail/gcc-patches/2025-May/685003.html for the testcase and explaintation there. Also adds a variant of pr22237.c which was found while working on this patch. Changes since v1: * v2: change check for vuse to use default definition. Remove dest/src arguments for optimize_agr_copyprop Changed dump messages slightly. Added stats Don't delete `a = a` until aliasing markers are added. PR tree-optimization/14295 PR tree-optimization/108358 PR tree-optimization/114169 gcc/ChangeLog: * tree-ssa-forwprop.cc (optimize_agr_copyprop): New function. (pass_forwprop::execute): Call optimize_agr_copyprop for load/store statements. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/20031106-6.c: Un-xfail. Add scan for forwprop1. * g++.dg/opt/pr66119.C: Disable forwprop since that does the copy prop now. * gcc.dg/tree-ssa/pr108358-a.c: New test. * gcc.dg/tree-ssa/pr114169-1.c: New test. * gcc.c-torture/execute/builtins/pr22237-1-lib.c: New test. * gcc.c-torture/execute/builtins/pr22237-1.c: New test. * gcc.dg/tree-ssa/pr57361.c: Disable forwprop1. * gcc.dg/tree-ssa/pr57361-1.c: New test. Signed-off-by: Andrew Pinski --- gcc/testsuite/g++.dg/opt/pr66119.C| 2 +- .../execute/builtins/pr22237-1-lib.c | 27 ++ .../execute/builtins/pr22237-1.c | 57 gcc/testsuite/gcc.dg/tree-ssa/20031106-6.c| 8 +- gcc/testsuite/gcc.dg/tree-ssa/pr108358-a.c| 33 +++ gcc/testsuite/gcc.dg/tree-ssa/pr114169-1.c| 39 + gcc/testsuite/gcc.dg/tree-ssa/pr57361-1.c | 10 +++ gcc/testsuite/gcc.dg/tree-ssa/pr57361.c | 2 +- gcc/tree-ssa-forwprop.cc | 87 +++ 9 files changed, 261 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c create mode 100644 gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr108358-a.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr114169-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr57361-1.c diff --git a/gcc/testsuite/g++.dg/opt/pr66119.C b/gcc/testsuite/g++.dg/opt/pr66119.C index d1b1845a258..52362e44434 100644 --- a/gcc/testsuite/g++.dg/opt/pr66119.C +++ b/gcc/testsuite/g++.dg/opt/pr66119.C @@ -3,7 +3,7 @@ the value of MOVE_RATIO now is. */ /* { dg-do compile { target { { i?86-*-* x86_64-*-* } && c++11 } } } */ -/* { dg-options "-O3 -mavx -fdump-tree-sra -march=slm -mtune=slm -fno-early-inlining" } */ +/* { dg-options "-O3 -mavx -fdump-tree-sra -fno-tree-forwprop -march=slm -mtune=slm -fno-early-inlining" } */ // { dg-skip-if "requires hosted libstdc++ for cstdlib malloc" { ! hostedlib } } #include diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c new file mode 100644 index 000..44032357405 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1-lib.c @@ -0,0 +1,27 @@ +extern void abort (void); + +void * +memcpy (void *dst, const void *src, __SIZE_TYPE__ n) +{ + const char *srcp; + char *dstp; + + srcp = src; + dstp = dst; + + if (dst < src) +{ + if (dst + n > src) + abort (); +} + else +{ + if (src + n > dst) + abort (); +} + + while (n-- != 0) +*dstp++ = *srcp++; + + return dst; +} diff --git a/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c new file mode 100644 index 000..0a12b0fc9a1 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/builtins/pr22237-1.c @@ -0,0 +1,57 @@ +extern void abort (void); +extern void exit (int); +struct s { unsigned char a[256]; }; +union u { struct { struct s b; int c; } d; struct { int c; struct s b; } e; }; +static union u v; +static union u v0; +static struct s *p = &v.d.b; +static struct s *q = &v.e.b; + +struct outers +{ + struct s inner; +}; + +static inline struct s rp (void) { return *p; } +static inline struct s rq (void) { return *q; } +static void pq (void) +{ + struct outers o = {rq () }; + *p = o.inner; +} +static void qp (void) +{ + struct outers o = {rp () }; + *q = o.inner; +} + +static void +init (struct s *sp) +{
[PATCH] Move get_call_rtx_from to final.c
Move get_call_rtx_from to final.c and call call_from_call_insn. PR other/120493 * final.cc (call_from_call_insn): Change the argument type to const rtx_call_insn *. (get_call_rtx_from): New. * rtl.h (is_a_helper ::test): New. (get_call_rtx_from): Moved to the final.cc section. * rtlanal.cc (get_call_rtx_from): Removed. Tested on x86-64. -- H.J. From 5f68d5644870e3984343240b26918ee97e3e6f17 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Sun, 1 Jun 2025 09:29:48 +0800 Subject: [PATCH] Move get_call_rtx_from to final.c Move get_call_rtx_from to final.c and call call_from_call_insn. PR other/120493 * final.cc (call_from_call_insn): Change the argument type to const rtx_call_insn *. (get_call_rtx_from): New. * rtl.h (is_a_helper ::test): New. (get_call_rtx_from): Moved to the final.cc section. * rtlanal.cc (get_call_rtx_from): Removed. Signed-off-by: H.J. Lu --- gcc/final.cc | 11 ++- gcc/rtl.h | 10 +- gcc/rtlanal.cc | 15 --- 3 files changed, 19 insertions(+), 17 deletions(-) diff --git a/gcc/final.cc b/gcc/final.cc index 12c6eb0ac09..a4dbab72914 100644 --- a/gcc/final.cc +++ b/gcc/final.cc @@ -2072,7 +2072,7 @@ output_alternate_entry_point (FILE *file, rtx_insn *insn) /* Given a CALL_INSN, find and return the nested CALL. */ static rtx -call_from_call_insn (rtx_call_insn *insn) +call_from_call_insn (const rtx_call_insn *insn) { rtx x; gcc_assert (CALL_P (insn)); @@ -2098,6 +2098,15 @@ call_from_call_insn (rtx_call_insn *insn) return x; } +/* Return the CALL in X if there is one. */ + +rtx +get_call_rtx_from (const rtx_insn *insn) +{ + const rtx_call_insn *call_insn = as_a (insn); + return call_from_call_insn (call_insn); +} + /* Print a comment into the asm showing FILENAME, LINENUM, and the corresponding source line, if available. */ diff --git a/gcc/rtl.h b/gcc/rtl.h index 7049ed775e8..8a740219c2a 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -957,6 +957,14 @@ is_a_helper ::test (rtx_insn *insn) return CALL_P (insn); } +template <> +template <> +inline bool +is_a_helper ::test (const rtx_insn *insn) +{ + return CALL_P (insn); +} + template <> template <> inline bool @@ -3682,7 +3690,6 @@ extern bool nonzero_address_p (const_rtx); extern bool rtx_unstable_p (const_rtx); extern bool rtx_varies_p (const_rtx, bool); extern bool rtx_addr_varies_p (const_rtx, bool); -extern rtx get_call_rtx_from (const rtx_insn *); extern tree get_call_fndecl (const rtx_insn *); extern HOST_WIDE_INT get_integer_term (const_rtx); extern rtx get_related_value (const_rtx); @@ -4572,6 +4579,7 @@ extern void simplify_using_condition (rtx, rtx *, bitmap); extern void compute_alignments (void); extern void update_alignments (vec &); extern int asm_str_count (const char *templ); +extern rtx get_call_rtx_from (const rtx_insn *); struct rtl_hooks { diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc index 900f53e252a..239d6691c4c 100644 --- a/gcc/rtlanal.cc +++ b/gcc/rtlanal.cc @@ -813,21 +813,6 @@ rtx_addr_varies_p (const_rtx x, bool for_alias) return false; } -/* Return the CALL in X if there is one. */ - -rtx -get_call_rtx_from (const rtx_insn *insn) -{ - rtx x = PATTERN (insn); - if (GET_CODE (x) == PARALLEL) -x = XVECEXP (x, 0, 0); - if (GET_CODE (x) == SET) -x = SET_SRC (x); - if (GET_CODE (x) == CALL && MEM_P (XEXP (x, 0))) -return x; - return NULL_RTX; -} - /* Get the declaration of the function called by INSN. */ tree -- 2.49.0
Re: [PATCH] Fix crash with constant initializer caused by IPA
> If one wants to look up something in symbol table during late optimization > one is supposed to check that ::get does not return NULL and be conservative > otheriwse. So your change for PR120156 was OK with me, but I did not > explicitly write (sorty for that). Similarly for vectorizer I would simply > test for node to exist. That's a bit ugly since we test decl_in_symtab_p just before, but of course very safe for release branches (the crash for the Ada testcase is a regression present in GCC 12 and later). Tested on x86-64/Linux, OK for all branches (once they reopen)? * tree-vect-data-refs.cc (vect_can_force_dr_alignment_p): Return false if the variable has no symtab node. -- Eric Botcazou diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 9fd1ef29650..6e35f549a8c 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -7242,7 +7242,8 @@ vect_can_force_dr_alignment_p (const_tree decl, poly_uint64 alignment) return false; if (decl_in_symtab_p (decl) - && !symtab_node::get (decl)->can_increase_alignment_p ()) + && (!symtab_node::get (decl) + || !symtab_node::get (decl)->can_increase_alignment_p ())) return false; if (TREE_STATIC (decl))
Re: [PATCH] aarch64:sve: Use create_tmp_reg_or_ssa_name instead of create_tmp_var in the folder
On Sat, May 31, 2025 at 8:41 PM Andrew Pinski wrote: > > Currently gimple_folder::convert_and_fold calls create_tmp_var; that means > while in ssa form, > the pass which calls fold_stmt will always have to update the ssa (via > TODO_update_ssa or otherwise). > This seems not very useful since we know that this will always be a ssa name, > using create_tmp_reg_or_ssa_name > instead is better and don't need to depend on the ssa updater. Plus this > should have a small compile time performance > and memory usage improvement too since this uses an anonymous ssa name rather > than creating a full decl for this. Note the gimple_in_ssa_p check in create_tmp_reg_or_ssa_name is superfluous these days given the gimplifier already creates SSA names for temporaries when not in SSA form (no PHI nodes, no CFG). So in principle we could switch create_tmp_reg to always yield a SSA name. Thus for the fixup here I'd suggest using make_ssa_name (type) instead. Richard. > Built and tested on aarch64-linux-gnu. > > gcc/ChangeLog: > > * config/aarch64/aarch64-sve-builtins.cc > (gimple_folder::convert_and_fold): Use create_tmp_reg_or_ssa_name > instead of create_tmp_var for the temporary. > > Signed-off-by: Andrew Pinski > --- > gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc > b/gcc/config/aarch64/aarch64-sve-builtins.cc > index 36519262efd..329237b086d 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins.cc > +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc > @@ -3675,7 +3675,7 @@ gimple_folder::convert_and_fold (tree type, >tree old_ty = TREE_TYPE (lhs); >gimple_seq stmts = NULL; >bool convert_lhs_p = !useless_type_conversion_p (type, old_ty); > - tree lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs; > + tree lhs_conv = convert_lhs_p ? create_tmp_reg_or_ssa_name (type) : lhs; >unsigned int num_args = gimple_call_num_args (call); >auto_vec args_conv; >args_conv.safe_grow (num_args); > -- > 2.43.0 >
Re: [PATCH] Move get_call_rtx_from to final.c
On Sun, Jun 1, 2025 at 9:14 AM H.J. Lu wrote: > > Move get_call_rtx_from to final.c and call call_from_call_insn. > > PR other/120493 > * final.cc (call_from_call_insn): Change the argument type to > const rtx_call_insn *. > (get_call_rtx_from): New. > * rtl.h (is_a_helper ::test): New. > (get_call_rtx_from): Moved to the final.cc section. > * rtlanal.cc (get_call_rtx_from): Removed. OK. Thanks, Richard. > Tested on x86-64. > > -- > H.J.
Re: [PATCH] Fix crash with constant initializer caused by IPA
On Sun, Jun 1, 2025 at 12:54 PM Eric Botcazou wrote: > > > If one wants to look up something in symbol table during late optimization > > one is supposed to check that ::get does not return NULL and be conservative > > otheriwse. So your change for PR120156 was OK with me, but I did not > > explicitly write (sorty for that). Similarly for vectorizer I would simply > > test for node to exist. > > That's a bit ugly since we test decl_in_symtab_p just before, but of course > very safe for release branches (the crash for the Ada testcase is a regression > present in GCC 12 and later). > > Tested on x86-64/Linux, OK for all branches (once they reopen)? OK. Thanks, Richard. > > * tree-vect-data-refs.cc (vect_can_force_dr_alignment_p): Return > false if the variable has no symtab node. > > -- > Eric Botcazou
Re: DEFAULT_PCC_STRUCT_RETURN on NetBSD vs Linux
On Wed, 2025-05-28 at 18:10 +0200, Andreas Schwab wrote: > On Mai 28 2025, John Paul Adrian Glaubitz wrote: > > > Shouldn't the #undef in linux.h undefine DEFAULT_PCC_STRUCT_RETURN and not > > PCC_STATIC_STRUCT_RETURN? > > No, they are separate target options. PCC_STATIC_STRUCT_RETURN is no > longer defined by default, so this is redundant now. OK, guess someone can drop that line then. > > And, secondly, shouldn't the comment in linux.h be corrected since > > apparently linux.h and netbsd-elf.h disagree on what the SVR4 ABI > > specifies how structs and unions are returned? > > This is controlled by TARGET_RETURN_IN_MEMORY if > DEFAULT_PCC_STRUCT_RETURN is 0. I was talking about the comments, not the code since NetBSD and Linux disagree on what the SVR4 ABI claims. Btw, shouldn't EMPTY_FIELD_BOUNDARY not changed by -malign-int to 32 similar to BIGGEST_ALIGNMENT (see netbsd-elf.h)? And what about the value for STACK_BOUNDARY? It seems to be 16 for many Linux targets while it's 32 for NetBSD. Is there a reason why it's 16-bit- aligned on Linux? Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer `. `' Physicist `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: [PATCH] RISC-V: Support CPUs in -march.
On Sun, Jun 1, 2025 at 4:06 AM Robin Dapp wrote: > > I stumped across this change from > > https://github.com/riscv-non-isa/riscv-toolchain-conventions/issues/88 > > and I want to express my strong disagreement with this change. > > > > > > Perhaps I'm accustomed to Arm's behavior, but I believe using -march= to > > target a specific CPU isn't ideal. > > > > * -march=X: (execution domain) Generate code that can use instructions > > available in the architecture X > > * -mtune=X: (optimization domain) Optimize for the microarchitecture X, but > > does not change the ABI or make assumptions about available instructions > > * -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two > > options. The supported values are generally the same as -mtune=. The > > architecture name is inferred from X > > > > For execution domain settings, -march=X overrides -mcpu=X regardless of > > their > > positions. > > > > In cases like `-march=LOWER -mcpu=HIGHER` or `-mcpu=HIGHER -march=LOWER`, > > the > > -march= option can disable certain target features. > > > > I strongly disagree with Clang adopting this behavior. > > I'm not convinced by the GCC patch explanation. > > > >> Suppose we have a Makefile that specifies -march=rv64gc by default. > > > > In the project specifies a lower feature set, then the compiler should > > respect it or the user should fix the project build system. > > There are two reasons for the change: > > - Adhere to the usual "last option wins" convention and at the same time This rule clearly applies to directly related options like -ffoo and -fno-foo, but it’s less obvious for unrelated pairs like -ffoo and -fbar, especially when there is traditionally strong specifics. In many cases, the principle of "the most specific option wins" governs the behavior. > - Make it easy to specify an overriding -march string representing a CPU >without spelling out the >200 char expanded arch string, a very error-prone >procedure. > > I agree that fixing the "project build system" might be another way of having > only a single -march in a compiler command line but it doesn't match my > experience. Over the years I have seen a significant number of projects where > multiple -marchs where present. With the "last option wins" convention this > works and a user always has a "last ditch" way of setting their preferred > architecture. > A typical use case is a default march that is overridden by CFLAGS. There's > nothing to be fixed here because it's standard procedure. In that situation > we > would always need the full -march string and that can't be what we want? When multiple -march= options are specified, the last one wins. However, this does not apply when combining -march= and -mcpu=, as -march= is intended to restrict the feature set. If you'd like to propose that -mcpu= should override -march=, one approach could be introducing a special value for -march=, such as -march=unset. > What's your main concern? The less clear separation between arch and CPU? > I don't think the change takes anything away from -march but just adds a > shortcut "set the arch level to what this CPU supports". That's fully in line > with what other targets are doing, also on the clang side. * -march= and -mcpu= has become less orthogonal, requiring users to navigate more complex rules. * We've lost the ability to use combinations like -march=LOWER -mcpu=HIGHER or -mcpu=HIGHER -march=LOWER to disable specific target features or detect incompatibilities. * We landed the commit 4a182418c89666e7594bcb0e5edc5194aa147910 hastily on May 23, 2025, with far-reaching implications that weren't fully considered. > Ideally (IMHO) we'd also imply an -mtune when specifying -march by default > rather than using -mtune=rocket or so. I strongly disagree with implicitly setting -mtune= when specifying -march=. Adding more semantics to -march complicates the rules users must follow and increases the risk of error-prone patterns. The traditional behavior established by aarch64, in contrast, is easy to remember: -mcpu=X: Specify both -march= and -mtune= but can be overridden by the two options. The supported values are generally the same as -mtune=. The architecture name is inferred from X > If we had an -mcpu=... that overrode -march and -mtune we'd already have a > good > way of achieving what I intended. In that case -march wouldn't need to be > changed but to my understanding this has been unwanted so far. For me a "last > option wins" -mcpu that overrides march and mtune would also work and would > keep the clear separation between -march and -mtune.
[PATCH] gcc: middle-end opt for trigonometric pi-based functions builtins
Hello world, This patch partially handled PR118592. This patch builds upon r16-710-g591d3d02664c7b and r16-711-g89935d56f768b4. It introduces middle-end optimizations, such as constant folding, for our trigonometric pi-based function built-ins. For MPFR versions older than 4.2.0, we've included our own folding functions. I've also added tests to ensure everything works as expected. Best regards, Yuao 0001-gcc-middle-end-opt-for-trigonometric-pi-based-functi.patch Description: 0001-gcc-middle-end-opt-for-trigonometric-pi-based-functi.patch
Re: [PATCH] forwprop: Manually rename the virtual mem op for complex and vector loads prop
> Am 01.06.2025 um 04:16 schrieb Andrew Pinski : > > There are two places which forwprop replaces an original load to a few > different loads. > Both can set the vuse manually instead of relying on update_ssa. > One is doing a complex load followed by REAL/IMAG_PART only > And the other is very similar but for vector loads followed by BIT_FIELD_REF. > > Since this was the last place that needed to handle updating the ssa form, > Remove the TODO_update_ssa also from the pass. Ok Richard > gcc/ChangeLog: > >* tree-ssa-forwprop.cc (optimize_vector_load): Set the vuse manually >on the new load statements. Also remove forward declaration since >the definition is before the first use. >(pass_forwprop::execute): Likewise for complex loads. >(pass_data_forwprop): Remove TODO_update_ssa. > > Signed-off-by: Andrew Pinski > --- > gcc/tree-ssa-forwprop.cc | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc > index 81ea7d4195e..75901ecfbb0 100644 > --- a/gcc/tree-ssa-forwprop.cc > +++ b/gcc/tree-ssa-forwprop.cc > @@ -205,7 +205,6 @@ struct _vec_perm_simplify_seq > typedef struct _vec_perm_simplify_seq *vec_perm_simplify_seq; > > static bool forward_propagate_addr_expr (tree, tree, bool); > -static void optimize_vector_load (gimple_stmt_iterator *); > > /* Set to true if we delete dead edges during the optimization. */ > static bool cfg_changed; > @@ -3387,6 +3386,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi) > gimple *stmt = gsi_stmt (*gsi); > tree lhs = gimple_assign_lhs (stmt); > tree rhs = gimple_assign_rhs1 (stmt); > + tree vuse = gimple_vuse (stmt); > > /* Gather BIT_FIELD_REFs to rewrite, looking through > VEC_UNPACK_{LO,HI}_EXPR. */ > @@ -3495,6 +3495,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi) > gimple *new_stmt = gimple_build_assign (tem, new_rhs); > location_t loc = gimple_location (use_stmt); > gimple_set_location (new_stmt, loc); > + gimple_set_vuse (new_stmt, vuse); > gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); > /* Perform scalar promotion. */ > new_stmt = gimple_build_assign (gimple_assign_lhs (use_stmt), > @@ -3514,6 +3515,7 @@ optimize_vector_load (gimple_stmt_iterator *gsi) > new_rhs); > location_t loc = gimple_location (use_stmt); > gimple_set_location (new_stmt, loc); > + gimple_set_vuse (new_stmt, vuse); > gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); >} > gimple_stmt_iterator gsi2 = gsi_for_stmt (use_stmt); > @@ -4167,7 +4169,7 @@ const pass_data pass_data_forwprop = > 0, /* properties_provided */ > 0, /* properties_destroyed */ > 0, /* todo_flags_start */ > - TODO_update_ssa, /* todo_flags_finish */ > + 0, /* todo_flags_finish */ > }; > > class pass_forwprop : public gimple_opt_pass > @@ -4404,6 +4406,7 @@ pass_forwprop::execute (function *fun) > component-wise loads. */ > use_operand_p use_p; > imm_use_iterator iter; > + tree vuse = gimple_vuse (stmt); > bool rewrite = true; > FOR_EACH_IMM_USE_FAST (use_p, iter, lhs) >{ > @@ -4443,6 +4446,7 @@ pass_forwprop::execute (function *fun) > > location_t loc = gimple_location (use_stmt); > gimple_set_location (new_stmt, loc); > + gimple_set_vuse (new_stmt, vuse); > gimple_stmt_iterator gsi2 = gsi_for_stmt (use_stmt); > unlink_stmt_vdef (use_stmt); > gsi_remove (&gsi2, true); > -- > 2.43.0 >
[PATCH v2] c++, coroutines: CWG2563 promise lifetime extension [PR115908].
Updated. I realised we no longer need to refer to initial_await_resume_called in the ramp at all, v2 removes the setting of the variable from there and puts it into the start of the actor. Tested on x86_64-darwin and powerp64le-linux. Confirmed that the sanitizer test in PR 118074 is fixed but we already have a suitable testcase in PR 115908 which has now been moved to the 'torture' sub-directory to get wider code-gen coverage. OK for trunk? thanks Iain --- 8< --- This implements the final piece of the revised CWG2563 wording; "It exits the scope of promise only if the coroutine completed without suspending." The change removes the need to track whether we have reached the ramp return. We can now assume that the coroutine frame is valid when constructing the ramp return value. The ramp function no longer needs to check initial_await_resume_called, and the references to that can be made local to the resumer. The cleanups for promise, arg copies and frame are now adjusted to depend only on whether a suspend has occurred. PR c++/115908 PR c++/118074 gcc/cp/ChangeLog: * coroutines.cc: New GTY coro_ramp_cleanup_id. (coro_init_identifiers): Initialize coro_ramp_cleanup_id. (build_actor_fn): Set coro_ramp_cleanup_id false when we exit via a suspend or continuation return. Do not clean up if we pass the final await expression without suspending. (cp_coroutine_transform::wrap_original_function_body): Declare _Coro_ramp_cleanup. (cp_coroutine_transform::build_ramp_function): Amend cleanups to use coro_ramp_cleanup and set that to be true on init. Remove the use of _Coro_befoe_return. gcc/testsuite/ChangeLog: * g++.dg/coroutines/pr115908.C: Move to... * g++.dg/coroutines/torture/pr115908.C: ...here. Signed-off-by: Iain Sandoe --- gcc/cp/coroutines.cc | 121 +- .../coroutines/{ => torture}/pr115908.C | 9 +- 2 files changed, 60 insertions(+), 70 deletions(-) rename gcc/testsuite/g++.dg/coroutines/{ => torture}/pr115908.C (88%) diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc index a2b6d53805a..42f719ddde1 100644 --- a/gcc/cp/coroutines.cc +++ b/gcc/cp/coroutines.cc @@ -348,6 +348,7 @@ static GTY(()) tree coro_resume_index_id; static GTY(()) tree coro_self_handle_id; static GTY(()) tree coro_actor_continue_id; static GTY(()) tree coro_frame_i_a_r_c_id; +static GTY(()) tree coro_ramp_cleanup_id; /* Create the identifiers used by the coroutines library interfaces and the implementation frame state. */ @@ -385,6 +386,7 @@ coro_init_identifiers () coro_resume_index_id = get_identifier ("_Coro_resume_index"); coro_self_handle_id = get_identifier ("_Coro_self_handle"); coro_actor_continue_id = get_identifier ("_Coro_actor_continue"); + coro_ramp_cleanup_id = get_identifier ("_Coro_ramp_should_cleanup"); } /* Trees we only need to set up once. */ @@ -2568,6 +2570,17 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody, /* Now we start building the rewritten function body. */ add_stmt (build_stmt (loc, LABEL_EXPR, actor_begin_label)); + tree i_a_r_c = NULL_TREE; + if (flag_exceptions) +{ + i_a_r_c + = coro_build_frame_access_expr (actor_frame, coro_frame_i_a_r_c_id, + false, tf_warning_or_error); + tree m = cp_build_modify_expr (loc, i_a_r_c, NOP_EXPR, + boolean_false_node, tf_warning_or_error); + finish_expr_stmt (m); +} + /* actor's coroutine 'self handle'. */ tree ash = coro_build_frame_access_expr (actor_frame, coro_self_handle_id, false, tf_warning_or_error); @@ -2597,6 +2610,16 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody, /* Add in our function body with the co_returns rewritten to final form. */ add_stmt (fnbody); + /* If we reach here without suspending, then the ramp should cleanup. */ + tree ramp_should_cleanup += coro_build_frame_access_expr (actor_frame, coro_ramp_cleanup_id, + false, tf_warning_or_error); + tree ramp_cu_if = begin_if_stmt (); + finish_if_stmt_cond (ramp_should_cleanup, ramp_cu_if); + finish_return_stmt (NULL_TREE); + finish_then_clause (ramp_cu_if); + finish_if_stmt (ramp_cu_if); + /* Now do the tail of the function; first cleanups. */ r = build_stmt (loc, LABEL_EXPR, del_promise_label); add_stmt (r); @@ -2638,6 +2661,9 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody, /* This is the suspend return point. */ add_stmt (build_stmt (loc, LABEL_EXPR, ret_label)); + r = cp_build_modify_expr (loc, ramp_should_cleanup, NOP_EXPR, + boolean_false_node, tf_warning_or_error); + finish_expr_stmt (r); finish_return_stmt (NULL_TREE); /* Th
[PATCH v2] aarch64:sve: Use make_ssa_name instead of create_tmp_var in the folder
Currently gimple_folder::convert_and_fold calls create_tmp_var; that means while in ssa form, the pass which calls fold_stmt will always have to update the ssa (via TODO_update_ssa or otherwise). This seems not very useful since we know that this will always be a ssa name, using make_ssa_name instead is better and don't need to depend on the ssa updater. Plus this should have a small compile time performance and memory usage improvement too since this uses an anonymous ssa name rather than creating a full decl for this. Changes since v1: * Use make_ssa_name instead of create_tmp_reg_or_ssa_name, anonymous ssa names are allowed early on in gimple too. Built and tested on aarch64-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins.cc: Include value-range.h and tree-ssanames.h (gimple_folder::convert_and_fold): Use make_ssa_name instead of create_tmp_var for the temporary. Add comment about callback argument. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64-sve-builtins.cc | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 36519262efd..8540aef47b2 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -47,6 +47,8 @@ #include "langhooks.h" #include "stringpool.h" #include "attribs.h" +#include "value-range.h" +#include "tree-ssanames.h" #include "aarch64-sve-builtins.h" #include "aarch64-sve-builtins-base.h" #include "aarch64-sve-builtins-sve2.h" @@ -3664,7 +3666,8 @@ gimple_folder::fold_pfalse () /* Convert the lhs and all non-boolean vector-type operands to TYPE. Pass the converted variables to the callback FP, and finally convert the result back to the original type. Add the necessary conversion statements. - Return the new call. */ + Return the new call. Note the tree argument to the callback FP, can only be set + once; it will always be a SSA_NAME. */ gimple * gimple_folder::convert_and_fold (tree type, gimple *(*fp) (gimple_folder &, @@ -3675,7 +3678,7 @@ gimple_folder::convert_and_fold (tree type, tree old_ty = TREE_TYPE (lhs); gimple_seq stmts = NULL; bool convert_lhs_p = !useless_type_conversion_p (type, old_ty); - tree lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs; + tree lhs_conv = convert_lhs_p ? make_ssa_name (type) : lhs; unsigned int num_args = gimple_call_num_args (call); auto_vec args_conv; args_conv.safe_grow (num_args); -- 2.43.0
[committed] cobol: Wrap the call to fprintf in a libgcobol routine. [PR119524]
>From 501f95ffae0371e2335f89951a02a3a32f0cd53d Mon Sep 17 00:00:00 2001 From: Robert Dubner mailto:rdub...@symas.com Date: Sun, 1 Jun 2025 12:32:37 -0400 Subject: [PATCH] [PR119524] --- gcc/cobol/gengen.cc| 16 +--- libgcobol/libgcobol.cc | 14 ++ 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/gcc/cobol/gengen.cc b/gcc/cobol/gengen.cc index 91f67d534f6..a5f143cf234 100644 --- a/gcc/cobol/gengen.cc +++ b/gcc/cobol/gengen.cc @@ -2152,18 +2152,6 @@ gg_printf(const char *format_string, ...) int nargs = 0; tree args[ARG_LIMIT]; - // Because this routine is intended for debugging, we are sending the - // text to STDERR - - // Because we don't actually use stderr ourselves, we just pick it up as a - // VOID_P and pass it along to fprintf() - tree t_stderr = gg_declare_variable(VOID_P, "stderr", - NULL_TREE, - vs_external_reference); - - gg_push_context(); - - args[nargs++] = t_stderr; args[nargs++] = build_string_literal(strlen(format_string)+1, format_string); va_list ap; @@ -2197,7 +2185,7 @@ gg_printf(const char *format_string, ...) static tree function = NULL_TREE; if( !function ) { -function = gg_get_function_address(INT, "fprintf"); +function = gg_get_function_address(INT, "__gg__fprintf_stderr"); } tree stmt = build_call_array_loc (location_from_lineno(), @@ -2206,8 +2194,6 @@ gg_printf(const char *format_string, ...) nargs, args); gg_append_statement(stmt); - - gg_pop_context(); } tree diff --git a/libgcobol/libgcobol.cc b/libgcobol/libgcobol.cc index 6721dd20740..9fd160b7083 100644 --- a/libgcobol/libgcobol.cc +++ b/libgcobol/libgcobol.cc @@ -49,6 +49,7 @@ #include #include #include +#include #if __has_include() # include // for program_invocation_short_name #endif @@ -13159,3 +13160,16 @@ __gg__set_env_value(cblc_field_t *value, // And now, anticlimactically, set the variable: setenv(trimmed_env, trimmed_val, 1); } + +extern "C" +void +__gg__fprintf_stderr(const char *format_string, ...) + { + /* This routine allows the compiler to send stuff to stderr in a way + that is straightforward to use.. */ + va_list ap; + va_start(ap, format_string); + vfprintf(stderr, format_string, ap); + va_end(ap); + } + -- 2.34.1
Re: [PATCH] aarch64:sve: Use create_tmp_reg_or_ssa_name instead of create_tmp_var in the folder
On Sun, Jun 1, 2025 at 3:54 AM Richard Biener wrote: > > On Sat, May 31, 2025 at 8:41 PM Andrew Pinski > wrote: > > > > Currently gimple_folder::convert_and_fold calls create_tmp_var; that means > > while in ssa form, > > the pass which calls fold_stmt will always have to update the ssa (via > > TODO_update_ssa or otherwise). > > This seems not very useful since we know that this will always be a ssa > > name, using create_tmp_reg_or_ssa_name > > instead is better and don't need to depend on the ssa updater. Plus this > > should have a small compile time performance > > and memory usage improvement too since this uses an anonymous ssa name > > rather than creating a full decl for this. > > Note the gimple_in_ssa_p check in create_tmp_reg_or_ssa_name is > superfluous these days > given the gimplifier already creates SSA names for temporaries when > not in SSA form > (no PHI nodes, no CFG). So in principle we could switch > create_tmp_reg to always yield a SSA name. > > Thus for the fixup here I'd suggest using make_ssa_name (type) instead. yes that worked, I submitted https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685263.html for that. I am waiting on this being approved before pushing the CCP patch that removes update_ssa since the CCP exposes this issue. Thanks, Andrew > > Richard. > > > Built and tested on aarch64-linux-gnu. > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-sve-builtins.cc > > (gimple_folder::convert_and_fold): Use create_tmp_reg_or_ssa_name > > instead of create_tmp_var for the temporary. > > > > Signed-off-by: Andrew Pinski > > --- > > gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc > > b/gcc/config/aarch64/aarch64-sve-builtins.cc > > index 36519262efd..329237b086d 100644 > > --- a/gcc/config/aarch64/aarch64-sve-builtins.cc > > +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc > > @@ -3675,7 +3675,7 @@ gimple_folder::convert_and_fold (tree type, > >tree old_ty = TREE_TYPE (lhs); > >gimple_seq stmts = NULL; > >bool convert_lhs_p = !useless_type_conversion_p (type, old_ty); > > - tree lhs_conv = convert_lhs_p ? create_tmp_var (type) : lhs; > > + tree lhs_conv = convert_lhs_p ? create_tmp_reg_or_ssa_name (type) : lhs; > >unsigned int num_args = gimple_call_num_args (call); > >auto_vec args_conv; > >args_conv.safe_grow (num_args); > > -- > > 2.43.0 > >
[PATCH] Add newlib to gitignore
newlib sources are not a part of GCC so should be ignored, if present. Signed-off-by: Arijit Kumar Das --- .gitignore | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.gitignore b/.gitignore index 7150fc3b29c..5ae6a5a08d6 100644 --- a/.gitignore +++ b/.gitignore @@ -74,3 +74,6 @@ stamp-* # ADDITIONS from GCCRS front-end libgrust/*/target/ + +# newlib sources are not a part of GCC so should be ignored, if present as a symlink +newlib -- 2.39.5