[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702

--- Comment #19 from Segher Boessenkool  ---
(In reply to Avinash Jayakar from comment #17)
> I looked at the slp vectorization pass that converts scalar gimple code to

"straight-line paralellisation".  Some "scalar" (whatever that means) things
are most optimally implemented using "vector" instructions.  The SLP pass is
what makes this happen.

> vectorized gimple. Analysis happens in vect_slp_analyze_bb_1 before actually
> scheduling in vect_schedule_slp. There are multiple patterns written here to
> optimize simple operations such as *2 to <<1 in vect_recog_mult_pattern. 
> I have added a pattern just for detecting left shift by one and replacing it
> by add in vect_recog_lshift_by_one_pattern. Either this can be done, or I
> can move this logic in a shift pattern (vect_recog_widen_shift_pattern or
> vect_recog_vector_vector_shift_pattern). 
> 
> This does fix the original issue, where <<1 generates 2 instructions. With
> this patch it just generates 1 add instruction when code is vectorized. But
> other cases like *2 and a = a+a, is not handled right now. 
> 
> @Segher, I had a few questions on this 

This is bugzilla, this is not twitter, "@" means nothing (and my username is
"segher", not "Segher").

> - Do you suggest moving ahead in this direction? Since here I am
> manipulating the GIMPLE, it will affect different architectures as well,
> would this be ok?

If it does something that does make sense here, it is a good addition.  For
other archs as well (although Gimple-level optimisations are so ver far away
from the eventual machine code that it is hard to talk about the machine code
there at all: you are transforming some bit of Gimple code to some nicer piece
of Gimple code!)

\> (In reply to Segher Boessenkool from comment #15)
> > Just have it recognised by a define_insn that generates an addition insn
> > when generating assembler code.  You know...  the same as always :-)
> > 
> 
> - Thank you for this suggestion, I did give this a try but ran into a few
> issues. Is there a way in define_insn to detect that one of the operand in
> rtl is dead?

Yes.

dead_or_set_p perhaps.  It all depends on context.  You can use all of DF as
well of course.

> Because we need to be sure that const_1 is not used anywhere
> further before replacing the the 2 rtl insns (splat and shift), with just 1
> (plus). I checked the define_peephole2, it provides a way to check if an
> operand is dead. Would using the peephole pass for this make sense?

Peepholes make no sense ever, hehe.  Sometimes they are the most convenient
solution though.

You are thinking about peep2_reg_dead_p?

There are better, more modern, solutions almost always :-)  Text-based
peepholes have been eradicated from most places, now peephole2 should go the
way of the dodo :-)

[Bug middle-end/121453] New: [OpenMP] 'omp simd' with 'collapse' – variable '.count' uninitialized, but used as 'if (.iter.14 == .count.15)'

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121453

Bug ID: 121453
   Summary: [OpenMP] 'omp simd' with 'collapse' – variable
'.count' uninitialized, but used as 'if (.iter.14 ==
.count.15)'
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: openmp, wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

This shows up with the SPEC ACCEL testcase '455.seismic' by failing with nvptx
offload as:
  libgomp: cuCtxSynchronize error: an illegal memory access was encountered
or in the debugger:
  CUDA Exception: Warp Out-of-range Address

THis seems to be due to what is diagnosed by '-Wall':
  warning: ‘.count.1179’ may be used uninitialized


Simplified example – compile with '-fopenmp' (and for the warning, add some
optimization level; depending on the code, -O3 or -O1 is enough).

Dump with '-fopenmp -O0':

unsigned int .count.34;
  unsigned int .count.22;
if (.iter.33 == .count.34) goto ; else goto
;

That is: '.count.34' is used in a condition but never actually
assigned any value.

* * *

Both .iter and .count are generated in 'omp_extract_for_data'.

BTW: I tried gfortran-7 – and is shows the same issue according to the omplower
tree dump.

* * *

! Simplified example

implicit none
double precision, allocatable, dimension(:,:,:) :: vv
integer :: NX, NY, NZ 
integer :: k, j, i

NX = 5; NY = 5; NZ = 5

allocate (vv(NX,NY,NZ))

!$omp target teams distribute parallel do simd collapse(3)
! - simpler version:   !$omp target simd  collapse(3)
! - simplest version:  !$omp simd  collapse(3)
  do k=1, NZ
 do j=1, NY
do i=1, NX
   vv(i,j,k) = 0.d0
enddo
 enddo
  enddo
end

[Bug c++/121335] Vulkan module ICE

2025-08-07 Thread kongmingd234 at proton dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121335

--- Comment #6 from kongmingd234  ---

Here is the new output of putting the includes before the imports.
As you can see, the std module still has the ICE, but the main file does not,
instead, there are other errors.


Compiling /usr/include/vulkan/vulkan.cppm
c++ -fmodules -std=c++23 -O3 /usr/include/vulkan/vulkan.cppm -c -o /dev/null
Compiling /usr/local/src/gcc/installed-here/include/c++/16.0.0/bits/std.cc
c++ -fmodules -std=c++23 -O3
/usr/local/src/gcc/installed-here/include/c++/16.0.0/bits/std.cc -c -o
/dev/null
In file included from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/bits/locale_facets_nonio.h:2071,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/locale:45,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/format:49,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/ostream:44,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/istream:43,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/sstream:42,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/complex:50,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/x86_64-pc-linux-gnu/bits/stdc++.h:141,
 from
/usr/local/src/gcc/installed-here/include/c++/16.0.0/bits/std.cc:30:
/usr/local/src/gcc/installed-here/include/c++/16.0.0/bits/locale_facets_nonio.tcc:42:12:
internal compiler error: in lookup_mark, at cp/tree.cc:2524
   42 | struct __use_cache<__moneypunct_cache<_CharT, _Intl> >
  |^~~
0x29b73cf internal_error(char const*, ...)
../.././gcc/diagnostic-global-context.cc:517
0xb1f6fb fancy_abort(char const*, int, char const*)
../.././gcc/diagnostic.cc:1810
0x895526 lookup_mark(tree_node*, bool)
../.././gcc/cp/tree.cc:2524
0xcd0bac name_lookup::dedup(bool)
../.././gcc/cp/name-lookup.cc:486
0xcd0bac name_lookup::dedup(bool)
../.././gcc/cp/name-lookup.cc:481
0xcd0bac name_lookup::search_unqualified(tree_node*, cp_binding_level*)
../.././gcc/cp/name-lookup.cc:1184
0xcd5c86 lookup_name(tree_node*, LOOK_where, LOOK_want)
../.././gcc/cp/name-lookup.cc:8131
0xce85df lookup_name(tree_node*, LOOK_want)
../.././gcc/cp/name-lookup.h:410
0xce85df cp_parser_lookup_name
../.././gcc/cp/parser.cc:33057
0xd30376 cp_parser_template_name
../.././gcc/cp/parser.cc:19945
0xd30b4a cp_parser_template_id
../.././gcc/cp/parser.cc:19552
0xd314b6 cp_parser_class_name
../.././gcc/cp/parser.cc:27449
0xd2b1c7 cp_parser_qualifying_entity
../.././gcc/cp/parser.cc:7704
0xd2b1c7 cp_parser_nested_name_specifier_opt
../.././gcc/cp/parser.cc:7390
0xd3a7a7 cp_parser_class_head
../.././gcc/cp/parser.cc:28119
0xd1b62b cp_parser_class_specifier
../.././gcc/cp/parser.cc:27570
0xd1d0cb cp_parser_type_specifier
../.././gcc/cp/parser.cc:20722
0xd3bb17 cp_parser_decl_specifier_seq
../.././gcc/cp/parser.cc:17308
0xd421c1 cp_parser_single_declaration
../.././gcc/cp/parser.cc:34027
0xd4271e cp_parser_template_declaration_after_parameters
../.././gcc/cp/parser.cc:33784
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
Compiling vulkan_first_test.cpp
c++ -fmodules -std=c++23 -O3 vulkan_first_test.cpp -c -o vulkan_first_test.o
vulkan_first_test.cpp: In member function ‘void
HelloTriangleApplication::createLogicalDevice()’:
vulkan_first_test.cpp:36:72: error: ‘physicalDevice’ was not declared in this
scope; did you mean ‘VkPhysicalDevice’?
   36 | std::vector queueFamilyProperties =
physicalDevice.getQueueFamilyProperties();
  |   
^~
  |   
VkPhysicalDevice
vulkan_first_test.cpp:92:146: error: designated initializers cannot be used
with a non-aggregate type ‘vk::DeviceQueueCreateInfo’
   92 | eInfo deviceQueueCreateInfo { .queueFamilyIndex = graphicsIndex,
.queueCount = 1, .pQueuePriorities = &queuePriority };
  |
 ^
vulkan_first_test.cpp:92:146: error: no matching function for call to
‘vk::DeviceQueueCreateInfo::DeviceQueueCreateInfo()’
vulkan_first_test.cpp:93:145: error: designated initializers cannot be used
with a non-aggregate type ‘vk::DeviceCreateInfo’
   93 | o  deviceCreateInfo{ .pNext =  &features, .queueCreateInfoCount =
1, .pQueueCreateInfos = &deviceQueueCreateInfo };
  |
 ^
vulkan_first_test.cpp:93:145: error

[Bug c/40564] Invalid -Wc++-compat warning about stringized C++ operator name

2025-08-07 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40564

Vincent Lefèvre  changed:

   What|Removed |Added

 CC||vincent-gcc at vinc17 dot net

--- Comment #3 from Vincent Lefèvre  ---
A similar one:

#define S(...) #__VA_ARGS__
const char *s = S(The first, second, and third items.);

qaa:~> gcc -c -Werror=c++-compat tst.c
tst.c:2:38: error: identifier "and" is a special operator name in C++
[-Werror=c++-compat]
2 | const char *s = S(The first, second, and third items.);
  |  ^
cc1: some warnings being treated as errors

In particular, this triggers an error in autoconf's _AC_C_C99_TEST_GLOBALS:

[...]
#define showlist(...) puts (#__VA_ARGS__)
[...]
  showlist (The first, second, and third items.);

[Bug tree-optimization/121448] [16 Regression] ICE: verify_gimple failed: gimple cond condition cannot throw with -O -fsignaling-nans -ffinite-math-only -fnon-call-exceptions

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121448

--- Comment #2 from Andrew Pinski  ---

Hmm, `-fsignaling-nans -ffinite-math-only` these options don't seen to make
sense together, I am trying to figure out the best way of changing the
definition here. See https://gcc.gnu.org/pipermail/gcc/2025-August/246483.html

[Bug target/121449] [13/14/15/16 regression] Immediate offset out of range error for AArch64 SVE gather load

2025-08-07 Thread Pengfei.Li2 at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121449

--- Comment #3 from Pengfei Li  ---
My proposed fix
https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692125.html

[Bug c++/121335] Vulkan module ICE

2025-08-07 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121335

--- Comment #7 from Patrick Palka  ---
I can't reproduce the std.cc ICE. Sometimes such ICEs happen when there's stale
stuff in the ./gcm.cache/ folder. Can you remove this folder and see if the
std.cc ICE still happens?

For the vulkan_first_test.cpp errors, can you confirm that these errors aren't
present when using ordinary includes instead of modules? They seem like
legitimate errors instead of a compiler bug.

[Bug tree-optimization/121448] New: [16 Regression] ICE: verify_gimple failed: gimple cond condition cannot throw with -O -fsignaling-nans -ffinite-math-only -fnon-call-exceptions

2025-08-07 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121448

Bug ID: 121448
   Summary: [16 Regression] ICE: verify_gimple failed: gimple cond
condition cannot throw with -O -fsignaling-nans
-ffinite-math-only -fnon-call-exceptions
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 62078
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62078&action=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -fsignaling-nans -ffinite-math-only
-fnon-call-exceptions testcase.c 
testcase.c: In function 'foo':
testcase.c:5:1: error: gimple cond condition cannot throw
5 | foo(double d)
  | ^~~
if (d_2(D) == 0.0)
during GIMPLE pass: dom
testcase.c:5:1: internal compiler error: verify_gimple failed
0x2b7ed21 internal_error(char const*, ...)
/repo/gcc-trunk/gcc/diagnostic-global-context.cc:534
0x1632a1d verify_gimple_in_cfg(function*, bool, bool)
/repo/gcc-trunk/gcc/tree-cfg.cc:5588
0x148724a execute_function_todo
/repo/gcc-trunk/gcc/passes.cc:2097
0x148771e execute_todo
/repo/gcc-trunk/gcc/passes.cc:2149
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-20250807085411-r16-3062-g6026a54f2faedf-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/16.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-20250807085411-r16-3062-g6026a54f2faedf-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 16.0.0 20250807 (experimental) (GCC)

[Bug c++/121445] [13/14/15/16 Regression] ICE in build_data_member_initialization when compiling constexpr constructor with nested non-literal types (C++23)

2025-08-07 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121445

Marek Polacek  changed:

   What|Removed |Added

   Priority|P3  |P2
Summary|ICE in  |[13/14/15/16 Regression]
   |build_data_member_initializ |ICE in
   |ation when compiling|build_data_member_initializ
   |constexpr constructor with  |ation when compiling
   |nested non-literal types|constexpr constructor with
   |(C++23) |nested non-literal types
   ||(C++23)
 Ever confirmed|0   |1
   Last reconfirmed||2025-08-07
 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org
   Keywords||ice-on-valid-code
   Target Milestone|--- |13.5

--- Comment #1 from Marek Polacek  ---
Confirmed.  An old regression, it seems.

[Bug target/121449] New: [13/14/15/16 regression] Immediate offset out of range error for AArch64 SVE gather load

2025-08-07 Thread Pengfei.Li2 at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121449

Bug ID: 121449
   Summary: [13/14/15/16 regression] Immediate offset out of range
error for AArch64 SVE gather load
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Pengfei.Li2 at arm dot com
  Target Milestone: ---

Errors of “immediate offset out of range” are reported during assembly:

Assembler messages:
Error: immediate offset out of range 0 to 31 at operand 3 -- `ld1b
z1.d,p0/z,[z1.d,#56]'

To reproduce, point CC and CXX to a pre-built GCC trunk and do a bootstrap with
O3 and SVE enabled in stage1 flags on AArch64

$ export CC=/path/to/trunk/gcc CXX=/path/to/trunk/g++
$ /path/to/gcc/configure --enable-checking=release
$ make -j$(nproc) -l$(nproc) STAGE1_C{,XX}FLAGS="-O3 -march=armv8-a+sve"

I will upload a small reproducer.

[Bug c++/117783] [C++26] P1061R10 - Structured bindings can introduce a pack

2025-08-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117783

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:dff57d76ad3c78cedaf2f8caa1686acb5059303d

commit r16-3069-gdff57d76ad3c78cedaf2f8caa1686acb5059303d
Author: Jakub Jelinek 
Date:   Thu Aug 7 16:38:51 2025 +0200

c++: Implement C++26 P1061R10 - Structured Bindings can introduce a Pack
[PR117783]

The following patch implements the C++26
P1061R10 - Structured Bindings can introduce a Pack
paper.
One thing unresolved in the patch is mangling, I've raised
https://github.com/itanium-cxx-abi/cxx-abi/issues/200
for that but no comments there yet.  One question is if it is ok
not to mention the fact that there is a structured binding pack in
the mangling of the structured bindings but more important is in case
of std::tuple* we might need to mangle individual structured binding
pack elements separately (each might need an exported name for the
var itself and perhaps its guard variable as well).  The patch just
uses the normal mangling for the whole structured bindings and emits
sorry if we need to mangle the structured binding pack elements.
The patch just marks the structured binding pack specially (considered
e.g. using some bit on it, but in the end I'm identifying it using
a made up type which causes DECL_PACK_P to be true; it is kind of
self-referential solution, because the type on the pack mentions the
DECL_DECOMPOSITION_P VAR_DECL on which the type is attached as its pack,
so it needs to be handled carefully during instantiation to avoid infinite
recursion, but it is the type that should be used if something else
actually
needs to use the same type as the structured binding pack, e.g. a capture
proxy), and stores the pack elements when actually processed through
cp_finish_decomp with non-dependent initializer into a TREE_VEC used as
DECL_VALUE_EXPR of the pack; though because several spots use the
DECL_VALUE_EXPR and assume it is ARRAY_REF from which they can find out the
base variable and the index, it stores the base variable and index in the
first 2 TREE_VEC elts and has the structured binding elements only after
that.
https://eel.is/c++draft/temp.dep.expr#3.6 says the packs are type dependent
regardless of whether the initializer of the structured binding is type
dependent or not, so I hope having a dependent type on the structured
binding VAR_DECL is ok.
The paper also has an exception for sizeof... which is then not value
dependent when the structured bindings are initialized with non-dependent
initializer: https://eel.is/c++draft/temp.dep.constexpr#4
The patch special cases that in 3 spots (I've been wondering if e.g. during
parsing I couldn't just fold the sizeof... to the INTEGER_CST right away,
but guess I'd need to repeat that also during partial instantiation).

And one thing still unresolved is debug info, I've just added
DECL_IGNORED_P
on the structured binding pack VAR_DECL because there were ICEs with -g
for now, hope it can be fixed incrementally but am not sure what exactly
we should emit in the debug info for that.

Speaking of which, I see
DW_TAG_GNU_template_parameter_pack
DW_TAG_GNU_formal_parameter_pack
etc. DIEs emitted regardless of DWARF version, shouldn't we try to upstream
those into DWARF 6 or check what other compilers emit for the packs?
And bet we'd need DW_TAG_GNU_structured_binding_pack as well.

2025-08-07  Jakub Jelinek  

PR c++/117783
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Change
__cpp_structured_bindings
predefined value for C++26 from 202403L to 202411L.
gcc/cp/
* parser.cc: Implement C++26 P1061R10 - Structured Bindings can
introduce a Pack.
(cp_parser_range_for): Also handle TREE_VEC as DECL_VALUE_EXPR
instead of ARRAY_REF.
(cp_parser_decomposition_declaration): Use sb-identifier-list
instead
of identifier-list in comments.  Parse structured bindings with
structured binding pack.  Don't emit pedwarn about structured
binding attributes in structured bindings inside of a condition.
(cp_convert_omp_range_for): Also handle TREE_VEC as DECL_VALUE_EXPR
instead of ARRAY_REF.
* decl.cc (get_tuple_element_type): Change i argument type from
unsigned to unsigned HOST_WIDE_INT.
(get_tuple_decomp_init): Likewise.
(set_sb_pack_name): New function.
(cp_finish_decomp): Handle structured binding packs.
* pt.cc (tsubst_pack_expansion): Handle structured binding packs
and capture proxies for them.  Formatting fixes.
(tsubst_decl): For structured binding packs don't tsubst TREE_TYPE
first, instea

[Bug preprocessor/121450] New: Generated dependency fragment for module partitions cause make failure

2025-08-07 Thread joergboe at snafu dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121450

Bug ID: 121450
   Summary: Generated dependency fragment for module partitions
cause make failure
   Product: gcc
   Version: 15.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: preprocessor
  Assignee: unassigned at gcc dot gnu.org
  Reporter: joergboe at snafu dot de
  Target Milestone: ---

I use the following Makefile to compile the following module partition:

$ cat Makefile 
modulepart1.o: modulepart1.cpp
$(CXX) -o $@ $< -c -fmodules-ts -MMD

-include modulepart1.d

$ cat modulepart1.cpp 
export module m:part;

The first run of make generates the dependency-file:

$ cat modulepart1.d 
modulepart1.o gcm.cache/m-part.gcm: modulepart1.cpp
m:part.c++-module: gcm.cache/m-part.gcm
.PHONY: m:part.c++-module
gcm.cache/m-part.gcm:| modulepart1.o

The second run of make gives the error:
$ make
modulepart1.d:2: *** target pattern contains no '%'.  Stop.

This is due to the colon in the target name 'm:part'
GNU make allows to use colons in targets and prerequisites if escaped with
backslash.

See also Bug 41329

[Bug fortran/121452] New: [14/15/16 Regression] Bogus 'OpenMP constructs ... may not be nested inside ‘simd’ region' due to compiler-inserted "#pragma omp __structured_block"

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121452

Bug ID: 121452
   Summary: [14/15/16 Regression] Bogus 'OpenMP constructs ... may
not be nested inside ‘simd’ region' due to
compiler-inserted "#pragma omp __structured_block"
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: openmp, rejects-valid
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: sandra at gcc dot gnu.org
  Target Milestone: ---

This shows up with SPEC Accel for 463.swim - and is a regression due to
handling intervening code with 'collapse(2)' → see list of commits below.

It's a true regression for the code shown first, it is also a rejects-valid but
no regression for the code variants (C, C++, Fortran).


The example fails with:

Error: OpenMP constructs other than ‘ordered simd’, ‘simd’, ‘loop’ or ‘atomic’
may not be nested inside ‘simd’ region


Simplified testcase:

implicit none
integer :: i, j
integer :: A(5,5), B(5,5) = 1

!$omp simd collapse(2)
   do 10 i = 1, 5
 do 20 j = 1, 5
   A(i,j) = B(i,j)
20   continue
10 continue

if (any(A /= 1)) stop 1
end


The code looks as follows - note the "#pragma omp __structured_block".
Note that CONTINUE is an old way to end a loop and nothing is actually using
the label on tree level. (On Fortran level, the '10' / '20' are used in 'DO'.)

  #pragma omp simd collapse(2)
  for (i = 1; i <= 5; i = i + 1)
for (j = 1; j <= 5; j = j + 1)
  {
{
  a[((integer(kind=8)) j * 5 + (integer(kind=8)) i) + -6] =
b[((integer(kind=8)) j * 5 + (integer(kind=8)) i) + -6];
  __label_20:;
  #pragma omp __structured_block
{
  __label_10:
}
  L.2:;
}
L.1:;
  }

* * *

VARIANT - has the same issue and points at the 'x = 1' line:

integer :: x
…
!$omp simd collapse(2)
   do i = 1, 5
 do j = 1, 5
   A(i,j) = B(i,j)
 end do
 x = 1  ! Actual intervening code
   end do

* * *

C/C++ VARIANT - like last example, again with intervening code.
[Error points a '}' after the 'C[i] = 4;' line.]

void f(int *A, int *B, int *C)
{
  #pragma omp simd collapse(2)
  for (int i=0; i < 1; i++) {
for (int j=0; j < 1; j++)
  A[i] += B[j];
C[i] = 4;
  }
}

* * *

Side remark 1: Using 'omp do' it works.

Side remark 2: Old Fortran also liked to share the continue lines; but that has
no issue with regard to 'omp simd':

 do 10 i = 1, 5
   do 10 j = 1, 5
 A(i,j) = B(i,j)
  10 continue 
  ! "Warning: Fortran 2018 deleted feature: Shared DO termination label 10"

* * *

Regression causing is the combination of following two patches:

"#pragma omp __structured_block" was added in:

commit r14-3488-ga62c8324e7e31ae6614f549bdf9d8a653233f8fc
Author: Sandra Loosemore 
Date:   Thu Aug 24 17:34:59 2023 +

OpenMP: Add OMP_STRUCTURED_BLOCK and GIMPLE_OMP_STRUCTURED_BLOCK.

In order to detect invalid jumps in and out of intervening code in
imperfectly-nested loops, the front ends need to insert some sort of
marker to identify the structured block sequences that they push into
the inner body of the loop.  The error checking happens in the
diagnose_omp_blocks pass, between gimplification and OMP lowering, so
we need both GENERIC and GIMPLE representations of these markers.
They are removed in OMP lowering so no subsequent passes need to know
about them.

but only activated for Fortran in commit:

commit r14-3492-gb7c4a12a9df3170090a431fa4364b97b30b87752
Author: Sandra Loosemore 
Date:   Thu Aug 24 17:35:01 2023 +

OpenMP: Fortran support for imperfectly-nested loops

* * *

Showing the error is omp-low.cc's

/* Check nesting restrictions.  */
static bool   
check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx)
{
...
  if (gimple_code (ctx->stmt) == GIMPLE_OMP_FOR
  && gimple_omp_for_kind (ctx->stmt) == GF_OMP_FOR_KIND_SIMD
  && !ctx->loop_p)
{
  c = NULL_TREE;
  if (ctx->order_concurrent
  && (gimple_code (stmt) == GIMPLE_OMP_ORDERED
  || gimple_code (stmt) == GIMPLE_OMP_ATOMIC_LOAD
  || gimple_code (stmt) == GIMPLE_OMP_ATOMIC_STORE))
{
  error_at (gimple_location (stmt),
"OpenMP constructs other than %, %"
" or % may not be nested inside a region with"
" the % clause");

[Bug fortran/121452] [14/15/16 Regression] Bogus 'OpenMP constructs ... may not be nested inside ‘simd’ region' due to compiler-inserted "#pragma omp __structured_block"

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121452

--- Comment #1 from Tobias Burnus  ---
Another variant (true regression, other error message):

4 | !$omp do ordered(2)
  |   1
  Error: !$OMP DO inner loops must be perfectly nested with ORDERED clause at
(1)

which is a different check / error message but the same cause.

!$omp for ordered(2)
   do i = 1, 5
 do j = 1, 5
   A(i,j) = B(i,j)
20   continue 
10 continue

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #3 from Andrew Pinski  ---
Maybe there should be a target hook which says DECL_MERGEABLE will do anything
here so we don't over align on targets which don't have mergeable cst sections.
(macho and elf are the only ones which have support that I know of).

Does NVPTX support mergeable constant sections/variables?

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread avinashd at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702

--- Comment #20 from Avinash Jayakar  ---
(In reply to Segher Boessenkool from comment #19)
> If it does something that does make sense here, it is a good addition.  For
> other archs as well (although Gimple-level optimisations are so ver far away
> from the eventual machine code that it is hard to talk about the machine code
> there at all: you are transforming some bit of Gimple code to some nicer
> piece
> of Gimple code!)

In that case, I will update the vectorization of multiplication as well, and
send a patch.

> dead_or_set_p perhaps.  It all depends on context.  You can use all of DF as
> well of course.


> Peepholes make no sense ever, hehe.  Sometimes they are the most convenient
> solution though.
> 
> You are thinking about peep2_reg_dead_p?
Yeah, and why peephole, is because it looks at a window of instructions and
tries to rewrite it in a machine dependent way. 

> There are better, more modern, solutions almost always :-)  Text-based
> peepholes have been eradicated from most places, now peephole2 should go the
> way of the dodo :-)

Other way is to use the combine pass, which is machine independent and not the
right choice in this case I think. 
I am not sure just if the "define_insn" that produces assembly, can look at 2
instructions and replace it with one. Can it?

[Bug tree-optimization/121454] New: [16 regression] ICE in nonoverlapping_refs_since_match_p, at tree-ssa-alias.cc:1684

2025-08-07 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121454

Bug ID: 121454
   Summary: [16 regression] ICE in
nonoverlapping_refs_since_match_p, at
tree-ssa-alias.cc:1684
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---
Target: i386-pc-solaris2.11, i686-pc-linux-gnu,
x86_64-pc-linux-gnu

Between 20250806 (54edbeeaac6eb4865fab37374fbdff3a9a2f2e12) and 20250807
(8e3239e3e92f3cd57bf3a19f10daa66c4cb45cc1),
Go bootstrap got broken on x86 (seen on Solaris/i386, Linux/i686, Linux/x86_64)
compiling the 32-bit libgo/go/crypto/elliptic/elliptic.go:

during GIMPLE pass: fre
/vol/gcc/src/hg/master/local/libgo/go/crypto/elliptic/elliptic.go: In function
‘crypto/elliptic.CurveParams.ScalarMult’:
/vol/gcc/src/hg/master/local/libgo/go/crypto/elliptic/elliptic.go:295:1:
internal compiler error: in nonoverlapping_refs_since_match_p, at
tree-ssa-alias.cc:1684
  295 | func (curve *CurveParams) ScalarMult(Bx, By *big.Int, k []byte)
(*big.Int, *big.Int) { 
  | ^
0xac6efd0 internal_error(char const*, ...)
/vol/gcc/src/hg/master/local/gcc/diagnostic-global-context.cc:534
0xac7ae19 fancy_abort(char const*, int, char const*)
/vol/gcc/src/hg/master/local/gcc/diagnostics/context.cc:1640
0x97b71f1 nonoverlapping_refs_since_match_p
/vol/gcc/src/hg/master/local/gcc/tree-ssa-alias.cc:1684
0x97b20cc decl_refs_may_alias_p
/vol/gcc/src/hg/master/local/gcc/tree-ssa-alias.cc:2073
0x97b20cc refs_may_alias_p_2
/vol/gcc/src/hg/master/local/gcc/tree-ssa-alias.cc:2467
0x97b471b refs_may_alias_p_1(ao_ref*, ao_ref*, bool)
/vol/gcc/src/hg/master/local/gcc/tree-ssa-alias.cc:2576
0x97b471b stmt_may_clobber_ref_p_1(gimple*, ao_ref*, bool)
/vol/gcc/src/hg/master/local/gcc/tree-ssa-alias.cc:3299
0x97b4bff walk_non_aliased_vuses(ao_ref*, tree_node*, bool, void* (*)(ao_ref*,
tree_node*, void*), void* (*)(ao_ref*, tree_node*, void*, translate_flags*),
tree_node* (*)(tree_node*), unsigned int&, void*)
/vol/gcc/src/hg/master/local/gcc/tree-ssa-alias.cc:3966
0x98e74e2 vn_reference_lookup(tree_node*, tree_node*, vn_lookup_kind,
vn_reference_s**, bool, tree_node**, tree_node*, bool)
/vol/gcc/src/hg/master/local/gcc/tree-ssa-sccvn.cc:4198
0x98eb08c visit_nary_op
/vol/gcc/src/hg/master/local/gcc/tree-ssa-sccvn.cc:5649
0x98ec0d1 process_bb
/vol/gcc/src/hg/master/local/gcc/tree-ssa-sccvn.cc:8331
0x98ee6cd do_rpo_vn_1
/vol/gcc/src/hg/master/local/gcc/tree-ssa-sccvn.cc:8917
0x98ef4e7 execute
/vol/gcc/src/hg/master/local/gcc/tree-ssa-sccvn.cc:9078

I suspect this is due to

commit 53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c
Author: Richard Biener 
Date:   Wed Aug 6 12:31:13 2025 +0200

tree-optimization/121405 - missed VN with aggregate copy

although I haven't verified that yet.

[Bug target/121441] [16 Regression] 5% slowdown of 519.lbm_r on aarch64

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121441

Filip Kastl  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug c/121423] [ICE] nested function leads to internal compiler error: verify_gimple failed

2025-08-07 Thread njuwy at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121423

--- Comment #2 from Anonymous  ---
(In reply to Richard Biener from comment #1)
> I think we have a duplicate for this.  We do not have a way to represent
> this case in the IL after lowering nested functions since the function
> declaration (which is then global) refers to a local type in foo.
> 
> We might want to reject this case?

Do you mean that this is a dup of #120862 ?

It seems that the two trigger programs are different. That is caused by
introduction of attributes and is triggered at both -Os/-O2. Here is caused by
nested functions and is only triggered at -Os.

[Bug middle-end/121394] [14/15/16 Regression] Since r16-2595-gf1c80147641783: link-time error: libm_a-e_atan2.o):(.rodata.cst32): SHF_MERGE section size (56) must be a multiple of sh_entsize (32)

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121394

--- Comment #12 from Tobias Burnus  ---
(In reply to Andrew Pinski from comment #11)
> Created attachment 62073 [details]
> Patch which I am testing

Not really a surprising result, but nonetheless:
I can confirm that this solves the GCN link issue (using LLVM's lld) for
libgomp.fortran/fortran-torture_execute_math.f90 and
libgomp.oacc-fortran/fortran-torture_execute_math.f90

Thanks!

[Bug target/121440] 50% slowdown of 519.lbm_r on Zen5 since r16-2727-g09f0768b55b96c (the fix for pr120941)

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121440

Filip Kastl  changed:

   What|Removed |Added

Summary|[16 Regression] 50% |50% slowdown of 519.lbm_r
   |slowdown of 519.lbm_r on|on Zen5 since
   |Zen5 since  |r16-2727-g09f0768b55b96c
   |r16-2727-g09f0768b55b96c|(the fix for pr120941)
   |(the fix for pr120941)  |

--- Comment #1 from Filip Kastl  ---
This is actually not a regression against GCC 15.

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1294.477.0&plot.1=1359.477.0&plot.2=1288.477.0&;

Our graphs for Zen5 sadly don't go that far back but I'm guessing that the
introduction of the rrvl pass in r16-271-gd1cada7481420a got us the speedup
which we now lost.

[Bug target/121440] 50% slowdown of 519.lbm_r on Zen5 since r16-2727-g09f0768b55b96c (the fix for pr120941)

2025-08-07 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121440

--- Comment #2 from rguenther at suse dot de  ---
On Thu, 7 Aug 2025, pheeck at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121440
> 
> Filip Kastl  changed:
> 
>What|Removed |Added
> 
> Summary|[16 Regression] 50% |50% slowdown of 519.lbm_r
>|slowdown of 519.lbm_r on|on Zen5 since
>|Zen5 since  |r16-2727-g09f0768b55b96c
>|r16-2727-g09f0768b55b96c|(the fix for pr120941)
>|(the fix for pr120941)  |
> 
> --- Comment #1 from Filip Kastl  ---
> This is actually not a regression against GCC 15.
> 
> https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1294.477.0&plot.1=1359.477.0&plot.2=1288.477.0&;
> 
> Our graphs for Zen5 sadly don't go that far back but I'm guessing that the
> introduction of the rrvl pass in r16-271-gd1cada7481420a got us the speedup
> which we now lost.

Improving (but correctness!) hoisting to hoist from perfect nests
might be applicable, or moving rrvl to before invariant motion.

[Bug rtl-optimization/121439] [IOCCC] а case with high time complexity and high RAM usage

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121439

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-08-07
 Status|UNCONFIRMED |NEW

--- Comment #9 from Richard Biener  ---
For machine generated code we suggest to use -O1

For me GCC 15 at -O0 takes ~3GB memory and 35s, the interesing part is -O1,
there I also see combine taking ages.

Note combine is inherently quadratic in particular with large BBs and it
does not have any limiting implemented.  In particular LOG_LINK distance
isn't limited (and we do xyz-inbetween walks) and the retry extra walks
of insns makes it even cubic.

Something like the following tackles the latter, but I suspect for this
testcase it's also the former.  See 2nd hunk below.

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 4dbc1f6a4a4..40bb93755fb 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -1239,6 +1239,18 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
label_tick_ebb_start = label_tick;
   last_bb = this_basic_block;

+  rtx_insn *tem = BB_HEAD (this_basic_block);
+  if (!NONDEBUG_INSN_P (tem))
+   tem = next_nondebug_insn (tem);
+  int bb_first_luid = DF_INSN_LUID (tem);
+  tem = BB_END (this_basic_block);
+  if (!NONDEBUG_INSN_P (tem))
+   tem = prev_nondebug_insn (tem);
+  int bb_last_luid = DF_INSN_LUID (tem);
+  int extra_walk_budget
+   = bb_first_luid < bb_last_luid ? bb_last_luid - bb_first_luid : 0;
+  extra_walk_budget += 256;
+
   rtl_profile_for_bb (this_basic_block);
   for (insn = BB_HEAD (this_basic_block);
   insn != NEXT_INSN (BB_END (this_basic_block));
@@ -1432,6 +1444,17 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
record_dead_and_set_regs (insn);

 retry:
+ if (next)
+   {
+ if (DF_INSN_LUID (next) < DF_INSN_LUID (insn))
+   {
+ if (DF_INSN_LUID (insn) - DF_INSN_LUID (next)
+ < extra_walk_budget)
+   next = NULL;
+ else
+   extra_walk_budget -= DF_INSN_LUID (insn) - DF_INSN_LUID
(next);
+   }
+   }
  ;
}
 }


diff --git a/gcc/combine.cc b/gcc/combine.cc
index 4dbc1f6a4a4..c9e5c91c973 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -1079,6 +1079,10 @@ create_log_links (void)
  && asm_noperands (PATTERN (use_insn)) >= 0)
continue;

+ /* Don't add far away links.  */
+ if (DF_INSN_LUID (use_insn) - DF_INSN_LUID (insn) > 256)
+   continue;
+
  /* Don't add duplicate links between instructions.  */
  struct insn_link *links;
  FOR_EACH_LOG_LINK (links, use_insn)


that get's combine in check but then afer a few minutes LRA blows up
with >25GB memory use so I had to kill it.

[Bug c/118592] Add builtins/const folding for the new C23 math functions

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118592

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org

--- Comment #3 from Tobias Burnus  ---
TODO: most functions listed in comment 2 still need to be handled in GCC

However,
* contrib/download_prerequisites has been updated to download MPFR 4.2.2,
  which adds C23 math functions. → PR120237 / r16-3063-gb399a0084bc962

* For 'pi' trigonometric function support, see commits:

r16-710-g591d3d02664c7b [PATCH] gcc: add trigonometric pi-based functions as
gcc builtins
  builtins.def + test cases

r16-711-g89935d56f768b4 [PATCH] gcc: add trigonometric pi-based functions as
gcc builtins
  update of gcc/doc/extend.texi

r16-1839-g15fcb2f556f716 gcc: middle-end opt for trigonometric pi-based
functions builtins

[Bug target/117015] s390 should define spaceship4 optab

2025-08-07 Thread stefansf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117015

Stefan Schulze Frielinghaus  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Stefan Schulze Frielinghaus  
---
Fixed on trunk.

[Bug target/121432] [15/16 regression] GCC has a regression on Microblaze since r15-1619-g3b9b8d6cfdf593

2025-08-07 Thread romain.naour at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121432

Romain Naour  changed:

   What|Removed |Added

 CC||romain.naour at gmail dot com

--- Comment #10 from Romain Naour  ---
Hello,

Some architectures (aarch64 and x86) defines hooks for callee-save [1].

Just for testing, I added a new hook for callee-save on microblaze and
return 1 (like [2]) to restore the old behaviour prior to the commit [3].

gcc/config/microblaze/microblaze.cc:

/* Implement TARGET_CALLEE_SAVE_COST.  */
static int
microblaze_callee_save_cost (spill_cost_type, unsigned int hard_regno,
machine_mode,
   unsigned int, int mem_cost, const HARD_REG_SET &, bool)
{
  return 1;
}

#undef TARGET_CALLEE_SAVE_COST
#define TARGET_CALLEE_SAVE_COST microblaze_callee_save_cost

With that the system boot correctly.
This is probably not the correct fix... I hope it help. 

[1]
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b191e8bdecf881d11c1544c441e38f4c18392a15
[2]
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/i386.cc;h=3128973ba79cccfc6761f451dcb716b9558cc4da;hb=d3ff498c478acefce35de04402f99171b4f64a1a#l20606
[3]
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b

Best regards,
Romain

[Bug target/71064] Offloading vs. 'long double' data type

2025-08-07 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71064

Thomas Schwinge  changed:

   What|Removed |Added

Summary|nvptx offloading: "long |Offloading vs. 'long
   |double" data type   |double' data type
   Keywords||openmp
 Target|nvptx   |nvptx, GCN
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-08-07
 Ever confirmed|0   |1
 CC||burnus at gcc dot gnu.org,
   ||kcy at codesourcery dot com

--- Comment #2 from Thomas Schwinge  ---
Kwok Cheung Yeung, May 21, 2025 at 7:06 PM:
> [...] testcases for the ,  and  headers running on 
> the offload target.
> 
> One problem I have noticed is with the std::nexttoward() function in cmath, 
> which triggers the following error on NVPTX:
> 
> lto1: fatal error: nvptx-none - 80-bit-precision floating-point numbers 
> unsupported (mode 'XF')
> 
> Looking at the definition in c_global:
> 
>   constexpr float
>   nexttoward(float __x, long double __y)
>   { return __builtin_nexttowardf(__x, __y); }
> 
>   constexpr long double
>   nexttoward(long double __x, long double __y)
>   { return __builtin_nexttowardl(__x, __y); }
> 
> There is always at least one long double in the arguments, so this function 
> cannot work on an architecture that doesn’t support XF mode. I have #ifdef’ed 
> out the test for this function.


Tobias Burnus, May 21, 2025 at 7:45 PM:
>> There is always at least one long double in the arguments, so this function 
>> cannot work on an architecture that doesn’t support XF mode
> 
> … and likewise on systems where the host’s ‘long double' is TF (like on 
> PowerPC, which has multiple types of 128bit floating-point numbers).
> 
> On the other hand, it works on systems where the host’s ‘long double’ is the 
> same as ‘double’ (= DF) (like on Arm → Grace[-Hopper])
> 
> Workaround: use nextafter — which has the same type for ‘y' as for 'x’/return 
> type, contrary to nexttoward.


For that, we've put into 'libgomp.c++/target-std__cmath.C':

#if 0
  /* TODO Due to 'std::nexttoward' using 'long double to', this triggers a
 '80-bit-precision floating-point numbers unsupported (mode ‘XF’)'
error
 with x86_64 host and nvptx, GCN offload compilers, or
 '128-bit-precision floating-point numbers unsupported (mode ‘TF’)'
error
 with powerpc64le host and nvptx offload compiler, for example;
 PR71064 'nvptx offloading: "long double" data type'.
 It ought to work on systems where the host's 'long double' is the same
as
 'double' ('DF'): aarch64, for example?  */
  next = std::nexttoward (x, y);
  if (!(next > x && next < y))
return false;
#endif

[Bug target/121432] [15/16 regression] GCC has a regression on Microblaze since r15-1619-g3b9b8d6cfdf593

2025-08-07 Thread thomas.petazzoni--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121432

--- Comment #11 from Thomas Petazzoni  ---
So I've been able to narrow down the issue to arch/microblaze/kernel/irq.o. I
can't say it's the only file impacted, but as soon as I have this object file
from GCC 15.x in my build, it fails

The assembly diff between the GCC 14.x file (before) and the GCC 15.x file
(after) is as follows:

--- before  2025-08-07 11:04:24.561779219 +0200
+++ after   2025-08-07 11:04:14.775803129 +0200
@@ -5,31 +5,30 @@
 Disassembly of section .irqentry.text:

  :
-   0:  3021ffdcaddik   r1, r1, -36
-   4:  fa61001cswi r19, r1, 28
-   8:  f9e1swi r15, r1, 0
-   c:  fac10020swi r22, r1, 32
+   0:  3021ffe0addik   r1, r1, -32
+   4:  f9e1swi r15, r1, 0
+   8:  fa61001cswi r19, r1, 28
+   c:  f8a10024swi r5, r1, 36
   10:  b000imm 0
-  14:  eac0lwi r22, r0, 0
+  14:  ea60lwi r19, r0, 0
   18:  b000imm 0
   1c:  f8a0swi r5, r0, 0
   20:  b000imm 0
   24:  b9f4brlid   r15, 0
-  28:  1265addkr19, r5, r0
+  28:  8000or  r0, r0, r0
   2c:  b000imm 0
   30:  e860lwi r3, r0, 0
   34:  99fc1800brald   r15, r3
-  38:  10b3addkr5, r19, r0
+  38:  e8a10024lwi r5, r1, 36
   3c:  b000imm 0
   40:  b9f4brlid   r15, 0
   44:  8000or  r0, r0, r0
   48:  e9e1lwi r15, r1, 0
   4c:  b000imm 0
-  50:  fac0swi r22, r0, 0
+  50:  fa60swi r19, r0, 0
   54:  ea61001clwi r19, r1, 28
-  58:  eac10020lwi r22, r1, 32
-  5c:  b60f0008rtsdr15, 8
-  60:  30210024addik   r1, r1, 36
+  58:  b60f0008rtsdr15, 8
+  5c:  30210020addik   r1, r1, 32

 Disassembly of section .init.text:

So to me the code is identical to the exception of which registers are used...
which makes sense since our problematic GCC commit is about register allocation
if I understood correctly (I am not at all a compiler expert).

The corresponding C code is pretty short:
https://elixir.bootlin.com/linux/v6.16/source/arch/microblaze/kernel/irq.c

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

--- Comment #1 from Richard Biener  ---
It's a generic issue that should improve merging.

[Bug c++/121445] New: ICE in build_data_member_initialization when compiling constexpr constructor with nested non-literal types (C++23)

2025-08-07 Thread jirehguo at tju dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121445

Bug ID: 121445
   Summary: ICE in build_data_member_initialization when compiling
constexpr constructor with nested non-literal types
(C++23)
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jirehguo at tju dot edu.cn
  Target Milestone: ---

When compiling the following valid code using -std=c++23 flag, the compiler
crashes with an internal compiler error.
Reproducer: https://godbolt.org/z/9c6q7TzzG

Flags: -std=c++23

Code:
```
#include 

struct S {
  constexpr S() {
struct InnerStruct {
  std::string str;
};

struct MiddleStruct {
  InnerStruct inner;
};

struct OuterStruct {
  MiddleStruct middle;
};

OuterStruct outer{};
outer.middle.inner.str = "";
  }
};
```

Output:
```
: In constructor 'constexpr S::S()':
:19:3: internal compiler error: in build_data_member_initialization, at
cp/constexpr.cc:485
   19 |   }
  |   ^
0x2890715 diagnostics::context::diagnostic_impl(rich_location*,
diagnostics::metadata const*, diagnostics::option_id, char const*,
__va_list_tag (*) [1], diagnostics::kind)
???:0
0x2885fb6 internal_error(char const*, ...)
???:0
0xaf7890 fancy_abort(char const*, int, char const*)
???:0
0xb785ee maybe_save_constexpr_fundef(tree_node*)
???:0
0xbedfe7 finish_function(bool)
???:0
0xd27003 c_parse_file()
???:0
0xe922c9 c_common_parse_file()
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
Compiler returned: 1
```

[Bug target/121444] New: [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

Bug ID: 121444
   Summary: [16 Regression] nvptx: Increased '.align' for 'CSWTCH'
after "Improve mergability of CSWTCH [PR120523]"
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: pinskia at gcc dot gnu.org, vries at gcc dot gnu.org
  Target Milestone: ---
Target: nvptx

For nvptx, I'm seeing increased '.align' for 'CSWTCH'es after commit
r16-2595-gf1c8014764178335e3b949e06b894ff5775beae5 "Improve mergability of
CSWTCH [PR120523]", for example:

--- "0-wo_Improve mergability of CSWTCH
[PR120523]/nvptx-none/newlib/libm/math/libm_a-e_atan2.o"2025-08-04
17:15:09.921571480 +0200
+++ "1-w_Improve mergability of CSWTCH
[PR120523]/nvptx-none/newlib/libm/math/libm_a-e_atan2.o" 2025-08-06
11:38:01.766996470 +0200
@@ -14,6 +14,6 @@
 // BEGIN VAR DEF: CSWTCH$7
-.global .align 8 .u64 CSWTCH$7[3] =
+.global .align 32 .u64 CSWTCH$7[3] =
 {-9223372036854775808,4614256656552045848,-4609115380302729960 };
 // BEGIN VAR DEF: CSWTCH$6
-.global .align 8 .u64 CSWTCH$6[3] =
+.global .align 32 .u64 CSWTCH$6[3] =
 {-4618122579557470952,4612488097114038738,-4610883939740737070 };
--- "0-wo_Improve mergability of CSWTCH
[PR120523]/nvptx-none/newlib/libm/math/libm_a-ef_atan2.o"   2025-08-04
17:15:10.049570119 +0200
+++ "1-w_Improve mergability of CSWTCH
[PR120523]/nvptx-none/newlib/libm/math/libm_a-ef_atan2.o"2025-08-06
11:38:01.834995725 +0200
@@ -14,6 +14,6 @@
 // BEGIN VAR DEF: CSWTCH$7
-.global .align 4 .u32 CSWTCH$7[3] =
+.global .align 16 .u32 CSWTCH$7[3] =
 {2147483648,1078530011,3226013659 };
 // BEGIN VAR DEF: CSWTCH$6
-.global .align 4 .u32 CSWTCH$6[3] =
+.global .align 16 .u32 CSWTCH$6[3] =
 {3209236443,1075235812,3222719460 };

Per my understanding, that shouldn't be necessary; "The default alignment for
scalar and array variables is to a multiple of the base-type size." (as it got
emitted before, explicitly: 'u32' -> 'align 4', etc.).

I've not yet attempted to understand whether that's a generic or nvptx-specific
issue.

[Bug c/121446] New: [OpenMP][OpenACC] 'atomic' directive rejects complex variables in C/C++

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121446

Bug ID: 121446
   Summary: [OpenMP][OpenACC] 'atomic' directive rejects complex
variables in C/C++
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: openacc, openmp, rejects-valid
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: jakub at gcc dot gnu.org, tschwinge at gcc dot gnu.org
  Target Milestone: ---

Found when looking at PR121416.

That's about
  _Complex double
  _Complex int
type variable in atomics.

In Fortran, those are accepted and quite explicitly so following the spec.

For OpenMP:
  The question is whether both or at least the first is valid for C (and C++?)
  At least clang-20 accepts both.

For OpenACC: The wording is slightly odd, but implies that at least for C, some
_Complex types are valid.

Currently, gcc/g++ reject both as follows (with -fopenmp and -fopenacc):

  error: invalid expression type for ‘#pragma omp atomic’

[For completeness: Likewise with vector types.]

* * *

OpenMP requires:
"* x, r (result), and v (as applicable) are lvalue expressions with scalar
type."
- with -
"For C/C++, a scalar-variable, as defined by the base language."

C23 defines it as:
"Arithmetic types, pointer types, and the nullptr_t type are collectively
called scalar types."
- and -
"Integer and floating types are collectively called arithmetic types. Each
arithmetic type belongs to one type domain: the real type domain comprises the
real types, the complex type domain comprises the complex types."

C++23: "Arithmetic types (6.8.2), enumeration types, pointer types,
pointer-to-member types (6.8.4), std::nullptr_t, and cv-qualified (6.8.5)
versions of these types are collectively called scalar types."
"Integral and floating-point types are collectively termed arithmetic types."

* * *

OpenACC requires:
"* x and v (as applicable) are both l-value expressions with scalar type."
- with -
"In C, scalar datatypes are char (signed or unsigned), int (signed or unsigned,
with optional short, long or long long attribute), enum, float, double, long
double, Complex (with optional float or long attribute), or any pointer
datatype. In C++, scalar datatypes are char (signed or unsigned), wchar t, int
(signed or
unsigned, with optional short, long or long long attribute), enum, bool, float,
double, long double, or any pointer datatype."

* * *

Remarks:

For OpenACC, I wonder why for C only with 'float', 'double' and 'long' and no
'_Complex int' or '_Complex long double'.

For C, I noticed that the standard only talks about _Complex +
float/double/long double and not integral types.

For C++, the standard does not have _Complex any only provides 'complex' via
the
header  that defines a class template

* * *

In most hardware, I assume that complex float and int (intN_t, N <= 4?) should
be supported, for complex double/long double + complex long/long long/intN_t (N
>= 8) it might be more difficult.

For more complex reductions (including involving 'complex') and for
'complex(8)', GCC uses GOMP_start/GOMP_stop (i.e. locking) – which looks
similar to the libatomic fallback version, except that start/stop permits more
code than just, e.g., a atomic_compare_exchange_16.

[It seems as if in some cases, GOMP_atomic_start/GOMP_atomic_stop is used even
though the data type is small enough, but I might have missed something.]

* * *  Test case * * *

_Complex int i, j;
_Complex double d, e;

void f() {
#pragma acc atomic update
#pragma omp atomic update
 i += 1;

#pragma acc atomic update
#pragma omp atomic update
 i += d;

#pragma acc atomic update
#pragma omp atomic update
 d += e;
}

* * *

A patch would be the following. – Note, however, the comment
above, which had to be changed if accepted:

--- a/gcc/c-family/c-omp.cc
+++ b/gcc/c-family/c-omp.cc
@@ -237,12 +237,15 @@ c_finish_omp_atomic (location_t loc, enum tree_code code,
   /* ??? According to one reading of the OpenMP spec, complex type are
  supported, but there are no atomic stores for any architecture.
  But at least icc 9.0 doesn't support complex types here either.
  And lets not even talk about vector types...  */
   type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type)
   && !POINTER_TYPE_P (type)
-  && !SCALAR_FLOAT_TYPE_P (type))
+  && !SCALAR_FLOAT_TYPE_P (type)
+  && !COMPLEX_FLOAT_TYPE_P (type)
+  && !(TREE_CODE (type) == COMPLEX_TYPE
+  && INTEGRAL_TYPE_P (TREE_TYPE (type
 {
   error_at (loc, "invalid expression type for %<#pragma omp atomic%>");
   return error_mark_node;
 }

[Bug target/121416] [gcn][MI300][CDNA3] libgomp.oacc-c-c++-common/reduction-cplx-dbl.c produces wrong gang-reduction result

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121416

--- Comment #4 from Tobias Burnus  ---
For completeness, modifying OpenACC's reduction-cplx-dbl.c to use atomics, i.e.

#pragma acc parallel num_gangs (32) copyin(ary[0:N]) copy(tsum,tprod)
  #pragma acc loop gang
for (int ix = 0; ix < N; ix++)
  {
#pragma acc atomic update
__real__ tsum += __real__ ary[ix];
#pragma acc atomic update
__imag__ tsum += __imag__ ary[ix];

also yields the correct result.

[Here, with atomics, the data is updated on every step - and not once per
threads/worker and once per team/gang as with reductions.]

[Bug tree-optimization/121405] [13/14/15 Regression] Another missed VN via a copy (but via an int copy)

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

Richard Biener  changed:

   What|Removed |Added

Summary|[13/14/15/16 Regression]|[13/14/15 Regression]
   |Another missed VN via a |Another missed VN via a
   |copy (but via an int copy)  |copy (but via an int copy)
  Known to work||16.0

--- Comment #11 from Richard Biener  ---
Fixed on trunk.

[Bug target/121447] New: [16 Regression] ~20% slowdown of 470.lbm since r16-1644-gaba3b9d3a48a07 on AMD Zen5

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121447

Bug ID: 121447
   Summary: [16 Regression] ~20% slowdown of 470.lbm since
r16-1644-gaba3b9d3a48a07 on AMD Zen5
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pheeck at gcc dot gnu.org
CC: hjl at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1283.240.0

r16-1644-gaba3b9d3a48a07 slowed down the 470.lbm SPEC 2006 benchmark by ~20%
when compiled with -Ofast -march=native -flto -fprofile-use on AMD Zen5.

I've already mentioned this slowdown in pr120941 comment 8.  Apparently this is
a separate problem, since the fix for pr120941 didn't help with this.

I'll eventually try to find out what is causing the slowdown and extract a
testcase.  I expect analysis will be more difficult since PGO is involved this
time.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/121447] [16 Regression] ~20% slowdown of 470.lbm since r16-1644-gaba3b9d3a48a07 on AMD Zen5

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121447

Filip Kastl  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug tree-optimization/121405] [13/14/15/16 Regression] Another missed VN via a copy (but via an int copy)

2025-08-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c

commit r16-3066-g53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c
Author: Richard Biener 
Date:   Wed Aug 6 12:31:13 2025 +0200

tree-optimization/121405 - missed VN with aggregate copy

The following handles value-numbering of a BIT_FIELD_REF of
a register that's defined by a load by looking up a subset
load similar to how we handle bit-and masked loads.  This
allows the testcase to be simplified by two FRE passes,
the first one will create the BIT_FIELD_REF.

PR tree-optimization/121405
* tree-ssa-sccvn.cc (visit_nary_op): Handle BIT_FIELD_REF
with reference def by looking up a combination of both.

* gcc.dg/tree-ssa/ssa-fre-107.c: New testcase.
* gcc.target/i386/pr90579.c: Adjust.

[Bug c++/121433] -Wredundant-move false positive

2025-08-07 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121433

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED
 CC||mpolacek at gcc dot gnu.org

--- Comment #1 from Marek Polacek  ---
GCC implements CWG 1579 so the conversion to std::optional is OK
and the implicit move still kicks in.

[Bug c/121446] [OpenMP][OpenACC] 'atomic' directive rejects complex variables in C/C++

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121446

--- Comment #1 from Andrew Pinski  ---
Complex int is a gnu C extension esthetic than part of the standard which is
most likely why it is not mentioned in openacc.

Long _Complex mentioned in openacc refers to complex long double.

[Bug c++/110338] Implement C++26 language features

2025-08-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110338
Bug 110338 depends on bug 117783, which changed state.

Bug 117783 Summary: [C++26] P1061R10 - Structured bindings can introduce a pack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117783

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c++/117783] [C++26] P1061R10 - Structured bindings can introduce a pack

2025-08-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117783

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |16.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jakub Jelinek  ---
Implemented for 16+.

[Bug target/121451] New: RISC-V: zero-stride load broadcast vs. vector-scalar

2025-08-07 Thread parras at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121451

Bug ID: 121451
   Summary: RISC-V: zero-stride load broadcast vs. vector-scalar
   Product: gcc
   Version: 15.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: parras at gcc dot gnu.org
  Target Milestone: ---

https://godbolt.org/z/brW6sG7KM
Reduced from 538.imagick topblock #0 (11 insns, 36.75%)

We get the following assembly:

fld fa5,0(a1)
vfmv.v.fv3,fa5
vfmacc.vv   v1,v3,v2

But since PR119100 we should get:

fld fa5,0(a4)
vfmacc.vf   v1,fa5,v2

What is preventing the combination here is the vec_duplicate operand being a
mem:

(set (reg:RVVM1DF 157 [ _20 ])
(vec_duplicate:RVVM1DF (mem:DF (reg/v/f:DI 153 [ g ]) [1 *g_16(D)+0 S8
A64])))

OTOH this seems to be candidate for a zero-stride load broadcast:

vlse64.vv3,0(a1),zero
vfmacc.vv   v1,v3,v2

However since r16-2452-gf796f819c35cc0 this case is explicitly handled as a
regular broadcast (implying the vfmv). Is there a reason to prefer forcing
unconditionally the memory operand into a register (fld + vfmv) over a
zero-stride load (vlse)?

bool
can_be_broadcast_p (rtx op)
{
...
  if (FLOAT_MODE_P (mode)
  && (memory_operand (op, mode) || CONSTANT_P (op))
  && can_create_pseudo_p ())
return true;

I also noticed the tunable discussed in PR118734 but the decision made here
does not involve it.

[Bug target/121414] aarch64: streaming & streaming-compatible functions should be marked as variant PCS

2025-08-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121414

--- Comment #2 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:851cbdca8848525b35cbea8d02ba75a167fc11c1

commit r16-3068-g851cbdca8848525b35cbea8d02ba75a167fc11c1
Author: Richard Sandiford 
Date:   Thu Aug 7 15:15:00 2025 +0100

aarch64: Mark SME functions as .variant_pcs [PR121414]

Unlike base PCS functions, __arm_streaming and __arm_streaming_compatible
functions allow/require PSTATE.SM to be 1 on entry, so they need to
be treated as STO_AARCH64_VARIANT_PCS.

Similarly, functions that share ZA or ZT0 with their callers require
ZA to be active on entry, whereas the base PCS requires ZA to be
dormant or off.  These functions too need to be marked as having
a variant PCS.

gcc/
PR target/121414
* config/aarch64/aarch64.cc (aarch64_is_variant_pcs): New function,
split out from...
(aarch64_asm_output_variant_pcs): ...here.  Handle various types
of SME function type.

gcc/testsuite/
PR target/121414
* gcc.target/aarch64/sme/pr121414_1.c: New test.

[Bug target/121414] [13/14/15 Backport] aarch64: streaming & streaming-compatible functions should be marked as variant PCS

2025-08-07 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121414

Richard Sandiford  changed:

   What|Removed |Added

Summary|aarch64: streaming &|[13/14/15 Backport]
   |streaming-compatible|aarch64: streaming &
   |functions should be marked  |streaming-compatible
   |as variant PCS  |functions should be marked
   ||as variant PCS

--- Comment #3 from Richard Sandiford  ---
Fixed on trunk.  It's a wrong-code issue, so I'll backport to all open release
branches.

[Bug target/120718] [15 Backport] ICE (unrecognizable insn) with const_poly_int in v2si vector

2025-08-07 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120718

Richard Sandiford  changed:

   What|Removed |Added

Summary|ICE (unrecognizable insn)   |[15 Backport] ICE
   |with const_poly_int in v2si |(unrecognizable insn) with
   |vector  |const_poly_int in v2si
   ||vector

--- Comment #7 from Richard Sandiford  ---
Fixed on trunk.  Will backport to GCC 15 for the SLP test case, but not
further.

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702

--- Comment #18 from Segher Boessenkool  ---
(In reply to Surya Kumari Jangala from comment #16)
> With the testcase in the "Description", we are seeing both a splat and a
> shift being generated. Instead, a single add instruction is more efficient.

Exactly, with the code around it, an addition works best here.  A "shift
immediate" would be fine as well, but that insn doesn't exist.

[Bug tree-optimization/121448] [16 Regression] ICE: verify_gimple failed: gimple cond condition cannot throw with -O -fsignaling-nans -ffinite-math-only -fnon-call-exceptions

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121448

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |16.0
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2025-08-07

--- Comment #1 from Andrew Pinski  ---
.

[Bug target/121449] [13/14/15/16 regression] Immediate offset out of range error for AArch64 SVE gather load

2025-08-07 Thread Pengfei.Li2 at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121449

--- Comment #1 from Pengfei Li  ---
Created attachment 62079
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62079&action=edit
reproducer

[Bug target/121449] [13/14/15/16 regression] Immediate offset out of range error for AArch64 SVE gather load

2025-08-07 Thread Pengfei.Li2 at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121449

--- Comment #2 from Pengfei Li  ---
Comment on attachment 62079
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62079
reproducer

Compiling above C++ file with "-O3 -march=armv8-a+sve" on AArch64 can reproduce
the bug

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

--- Comment #4 from Thomas Schwinge  ---
Thanks, Richi, Andrew.

(In reply to Andrew Pinski from comment #2)
> I am curious is this causing an assembly failure? Or you were just curious
> about the alignment changes?

For nvptx, this doesn't cause any test suite regressions.  I just happened to
notice the difference in code generated for GCC's target libraries, which
looked like a (minor, indeed) regression.

We don't know how exactly the (proprietary) PTX/Nvidia GPU back end code
operates, how it lays out objects in memory.  GCC/nvptx isn't ELF, it's
unlikely that any such merging of 'CSWTCH'es is able to happen.

> DECL_MERGEABLE will be set on the decls also. So if you need to lower back
> when outing the decl you can check that.

Noted, thanks.  We don't "need to lower back", but maybe we should (if it's
indeed easy; I'll have a look), as the increased alignment isn't beneficial per
my understanding.  Same issue for other non-ELF targets, I suppose?

(In reply to Andrew Pinski from comment #3)
> Maybe there should be a target hook which says DECL_MERGEABLE will do
> anything here so we don't over align on targets which don't have mergeable
> cst sections.

..., so that was my next thought, whether that's worth having general
infrastructure for?

> (macho and elf are the only ones which have support that I
> know of).
> 
> Does NVPTX support mergeable constant sections/variables?

There's no such concept, as far as I know.

[Bug tree-optimization/121454] [16 regression] ICE in nonoverlapping_refs_since_match_p, at tree-ssa-alias.cc:1684

2025-08-07 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121454

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug fortran/121435] Incorrect result with default pointer initialization

2025-08-07 Thread kargls at comcast dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121435

kargls at comcast dot net changed:

   What|Removed |Added

 CC||kargls at comcast dot net

--- Comment #1 from kargls at comcast dot net ---
It appears that default initialization is not occurring for
the main program unit.  Here's a slightly modified and expanded
test.


  program foo

integer, target :: v(5) = [1, 2, 3, 4, 5]

type entry
  integer, pointer :: p(:) => v! default initialization
end type entry

type(entry) test
type(entry), allocatable :: d

! This causes a segfault at runtime, because default
! initialization is not occurring in the main program
! unit.
!
! if (any(v /= test%p)) stop 1
! print *, test%p

call bar(test)

block
  ! def. init. occurs in block construct
  type(entry) b
  if (any(v /= b%p)) stop 3
end block

! def. init. occurs in allocation
allocate(d)
if (any(v /= d%p)) stop 4

contains
  ! def. init. occurs for intent(out) dummy argument
  subroutine bar(a)
type(entry), intent(out) :: a
if (any(v /= a%p)) stop 2
  end subroutine bar
  end program foo

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Keywords||missed-optimization
   Last reconfirmed||2025-08-07

--- Comment #5 from Andrew Pinski  ---
I Noticed that clang puts the cst in the mergeable sections even without
increasing the alignment. Let me see if I can do that without increasing the
alignment overall. Still need to do the padding for the section though.

That will remove the extra alignment and even fix PR 121438 without any changes
to the front-end. 

But this might not happen until next week.

[Bug tree-optimization/121454] [16 regression] ICE in nonoverlapping_refs_since_match_p, at tree-ssa-alias.cc:1684

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121454

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||121405

--- Comment #1 from Andrew Pinski  ---
>although I haven't verified that yet.

https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692121.html


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121405
[Bug 121405] [13/14/15 Regression] Another missed VN via a copy (but via an int
copy)

[Bug target/121441] New: [16 Regression] 5% slowdown of 519.lbm_r on aarch64

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121441

Bug ID: 121441
   Summary: [16 Regression] 5% slowdown of 519.lbm_r on aarch64
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: missed-optimization, needs-bisection
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pheeck at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: aarch64-gnu-linux
Target: aarch64-gnu-linux

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=585.477.0

there was a 5% exec time slowdown of the 519.lbm_r SPEC 2017 benchmark between
commits

r16-2456-g556ed247adc985
r16-2619-g688f1947bd5453

when run with -Ofast (generic march) on an Ampere Altra (Neoverse N1) machine.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/121416] [gcn][MI300][CDNA3] libgomp.oacc-c-c++-common/reduction-cplx-dbl.c produces wrong gang-reduction result

2025-08-07 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121416

--- Comment #2 from Andrew Stubbs  ---
Let's look at why the atomic instructions that exist aren't working for us,
before we try to use the big dumb hammer fix (and does that solution *really*
work, if we don't understand the cache architecture properly?)

[Bug c++/121443] New: GCC rejects valid lambda with local struct constructor in default argument

2025-08-07 Thread jirehguo at tju dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121443

Bug ID: 121443
   Summary: GCC rejects valid lambda with local struct constructor
in default argument
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jirehguo at tju dot edu.cn
  Target Milestone: ---

GCC rejects following code, which is accepted by Clang and MSVC.
Reproducer: https://godbolt.org/z/EWjjvdfrh
The same lambda, when used within a function body (e.g., main), is accepted
without error.

Flags: -std=c++23

Code:
```
int g(int i = [] {
  struct Node {
int value;
Node(int val) : value(val) {}
  };
  Node node1(45);
  return node1.value;
}()) {
  return i;
}

int main() {
  return g();
}

```

Output:
```
: In lambda function:
:4:10: error: expected unqualified-id before 'int'
4 | Node(int val) : value(val) {}
  |  ^~~
:4:10: error: expected ')' before 'int'
4 | Node(int val) : value(val) {}
  | ~^~~
  |  )
Compiler returned: 1
```

[Bug c++/121442] [16 Regression] Error recovery ICE since r16-2108

2025-08-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121442

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2025-08-07
   Target Milestone|--- |16.0
 Ever confirmed|0   |1

[Bug c++/121434] spurious -Wsequence-point warning

2025-08-07 Thread matthijsvanduin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121434

Matthijs van Duin  changed:

   What|Removed |Added

 CC||matthijsvanduin at gmail dot 
com

--- Comment #6 from Matthijs van Duin  ---
(In reply to Andrew Pinski from comment #1)
> Note IIRC Wsequence-point is designed for the C++98 rules rather than more
> recent rules and this is on purpose.

If this is true then that's a bad decision imho. Warning about "undefined
behaviour" when the behaviour is in fact well-defined is a bug, plain and
simple.

When I specify -std=c++17 or later it means I no longer care about the
limitations of older versions of the standard, and in fact almost certainly the
code won't even compile using those older versions, hence whether or not it
would have undefined behaviour under older rules is completely irrelevant.

[Bug target/117015] s390 should define spaceship4 optab

2025-08-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117015

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug c++/121442] New: [16 Regression] Error recovery ICE since r16-2108

2025-08-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121442

Bug ID: 121442
   Summary: [16 Regression] Error recovery ICE since r16-2108
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

struct S { int a, b, c, d, e; };

void
foo ()
{
  auto [a, b, b, b, c ] = S {};
}

ICEs starting with r16-2108-gc81447d969f27a8653ebb1a450372f0d25a2e628
u3.C: In function ‘void foo()’:
u3.C:6:15: error: redeclaration of ‘auto b’
6 |   auto [a, b, b, b, c ] = S {};
  |   ^
u3.C:6:12: note: ‘auto b’ previously declared here
6 |   auto [a, b, b, b, c ] = S {};
  |^
u3.C:6:18: error: redeclaration of ‘auto b’
6 |   auto [a, b, b, b, c ] = S {};
  |  ^
u3.C:6:12: note: ‘auto b’ previously declared here
6 |   auto [a, b, b, b, c ] = S {};
  |^
u3.C:6:23: internal compiler error: tree check: expected var_decl or
function_decl, have error_mark in cp_parser_decomposition_declaration, at
cp/parser.cc:17005
6 |   auto [a, b, b, b, c ] = S {};
  |   ^
0x2f7fbe9 internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:534
0x168b7ca tree_check_failed(tree_node const*, char const*, int, char const*,
...)
../../gcc/tree.cc:9161
0x448333 tree_check2(tree_node*, char const*, int, char const*, tree_code,
tree_code)
../../gcc/tree.h:3752
0x705370 cp_parser_decomposition_declaration
../../gcc/cp/parser.cc:17005
0x704468 cp_parser_simple_declaration
../../gcc/cp/parser.cc:16617
0x70421f cp_parser_block_declaration
../../gcc/cp/parser.cc:16512
0x70208c cp_parser_declaration_statement
../../gcc/cp/parser.cc:15549
0x6fc63b cp_parser_statement
../../gcc/cp/parser.cc:13379
0x6fdaf0 cp_parser_statement_seq_opt
../../gcc/cp/parser.cc:13953
0x6fd660 cp_parser_compound_statement
../../gcc/cp/parser.cc:13800
0x719b89 cp_parser_function_body
../../gcc/cp/parser.cc:26949
0x719edf cp_parser_ctor_initializer_opt_and_function_body
../../gcc/cp/parser.cc:27000
0x7293b6 cp_parser_function_definition_after_declarator
../../gcc/cp/parser.cc:34006
0x72919c cp_parser_function_definition_from_specifiers_and_declarator
../../gcc/cp/parser.cc:33921
0x71411b cp_parser_init_declarator
../../gcc/cp/parser.cc:24231
0x7046b7 cp_parser_simple_declaration
../../gcc/cp/parser.cc:16693
0x70421f cp_parser_block_declaration
../../gcc/cp/parser.cc:16512
0x703c4f cp_parser_declaration
../../gcc/cp/parser.cc:16313
0x703d2c cp_parser_toplevel_declaration
../../gcc/cp/parser.cc:16334
0x6e84c9 cp_parser_translation_unit
../../gcc/cp/parser.cc:5503
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug target/121432] [15/16 regression] GCC has a regression on Microblaze since r15-1619-g3b9b8d6cfdf593

2025-08-07 Thread thomas.petazzoni--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121432

--- Comment #12 from Thomas Petazzoni  ---
I can confirm: if I take all object files from GCC 14.x, and just
arch/microblaze/kernel/irq.o from GCC 15.x, the issue occurs, where user-space
applications don't work. I guess they are failing when they do a syscall.

[Bug target/121416] [gcn][MI300][CDNA3] libgomp.oacc-c-c++-common/reduction-cplx-dbl.c produces wrong gang-reduction result

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121416

--- Comment #3 from Tobias Burnus  ---
(In reply to Andrew Stubbs from comment #2)
> Let's look at why the atomic instructions that exist aren't working for us,
> before we try to use the big dumb hammer fix (and does that solution
> *really* work, if we don't understand the cache architecture properly?)

I think the question is whether the OpenACC code is correct or not - or to find
a non-OpenACC code which also shows the issue and should be correct.

I find the OpenACC generated code to convoluted to really see whether it is
valid or not.

* * *

OpenACC: In any case, for the gang code, I get (cf. comment 0) with OpenACC:

sum:
104.833984 + i 109.667969
16.898438 + i 17.796875

prod:
698317287061244416.00 + i -950434920224383616.00
29.758044 + i 6.070528

That is: Both 'sum' and 'prod' are wrong.

* * *

While with OpenMP, the following Fortran code produces the correct result:

!$omp target teams distribute parallel do map(to: ary) map(tofrom: tsum, tprod)
do ix = 1, N
!$omp atomic update
tsum = tsum + ary(ix)
!$omp atomic update
tprod = tprod * ary(ix)
end do

but this code uses - again -

  GOMP_atomic_start
  GOMP_atomic_end

although I wonder whether __atomic_compare_exchange_16 shouldn't have
handled this atomically? (This is available via libatomic for Nvptx and also on
the host. I think for GCN, it is not, but I might be wrong; GCN has no
libatomic, but I am not 100% sure that it doesn't handle it intrinsically.)

* * *

For the sum, one can also put it into two atomics:

#pragma omp target teams \
distribute parallel for \
map(to: ary) map(tofrom: tsum, tprod)
for (int ix = 0; ix < N; ix++)
  {
#pragma omp atomic update
  __real__ tsum = __real__ tsum + __real__ ary[ix];

#pragma omp atomic update
  __imag__ tsum = __imag__ tsum + __imag__ ary[ix];
  }

and doing so produces in OpenMP

sum:
104.833984 + i 109.667969
104.833984 + i 109.667969

and uses
flat_atomic_cmpswap_X2

This works as RE and IM are complete independent. And also shows that there is
no generic issue with atomics.

For multiplication, real and imaginary parts get mixed. Recall that for complex
variables A and B, 'A * B' is:

  (Re A * Re B - Im A * Im B)  +  i*(Re A*Im B + Im A + Re B)

Thus, I can do an 'atomic update' for them, but as soon as one succeeds and the
other fails, I am doomed!

I have no idea how this generated OpenACC handles this - nor why 'sum' is also
wrong with OpenACC.

* * *

Note that I used Fortran above because for C/C++, a complex 'omp atomic update'
is rejected with:

   error: invalid expression type for ‘#pragma omp atomic’

while Clang accepts it; I also do not see anything wrong with using complex
numbers + in gfortran it works (using the atomic_start/atomic_stop workaround).

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

--- Comment #2 from Andrew Pinski  ---
I am curious is this causing an assembly failure? Or you were just curious
about the alignment changes?

So for mergeable sections in elf, they all have an alignment and an element
size which is the same. So we increase the alignment to the next power of 2 (up
to 32bytes max if bigger than 32bytes dont change the alignment). So there is a
max wasting of 8 bytes in some cases which will be zero filled.

DECL_MERGEABLE will be set on the decls also. So if you need to lower back when
outing the decl you can check that.

[Bug c/121446] [OpenMP][OpenACC] 'atomic' directive rejects complex variables in C/C++

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121446

--- Comment #2 from Andrew Pinski  ---
See https://gcc.gnu.org/onlinedocs/gcc-15.1.0/gcc/Complex.html for the
extension.

[Bug ada/121316] Representation clause on enumeration type causes iterator filters to be ignored

2025-08-07 Thread liam at liampwll dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121316

--- Comment #2 from Liam Powell  ---
By crash I do mean the bug box. It's also notable that this only occurs when
the 'Image is inside the loop.

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread avinashd at linux dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702

--- Comment #17 from Avinash Jayakar  ---
Created attachment 62077
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62077&action=edit
Proposed patch for using add when vectorizing << 1

I looked at the slp vectorization pass that converts scalar gimple code to
vectorized gimple. Analysis happens in vect_slp_analyze_bb_1 before actually
scheduling in vect_schedule_slp. There are multiple patterns written here to
optimize simple operations such as *2 to <<1 in vect_recog_mult_pattern. 
I have added a pattern just for detecting left shift by one and replacing it by
add in vect_recog_lshift_by_one_pattern. Either this can be done, or I can move
this logic in a shift pattern (vect_recog_widen_shift_pattern or
vect_recog_vector_vector_shift_pattern). 

This does fix the original issue, where <<1 generates 2 instructions. With this
patch it just generates 1 add instruction when code is vectorized. But other
cases like *2 and a = a+a, is not handled right now. 

@Segher, I had a few questions on this 
- Do you suggest moving ahead in this direction? Since here I am manipulating
the GIMPLE, it will affect different architectures as well, would this be ok?


(In reply to Segher Boessenkool from comment #15)
> Just have it recognised by a define_insn that generates an addition insn
> when generating assembler code.  You know...  the same as always :-)
> 

- Thank you for this suggestion, I did give this a try but ran into a few
issues. Is there a way in define_insn to detect that one of the operand in rtl
is dead? Because we need to be sure that const_1 is not used anywhere further
before replacing the the 2 rtl insns (splat and shift), with just 1 (plus). I
checked the define_peephole2, it provides a way to check if an operand is dead.
Would using the peephole pass for this make sense?

[Bug target/120718] ICE (unrecognizable insn) with const_poly_int in v2si vector

2025-08-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120718

--- Comment #6 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:8e3239e3e92f3cd57bf3a19f10daa66c4cb45cc1

commit r16-3067-g8e3239e3e92f3cd57bf3a19f10daa66c4cb45cc1
Author: Richard Sandiford 
Date:   Thu Aug 7 14:19:03 2025 +0100

Remove MODE_COMPOSITE_P test from simplify_gen_subreg [PR120718]

simplify_gen_subreg rejected subregs of literal constants if
MODE_COMPOSITE_P.  This was added by the fix for PR96648 in
g:c0f772894b6b3cd8ed5c5dd09d0c7917f51cf70f.  Jakub said:

  As for the simplify_gen_subreg change, I think it would be desirable
  to just avoid creating SUBREGs of constants on all targets and for all
  constants, if simplify_immed_subreg simplified, fine, otherwise punt,
  but as we are late in GCC11 development, the patch instead guards this
  behavior on MODE_COMPOSITE_P (outermode) - i.e. only conversions to
  powerpc{,64,64le} double double long double - and only for the cases
where
  simplify_immed_subreg was called.

I'm not sure about relaxing the codes further, since subregs might
be wanted for CONST, SYMBOL_REF and LABEL_REF.  But removing the
MODE_COMPOSITE_P is needed to fix PR120718, where we get an ICE
from generating a subreg of a V2SI const_vector.

gcc/
PR rtl-optimization/120718
* simplify-rtx.cc (simplify_context::simplify_gen_subreg):
Remove MODE_COMPOSITE_P condition.

gcc/testsuite/
PR rtl-optimization/120718
* gcc.target/aarch64/sve/acle/general/pr120718.c: New test.

[Bug target/121392] [16 Regression] GCN offloading: 'libgomp.c/simd-math-1.c' execution test timeouts

2025-08-07 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121392

Andrew Stubbs  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ams at gcc dot gnu.org
   Last reconfirmed||2025-08-07
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #3 from Andrew Stubbs  ---
Patch submitted: https://sourceware.org/pipermail/newlib/2025/022077.html

[Bug other/120237] /pub/gcc/infrastructure/ + contrib/download_prerequisites: Update MPFR for C23 / Fortran 2023 functions

2025-08-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120237

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:b399a0084bc9627c3507ec0fe1bf90f4c073aacf

commit r16-3063-gb399a0084bc9627c3507ec0fe1bf90f4c073aacf
Author: Tobias Burnus 
Date:   Thu Aug 7 09:19:03 2025 +0200

contrib/download_prerequisites: Update GMP, MPFR, MPC [PR120237]

Download newer versions of GMP, MPFR and MPC (the latest); besides the
usual
bug fixes and smaller features, MPFR adds new functions for C23, some of
which are already used in GCC in the middle (fold-const-call.cc) and in
Fortran 2023 for the 'pi' trignonometric functions, if MPFR is new enough.

contrib/ChangeLog:

PR other/120237
* download_prerequisites: Update to download GMP 6.3.0 (before
6.2.1),
MPFR 4.2.2 (before 4.1.0), and MPC 1.3.1 (before 1.2.1).
* prerequisites.md5: Update hash.
* prerequisites.sha512: Likewise.

[Bug rtl-optimization/121439] [IOCCC] а case with high time complexity and high RAM usage

2025-08-07 Thread jpegqs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121439

--- Comment #8 from Ilya Kurdyukov  ---
-O0 is already fast in GCC 14.2.1, I haven't tried this case after 13.3.0, I
ran it straight away on -O2.

$ time -p cc -O0 output.c -o test0
real 55.63
user 49.74
sys 4.23

[Bug c/121423] [ICE] nested function leads to internal compiler error: verify_gimple failed

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121423

Richard Biener  changed:

   What|Removed |Added

  Component|ipa |c
 CC||jsm28 at gcc dot gnu.org
   Keywords||ice-checking, rejects-valid

--- Comment #1 from Richard Biener  ---
I think we have a duplicate for this.  We do not have a way to represent this
case in the IL after lowering nested functions since the function declaration
(which is then global) refers to a local type in foo.

We might want to reject this case?

[Bug target/121440] New: [16 Regression] 50% slowdown of 519.lbm_r on Zen5 since r16-2727-g09f0768b55b96c (the fix for pr120941)

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121440

Bug ID: 121440
   Summary: [16 Regression] 50% slowdown of 519.lbm_r on Zen5
since r16-2727-g09f0768b55b96c (the fix for pr120941)
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pheeck at gcc dot gnu.org
CC: hjl.tools at gmail dot com, rguenth at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu,
Target: x86_64-pc-linux-gnu

So the fix for pr120941 apparently slows down 519.lbm_r with -Ofast
-march=native -flto on our Zen5 machine by ~50% :(.

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1288.477.0

H.J., Richi, any ideas (intuition) about what this could be?  I'll try to find
out what is going on and produce a testcase.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/121440] [16 Regression] 50% slowdown of 519.lbm_r on Zen5 since r16-2727-g09f0768b55b96c (the fix for pr120941)

2025-08-07 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121440

Filip Kastl  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug rtl-optimization/121424] Debug info associates return instruction with inlined function due to copying of the return during cfgcleanup after RA

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121424

--- Comment #11 from Richard Biener  ---
I'll note returns have been an issue wrt locations because we unify all returns
during gimple lowering / CFG build so we have a single edge to EXIT.

[Bug other/120237] /pub/gcc/infrastructure/ + contrib/download_prerequisites: Update MPFR for C23 / Fortran 2023 functions

2025-08-07 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120237

Tobias Burnus  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Tobias Burnus  ---
FIXED.

As noted in the patch-submission email (see there for details):

* ISL > 0.24 requires C++17, but GCC only requires C++14
  (ISL > 0.24 provides no really required features. Thus,
  no need to think about it.)

* gettext - this one has its own particularities. Thus,
  it was not included in this update.
  Note that gettext 0.26 (released 20 Jul 2025) contains
  an update for GCC format checking, which avoids some
  .po translation issues (missing/wrong %...). Thus, using
  it has advantages.

[Bug target/121455] darwin_mergeable_constant_section and machopic_select_rtx_section should be improved to support non aligned objects

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121455

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2025-08-07
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED

[Bug c/120510] composite_type produces result not compatible with arguments

2025-08-07 Thread uecker at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120510

uecker at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |uecker at gcc dot 
gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from uecker at gcc dot gnu.org ---
fixed

[Bug target/121432] [15/16 regression] GCC has a regression on Microblaze since r15-1619-g3b9b8d6cfdf593

2025-08-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121432

--- Comment #13 from Sam James  ---
Good work.

I suspect pinskia is going to be right and it's a botched libcall impl (see
PR103383, PR107459).

Can you delete irq.o, run make V=1, and share the command line used to build
irq.o? Then run that command you found again and append -save-temps, then
upload irq.i here?

[Bug translation/93836] teach xgettext what HOST_WIDE_INT_PRINT means

2025-08-07 Thread dimitar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93836

Dimitar Dimitrov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||dimitar at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #7 from Dimitar Dimitrov  ---
This has been fixed with r15-7699-g0bb431d0a77cf8dc790b9c61539b3eb6ab1710f0

[Bug target/121455] New: darwin_mergeable_constant_section and machopic_select_rtx_section should be improved to support non aligned objects

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121455

Bug ID: 121455
   Summary: darwin_mergeable_constant_section and
machopic_select_rtx_section should be improved to
support non aligned objects
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: *-*-darwin

Similarly to how elf mergeable section support is being improved, macho/darwin
could be done similarly.

That is non-aligned objects could be placed in the __TEXT,__literalN sections
(where N is 4, 8, and 16).

I will attach a patch to test in a little bit; it will depend on patches for PR
121444, and PR 121394 since those also fix the middle-ends parts for mergeable
objects.

[Bug middle-end/121394] [14/15/16 Regression] Since r16-2595-gf1c80147641783: link-time error: libm_a-e_atan2.o):(.rodata.cst32): SHF_MERGE section size (56) must be a multiple of sh_entsize (32)

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121394

--- Comment #13 from Andrew Pinski  ---
Created attachment 62080
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62080&action=edit
New patch which I am testing

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

--- Comment #6 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> I Noticed that clang puts the cst in the mergeable sections even without
> increasing the alignment. Let me see if I can do that without increasing the
> alignment overall. Still need to do the padding for the section though.
> 
> That will remove the extra alignment and even fix PR 121438 without any
> changes to the front-end. 
> 
> But this might not happen until next week.

I think I have this implemented. So it fixes this issue and PR 121438; part of
the patch also fixes PR 121394 (going to extract that off as a first patch) and
have the second patch for this one/PR121438.

[Bug tree-optimization/121454] [16 regression] ICE in nonoverlapping_refs_since_match_p, at tree-ssa-alias.cc:1684

2025-08-07 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121454

--- Comment #2 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> I suspect this is due to
>
> commit 53f491ccd1e59fad77fb2cb30d1a58b9e5e5f63c
> Author: Richard Biener 
> Date:   Wed Aug 6 12:31:13 2025 +0200
>
> tree-optimization/121405 - missed VN with aggregate copy
>
> although I haven't verified that yet.

Confirmed: with the patch reverted locally, an i686-pc-linux-gnu
bootstrap is into make check now.

[Bug target/121444] [16 Regression] nvptx: Increased '.align' for 'CSWTCH' after "Improve mergability of CSWTCH [PR120523]"

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121444

--- Comment #7 from Andrew Pinski  ---
Created attachment 62081
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62081&action=edit
Patch which I am testing which reverts the increased alignment part

Note this depends on the patch for PR 121394 for correctness.

[Bug c++/121438] Improve mergability of array's initializer_list

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121438

--- Comment #4 from Andrew Pinski  ---
New patch for this is included in PR 121444.

[Bug target/121462] [meta-bug] BPF verifier issues

2025-08-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121462

Sam James  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-08-08
 Status|UNCONFIRMED |NEW

[Bug tree-optimization/121454] [16 regression] ICE in nonoverlapping_refs_since_match_p, at tree-ssa-alias.cc:1684 since r16-3066-g53f491ccd1e59f

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121454

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
I will have a look.

[Bug target/121461] switch to table conversion should be switched off for BPF

2025-08-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121461

--- Comment #1 from Sam James  ---
Note that the thread is about a userland parser in libv4l (which does NOT use
libbpf), but Clang doesn't do this optimisation for the testcase.

I've tried looking for why but can't find anything clear other than old
ac2e25026fa7a198c33fe521d9c02865ede12981 in llvm and
cc290a9e912e68677b8c33a3c82740435a6c04d8 as well.

https://patchwork.ozlabs.org/project/netdev/patch/b73889608508d98bbd4d58af032528626a4950b0.1554731339.git.dan...@iogearbox.net/

[Bug target/121461] switch to table conversion should be switched off for BPF

2025-08-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121461

--- Comment #2 from Sam James  ---
There is https://github.com/libbpf/libbpf/issues/274 as well which says the
support is fairly new for suffixed sections.

[Bug tree-optimization/121460] `switch (v + 20) case` should optimize away the addition

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121460

--- Comment #1 from Richard Biener  ---
More interesting for multiplications I guess.

[Bug target/121461] switch to table conversion should be switched off for BPF

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121461

--- Comment #3 from Richard Biener  ---
That's a weird restriction.

[Bug target/121449] [13/14/15/16 regression] Immediate offset out of range error for AArch64 SVE gather load

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121449

Richard Biener  changed:

   What|Removed |Added

 Target||aarch64
   Target Milestone|--- |13.5
   Keywords||assemble-failure

[Bug tree-optimization/121454] [16 regression] ICE in nonoverlapping_refs_since_match_p, at tree-ssa-alias.cc:1684 since r16-3066-g53f491ccd1e59f

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121454

--- Comment #4 from Richard Biener  ---
I'll note that the assert isn't ensured in the GIMPLE IL verifier AFAICS.

  /* TARGET_MEM_REF are never wrapped in handled components, so we do not need
 to handle them here at all.  */
  gcc_checking_assert (TREE_CODE (ref1) != TARGET_MEM_REF
   && TREE_CODE (ref2) != TARGET_MEM_REF);

I will make the VN lookup code more careful.

[Bug fortran/121452] [14/15/16 Regression] Bogus 'OpenMP constructs ... may not be nested inside ‘simd’ region' due to compiler-inserted "#pragma omp __structured_block"

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121452

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.4

[Bug tree-optimization/119568] [13/14/15/16 Regression] [avr] ICE: in find_widening_optab_handler_and_mode, at optabs-query.cc:498

2025-08-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119568

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.5

[Bug other/121456] New: GCC doesn't fully utilize registers with known values when making 'mov' instructions

2025-08-07 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121456

Bug ID: 121456
   Summary: GCC doesn't fully utilize registers with known values
when making 'mov' instructions
   Product: gcc
   Version: 15.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

This issue is easier to demonstrate in '-Os' optimization, but sometimes
happens in '-O2' as well.

Test code:

```c
#include 

uint64_t func1a(uint64_t a, uint64_t b) {
return a == 10 ? a : b;
}

uint64_t func1b(uint64_t a, uint64_t b) {
return a == 10 ? 10 : b;
}

uint64_t func2a(uint64_t a, uint64_t b) {
return a == 0x1234abcd ? a : b;
}

uint64_t func2b(uint64_t a, uint64_t b) {
return a == 0x1234abcd ? 0x1234abcd : b;
}
```

Note that func1a and func1b are equivalent, and func2a and func2b are
equivalent.

(All tests below are done in Compiler Explorer)

### Test 1, for ARM64 target

The ideal result is what's generated by Clang 20.1.0 (with '-O2' flag):

```assembly
func1a:
cmp x0, #10
cselx0, x0, x1, eq
ret

func2a:
mov w8, #43981
movkw8, #4660, lsl #16
cmp x0, x8
cselx0, x0, x1, eq
ret
```

In gcc 15.1 (with '-O2' flag) this is generated instead:

```assembly
func1a:
cmp x0, 10
mov x0, 10
cselx0, x1, x0, ne
ret
// func1b assembly is same as func1a

func2a:
mov x2, 43981
movkx2, 0x1234, lsl 16
cmp x0, x2
cselx0, x1, x0, ne
ret
func2b:
mov x2, 43981
movkx2, 0x1234, lsl 16
cmp x0, x2
cselx0, x0, x1, eq
ret
```

Note that (a) there's an unneeded 'mov' instruction in func1a, and (b) although
func2a and func2b have the same code size, their use of 'csel' instruction is
not identical (I suspect that means the code is not canonicalized to be the
same).

When compiled with gcc 15.1 with '-Os' flag, func1b got the ideal result, but
then the code becomes different from func1a. func1a has an unneeded 'mov'
instruction.

### Test 2, for x86-64 target

The ideal result is what's generated by Clang 20.1.0 (with '-O2' flag):

```assembly
.intel_syntax
func1a:
mov rax, rsi
cmp rdi, 10
cmove   rax, rdi
ret

func2a:
mov rax, rsi
cmp rdi, 305441741
cmove   rax, rdi
ret
```

In gcc 15.1 (with '-Os' flag) this is generated instead:

```assembly
func1a:
cmp rdi, 10
mov eax, 10
cmovne  rax, rsi
ret
func1b:
cmp rdi, 10
mov rax, rsi
cmove   rax, rdi
ret
func2a:
cmp rdi, 305441741
mov eax, 305441741
cmovne  rax, rsi
ret
func2b:
cmp rdi, 305441741
mov rax, rsi
cmove   rax, rdi
ret
```

(a) GCC converts a register 'mov' instruction into a immediate-operand 'mov'
for func1a and func2a cases (which suggests GCC can know that the variable 'a'
has a fixed value after the equality check), however,
(b) GCC misses that the 'a' variants of both functions can convert to the 'b'
variants. The 'b' variants have the ideal size I want for '-Os'.

So for both tests for the two architectures, GCC didn't fully utilize registers
when they have known, fixed values when planning register move instructions.

[Bug target/121457] New: R_X86_64_CODE_6_GOTTPOFF configure test is broken with binutils --enable-targets=all

2025-08-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121457

Bug ID: 121457
   Summary: R_X86_64_CODE_6_GOTTPOFF configure test is broken with
binutils --enable-targets=all
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
CC: hjl.tools at gmail dot com
  Target Milestone: ---

r14-9154-g7f2cf0c45f4ba7 adds a check for R_X86_64_CODE_6_GOTTPOFF support, but
it makes a bad assumption.

+# Check if gas and gld support "addq %r23,foo@GOTTPOFF(%rip), %r15"
+# with R_X86_64_CODE_6_GOTTPOFF relocation.
+if echo "$ld_ver" | grep GNU > /dev/null; then
+  if $gcc_cv_ld -V 2>/dev/null | grep elf_x86_64_sol2 > /dev/null; then
+ld_ix86_gld_64_opt="-melf_x86_64_sol2"
+  else
+ld_ix86_gld_64_opt="-melf_x86_64"
+  fi

This will pass -melf_x86_64_sol2 on x86_64-pc-linux-gnu if binutils is built
with --enable-targets=all, and then gcc_cv_as_x86_64_code_6_gottpoff=no is set.

We should check first if $target is *solaris* (or something like that) and only
then have this fallback check.

[Bug c++/121335] Vulkan module ICE

2025-08-07 Thread kongmingd234 at proton dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121335

--- Comment #8 from kongmingd234  ---
You were right on both accounts, clearing the folder worked. The errors I'm
getting are real errors I believe, as copying more code from the vulkan
tutorial reduced the errors.
Now I just have to figure out how to get vulkan to work.

[Bug rtl-optimization/121456] GCC doesn't fully utilize registers with known values when making 'mov' instructions

2025-08-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121456

Andrew Pinski  changed:

   What|Removed |Added

  Component|other   |rtl-optimization
   Keywords||missed-optimization
   Severity|normal  |enhancement

  1   2   >