date:20240403

[Bug c++/114569] New: GCC accepts forming pointer to function type which is ref qualified

2024-04-03 Thread jlame646 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114569

Bug ID: 114569
   Summary: GCC accepts forming pointer to function type which is
ref qualified
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jlame646 at gmail dot com
  Target Milestone: ---

In the following program `#2` is accepts by all compilers but `#1` is rejected.
Shouldn't `#2` also be rejected for the same reason.
https://godbolt.org/z/sMraETcbx

```
#include  
template 
struct Decompose;

template 
struct Decompose {
using Type = T;
};

template 
using FTDecay = typename Decompose::Type;



// static_assert(std::is_same_v, int (*)() &>);  //#1: all
rejects this as expected 
 FTDecay x{};//#2: all
accepts this why? 
```

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Target||arm

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread clyon at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

--- Comment #2 from Christophe Lyon  ---
I think the last -march option overrides the previous one(s).

I'd say the test should use an effective-target which checks that linking is
actually OK rather than just a compile OK test. Not sure if an adequate one
already exists, but there are already plenty :-)

[Bug libquadmath/114533] libquadmath: printf: fix misaligned access on args

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114533

--- Comment #12 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:8455d6f6cd43b7b143ab9ee19437452fceba9cc9

commit r14-9769-g8455d6f6cd43b7b143ab9ee19437452fceba9cc9
Author: Jakub Jelinek 
Date:   Wed Apr 3 10:02:35 2024 +0200

libquadmath: Don't assume the storage for __float128 arguments is aligned
[PR114533]

With the
register_printf_type/register_printf_modifier/register_printf_specifier
APIs the C library is just told the size of the argument and is provided
with
a callback to fetch the argument from va_list using va_arg into C library
provided
memory.  The C library isn't told what alignment requirement it has, but we
were
using direct load of a __float128 value from that memory which assumes
__alignof (__float128) alignment.

The following patch fixes that by using memcpy instead.

I haven't been able to reproduce an actual crash, tried
 #include 
 #include 
 #include 

int main ()
{
  __float128 r;
  int prec = 20;
  int width = 46;
  char buf[128];

  r = 2.0q;
  r = sqrtq (r);
  int n = quadmath_snprintf (buf, sizeof buf, "%+-#*.20Qe", width, r);
  if ((size_t) n < sizeof buf)
printf ("%s\n", buf);
/* Prints: +1.41421356237309504880e+00 */
  quadmath_snprintf (buf, sizeof buf, "%Qa", r);
  if ((size_t) n < sizeof buf)
printf ("%s\n", buf);
/* Prints: 0x1.6a09e667f3bcc908b2fb1366ea96p+0 */
  n = quadmath_snprintf (NULL, 0, "%+-#46.*Qe", prec, r);
  if (n > -1)
{
  char *str = malloc (n + 1);
  if (str)
{
  quadmath_snprintf (str, n + 1, "%+-#46.*Qe", prec, r);
  printf ("%s\n", str);
  /* Prints: +1.41421356237309504880e+00 */
}
  free (str);
}
  printf ("%+-#*.20Qe\n", width, r);
  printf ("%Qa\n", r);
  printf ("%+-#46.*Qe\n", prec, r);
  printf ("%d %Qe %d %Qe %d %Qe\n", 1, r, 2, r, 3, r);
  return 0;
}
In any case, I think memcpy for loading from it is right.

2024-04-03  Simon Chopin  
Jakub Jelinek  

PR libquadmath/114533
* printf/printf_fp.c (__quadmath_printf_fp): Use memcpy to copy
__float128 out of args.
* printf/printf_fphex.c (__quadmath_printf_fphex): Likewise.

Signed-off-by: Simon Chopin

[Bug gcov-profile/113765] [14 Regression] ICE: autofdo: val-profiler-threads-1.c compilation, error: probability of edge from entry block not initialized

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113765

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-04-03
   Assignee|unassigned at gcc dot gnu.org  |erozen at microsoft dot 
com

[Bug tree-optimization/114476] [13/14 Regression] wrong code with -fwrapv -O3 -fno-vect-cost-model (and -march=armv9-a+sve2 on aarch64 and -march=rv64gcv on riscv)

2024-04-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114476

--- Comment #8 from Robin Dapp  ---
I tried some things (for the related bug without -fwrapv) then got busy with
some other things.  I'm going to have another look later this week.

[Bug demangler/54254] libiberty: demangling for global constructor is broken since r167781

2024-04-03 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54254

--- Comment #6 from Eric Gallager  ---
(In reply to Andrew Pinski from comment #4)
> *** Bug 56755 has been marked as a duplicate of this bug. ***

symbol from this one was _GLOBAL__sub_I__ZN4AMOS12ContigEdge_t5NCODEE

[Bug tree-optimization/114551] [14 Regression] wrong code at -O3 on x86_64-linux-gnu since r14-2944

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114551

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #6 from Richard Biener  ---
It can be reproduced with -O2 -funswitch-loops -fsplit-loops.

Loop splitting emits

   [local count: 14598063]:
  a.0_1 = a;
  _2 = a.0_1 + -1;
  a = _2;
  _24 = _2 <= 0;
  _10 = 2147483647 - _2;
  if (_10 <= 2)

and the 2147483647 - _2 expression then overflows, so that's definitely
wrong.  This is built here:

/* Build a condition that will skip the first loop when the
   guard condition won't ever be true (or false).  */
gimple_seq stmts2;
border = force_gimple_operand (border, , true, NULL_TREE);
if (stmts2)  

or rather in split_at_bb_p and put into '*border'

[Bug fortran/113956] [13/14 Regression] ice in gfc_trans_pointer_assignment, at fortran/trans-expr.cc:10524

2024-04-03 Thread pault at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113956

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #6 from Paul Thomas  ---
I'll take it.

Paul

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #13 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:cab32bacaea268ec062b1fb4fc662d90c9d1cfce

commit r14-9775-gcab32bacaea268ec062b1fb4fc662d90c9d1cfce
Author: H.J. Lu 
Date:   Mon Feb 26 08:38:58 2024 -0800

tree-profile: Disable indirect call profiling for IFUNC resolvers

We can't profile indirect calls to IFUNC resolvers nor their callees as
it requires TLS which hasn't been set up yet when the dynamic linker is
resolving IFUNC symbols.

Add an IFUNC resolver caller marker to cgraph_node and set it if the
function is called by an IFUNC resolver.  Disable indirect call profiling
for IFUNC resolvers and their callees.

Tested with profiledbootstrap on Fedora 39/x86-64.

gcc/ChangeLog:

PR tree-optimization/114115
* cgraph.h (symtab_node): Add check_ifunc_callee_symtab_nodes.
(cgraph_node): Add called_by_ifunc_resolver.
* cgraphunit.cc (symbol_table::compile): Call
symtab_node::check_ifunc_callee_symtab_nodes.
* symtab.cc (check_ifunc_resolver): New.
(ifunc_ref_map): Likewise.
(is_caller_ifunc_resolver): Likewise.
(symtab_node::check_ifunc_callee_symtab_nodes): Likewise.
* tree-profile.cc (gimple_gen_ic_func_profiler): Disable indirect
call profiling for IFUNC resolvers and their callees.

gcc/testsuite/ChangeLog:

PR tree-optimization/114115
* gcc.dg/pr114115.c: New test.

[Bug middle-end/111632] gcc fails to bootstrap when using libc++

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111632

--- Comment #25 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Iain D Sandoe
:

https://gcc.gnu.org/g:e95ab9e60ce1d9aa7751d79291133fd5af9209d7

commit r13-8572-ge95ab9e60ce1d9aa7751d79291133fd5af9209d7
Author: Francois-Xavier Coudert 
Date:   Sat Mar 16 09:50:00 2024 +0100

libcc1: fix  include

Use INCLUDE_VECTOR before including system.h, instead of directly
including , to avoid running into poisoned identifiers.

Signed-off-by: Dimitry Andric 

PR middle-end/111632

libcc1/ChangeLog:

* libcc1plugin.cc: Fix include.
* libcp1plugin.cc: Fix include.

(cherry picked from commit 5213047b1d50af63dfabb5e5649821a6cb157e33)

[Bug lto/114574] [14 regression] ICE when building curl with LTO (fld_incomplete_type_of, at ipa-free-lang-data.cc:257)

2024-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114574

--- Comment #1 from Sam James  ---
reducing

[Bug lto/114574] [14 regression] ICE when building curl with LTO (fld_incomplete_type_of, at ipa-free-lang-data.cc:257)

2024-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114574

--- Comment #2 from Sam James  ---
Reduced:
```
struct X509_algor_st sk_X509_ALGOR_copyfunc(const struct X509_algor_st *);
struct X509_algor_st {
} PKCS8_pkey_get0(const struct X509_algor_st **) {
}
```

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-04-03 Thread burnus at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Tobias Burnus  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Tobias Burnus  ---
FIXED on mainline (= GCC 14).

Namely, the following was fixed. All of those issues involve compiling with
'-g' such that 'mkoffload' generates also a GCN .o file for which the ELF flag
has to match the other .o files.

(A) The issue of comment 0: ELF Flag mismatch if GCC was configured with a
--with-arch=... that does not match the default setting.
→ Fix: See comment 6

Earlier fixes, only vaguely related to comment 0:

(B)
* Compiler default was changed to gfx900 but mkoffload still had Fiji as
default
* Race in handling the debug files
→ Fix: See comment 1

(C)
* Fixed issues related to xnack/sram-ecc, which also lead to ELF flag
mismatches
→ Fix: See comment 4

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-03 Thread hubicka at ucw dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #15 from Jan Hubicka  ---
> Fixed for GCC 14 so far
It is simple patch, so backporting is OK after a week in mainline.

[Bug middle-end/111632] gcc fails to bootstrap when using libc++

2024-04-03 Thread iains at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111632

--- Comment #26 from Iain Sandoe  ---
NOTE: I adjusted the PR lines in the commit header so that the commits get
reflected on the PR.

[Bug c++/114537] bit_cast does not work NSDMI of bitfields

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114537

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-04-03
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Created attachment 57863
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57863=edit
gcc14-pr114537.patch

Untested fix.

[Bug c++/114569] GCC accepts forming pointer to function type which is ref qualified

2024-04-03 Thread jlame646 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114569

--- Comment #2 from Jason Liam  ---
(In reply to Marek Polacek from comment #1)
> So the code should compile.

But https://timsong-cpp.github.io/cppwp/n4950/dcl.ptr#4.sentence-2 says:

> [Note 1: [...] Forming a function pointer type is ill-formed if the function 
> type has cv-qualifiers or a ref-qualifier; see [dcl.fct]. [...]]

And since `FTDecay` is `int (*)() &`, this should be ill-formed?

[Bug c++/71482] Add -Wglobal-constructors warning option

2024-04-03 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71482

Eric Gallager  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=54254

--- Comment #6 from Eric Gallager  ---
Another reason this warning might be wanted: name mangling and demangling of
global constructors has been buggy for awhile now; see bug 54254

[Bug lto/114574] New: [14 regression] ICE when building curl with LTO (internal compiler error: in fld_incomplete_type_of, at ipa-free-lang-data.cc:257)

2024-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114574

Bug ID: 114574
   Summary: [14 regression] ICE when building curl with LTO
(internal compiler error: in fld_incomplete_type_of,
at ipa-free-lang-data.cc:257)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57861
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57861=edit
libcurl_la-curl_ntlm_core.i.xz

Tonnes of these failures. Just picked curl at random.

```
$ x86_64-pc-linux-gnu-gcc -m32 -mfpmath=sse -DHAVE_CONFIG_H
-I/var/tmp/portage/net-misc/curl-8.7.1-r1/work/curl-8.7.1/include -I../lib
-I/var/tmp/portage/net-misc/curl-8.7.1-r1/work/curl-8.7.1/lib
-DBUILDING_LIBCURL -DCURL_HIDDEN_SYMBOLS -fvisibility=hidden -O3 -pipe
-march=native -fdiagnostics-color=always -flto -fno-vect-cost-model
-fpermissive -Werror-implicit-function-declaration -c
/var/tmp/portage/net-misc/curl-8.7.1-r1/work/curl-8.7.1/lib/curl_ntlm_core.c 
-fPIC -DPIC -o .libs/libcurl_la-curl_ntlm_core.o
during IPA pass: *free_lang_data
/var/tmp/portage/net-misc/curl-8.7.1-r1/work/curl-8.7.1/lib/curl_ntlm_core.c:665:1:
internal compiler error: in fld_incomplete_type_of, at
ipa-free-lang-data.cc:257
  665 | }
  | ^
0x55f1b9e2de0f fld_incomplete_type_of
   
/usr/src/debug/sys-devel/gcc-14.0./gcc-14.0./gcc/ipa-free-lang-data.cc:257
0x55f1bb41a2ad fld_simplified_type
   
/usr/src/debug/sys-devel/gcc-14.0./gcc-14.0./gcc/ipa-free-lang-data.cc:344
0x55f1bb41a2ad free_lang_data_in_type
   
/usr/src/debug/sys-devel/gcc-14.0./gcc-14.0./gcc/ipa-free-lang-data.cc:439
0x55f1bbaa5ad0 free_lang_data_in_cgraph
   
/usr/src/debug/sys-devel/gcc-14.0./gcc-14.0./gcc/ipa-free-lang-data.cc:1072
0x55f1bbaa5ad0 free_lang_data
   
/usr/src/debug/sys-devel/gcc-14.0./gcc-14.0./gcc/ipa-free-lang-data.cc:1109
0x55f1bbaa5ad0 execute
   
/usr/src/debug/sys-devel/gcc-14.0./gcc-14.0./gcc/ipa-free-lang-data.cc:1176
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
```

'gcc -c libcurl_la-curl_ntlm_core.i -O2 -flto' is enough to reproduce. I last
built curl fine on 30th March, apparently.

```
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/14/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-14.0./work/gcc-14.0./configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/14
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/14/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/14/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=yes,extra,rtl,df
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 14.0. p,
commit 7bbfb01a32b73842f8908de028703510a0e12057' --with-gcc-major-version-only
--enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--enable-multilib --with-multilib-list=m32,m64 --disable-fixed-point
--enable-targets=all --enable-libgomp --disable-libssp --disable-libada
--disable-cet --disable-systemtap --disable-valgrind-annotations
--disable-vtable-verify --disable-libvtv --with-zstd --without-isl
--enable-default-pie --enable-host-pie --disable-host-bind-now
--enable-default-ssp --disable-fixincludes --with-build-config='bootstrap-O3
bootstrap-lto'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240403 (experimental)
8455d6f6cd43b7b143ab9ee19437452fceba9cc9 (Gentoo 14.0. p, commit
7bbfb01a32b73842f8908de028703510a0e12057)
```

[Bug tree-optimization/114485] [13/14 Regression] Wrong code with -O3 -march=rv64gcv on riscv or `-O3 -march=armv9-a` for aarch64

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114485

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #8 from Richard Biener  ---
(In reply to Robin Dapp from comment #4)
> Yes, the vectorization looks ok.  The extracted live values are not used
> afterwards and therefore the whole vectorized loop is being thrown away.
> Then we do one iteration of the epilogue loop, inverting the original c and
> end up with -8 instead of 8.  This is pretty similar to what's happening in
> the related PR.
> 
> We properly populate the phi in question in
> slpeel_update_phi_nodes_for_guard1:
> 
> c_lsm.7_64 = PHI <_56(23), pretmp_34(17)>
> 
> but vect_update_ivs_after_vectorizer changes that into
> 
> c_lsm.7_64 = PHI .
> 
> Just as a test, commenting out
> 
>   if (!LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
>   vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf,
> update_e);
> 
> at least makes us keep the VEC_EXTRACT and not fail anymore.

I'll note that on x86_64 we do the same and not fail the testcase.  x86
cannot use partial vectors because we don't implement EXTRACT_LAST,
so that might be the "key" to the failure (partial vectors). And we
might need to "fail" vectorization of the special inductions when
using them?

This might be also out-of-sync handling of which ones we handle with
vect_update_ivs_after_vectorizer and which ones with
vectorizable_live_operation - as indeed we do generate the EXTRACT_LAST here.

[Bug tree-optimization/114555] ICE: definition in block 14 does not dominate use in block 15 at -O and above with _BitInt() bitfield

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114555

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
   Last reconfirmed||2024-04-03
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #2 from Jakub Jelinek  ---
Created attachment 57862
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57862=edit
gcc14-pr114555.patch

So far lightly tested patch.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-03 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

H.J. Lu  changed:

   What|Removed |Added

  Known to work||14.0

--- Comment #14 from H.J. Lu  ---
Fixed for GCC 14 so far

[Bug rtl-optimization/114515] [14 Regression] Failure to use aarch64 lane forms after PR101523

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=114575

--- Comment #10 from Tamar Christina  ---
This has also broken our addressing modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114575

[Bug libgomp/113867] [14 Regression][OpenMP] Wrong code with mapping pointers in structs

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113867

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-03

[Bug middle-end/114552] [13 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[13/14 Regression] wrong|[13 Regression] wrong code
   |code at -O1 and above on|at -O1 and above on
   |x86_64-linux-gnu since  |x86_64-linux-gnu since
   |r13-990 |r13-990

--- Comment #8 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug middle-end/111632] gcc fails to bootstrap when using libc++

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111632

--- Comment #24 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Iain D Sandoe
:

https://gcc.gnu.org/g:68057560ff1fc0fb2df38c2f9627a20c9a8da5c5

commit r13-8571-g68057560ff1fc0fb2df38c2f9627a20c9a8da5c5
Author: Francois-Xavier Coudert 
Date:   Thu Mar 7 14:36:03 2024 +0100

Include safe-ctype.h after C++ standard headers, to avoid over-poisoning

When building gcc's C++ sources against recent libc++, the poisoning of
the ctype macros due to including safe-ctype.h before including C++
standard headers such as , , etc, causes many compilation
errors, similar to:

  In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
  In file included from /home/dim/src/gcc/master/gcc/system.h:233:
  In file included from /usr/include/c++/v1/vector:321:
  In file included from
  /usr/include/c++/v1/__format/formatter_bool.h:20:
  In file included from
  /usr/include/c++/v1/__format/formatter_integral.h:32:
  In file included from /usr/include/c++/v1/locale:202:
  /usr/include/c++/v1/__locale:546:5: error: '__abi_tag__' attribute
  only applies to structs, variables, functions, and namespaces
546 | _LIBCPP_INLINE_VISIBILITY
| ^
  /usr/include/c++/v1/__config:813:37: note: expanded from macro
  '_LIBCPP_INLINE_VISIBILITY'
813 | #  define _LIBCPP_INLINE_VISIBILITY _LIBCPP_HIDE_FROM_ABI
| ^
  /usr/include/c++/v1/__config:792:26: note: expanded from macro
  '_LIBCPP_HIDE_FROM_ABI'
792 |
__attribute__((__abi_tag__(_LIBCPP_TOSTRING(
  _LIBCPP_VERSIONED_IDENTIFIER
|  ^
  In file included from /home/dim/src/gcc/master/gcc/gensupport.cc:23:
  In file included from /home/dim/src/gcc/master/gcc/system.h:233:
  In file included from /usr/include/c++/v1/vector:321:
  In file included from
  /usr/include/c++/v1/__format/formatter_bool.h:20:
  In file included from
  /usr/include/c++/v1/__format/formatter_integral.h:32:
  In file included from /usr/include/c++/v1/locale:202:
  /usr/include/c++/v1/__locale:547:37: error: expected ';' at end of
  declaration list
547 | char_type toupper(char_type __c) const
| ^
  /usr/include/c++/v1/__locale:553:48: error: too many arguments
  provided to function-like macro invocation
553 | const char_type* toupper(char_type* __low, const
char_type* __high) const
|^
  /home/dim/src/gcc/master/gcc/../include/safe-ctype.h:146:9: note:
  macro 'toupper' defined here
146 | #define toupper(c) do_not_use_toupper_with_safe_ctype
| ^

This is because libc++ uses different transitive includes than
libstdc++, and some of those transitive includes pull in various ctype
declarations (typically via ).

There was already a special case for including  before
safe-ctype.h, so move the rest of the C++ standard header includes to
the same location, to fix the problem.

PR middle-end/111632

gcc/ChangeLog:

* system.h: Include safe-ctype.h after C++ standard headers.

Signed-off-by: Dimitry Andric 
(cherry picked from commit 9970b576b7e4ae337af1268395ff221348c4b34a)

[Bug rtl-optimization/114575] New: [14 Regression] SVE addressing modes broken since g:839bc42772ba7af66af3bd16efed4a69511312ae

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114575

Bug ID: 114575
   Summary: [14 Regression] SVE addressing modes broken since
g:839bc42772ba7af66af3bd16efed4a69511312ae
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64*

Created attachment 57864
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57864=edit
addr.cc

Since the offending commit, compiling the example attached with -O3
-march=armv8.5-a+sve2 results in:

.L14:
lsl x0, x1, 1
add x1, x1, 8
add x2, x3, x0
add x6, x0, x4
ld1hz2.h, p7/z, [x2]
ld1hz22.h, p7/z, [x6]
add x0, x0, x5
fmadz22.h, p7/m, z21.h, z2.h
ld1hz20.h, p7/z, [x0]
fmadz20.h, p7/m, z26.h, z22.h
st1hz20.h, p7, [x2]
cmp x1, 808
bne .L14

instead of what it was before the commit:

.L14:
ld1hz2.h, p7/z, [x1, x0, lsl 1]
ld1hz22.h, p7/z, [x2, x0, lsl 1]
ld1hz20.h, p7/z, [x3, x0, lsl 1]
fmadz22.h, p7/m, z21.h, z2.h
fmadz20.h, p7/m, z26.h, z22.h
st1hz20.h, p7, [x1, x0, lsl 1]
add x0, x0, 8
cmp x0, 808
bne .L14

It's now no longer pushing in the register shifts. This causes significant
performance loss as it needs to now perform the integer ALU ops before doing
the load and they're on the critical path.

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from Jakub Jelinek  ---
Fixed.

[Bug demangler/54254] libiberty: demangling for global constructor is broken since r167781

2024-04-03 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54254

--- Comment #7 from Eric Gallager  ---
(In reply to Andrew Pinski from comment #5)
> *** Bug 90039 has been marked as a duplicate of this bug. ***

Symbol for this one was _GLOBAL__sub_I__Z11print_tracev

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

--- Comment #5 from Maxim Kuvyrkov  ---
Looking at this problem more, I think the issue is due to ARM target trying
hard to avoid UNSUPPORTED tests, instead of embracing them.

For the vectorization NEON check we have ...
===
proc check_effective_target_arm_neon_ok_nocache { } {
global et_arm_neon_flags
set et_arm_neon_flags ""
if { [check_effective_target_arm32] } {
foreach flags {"" "-mfloat-abi=softfp" "-mfpu=neon" "-mfpu=neon
-mfloat-abi=softfp" "-mfpu=neon -mfloat-abi=softfp -march=armv7-a"
"-mfloat-abi=hard" "-mfpu=neon -mfloat-abi=hard" "-mfpu=neon -mfloat-abi=hard
-march=armv7-a"} {
if { [check_no_compiler_messages_nocache arm_neon_ok object {
#include 
...
===
... where target tries to find a set of flags compatible with _any_ of the
built multilibs to run the testsuite.

I think this is excessive, since each multilib should be tested on its own
merits, and if armv7-m does not support vectorization, there should be no
effort to try and switch to armv7-a or armv8-m+mve multilib in order to run
vectorization tests.  In other words, vectorization tests should be marked
UNSUPPORTED in armv7-m, and PASS/FAIL in armv7-a and/or armv8-m+mve.

In practical terms, my proposed solution to this problem is to remove all
"foreach flags" options except for the default "".

ARM maintainers, what am I missing?

[Bug c++/114573] -Wzero-as-null-pointer-constant complains on enum with explicit cast

2024-04-03 Thread ossman at cendio dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114573

--- Comment #2 from Pierre Ossman  ---
Indeed. It is part of an effort to have a more modern C++ style in TigerVNC.
One item was preferring nullptr over NULL, and this issue became an obstacle
there.

Right now, we did a #pragma, but if there is a better workaround, then we are
all ears.

[Bug lto/114574] [14 regression] ICE when building curl with LTO (fld_incomplete_type_of, at ipa-free-lang-data.cc:257) since r14-9763

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114574

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
Summary|[14 regression] ICE when|[14 regression] ICE when
   |building curl with LTO  |building curl with LTO
   |(fld_incomplete_type_of, at |(fld_incomplete_type_of, at
   |ipa-free-lang-data.cc:257)  |ipa-free-lang-data.cc:257)
   ||since r14-9763
   Last reconfirmed||2024-04-03
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |14.0
 CC||jakub at gcc dot gnu.org,
   ||uecker at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Jakub Jelinek  ---
Started with r14-9763-g871bb5ad2dd56343d80b6a6d269e85efdce5

[Bug demangler/59518] C++ demangler does not handle some global constructor & LTO names

2024-04-03 Thread egallager at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59518

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #2 from Eric Gallager  ---
(In reply to Andrew Pinski from comment #1)
> _GLOBAL__sub_ issue is PR 54254

How about the others?

[Bug c++/114571] -Wzero-as-null-pointer-constant does not complain about NULL

2024-04-03 Thread ossman at cendio dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114571

--- Comment #3 from Pierre Ossman  ---
And another odd case; gcc 5 complains about this:

> const char *a;
> a = NULL;

but not:

> const char *a = NULL;

gcc 13 complains about neither, and clang about both.

[Bug c++/114569] GCC accepts forming pointer to function type which is ref qualified

2024-04-03 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114569

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #1 from Marek Polacek  ---
[dcl.fct]/10:
A function type with a cv-qualifier-seq or a ref-qualifier shall appear only
as: 
-- [...]
-- the type-id of a template-argument for a type-parameter

So the code should compile.

[Bug target/112397] Two persistent libstdc++ test failures on x86_64-apple-darwin

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112397

--- Comment #12 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Iain D Sandoe
:

https://gcc.gnu.org/g:ae11f0154116f4e5fa8769b1ea1600b1b1c22958

commit r13-8577-gae11f0154116f4e5fa8769b1ea1600b1b1c22958
Author: Iain Sandoe 
Date:   Thu Feb 8 17:54:31 2024 +

libstdc++, Darwin: Handle a linker warning [PR112397].

Darwin's linker warns when we make a direct branch to code that is
in a weak definition (citing that if a different implementation of
the weak function is chosen by the dynamic linker this would be an
error).

As the analysis in the PR shows, this can happen when we have hot/
cold partitioning and there is an error path that is primarily cold
but makes use of epilogue code in the hot section.  In this simple
case, we can easily deduce that the code is in fact safe; however
that is not something we can realistically implement in the linker.

Since the user-replaceable allocators are implemented using weak
definitions, this is a warning that is frequently flagged up in both
the testsuite and end-user code.

The chosen solution here is to suppress the hot/cold partitioning for
these cases (it is unlikely to impact performance much c.f. the
actual allocation).

PR target/112397

libstdc++-v3/ChangeLog:

* configure: Regenerate.
* configure.ac: Detect if we are building for Darwin.
* libsupc++/Makefile.am: If we are building for Darwin, then
suppress hot/cold partitioning for the array allocators.
* libsupc++/Makefile.in: Regenerated.

Signed-off-by: Iain Sandoe 
Co-authored-by: Jonathan Wakely 
(cherry picked from commit 1609fdff16f17ead37666f6d0e801800ee3d04d2)

[Bug testsuite/112297] Failure of pr100936.c on x86_64-apple-darwin21

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112297

--- Comment #4 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Iain D Sandoe
:

https://gcc.gnu.org/g:44514fde12e2a8f75fca88fdd6ff7a0e678ac966

commit r13-8573-g44514fde12e2a8f75fca88fdd6ff7a0e678ac966
Author: Francois-Xavier Coudert 
Date:   Mon Dec 11 09:26:23 2023 +0100

Testsuite: restrict test to nonpic targets

The test is currently failing on x86_64-apple-darwin.

gcc/testsuite/ChangeLog:

PR testsuite/112297
* gcc.target/i386/pr100936.c: Require nonpic target.

(cherry picked from commit 02f562484c17522d79a482ac702a5fa3c2dfdd10)

[Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:03039744f368a24a452e4ea8d946e9c2cedaf1aa

commit r14-9768-g03039744f368a24a452e4ea8d946e9c2cedaf1aa
Author: Jakub Jelinek 
Date:   Wed Apr 3 09:59:45 2024 +0200

expr: Fix up emit_push_insn [PR114552]

r13-990 added optimizations in multiple spots to optimize during
expansion storing of constant initializers into targets.
In the load_register_parameters and expand_expr_real_1 cases,
it checks it has a tree as the source and so knows we are reading
that whole decl's value, so the code is fine as is, but in the
emit_push_insn case it checks for a MEM from which something
is pushed and checks for SYMBOL_REF as the MEM's address, but
still assumes the whole object is copied, which as the following
testcase shows might not always be the case.  In the testcase,
k is 6 bytes, then 2 bytes of padding, then another 4 bytes,
while the emit_push_insn wants to store just the 6 bytes.

The following patch simply verifies it is the whole initializer
that is being stored, I think that is best thing to do so late
in GCC 14 cycle as well for backporting.

For GCC 15, perhaps the code could stop requiring it must be at offset
zero,
nor that the size is equal, but could use
get_symbol_constant_value/fold_ctor_reference gimple-fold APIs to actually
extract just part of the initializer if we e.g. push just some subset
(of course, still verify that it is a subset).  For sizes which are power
of two bytes and we have some integer modes, we could use as type for
fold_ctor_reference corresponding integral types, otherwise dunno, punt
or use some structure (e.g. try to find one in the initializer?), whatever.
But even in the other spots it could perhaps handle loading of
COMPONENT_REFs or MEM_REFs from the .rodata vars.

2024-04-03  Jakub Jelinek  

PR middle-end/114552
* expr.cc (emit_push_insn): Only use store_constructor for
immediate_const_ctor_p if int_expr_size matches size.

* gcc.c-torture/execute/pr114552.c: New test.

[Bug c++/114571] New: -Wzero-as-null-pointer-constant does not complain about NULL

2024-04-03 Thread ossman at cendio dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114571

Bug ID: 114571
   Summary: -Wzero-as-null-pointer-constant does not complain
about NULL
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ossman at cendio dot se
  Target Milestone: ---

Created attachment 57857
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57857=edit
Test case

We are looking at bringing up the TigerVNC project to a more modern C++ style,
and one thing was using nullptr instead of NULL. We were very glad when we
found -Wzero-as-null-pointer-constant to help out with this.

Unfortunately, it doesn't seem to do much for NULL in modern¹ gcc. It spots
usage of "0", but not any "NULL". clang has no trouble finding both.

I've attached a test case with some comments on the cases we've seen.

¹ gcc 5 spots some, but not all NULL

[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682

--- Comment #9 from Tamar Christina  ---
(In reply to Andrew Pinski from comment #8)
> This might be the path splitting running on the gimple level causing issues
> too; see PR 112402 .

Ah that's a good shout.  It looks like Richi already agrees that we should
recognize/do some ifcvt at GIMPLE.

Guess that just leaves the where.

[Bug c++/114573] -Wzero-as-null-pointer-constant complains on enum with explicit cast

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114573

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic
   Last reconfirmed||2024-04-03
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
Confirmed.  Note this diagnostic isn't enabled with -Wall or -Wextra.

[Bug c++/114573] New: -Wzero-as-null-pointer-constant complains on enum with explicit cast

2024-04-03 Thread ossman at cendio dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114573

Bug ID: 114573
   Summary: -Wzero-as-null-pointer-constant complains on enum with
explicit cast
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ossman at cendio dot se
  Target Milestone: ---

g++ complains about the following code when -Wzero-as-null-pointer-constant is
given:

> enum { ZERO, ONE, TWO };
> 
> extern int func(const char *a);
> 
> void zeroenum()
> {
> func((const char*)ZERO);
> }

Oddly enough, it also gives the wrong location for the issue:

> nulls.cxx: In function ‘void zeroenum()’:
> nulls.cxx:8:1: warning: zero as null pointer constant 
> [-Wzero-as-null-pointer-constant]
> 8 | }
>   | ^

clang does not complain about this, neither does older versions of gcc (tested
with 5.5.0). So it's some form of regression.


The above example is a bit contrived, but this pattern is moderately common in
that an argument can be either a pointer or an integer. E.g. CopyFromParent in
libX11, or FLTK menus.

[Bug c++/114572] [OpenMP] "internal compiler error: in assign_temp" with assignment operator and lastprivate clause

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114572

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Guess nobody expected copy assignment operator to return something that would
need to be destructed in the lastprivate clause handling.
Normally copy assignment operators return reference, not the object being
modified by value.

[Bug tree-optimization/114555] ICE: definition in block 14 does not dominate use in block 15 at -O and above with _BitInt() bitfield

2024-04-03 Thread zsojka at seznam dot cz via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114555

--- Comment #1 from Zdenek Sojka  ---
Created attachment 57860
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57860=edit
another testcase, failing with -O -fno-tree-forwprop

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -fno-tree-forwprop testcase2.c 
testcase2.c: In function 'foo':
testcase2.c:9:1: error: definition in block 14 does not dominate use in block
15
9 | foo(void)
  | ^~~
for SSA_NAME: _17 in statement:
_16 = PHI <_15(2), _17(15)>
PHI argument
_17
for PHI node
_16 = PHI <_15(2), _17(15)>
during GIMPLE pass: bitintlower
testcase2.c:9:1: internal compiler error: verify_ssa failed
0x177b02f verify_ssa(bool, bool)
/repo/gcc-trunk/gcc/tree-ssa.cc:1203
0x13cc2c5 execute_function_todo
/repo/gcc-trunk/gcc/passes.cc:2095
0x13cc72e execute_todo
/repo/gcc-trunk/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug middle-end/114563] ggc_internal_alloc is slow

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563

--- Comment #5 from Richard Biener  ---
Btw, I'd say for the sake of avoiding virtual memory fragmentation free_unit
should be equal to GGC_QUIRE_SIZE.  But we should possibly merge adjacent
entries we don't free to power-of-two chunks and possibly have alloc_page
split a larger page when the small ones are exhausted (at least up to
GGC_QUIRE_SIZE?).

I'll also note that while we chunk G.pagesize allocs
with GGC_QUIRE_SIZE we don't do that for the larger allocations?  We also
immediately split to G.pagesize instead of also filling the larger orders
with free pages.  Maybe because we don't chunk the larger orders we shouldn't
force them to be released in large chunks only either?

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:b2460d621efe740bd95ad41afef6d806ec1bd9c7

commit r14-9770-gb2460d621efe740bd95ad41afef6d806ec1bd9c7
Author: Tobias Burnus 
Date:   Wed Apr 3 12:37:39 2024 +0200

GCN: Fix --with-arch= handling in mkoffload [PR111966]

The default -march= setting used in mkoffload did not reflect the modified
default set by GCC's configure-time --with-arch=, causing issues when
generating debug code.

gcc/ChangeLog:

PR other/111966
* config/gcn/mkoffload.cc (get_arch): New; moved -march= flag
handling from ...
(main): ... here; call it to handle --with-arch config option
and -march= commandline.

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-04-03 Thread iii at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #76 from Ilya Leoshkevich  ---
It's because the sanitizer runtime was copied from LLVM to GCC.  I will post a
patch removing the unsupported MSan and DFSan from the error message.

[Bug c++/114571] -Wzero-as-null-pointer-constant does not complain about NULL

2024-04-03 Thread ossman at cendio dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114571

--- Comment #2 from Pierre Ossman  ---
Found another case that neither gcc 5, gcc 13, nor clang complain about for
some odd reason:

>  assert(thing == NULL);

All three complain about:

>  assert(thing == 0);

Not sure what's going on here.

[Bug c++/114572] New: [OpenMP] "internal compiler error: in assign_temp" with assignment operator and lastprivate clause

2024-04-03 Thread j.reuter--- via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114572

Bug ID: 114572
   Summary: [OpenMP] "internal compiler error: in assign_temp"
with assignment operator and lastprivate clause
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: j.reu...@fz-juelich.de
  Target Milestone: ---

Created attachment 57859
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57859=edit
Preprocessed output GCC 13 (Fedora)

Using the following source code, GCC 11 and 12 on Ubuntu 22.04 and 13 on Fedora
39 abort with an internal compiler error. I was also able to reproduce the
issue on Godbolt with GCC trunk:


struct c1
{
~c1(){}
c1 operator=(const c1& other)
{
return *this;
}
};

int main( void )
{
c1 c;
#pragma omp parallel for lastprivate(c)
for(int i = 0; i < 10; ++i){}
}


Compiling the code with:
g++ -fopenmp gcc-error.cpp


fails with:
during RTL pass: expand
gcc-error.cpp: In function ‘main._omp_fn.0’:
gcc-error.cpp:16:5: internal compiler error: in assign_temp, at function.cc:988
   16 | for(int i = 0; i < 10; ++i){}
  | ^~~
Please submit a full bug report, with preprocessed source.
See  for instructions.
Preprocessed source stored into /tmp/cck1AuEN.out file, please attach this to
your bugreport.


Removing both the copy constructor and destructor fixes the issue. Adding
either of them independently causes the error to appear. The log file is
attached.

Godbolt: https://godbolt.org/z/nMve88eEo

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread clyon at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

--- Comment #4 from Christophe Lyon  ---
I'm wondering whether you missed check_effective_target_arm_arch_FUNC_link and
friends?

Maybe check_effective_target_arm_arch_v7a_neon_link would work here, but it
does not use the exact same flags.

[Bug middle-end/114570] GCC doesn't perform good loop invariant code motion for very long vector operations.

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114570

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-03

--- Comment #1 from Richard Biener  ---
There's no (gimple) invariant motion after vector operation lowering.  RTL
invariant motion should see this though but it might have a prohibiting cost
model (or doesn't handle [stack] memory?).

There's other reasons we want to move vector lowering earlier which might then
also catch invariant motion opportunities.

[Bug c++/114480] g++: internal compiler error: Segmentation fault signal terminated program cc1plus

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:e7b7188b1cf8c174f0e890d4ac279ff480b51043

commit r14-9767-ge7b7188b1cf8c174f0e890d4ac279ff480b51043
Author: Richard Biener 
Date:   Tue Apr 2 12:31:04 2024 +0200

tree-optimization/114557 - reduce ehcleanup peak memory use

The following reduces peak memory use for the PR114480 testcase at -O1
which is almost exclusively spent by the ehcleanup pass in allocating
PHI nodes.  The free_phinodes cache we maintain isn't very effective
since it has effectively two slots, one for 4 and one for 9 argument
PHIs and it is only ever used for allocations up to 9 arguments but
we put all larger PHIs in the 9 argument bucket.  This proves
uneffective resulting in much garbage to be kept when incrementally
growing PHI nodes by edge redirection.

The mitigation is to rely on the GC freelist for larger sizes and
thus immediately return all larger bucket sized PHIs to it via ggc_free.

This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
from 359s to 168s.

PR tree-optimization/114557
PR tree-optimization/114480
* tree-phinodes.cc (release_phi_node): Return PHIs from
allocation buckets not covered by free_phinodes to GC.
(remove_phi_node): Release the PHI LHS before freeing the
PHI node.
* tree-vect-loop.cc (vectorizable_live_operation): Get PHI lhs
before releasing it.

[Bug tree-optimization/114557] ehcleanup cleanup_empty_eh_merge_phis eats a lot of memory

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114557

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:e7b7188b1cf8c174f0e890d4ac279ff480b51043

commit r14-9767-ge7b7188b1cf8c174f0e890d4ac279ff480b51043
Author: Richard Biener 
Date:   Tue Apr 2 12:31:04 2024 +0200

tree-optimization/114557 - reduce ehcleanup peak memory use

The following reduces peak memory use for the PR114480 testcase at -O1
which is almost exclusively spent by the ehcleanup pass in allocating
PHI nodes.  The free_phinodes cache we maintain isn't very effective
since it has effectively two slots, one for 4 and one for 9 argument
PHIs and it is only ever used for allocations up to 9 arguments but
we put all larger PHIs in the 9 argument bucket.  This proves
uneffective resulting in much garbage to be kept when incrementally
growing PHI nodes by edge redirection.

The mitigation is to rely on the GC freelist for larger sizes and
thus immediately return all larger bucket sized PHIs to it via ggc_free.

This reduces the peak memory use from 19.8GB to 11.3GB and compile-time
from 359s to 168s.

PR tree-optimization/114557
PR tree-optimization/114480
* tree-phinodes.cc (release_phi_node): Return PHIs from
allocation buckets not covered by free_phinodes to GC.
(remove_phi_node): Release the PHI LHS before freeing the
PHI node.
* tree-vect-loop.cc (vectorizable_live_operation): Get PHI lhs
before releasing it.

[Bug target/114570] New: GCC doesn't perform good loop invariant code motion for very long vector operations.

2024-04-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114570

Bug ID: 114570
   Summary: GCC doesn't perform good loop invariant code motion
for very long vector operations.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---

typedef float v128_32 __attribute__((vector_size (128 * 4), aligned(2048)));
v128_32
foo (v128_32 a, v128_32 b, v128_32 c, int n)
{
for (int i = 0; i != 2048; i++)
{
a = a / c;
a = a / b;
}
return a;
}

   [local count: 1063004408]:
  # a_13 = PHI 
  # ivtmp_2 = PHI 
  # DEBUG i => NULL
  # DEBUG a => NULL
  # DEBUG BEGIN_STMT
  _14 = BIT_FIELD_REF ;
  _15 = BIT_FIELD_REF ;
  _10 = _14 / _15;
  _11 = BIT_FIELD_REF ;
  _12 = BIT_FIELD_REF ;
  _16 = _11 / _12;
  _17 = BIT_FIELD_REF ;
  _18 = BIT_FIELD_REF ;
  _19 = _17 / _18;
  _20 = BIT_FIELD_REF ;
  _21 = BIT_FIELD_REF ;
  _22 = _20 / _21;
  _23 = BIT_FIELD_REF ;
  _24 = BIT_FIELD_REF ;
  _25 = _23 / _24;
  _26 = BIT_FIELD_REF ;
  _27 = BIT_FIELD_REF ;
  _28 = _26 / _27;
  _29 = BIT_FIELD_REF ;
  _30 = BIT_FIELD_REF ;
  _31 = _29 / _30;
  _32 = BIT_FIELD_REF ;
  _33 = BIT_FIELD_REF ;
  _34 = _32 / _33;
  _35 = BIT_FIELD_REF ;
  _36 = BIT_FIELD_REF ;
  _37 = _35 / _36;
  _38 = BIT_FIELD_REF ;
  _39 = BIT_FIELD_REF ;
  _40 = _38 / _39;
  _41 = BIT_FIELD_REF ;
  _42 = BIT_FIELD_REF ;
  _43 = _41 / _42;
  _44 = BIT_FIELD_REF ;
  _45 = BIT_FIELD_REF ;
  _46 = _44 / _45;
  _47 = BIT_FIELD_REF ;
  _48 = BIT_FIELD_REF ;
  _49 = _47 / _48;
  _50 = BIT_FIELD_REF ;
  _51 = BIT_FIELD_REF ;
  _52 = _50 / _51;
  _53 = BIT_FIELD_REF ;
  _54 = BIT_FIELD_REF ;
  _55 = _53 / _54;
  _56 = BIT_FIELD_REF ;
  _57 = BIT_FIELD_REF ;
  _58 = _56 / _57;
  # DEBUG a => {_10, _16, _19, _22, _25, _28, _31, _34, _37, _40, _43, _46,
_49, _52, _55, _58}
  # DEBUG BEGIN_STMT
  _59 = BIT_FIELD_REF ;
  _60 = _10 / _59;
  _61 = BIT_FIELD_REF ;
  _62 = _16 / _61;
  _63 = BIT_FIELD_REF ;
  _64 = _19 / _63;
  _65 = BIT_FIELD_REF ;
  _66 = _22 / _65;
  _67 = BIT_FIELD_REF ;
  _68 = _25 / _67;
  _69 = BIT_FIELD_REF ;
  _70 = _28 / _69;
  _71 = BIT_FIELD_REF ;
  _72 = _31 / _71;
  _73 = BIT_FIELD_REF ;
  _74 = _34 / _73;
  _75 = BIT_FIELD_REF ;
  _76 = _37 / _75;
  _77 = BIT_FIELD_REF ;
  _78 = _40 / _77;
  _79 = BIT_FIELD_REF ;
  _80 = _43 / _79;
  _81 = BIT_FIELD_REF ;
  _82 = _46 / _81;
  _83 = BIT_FIELD_REF ;
  _84 = _49 / _83;
  _85 = BIT_FIELD_REF ;
  _86 = _52 / _85;
  _87 = BIT_FIELD_REF ;
  _88 = _55 / _87;
  _89 = BIT_FIELD_REF ;
  _90 = _58 / _89;
  a_9 = {_60, _62, _64, _66, _68, _70, _72, _74, _76, _78, _80, _82, _84, _86,
_88, _90};
  # DEBUG a => a_9
  # DEBUG BEGIN_STMT
  # DEBUG i => NULL
  # DEBUG a => a_9
  # DEBUG BEGIN_STMT
  ivtmp_1 = ivtmp_2 + 4294967295;
  if (ivtmp_1 != 0)
goto ; [98.99%]
  else
goto ; [1.01%]

Ideally, those BIT_FIELD_REF can be hoisted out and 
# a_13 = PHI  can be optimized with those 256-bit vectors.

we finanly generate 

foo:
pushq   %rbp
movq%rdi, %rax
movl$2048, %edx
movq%rsp, %rbp
subq$408, %rsp
leaq-120(%rsp), %r8
.L2:
vmovaps 16(%rbp), %ymm15
vmovaps 48(%rbp), %ymm14
movq%r8, %rsi
vdivps  1040(%rbp), %ymm15, %ymm15
vmovaps 80(%rbp), %ymm13
vmovaps 112(%rbp), %ymm12
vdivps  528(%rbp), %ymm15, %ymm15
vdivps  1072(%rbp), %ymm14, %ymm14
vmovaps 144(%rbp), %ymm11
vmovaps 176(%rbp), %ymm10
vdivps  560(%rbp), %ymm14, %ymm14
vdivps  1104(%rbp), %ymm13, %ymm13
vmovaps 208(%rbp), %ymm9
vmovaps 240(%rbp), %ymm8
vdivps  592(%rbp), %ymm13, %ymm13
vdivps  1136(%rbp), %ymm12, %ymm12
vmovaps 272(%rbp), %ymm7
vmovaps 304(%rbp), %ymm6
vdivps  624(%rbp), %ymm12, %ymm12
vdivps  1168(%rbp), %ymm11, %ymm11
vmovaps 336(%rbp), %ymm5
vdivps  656(%rbp), %ymm11, %ymm11
vdivps  1200(%rbp), %ymm10, %ymm10
vdivps  1232(%rbp), %ymm9, %ymm9
vdivps  688(%rbp), %ymm10, %ymm10
vdivps  720(%rbp), %ymm9, %ymm9
vdivps  1264(%rbp), %ymm8, %ymm8
vdivps  1296(%rbp), %ymm7, %ymm7
vdivps  752(%rbp), %ymm8, %ymm8
vdivps  784(%rbp), %ymm7, %ymm7
vdivps  1328(%rbp), %ymm6, %ymm6
movl$64, %ecx
vdivps  816(%rbp), %ymm6, %ymm6
leaq16(%rbp), %rdi
vdivps  1360(%rbp), %ymm5, %ymm5
vdivps  848(%rbp), %ymm5, %ymm5
vmovaps 368(%rbp), %ymm4
vmovaps 400(%rbp), %ymm3
vdivps  1392(%rbp), %ymm4, %ymm4
vdivps  1424(%rbp), %ymm3, %ymm3
vmovaps 432(%rbp), %ymm2
vmovaps 464(%rbp), %ymm1
vdivps  880(%rbp), %ymm4, %ymm4
vdivps  912(%rbp), %ymm3, %ymm3
vmovaps

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org,
   ||rearnsha at gcc dot gnu.org

--- Comment #1 from Maxim Kuvyrkov  ---
The test now fails with linker error:
.../arm-eabi/bin/ld: error: /tmp/cc2Q27GE.o: conflicting architecture profiles
A/M

This is due to command line having
-mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp -mfpu=auto ...
-mfpu=neon -mfloat-abi=softfp -march=armv7-a

The first part comes from toolchain configuration settings, and the second part
(-mfpu=neon -mfloat-abi=softfp -march=armv7-a) comes from
check_effective_target_arm_neon_ok_nocache().

Surprisingly (to me), GCC accepts such mixed options, which makes
check_effective_target_arm_neon_ok_nocache() succeed, since it's only doing a
compilation test.  The linker, though, fails.

Richard E., it is expected that GCC accepts conflicting -march= options?

[Bug middle-end/114563] ggc_internal_alloc is slow

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563

Richard Biener  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
Note this is likely because of release_pages keeping a large freelist when
using madvise.  After r14-9767 this improved to

   5.15% 35482  cc1plus  cc1plus   [.] ggc_internal_alloc

I've tried a quick hack to use the 'prev' field to implement a skip-list,
skipping to the next page entry with a different size.  That works
reasonably well but it also shows the freelist is heavily fragmented.

N: M P Q
11: 98767 19662 17321
21: 176918 68336 27167
31: 228676 164683 27185

that's stats after N alloc_page which M times finds a free page to re-use,
in that process P times using the skip-list to skip at least one entry and
Q times following the ->next link directly.

It does get alloc_page from the profile.

It might be worth keeping the list sorted in ascending ->bytes order with
this, making pagesize allocations O(1) and other sizes O(constant).

Of course using N buckets would be the straight-forward thing but then
release_pages would be complicated, esp. malloc page groups I guess.

But as said, this is likely a symptom of the MADVISE path keeping too many
page entries for the testcase, so another attack vector is to more
aggressively release them.  I don't know how much fragmented they are,
we don't seem to try sorting them before unmapping the >= free_unit chunks.

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

--- Comment #3 from Maxim Kuvyrkov  ---
Changing from compile-only to link test is as simple as changing "object" to
"executable" in
[check_no_compiler_messages_nocache arm_neon_ok object ...]
.

However, ... this pattern of checking for ARM architectural features is shared
by 20+ check_effective_target_arm_* routines.  IMO, we should either update all
of these to be link tests (unless there is a good reason to keep them as
compile-only that we can document in the comments).  Or just accept this
vectorization test failure on ARM targets that don't support vectorization.

[Bug c++/114480] g++: internal compiler error: Segmentation fault signal terminated program cc1plus

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480

--- Comment #18 from Richard Biener  ---
Btw, clang is quite quick with -O0 (8s, 1GB ram) but with -O1 uses 18GB ram and
8 minutes compile-time.

[Bug middle-end/114563] ggc_internal_alloc is slow

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563

--- Comment #3 from Richard Biener  ---
Created attachment 57856
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57856=edit
quick skip-list patch

Before:

> /usr/bin/time ./cc1plus -quiet -o /dev/null /tmp/a-test-poly.ii -O
173.29user 3.25system 2:56.59elapsed 99%CPU (0avgtext+0avgdata
11311472maxresident)k
0inputs+0outputs (0major+2867887minor)pagefaults 0swaps

After:

> /usr/bin/time ./cc1plus -quiet -o /dev/null /tmp/a-test-poly.ii -O
161.23user 3.15system 2:44.44elapsed 99%CPU (0avgtext+0avgdata
11308852maxresident)k
0inputs+0outputs (0major+2868137minor)pagefaults 0swaps

The patch uses the ->prev pointer to point to the previous entry of the
next entry with differing ->bytes.  I re-compute the pointers from scratch
during release_pages and update during alloc/free but do not merge ranges
when allocating.

It would be possible to compute/update the skip list pointer during the
walk itself at a bit of extra cost there.

As said, when we see to maintain a sorted free_pages list this might
speed up the walk some more as we can stop after checking the right-size
chunk.

Doing a better job in release_pages for the madvise case would be good as well.

I wonder if anybody has a preference / priority?

[Bug middle-end/114563] ggc_internal_alloc is slow

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114563

--- Comment #4 from Richard Biener  ---
Created attachment 57858
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57858=edit
better release_pages

Ah, and it's not so much fragmentation but large free_unit (1MB) that's hard
to get to.  The attached sorts the pages before releasing contiguous spaces.

printing all > G.pagesize occurances yields

where the first column is the number of times (sort -n | uniq -c) and the
rest is the size of the contiguous area and in how many page entries that
is spread (I notice we don't merge the page entries either, but that would
not obviously be a good thing I guess)

  5 8192 in 1
465 8192 in 2
410 12288 in 3
317 16384 in 4
162 20480 in 5
158 24576 in 6
145 28672 in 7
 20 32768 in 1
 94 32768 in 8
 81 36864 in 9
 59 40960 in 10
 61 45056 in 11
 50 49152 in 12
  1 49152 in 6
 27 53248 in 13
 20 57344 in 14
 13 61440 in 15
  5 65536 in 1
 14 65536 in 16
  5 65536 in 2
  3 69632 in 17
  1 73728 in 18
  1 77824 in 19
  1 81920 in 20
  1 86016 in 21
  1 94208 in 23
  2 98304 in 2
  1 114688 in 14
  1 118784 in 29
  1 126976 in 31
  2 131072 in 1
  1 131072 in 16
  2 131072 in 3
  1 155648 in 38
  1 159744 in 39
  1 167936 in 41
  1 196608 in 2
  1 204800 in 50
  6 204800 in 7
  4 229376 in 3
  2 262144 in 1
  1 278528 in 34
  1 344064 in 42
  1 393216 in 48
  1 524288 in 1
  1 544768 in 133
  2 565248 in 138
  1 569344 in 139
  1 573440 in 140
  1 737280 in 90
  1 835584 in 204
  1 884736 in 54
  1 999424 in 61

-- below is then released
  2 1048576 in 1
  1 1048576 in 2
  1 1310720 in 65
  1 1359872 in 129
  1 1400832 in 149
  1 1597440 in 8
  1 1605632 in 392
  3 1605632 in 7
  1 1835008 in 7
  1 1867776 in 8
  1 2023424 in 494
  3 3145728 in 6
  3 3211264 in 7
  1 4194304 in 1
  1 4685824 in 319
  1 8388608 in 1
  1 8617984 in 439
  1 8896512 in 450
  1 9363456 in 297
  1 9732096 in 480
  1 9740288 in 508
  1 9764864 in 478
  1 10485760 in 2560
  2 14049280 in 616
  3 14336000 in 628
  1 16777216 in 1
  1 33554432 in 1
  1 85975040 in 1966
  1 210501632 in 3040
  1 275062784 in 3586
  1 302637056 in 1298
  1 339976192 in 3853
  1 429260800 in 2384
  1 476405760 in 4017
  1 556277760 in 3465
  1 561905664 in 4372
  1 563216384 in 4379
  1 645537792 in 5451
  1 1515700224 in 8173
  1 1524654080 in 8195
  1 1525112832 in 8196

[Bug libstdc++/114401] libstdc++ allocator destructor omitted when reinserting node_handle into tree- and hashtable-based containers

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114401

--- Comment #5 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:47ebdbe5bf71d9eb260359b6aceb5cb071d97acd

commit r13-8570-g47ebdbe5bf71d9eb260359b6aceb5cb071d97acd
Author: Jonathan Wakely 
Date:   Thu Mar 21 13:25:15 2024 +

libstdc++: Destroy allocators in re-inserted container nodes [PR114401]

The allocator objects in container node handles were not being destroyed
after the node was re-inserted into a container. They are stored in a
union and so need to be explicitly destroyed when the node becomes
empty. The containers were zeroing the node handle's pointer, which
makes it empty, causing the handle's destructor to think there's nothing
to clean up.

Add a new member function to the node handle which destroys the
allocator and zeros the pointer. Change the containers to call that
instead of just changing the pointer manually.

We can also remove the _M_empty member of the union which is not
necessary.

libstdc++-v3/ChangeLog:

PR libstdc++/114401
* include/bits/hashtable.h (_Hashtable::_M_reinsert_node): Call
release() on node handle instead of just zeroing its pointer.
(_Hashtable::_M_reinsert_node_multi): Likewise.
(_Hashtable::_M_merge_unique): Likewise.
(_Hashtable::_M_merge_multi): Likewise.
* include/bits/node_handle.h (_Node_handle_common::release()):
New member function.
(_Node_handle_common::_Optional_alloc::_M_empty): Remove
unnecessary union member.
(_Node_handle_common): Declare _Hashtable as a friend.
* include/bits/stl_tree.h (_Rb_tree::_M_reinsert_node_unique):
Call release() on node handle instead of just zeroing its
pointer.
(_Rb_tree::_M_reinsert_node_equal): Likewise.
(_Rb_tree::_M_reinsert_node_hint_unique): Likewise.
(_Rb_tree::_M_reinsert_node_hint_equal): Likewise.
* testsuite/23_containers/multiset/modifiers/114401.cc: New test.
* testsuite/23_containers/set/modifiers/114401.cc: New test.
* testsuite/23_containers/unordered_multiset/modifiers/114401.cc:
New test.
* testsuite/23_containers/unordered_set/modifiers/114401.cc: New
test.

(cherry picked from commit c2e28df90a1640cebadef6c6c8ab5ea964071bb1)

[Bug libstdc++/113841] Can't swap two std::hash

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

--- Comment #14 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:87ec5b369eed205dfe6802afaaec3986b246ade9

commit r13-8569-g87ec5b369eed205dfe6802afaaec3986b246ade9
Author: Jonathan Wakely 
Date:   Fri Feb 9 17:06:20 2024 +

libstdc++: Constrain std::vector default constructor [PR113841]

This is needed to avoid errors outside the immediate context when
evaluating is_default_constructible_v> when A is not
default constructible.

To avoid diagnostic regressions for 23_containers/vector/48101_neg.cc we
need to make the std::allocator partial specializations default
constructible, which they probably should have been anyway.

libstdc++-v3/ChangeLog:

PR libstdc++/113841
* include/bits/allocator.h (allocator): Add default
constructor to partial specializations for cv-qualified types.
* include/bits/stl_vector.h (_Vector_impl::_Vector_impl()):
Constrain so that it's only present if the allocator is default
constructible.
* include/bits/stl_bvector.h (_Bvector_impl::_Bvector_impl()):
Likewise.
* testsuite/23_containers/vector/cons/113841.cc: New test.

(cherry picked from commit 142cc4c223d695e515ed2504501b91d8a7ac6eb8)

[Bug libstdc++/114367] std::vector constexpr initialization doesn't start lifetime of array members

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114367

--- Comment #5 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:d8d71b19f0b1e28fd6d413a6874ec55c568865b0

commit r13-8568-gd8d71b19f0b1e28fd6d413a6874ec55c568865b0
Author: Jonathan Wakely 
Date:   Mon Mar 18 13:00:17 2024 +

libstdc++: Begin lifetime of storage in std::vector [PR114367]

This doesn't cause a problem with GCC, but Clang correctly diagnoses a
bug in the code. The objects in the allocated storage need to begin
their lifetime before we start using them.

This change uses the allocator's construct function instead of using
std::construct_at directly, in order to support fancy pointers.

libstdc++-v3/ChangeLog:

PR libstdc++/114367
* include/bits/stl_bvector.h (_M_allocate): Use allocator's
construct function to begin lifetime of words.

(cherry picked from commit 16afbd9c9c4282d56062cef95e6eccfdcf3efe03)

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #53 from Richard Biener  ---
So just to recap, with reverting the change and instead doing

diff --git a/gcc/combine.cc b/gcc/combine.cc
index a4479f8d836..ff25752cac4 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -4186,6 +4186,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1,
rtx_insn *i0,
   adjust_for_new_dest (i3);
 }

+  bool i2_unchanged = false;
+  if (rtx_equal_p (newi2pat, PATTERN (i2)))
+i2_unchanged = true;
+
   /* We now know that we can do this combination.  Merge the insns and
  update the status of registers and LOG_LINKS.  */

@@ -4752,6 +4756,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1,
rtx_insn *i0,
   combine_successes++;
   undo_commit ();

+  if (i2_unchanged)
+return i3;
+
   rtx_insn *ret = newi2pat ? i2 : i3;
   if (added_links_insn && DF_INSN_LUID (added_links_insn) < DF_INSN_LUID
(ret))
 ret = added_links_insn;

combine time is down from 79s (93%) to 3.5s (37%), quite a bit more than
with the currently installed patch which has combine down to 0.02s (0%).
But notably peak memory use is down from 9GB to 400MB (installed patch 340MB).

That was with a cross from x86_64-linux and a release checking build.

This change should avoid any code generation changes, I do think if the
pattern doesn't change what distribute_notes/links does should be a no-op
even to I2 so we can ignore added_{links,notes}_insn (not ignoring them
only provides a 50% speedup).

I like the 0% combine result of the installed patch but the regressions
observed probably mean this needs to be defered to stage1.

[Bug target/85919] Incomplete transition to IFNs for scatter/gather support, drop vectorize.builtin_{gather,scatter} target hooks

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85919

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-03

--- Comment #3 from Andrew Pinski  ---
Confirmed.

[Bug target/85919] Incomplete transition to IFNs for scatter/gather support, drop vectorize.builtin_{gather,scatter} target hooks

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85919

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug fortran/113412] ATAN(Y,X) does not check arguments and generates wrong error message.

2024-04-03 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113412

--- Comment #6 from anlauf at gcc dot gnu.org ---
(In reply to kargls from comment #5)
> The pointers to expr->symtree is NULL.  This new patch catches your example.

It does, but behaves weird for some other cases.  Try:

program main
  complex :: c = 1.
  complex, parameter :: z = 1.
  print *, atan(c,c)
  print *, atan(z,z)
end

This gives now:

pr113412.f90:4:18:

4 |   print *, atan(c,c)
  |  1
Error: 'c' argument of 'atan' intrinsic at (1) must be the same type and kind
as 'c'
pr113412.f90:5:18:

5 |   print *, atan(z,z)
  |  1
Error: 'z' argument of 'atan' intrinsic at (1) must be the same type and kind
as 'z'


I wonder whether we can reuse existing checks for atan2 for the 2-argument
version of atan.

I tried the following:

diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc
index c35f2bdd183..261d4229139 100644
--- a/gcc/fortran/intrinsic.cc
+++ b/gcc/fortran/intrinsic.cc
@@ -4370,6 +4370,11 @@ sort_actual (const char *name, gfc_actual_arglist **ap,
   if (a == NULL)
 goto do_sort;

+  if ((gfc_option.allow_std & GFC_STD_F2008) != 0
+  && strcmp(name, "atan") == 0
+  && !gfc_check_atan_2 (actual->expr, actual->next->expr))
+return false;
+
 whoops:
   gfc_error ("Too many arguments in call to %qs at %L", name, where);
   return false;


This is indeed sort of hackish and produces for testcase:

program main
  complex :: c = 1.
  print *, atan (c,c)
  print *, atan2(c,c)
end

pr113412.f90:3:17:

3 |   print *, atan (c,c)
  | 1
Error: 'x' argument of 'atan' intrinsic at (1) must be REAL
pr113412.f90:4:17:

4 |   print *, atan2(c,c)
  | 1
Error: 'y' argument of 'atan2' intrinsic at (1) must be REAL


Note that the name of the formal argument is now wrong, probably because
the association of actuals with formals is missing.

[Bug c++/86303] Constructor is not used for type conversion

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86303

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski  ---
Note the rule change for C++17+ is the "Prvalue semantics" (which is described
here https://en.cppreference.com/w/cpp/language/copy_elision ). In C++11 and
C++14, copy elision is not required to be done and you need to do a copy so the
code is invalid since there is a copy that needed to done but there is no way
to do a copy in this case since the "copy constructor" cannot bind a temporary.

Anyways clang accepts it for C++11 and C++14 modes is a bug there and should be
reported back to them.

Note clang/msvc does not even use:
 auto_ptr(auto_ptr_ref);
for C++11 as if I mark it as delete(d), clang accepts it still.

Which makes me suspecision of them doing the copy elision but then not also
checking if the copy would be valid. In the case of GCC there was a bug
previously where the check was not done and it was fixed and there was many
complaints in then but that was over 10 years ago.

[Bug c/23872] .original dump weirdness

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23872

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #5 from Andrew Pinski  ---
The  issue was fixed with r0-73077-g953ff28998b59b which was
included in GCC 4.2.0.

The DECL_EXPR issue is still there.
Currently the code is:
case DECL_EXPR:
  print_declaration (pp, DECL_EXPR_DECL (node), spc, flags);
  is_stmt = false;
  break;

I will do a patch to wrap a `DECL_EXPR < ... >` the printing so it becomes
obvious what it does.

[Bug c++/99426] [modules] failed to read compiled module cluster 1186: Bad file data

2024-04-03 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99426

--- Comment #7 from Patrick Palka  ---
There's a patch pending review at
https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647203.html

Until that's merged, one should be able to work around this error with a trunk
compiler by using --param=ggc-min-expand=1000 (or an ever larger value) to
prevent garbage collection from occurring as often.

[Bug target/32775] init_machine_status doc bug

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32775

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||documentation
   Last reconfirmed|2010-02-21 00:52:44 |2024-4-3
 CC||pinskia at gcc dot gnu.org

--- Comment #2 from Andrew Pinski  ---
Note init_machine_status should really be a target hook rather than just some
random function pointer too.

The change in the function pointer type happened with r0-4-ge2500fedef1a1c
(aka PCH merge or rather use GC for most things).

[Bug target/36512] relocation overflow

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36512

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #5 from Andrew Pinski  ---
The problem here is related to intl/libintl.a getting compiled vs having
libiconv.dylib installed.

libiconv.dylib is not normally installed on darwin either.

[Bug target/114576] [14 regression]VEX-prefixed AES instruction without AVX enabled

2024-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114576

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
(In reply to Andrew Pinski from comment #2)
> Something like this should fix it (but I am not 100% sure it is correct nor
> can I test it):

This is IMHO not correct.
vaesenc etc. instructions can be used even if just -maes -mavx, not just -mvaes
-mavx512vl.
But, it is especially messy because -mvaes doesn't imply -maes, so IMHO if
somebody e.g. asks for -mvaes -mavx512vl -mno-aes and the insns don't use any
xmm16+ register, it would emit the insn using VEX encoding rather than EVEX, so
I think we need to use {evex} prefixes.

So I think we want:
--- gcc/config/i386/i386.md.jj  2024-03-18 10:33:27.983419363 +0100
+++ gcc/config/i386/i386.md 2024-04-04 00:17:48.818340648 +0200
@@ -568,13 +568,14 @@ (define_attr "unit" "integer,i387,sse,mm

 ;; Used to control the "enabled" attribute on a per-instruction basis.
 (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx,
-   x64_avx,x64_avx512bw,x64_avx512dq,aes,apx_ndd,
+   x64_avx,x64_avx512bw,x64_avx512dq,apx_ndd,
sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx,
   
avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,avx512f_512,
noavx512f,avx512bw,avx512bw_512,noavx512bw,avx512dq,
noavx512dq,fma_or_avx512vl,avx512vl,noavx512vl,avxvnni,
avx512vnnivl,avx512fp16,avxifma,avx512ifmavl,avxneconvert,
-   avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl"
+   avx512bf16vl,vpclmulqdqvl,avx_noavx512f,avx_noavx512vl,
+   aes_avx,vaes_avx512vl"
   (const_string "base"))

 ;; The (bounding maximum) length of an instruction immediate.
@@ -915,7 +916,6 @@ (define_attr "enabled" ""
   (symbol_ref "TARGET_64BIT && TARGET_AVX512BW")
 (eq_attr "isa" "x64_avx512dq")
   (symbol_ref "TARGET_64BIT && TARGET_AVX512DQ")
-(eq_attr "isa" "aes") (symbol_ref "TARGET_AES")
 (eq_attr "isa" "sse_noavx")
   (symbol_ref "TARGET_SSE && !TARGET_AVX")
 (eq_attr "isa" "sse2") (symbol_ref "TARGET_SSE2")
@@ -968,6 +968,10 @@ (define_attr "enabled" ""
   (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL")
 (eq_attr "isa" "apx_ndd")
   (symbol_ref "TARGET_APX_NDD")
+(eq_attr "isa" "aes_avx")
+  (symbol_ref "TARGET_AES && TARGET_AVX")
+(eq_attr "isa" "vaes_avx512vl")
+  (symbol_ref "TARGET_VAES && TARGET_AVX512VL")

 (eq_attr "mmx_isa" "native")
   (symbol_ref "!TARGET_MMX_WITH_SSE")
--- gcc/config/i386/sse.md.jj   2024-03-18 08:58:45.942772799 +0100
+++ gcc/config/i386/sse.md  2024-04-04 00:33:32.386194779 +0200
@@ -26277,75 +26277,79 @@ (define_insn "xop_vpermil23"
 

 (define_insn "aesenc"
-  [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
-   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
-  (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")]
+  [(set (match_operand:V2DI 0 "register_operand" "=x,x,x,v")
+   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,x,v")
+  (match_operand:V2DI 2 "vector_operand" "xja,xm,xm,vm")]
  UNSPEC_AESENC))]
   "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
   "@
aesenc\t{%2, %0|%0, %2}
vaesenc\t{%2, %1, %0|%0, %1, %2}
+   %{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}
vaesenc\t{%2, %1, %0|%0, %1, %2}"
-  [(set_attr "isa" "noavx,aes,avx512vl")
+  [(set_attr "isa" "noavx,aes_avx,vaes_avx512vl,vaes_avx512vl")
(set_attr "type" "sselog1")
-   (set_attr "addr" "gpr16,*,*")
+   (set_attr "addr" "gpr16,*,*,*")
(set_attr "prefix_extra" "1")
-   (set_attr "prefix" "orig,vex,evex")
-   (set_attr "btver2_decode" "double,double,double")
+   (set_attr "prefix" "orig,vex,evex,evex")
+   (set_attr "btver2_decode" "double,double,double,double")
(set_attr "mode" "TI")])

 (define_insn "aesenclast"
-  [(set (match_operand:V2DI 0 "register_operand" "=x,x,v")
-   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v")
-  (match_operand:V2DI 2 "vector_operand" "xja,xm,vm")]
+  [(set (match_operand:V2DI 0 "register_operand" "=x,x,x,v")
+   (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,x,v")
+  (match_operand:V2DI 2 "vector_operand" "xja,xm,xm,vm")]
  UNSPEC_AESENCLAST))]
   "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)"
   "@
aesenclast\t{%2, %0|%0, %2}
vaesenclast\t{%2, %1, %0|%0, %1, %2}
+   %{evex%} vaesenclast\t{%2, %1, %0|%0, %1, %2}

[Bug middle-end/47048] misc vect.exp failures with -fgraphite-identity enabled at -O2.

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47048

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Keywords||missed-optimization
   Assignee|spop at gcc dot gnu.org|unassigned at gcc dot 
gnu.org
 CC||pinskia at gcc dot gnu.org

[Bug c/28141] thread-local ptr initialized to address of thread-local misclassified as non-constant initializer

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28141

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Andrew Pinski  ---
This is a won't fix as the C11's thread_local does not allow for it either.
C++'s thread_local does but that is because it allows dynamically
initialization; just like global variables in C++.

[Bug c/114526] ISO C does not prohibit extensions: fix misconception.

2024-04-03 Thread harald at gigawatt dot nl via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114526

--- Comment #20 from Harald van Dijk  ---
(In reply to Kaz Kylheku from comment #19)

Needless to say I still disagree, but I interpreted your comment #17 as
suggesting this aspect of the discussion is neither necessary nor useful for
this bug, and agreed with that in comment #18. So let's actually stop this
aspect of the discussion.

[Bug c++/114571] -Wzero-as-null-pointer-constant does not complain about NULL

2024-04-03 Thread ossman at cendio dot se via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114571

--- Comment #1 from Pierre Ossman  ---
Hmm.. I found bug 77513, and r9-873. So I guess this is intentional?

This makes the warning somewhat pointless. We want to make sure developers
standardise on nullptr, both for style and since the behaviour of NULL is
compiler dependent (if I'm understanding the C++ standard correctly).

It's also annoying if there is not a consensus between clang and gcc here.

[Bug libstdc++/104606] [11/12/13/14 Regression] comparison operator resolution with std::optional and -std=c++20

2024-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104606

--- Comment #14 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:7f65d8267fbfd19cf21a3dc71d27e989e75044a3

commit r14-9771-g7f65d8267fbfd19cf21a3dc71d27e989e75044a3
Author: Jonathan Wakely 
Date:   Wed Mar 27 21:51:13 2024 +

libstdc++: Reverse arguments in constraint for std::optional's <=>
[PR104606]

This is a workaround for a possible compiler bug that causes constraint
recursion in the operator<=>(const optional&, const U&) overload.

libstdc++-v3/ChangeLog:

PR libstdc++/104606
* include/std/optional (operator<=>(const optional&, const U&)):
Reverse order of three_way_comparable_with template arguments.
* testsuite/20_util/optional/relops/104606.cc: New test.

[Bug target/114577] New: Inefficient codegen for SVE/NEON bridge

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114577

Bug ID: 114577
   Summary: Inefficient codegen for SVE/NEON bridge
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64*

The following sequence:

#include 

svint32_t f (int *a, int *b)
{
  int32x4_t va = vld1q_s32 (a);
  svint32_t za = svset_neonq_s32 (svundef_s32 (), va);
  return za;
}

-O2 -march=armv9-a

is expected to be a simple load but generates:

f:
ldr q31, [x0]
ptrue   p3.s, vl4
sel z0.s, p3, z31.s, z0.s
ret

instead of the expected (from clang):

f:  // @f
ldr q0, [x0]
ret

it looks like GCC's implementation of svset_neonq_s32 with svundef does not
become a view_convert/subreg.

[Bug target/85236] missing _mm256_atan2_ps

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85236

--- Comment #7 from Andrew Pinski  ---
clang does not implement this intrinsics either and there is no issue filed
there about it either (I am kinda of shocked).

Note ICX (which is the new ICC but with using clang/LLVM) does and it calls
__svml_atan2f8_e9 directly.


Looks like SIMD-everywhere has implement it though, see
https://github.com/simd-everywhere/simde/issues/40

and
https://github.com/simd-everywhere/simde/commit/5b28b3d4672a9cc0616d5d6813b8e31e9bde8148

[Bug driver/81358] libatomic not automatically linked with C11 code

2024-04-03 Thread bunk at stusta dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358

--- Comment #15 from Adrian Bunk  ---
(In reply to Tobias Burnus from comment #11)
> RFC draft patch – also to solve an offload problem with atomic and nvptx
> libgomp:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556297.html
> See reply for what still needs to be done (esp. related to building
> libraries + testsuite).

Was there any reason why this was stalled?

More and more packages in Debian need manual addition of libatomic on some
32bit vintage architectures (armv5/m68k/mips/powerpc/sh4) due to this, which is
a pain.

Is the only issue that noone has addressed the comments in the reply on the
mailing list, or have there also been other problems?

[Bug target/101865] _ARCH_PWR8 is not defined when using -mcpu=power8

2024-04-03 Thread bergner at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101865

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|willschm at gcc dot gnu.org|bergner at gcc dot 
gnu.org
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2022-Septemb
   ||er/601825.html

--- Comment #17 from Peter Bergner  ---
I'm working on updating the patch Will submitted to take into consideration the
patch reviews plus trunk changes since it was submitted.  Mine now.

[Bug middle-end/29231] need a way to produce trampolines not on the stack

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29231

Andrew Pinski  changed:

   What|Removed |Added

 CC||iains at gcc dot gnu.org

--- Comment #6 from Andrew Pinski  ---
Most of the support was added in r14-4821-g28d8c680aaea46 .

Maybe Iain can provide more information on what else is needed to be done if
anything.

[Bug target/86027] string literals get corrupted with -O3 and gas on solaris i386

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86027

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 84017 ***

[Bug bootstrap/84017] [6/7/8 regression] Bootstrap failure on Solaris 10/x86 with gas/ld

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84017

Andrew Pinski  changed:

   What|Removed |Added

 CC||subscribe at teskor dot de

--- Comment #12 from Andrew Pinski  ---
*** Bug 86027 has been marked as a duplicate of this bug. ***

[Bug target/85236] missing _mm256_atan2_ps

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85236

--- Comment #6 from Andrew Pinski  ---
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#!=undefined=SVML=_mm256_atan2_ps_expand=393

[Bug libstdc++/93672] std::basic_istream::ignore hangs if delim MSB is set

2024-04-03 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93672

--- Comment #3 from Jonathan Wakely  ---
So maybe:

--- a/libstdc++-v3/src/c++98/istream.cc
+++ b/libstdc++-v3/src/c++98/istream.cc
@@ -112,7 +112,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 basic_istream::
 ignore(streamsize __n, int_type __delim)
 {
-  if (traits_type::eq_int_type(__delim, traits_type::eof()))
+  // If __delim is eof() we ignore up to __n chars, and for any other
+  // negative value using eq_int_type(sgetc(), __delim) will never be
true,
+  // so just treat all negative __delim values as eof().
+  if (__delim < 0)
return ignore(__n);

   _M_gcount = 0;

[Bug target/86466] [X86] gcc checks the range of the immediate to _mm_blend_ps, but not _mm_blend_epi32

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86466

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||accepts-invalid
   Last reconfirmed||2024-04-03
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug target/34629] cexp call broken on solaris 10 32bit code with gcc and -fPIC option

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34629

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |4.3.0

--- Comment #4 from Andrew Pinski  ---
Fixed by r0-80529-g29173496a0453f for GCC 4.3.0 .


https://gcc.gnu.org/pipermail/gcc-patches/2007-April/214873.html . I didn't
look to see if it was backported to the 4.2.x branch though which this bug
report was reported against (a few months after the fix went into the trunk at
the time). But it has been fixed for a long time now so closing as fixed.

[Bug target/114576] [14 regression] VEX-prefixed AES instruction without AVX enabled

2024-04-03 Thread thiago at kde dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114576

--- Comment #4 from Thiago Macieira  ---
(In reply to Jakub Jelinek from comment #3)
> vaesenc etc. instructions can be used even if just -maes -mavx, not just
> -mvaes -mavx512vl.

Correct, that's just VEX-prefixed AESNI instructions.

VAES added the 256-bit and 512-bit versions of those instructions. The table at
felix's website is accurate: https://www.felixcloutier.com/x86/aesenc

This is actually similar to GFNI:
* GFNI: 128-bit only, non-VEX, non-EVEX
* GFNI+AVX: VEX allowed, 128- and 256-bit; no EVEX
* GFNI+AVX512F: 128- and 256-bit with VEX, 512-bit with EVEX
* GFNI+AVX512VL: 128- and 256-bit with VEX, all with EVEX
* GFNI+AVX10 without EVEX512: 128- and 256-bit with VEX and EVEX, no 512-bit

The F-no-VL case does not exist in practice.

> But, it is especially messy because -mvaes doesn't imply -maes, so IMHO if
> somebody e.g. asks for -mvaes -mavx512vl -mno-aes and the insns don't use
> any xmm16+ register, it would emit the insn using VEX encoding rather than
> EVEX, so I think we need to use {evex} prefixes.

Would it be simpler to just imply that VAES includes AESNI? There are no
processors that have VAES without AESNI and it doesn't make sense for there to
be one.

[Bug middle-end/85620] Missing ENDBR after swapcontext

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85620

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|REOPENED|RESOLVED
   Target Milestone|9.0 |13.0

--- Comment #13 from Andrew Pinski  ---
.

[Bug target/81652] [meta-bug] -fcf-protection=full bugs

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81652
Bug 81652 depends on bug 85620, which changed state.

Bug 85620 Summary: Missing ENDBR after swapcontext
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85620

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

[Bug target/86466] [X86] gcc checks the range of the immediate to _mm_blend_ps, but not _mm_blend_epi32

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86466

--- Comment #1 from Andrew Pinski  ---
Created attachment 57870
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57870=edit
testcase

Please next time attach or place the testcase inline instead of just linking to
godbolt, we were just lucky that the godbolt URLs are still valid after these
few years.

[Bug target/40988] incorrect code when using ..._bit macros from asm/bitops.h in a loop in userspace program

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40988

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Andrew Pinski  ---
So yes I was correct in saying the inline-asm was incorrect. it should have
been:
```

__asm__ __volatile__(
"btrl %2,%1\n\tsbbl %0,%0"
:"=r" (oldbit), "+m" (ADDR)
:"Ir" (nr) : "memory");
return oldbit;
```

as this both reads and write the memory.

[Bug target/40988] incorrect code when using ..._bit macros from asm/bitops.h in a loop in userspace program

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40988

--- Comment #3 from Andrew Pinski  ---
One more note the Linux kernel sources has been corrected already.

They do now:
asm volatile(__ASM_SIZE(btr) " %2,%1"
 CC_SET(c)
 : CC_OUT(c) (oldbit)
 : ADDR, "Ir" (nr) : "memory");
return oldbit;


and 

asm volatile(__ASM_SIZE(bts) " %1,%0" : : ADDR, "Ir" (nr) : "memory");

Which describes the same thing as "+" as the memory clobber says it will update
memory too.

Linux commit 5b77e95dd7790 changed it from "+" to the above.
Linux commit 92934bcbf96bc (in 2006 which was included in 2.6.16) fixed it to
be "+" so you must have copied the linux sources from FC6 and didn't look to
see if the Linux kernel header was updated in your FC11 system.

[Bug c++/28017] lack of guard variables for explicitly instantiated template static data

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28017

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=80320
  Known to fail||

--- Comment #14 from Andrew Pinski  ---
I think this and PR 80320 both have the same underlying issue.

[Bug target/40771] generated code is ~25% slower when autovectorization is enabled

2024-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40771

--- Comment #4 from Andrew Pinski  ---
AARCH64 vectorization looks decent too:
```
dup v31.8h, w0
adrpx2, .LC0
adrpx0, .LC1
adrpx1, .LANCHOR0
ldr q30, [x2, #:lo12:.LC0]
ldr q29, [x0, #:lo12:.LC1]
add v30.8h, v31.8h, v30.8h
add v29.8h, v31.8h, v29.8h
uzp2v29.16b, v30.16b, v29.16b
str q29, [x1, #:lo12:.LANCHOR0]
```

The only improvement that can be made there is with SVE, those ldr could be
`index` instructions instead but that is PR 113328 .

[Bug rtl-optimization/93565] [11/12/13 Regression] Combine duplicates instructions

2024-04-03 Thread wilco at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565

--- Comment #31 from Wilco  ---
(In reply to Andrew Pinski from comment #29)
> Looking back at this one, I (In reply to Wilco from comment #8)
> > Here is a much simpler example:
> > 
> > void f (int *p, int y)
> > {
> >   int a = y & 14;
> >   *p = a | p[a];
> > }
> After r14-9692-g839bc42772ba7af66af3bd16efed4a69511312ae, we now get:
> f:
> .LFB0:
> .cfi_startproc
> and w2, w1, 14
> mov x1, x2
> ldr w2, [x0, x2, lsl 2]
> orr w1, w2, w1
> str w1, [x0]
> ret
> .cfi_endproc
> 
> There is an extra move still but the duplicated and is gone. (with
> -frename-registers added, the move is gone as REE is able to remove the zero
> extend but then there is a life range conflict so can't remove the move too).

Even with the mov it is better since that can be done with zero latency in
rename in most CPUs.

> So maybe this should be closed as fixed for GCC 14 and the cost changes for
> clz reverted.

The ctz costs are correct since it is a 2-instruction sequence - it only needs
adjusting for CSSC.

[Bug target/114510] [14 Regression] missed proping of multiply by 2 into address of load/stores

2024-04-03 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org

--- Comment #3 from Tamar Christina  ---
Richard's patch should fix this next year.

https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634166.html

We'll then stop relying on combine or other passes to fix this.

1 2 >

1 - 100 of 142 matches

Mail list logo