[Bug c++/114946] New: [concepts] No error for invalid requires constraint in declaration

2024-05-04 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114946

Bug ID: 114946
   Summary: [concepts] No error for invalid requires constraint in
declaration
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nshead at gcc dot gnu.org
  Target Milestone: ---

The following sample compiles fine with 'g++ -std=c++20 -pedantic-errors':

  template 
requires 
  struct S {};

  template 
requires 
  void foo() {}

Note that '' has not been declared or defined.  Both MSVC and Clang
complain about the undeclared identifier.  GCC does error if we attempt to
instantiate either of these specialisations, but they always (silently) lose to
a better match:

  template  struct S {};
  template  requires  struct S {};

  template  void foo() {}
  template  requires  void foo() {}

  int main() {
S x;
foo();
  }

[Bug c++/93008] Need a way to make inlining heuristics ignore whether a function is inline

2024-05-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93008

--- Comment #12 from Jonathan Wakely  ---
There's nothing fake about them, they are definitely inline functions as far as
the language rules. But in some cases we don't want the compiler to use that
fact as an optimisation hint.

[Bug fortran/114874] [14/15 Regression] ICE with select type, type is (character(*)), and substring

2024-05-04 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114874

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org

--- Comment #7 from Paul Thomas  ---
Created attachment 58104
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58104=edit
Fix for this PR

This seems to be the best fix. I have tried several different approaches in the
last two days but it has been an uphill struggle against the state of the block
namespaces at this stage of the compilation.

I'll think about it for another day or so before submitting.

Cheers

Paul

[Bug c++/93008] Need a way to make inlining heuristics ignore whether a function is inline

2024-05-04 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93008

--- Comment #11 from Eric Gallager  ---
(In reply to Jan Hubicka from comment #8)
> Reading the discussion again, I don't think we have a way to make inline
> keyword ignored by inliner.  We can make not_really_inline attribute (better
> name would be welcome).

"fake_inline"?

[Bug tree-optimization/114945] [14/15 regression] Sporadic std::vector::resize() -Wstringop-overflow or -Warray-bounds warning with gcc 14

2024-05-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114945

Andrew Pinski  changed:

   What|Removed |Added

Summary|[14 regression] Sporadic|[14/15 regression] Sporadic
   |std::vector::resize()   |std::vector::resize()
   |-Wstringop-overflow or  |-Wstringop-overflow or
   |-Warray-bounds warning with |-Warray-bounds warning with
   |gcc 14  |gcc 14
   Target Milestone|--- |14.0

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-05-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #63 from Segher Boessenkool  ---
(In reply to Sarah Julia Kriesch from comment #62)
> (In reply to Segher Boessenkool from comment #61)
> > (In reply to Sarah Julia Kriesch from comment #60)
> > > I have to agree with Richard. This problem has been serious for a long 
> > > time
> > > but has been ignored by IBM based on distribution choices.
> > 
> > What?  What does IBM have to do with this?  Yes, they are my employer, but
> > what I decide is best for combine to do is not influenced by them *at all*
> > (except some times they want me to spend time doing paid work, distracting
> > me from things that really matter, like combine!)
> > 
> Then, tell other reasons why my requests in the openSUSE bug report have
> been rejected in the past, and this bug report has been open for 3 years.
> Perhaps it is helpful to know that IBM has fixed memory issues in PostgreSQL
> (for openSUSE/upstream) with higher quality via my request with the support
> for Red Hat (and faster).

Once again, I have no idea what you are talking about.  It sounds like some
complot theory?  Exciting!

I really have no idea what you are talking about.  I recognise some of the
words, but not enough to give me a handle on what you are on about.

> > > Anyway, we want to live within the open source community without any Linux
> > > distribution priorities (especially in upstream projects like here).
> > 
> > No idea what that means either.
> > 
> There is a reason for founding the Linux Distributions Working Group at the
> Open Mainframe Project (equality for all Linux Distributions on s390x).
> SUSE, Red Hat and Canonical have been supporting this idea also (especially
> based on my own work experience at IBM and the priorities inside).

And here I don't have any context either.

> > > Segher, can you specify the failed test cases? Then, it should be possible
> > > to reproduce and improve that all. In such a collaborative way, we can 
> > > also
> > > achieve a solution.
> > 
> > What failed test cases?  You completely lost me.
> > 
> This one:
> (In reply to Segher Boessenkool from comment #57)
> > (In reply to Richard Biener from comment #56)
> > PR101523 is a very serious problem, way way way more "P1" than any of the
> > "my target was inconvenienced by some bad testcases failing now" "P1"s there
> > are now.  Please undo this!

They are in this PR.  "See Also", top right corner in the headings.

> (In reply to Segher Boessenkool from comment #61)
> > We used to do the wrong thing in combine.  Now that my fix was reverted, we
> > still do.  This should be undone soonish, so that I can commit an actual
> > UNCSE
> > implementation, which fixes all "regressions" (quotes, because they are 
> > not!)
> > caused by my previous patch, and does a lot more too.  It also will allow us
> > to remove a bunch of other code from combine, speeding up things a lot more
> > (things that keep a copy of a set if the dest is used more than once).  
> > There
> > has been talk of doing an UNCSE for over twenty years now, so annoying me
> > enough to get this done is a good result of this whole thing :-)
> Your fixes should also work with upstream code and the used gcc versions in
> our/all Linux distributions. I recommend applying tests and merging your
> fixes to at least one gcc version.

Lol.  No.  Distributions have to sort out their own problems.  I don't have
a copy of an old version of most distros even; I haven't *heard* about the
*existence* of most distros!

I don't use a Linux distro on any of my own machines.  And I care about some
other OSes at least as much, btw.  And not just because my employer cares about
some of those.

> If you want to watch something about our reasons for creating a
> collaboration between Linux distributions (and upstream projects), you
> should watch my first presentation "Collaboration instead of Competition":
> https://av.tib.eu/media/57010
> 
> Hint: The IBM statement came from my former IBM Manager (now your CPO).

CPO?  What is a CPO?  I don't think I have any?  I do have an R2 somewhere,
does that help?

[Bug c++/114945] New: Sporadic std::vector::resize() -Wstringop-overflow or -Warray-bounds warning with gcc 14

2024-05-04 Thread nilsgladitz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114945

Bug ID: 114945
   Summary: Sporadic std::vector::resize() -Wstringop-overflow or
-Warray-bounds warning with gcc 14
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nilsgladitz at gmail dot com
  Target Milestone: ---

Created attachment 58103
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58103=edit
Testcase

Initially seen with a Ubuntu specific GCC 14 x86-64 snapshot but also
reproduced with a vanilla arm32 build of GCC 14.1-rc1. Not seen with GCC 13.1.

I see there are a couple of issues which sound similar or related (e.g. bugs
113664, 106185, 105823, 105746) but I can't really tell if this is a duplicate
or regression of a previously fixed issue.

The reduced attached test case compiled with
g++-14 -std=c++20 -O2 case1.cpp

Produces the warning:
/usr/include/c++/14/bits/stl_algobase.h:972:25: warning: ‘void*
__builtin_memset(void*, int, long unsigned int)’ writing 3 bytes into a region
of size 0 overflows the destination [-Wstringop-overflow=]
  972 | __builtin_memset(__first, static_cast(__tmp),
__len);
  |
^~~

Adding -Wall replaces the above warning with this new warning:
/usr/include/c++/14/bits/stl_algobase.h:972:25: warning: ‘void*
__builtin_memset(void*, int, long unsigned int)’ offset [0, 2] is out of the
bounds [0, 0] [-Warray-bounds=]
  972 | __builtin_memset(__first, static_cast(__tmp),
__len);
  |
^~~

Also recreated this on godbolt: https://godbolt.org/z/rYx9q7Ke1

[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-04 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944

John Platts  changed:

   What|Removed |Added

 Target||x86_64-*-*, i?86-*-*

--- Comment #1 from John Platts  ---
Here is another snippet of code that has suboptimal codegen on SSE2 with GCC
13.2.0:
#include 
#include 

__m128i SSE2ShuffleI8(__m128i a, __m128i b) {
  alignas(16) uint8_t a_lanes[16];
  alignas(16) uint8_t b_lanes[16];

  _mm_store_si128(reinterpret_cast<__m128i*>(a_lanes), a);
  _mm_store_si128(reinterpret_cast<__m128i*>(b_lanes),
  _mm_and_si128(b, _mm_set1_epi8(static_cast(15;

  __m128i v0 = _mm_cvtsi32_si128(a_lanes[b_lanes[0]]);
  __m128i v1 = _mm_cvtsi32_si128(a_lanes[b_lanes[1]]);
  __m128i v2 = _mm_cvtsi32_si128(a_lanes[b_lanes[2]]);
  __m128i v3 = _mm_cvtsi32_si128(a_lanes[b_lanes[3]]);
  __m128i v4 = _mm_cvtsi32_si128(a_lanes[b_lanes[4]]);
  __m128i v5 = _mm_cvtsi32_si128(a_lanes[b_lanes[5]]);
  __m128i v6 = _mm_cvtsi32_si128(a_lanes[b_lanes[6]]);
  __m128i v7 = _mm_cvtsi32_si128(a_lanes[b_lanes[7]]);
  __m128i v8 = _mm_cvtsi32_si128(a_lanes[b_lanes[8]]);
  __m128i v9 = _mm_cvtsi32_si128(a_lanes[b_lanes[9]]);
  __m128i v10 = _mm_cvtsi32_si128(a_lanes[b_lanes[10]]);
  __m128i v11 = _mm_cvtsi32_si128(a_lanes[b_lanes[11]]);
  __m128i v12 = _mm_cvtsi32_si128(a_lanes[b_lanes[12]]);
  __m128i v13 = _mm_cvtsi32_si128(a_lanes[b_lanes[13]]);
  __m128i v14 = _mm_cvtsi32_si128(a_lanes[b_lanes[14]]);
  __m128i v15 = _mm_cvtsi32_si128(a_lanes[b_lanes[15]]);

  v0 = _mm_unpacklo_epi8(v0, v1);
  v2 = _mm_unpacklo_epi8(v2, v3);
  v4 = _mm_unpacklo_epi8(v4, v5);
  v6 = _mm_unpacklo_epi8(v6, v7);
  v8 = _mm_unpacklo_epi8(v8, v9);
  v10 = _mm_unpacklo_epi8(v10, v11);
  v12 = _mm_unpacklo_epi8(v12, v13);
  v14 = _mm_unpacklo_epi8(v14, v15);

  v0 = _mm_unpacklo_epi16(v0, v2);
  v4 = _mm_unpacklo_epi16(v4, v6);
  v8 = _mm_unpacklo_epi16(v8, v10);
  v12 = _mm_unpacklo_epi16(v12, v14);

  v0 = _mm_unpacklo_epi32(v0, v4);
  v8 = _mm_unpacklo_epi32(v8, v12);

  return _mm_unpacklo_epi64(v0, v8);
}

Here is the code that is generated when the above code is compiled on x86_64
GCC 13.2.0 with the -O2 option:
SSE2ShuffleI8(long long __vector(2), long long __vector(2)):
sub rsp, 144
pandxmm1, XMMWORD PTR .LC0[rip]
movaps  XMMWORD PTR [rsp+120], xmm0
movdeax, xmm1
movzx   eax, al
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+104], xmm1
movdxmm0, eax
movzx   eax, BYTE PTR [rsp+105]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+88], xmm1
movdxmm2, eax
movzx   eax, BYTE PTR [rsp+90]
punpcklbw   xmm0, xmm2
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+72], xmm1
movdxmm8, eax
movzx   eax, BYTE PTR [rsp+75]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+56], xmm1
movdxmm2, eax
movzx   eax, BYTE PTR [rsp+60]
punpcklbw   xmm8, xmm2
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+40], xmm1
punpcklwd   xmm0, xmm8
movdxmm5, eax
movzx   eax, BYTE PTR [rsp+45]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+24], xmm1
movdxmm2, eax
movzx   eax, BYTE PTR [rsp+30]
punpcklbw   xmm5, xmm2
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp+8], xmm1
movdxmm7, eax
movzx   eax, BYTE PTR [rsp+15]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-8], xmm1
movdxmm2, eax
movzx   eax, BYTE PTR [rsp]
punpcklbw   xmm7, xmm2
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-24], xmm1
punpcklwd   xmm5, xmm7
punpckldq   xmm0, xmm5
movdxmm3, eax
movzx   eax, BYTE PTR [rsp-15]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-40], xmm1
movdxmm4, eax
movzx   eax, BYTE PTR [rsp-30]
punpcklbw   xmm3, xmm4
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-56], xmm1
movdxmm6, eax
movzx   eax, BYTE PTR [rsp-45]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-72], xmm1
movdxmm2, eax
movzx   eax, BYTE PTR [rsp-60]
punpcklbw   xmm6, xmm2
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-88], xmm1
punpcklwd   xmm3, xmm6
movdxmm2, eax
movzx   eax, BYTE PTR [rsp-75]
movzx   eax, BYTE PTR [rsp+120+rax]
movaps  XMMWORD PTR [rsp-104], xmm1
movdxmm4, eax
movzx   eax, BYTE PTR [rsp-90]
punpcklbw   xmm2, xmm4

[Bug target/114944] New: Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-04 Thread john_platts at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944

Bug ID: 114944
   Summary: Codegen of __builtin_shuffle for an 16-byte uint8_t
vector is suboptimal on SSE2
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: john_platts at hotmail dot com
  Target Milestone: ---

Here is a snippet of code that has suboptimal codegen on SSE2:

#include 
#include 

__m128i SSE2ShuffleI8(__m128i a, __m128i b) {
  typedef uint8_t GccU8M128Vec __attribute__((__vector_size__(16)));
  return reinterpret_cast<__m128i>(__builtin_shuffle(
reinterpret_cast(a), reinterpret_cast(b)));
}

Here is the code that is generated when the above code is compiled on x86_64
GCC 13.2.0 with the -O2 option:
SSE2ShuffleI8(long long __vector(2), long long __vector(2)):
pushr15
movdr11d, xmm1
pushr14
and r11d, 15
pushr13
pushr12
pushrbp
pushrbx
sub rsp, 160
movaps  XMMWORD PTR [rsp+8], xmm1
movzx   edx, BYTE PTR [rsp+16]
movaps  XMMWORD PTR [rsp+24], xmm1
movzx   eax, BYTE PTR [rsp+31]
movaps  XMMWORD PTR [rsp+40], xmm1
mov rcx, rdx
movzx   r15d, BYTE PTR [rsp+46]
and ecx, 15
and eax, 15
movaps  XMMWORD PTR [rsp+120], xmm1
movzx   ebx, BYTE PTR [rsp+121]
mov QWORD PTR [rsp-120], rcx
and r15d, 15
movaps  XMMWORD PTR [rsp+136], xmm0
and ebx, 15
movaps  XMMWORD PTR [rsp+104], xmm1
movzx   ebp, BYTE PTR [rsp+106]
movaps  XMMWORD PTR [rsp+88], xmm1
movzx   r12d, BYTE PTR [rsp+91]
movaps  XMMWORD PTR [rsp+72], xmm1
movzx   r13d, BYTE PTR [rsp+76]
and ebp, 15
movaps  XMMWORD PTR [rsp+56], xmm1
movzx   r14d, BYTE PTR [rsp+61]
and r12d, 15
movaps  XMMWORD PTR [rsp-8], xmm1
movzx   edx, BYTE PTR [rsp+1]
and r13d, 15
movaps  XMMWORD PTR [rsp-24], xmm1
movzx   ecx, BYTE PTR [rsp-14]
and r14d, 15
movaps  XMMWORD PTR [rsp-40], xmm1
movzx   esi, BYTE PTR [rsp-29]
and edx, 15
movaps  XMMWORD PTR [rsp-56], xmm1
movzx   edi, BYTE PTR [rsp-44]
and ecx, 15
movaps  XMMWORD PTR [rsp-72], xmm1
movzx   r8d, BYTE PTR [rsp-59]
and esi, 15
movaps  XMMWORD PTR [rsp-88], xmm1
movzx   r9d, BYTE PTR [rsp-74]
and edi, 15
movaps  XMMWORD PTR [rsp-104], xmm1
movzx   r10d, BYTE PTR [rsp-89]
and r8d, 15
movzx   eax, BYTE PTR [rsp+136+rax]
movzx   r15d, BYTE PTR [rsp+136+r15]
and r9d, 15
movzx   r14d, BYTE PTR [rsp+136+r14]
sal rax, 8
movzx   ebp, BYTE PTR [rsp+136+rbp]
movzx   r13d, BYTE PTR [rsp+136+r13]
and r10d, 15
or  rax, r15
movzx   r12d, BYTE PTR [rsp+136+r12]
movzx   ebx, BYTE PTR [rsp+136+rbx]
sal rax, 8
movzx   edi, BYTE PTR [rsp+136+rdi]
movzx   r9d, BYTE PTR [rsp+136+r9]
or  rax, r14
movzx   esi, BYTE PTR [rsp+136+rsi]
movzx   r8d, BYTE PTR [rsp+136+r8]
sal rax, 8
movzx   ecx, BYTE PTR [rsp+136+rcx]
movzx   edx, BYTE PTR [rsp+136+rdx]
or  rax, r13
sal rax, 8
or  rax, r12
sal rax, 8
or  rax, rbp
sal rax, 8
or  rax, rbx
movzx   ebx, BYTE PTR [rsp+136+r11]
sal rax, 8
mov r11, rax
movzx   eax, BYTE PTR [rsp+136+r10]
sal rax, 8
or  rax, r9
sal rax, 8
or  r11, rbx
or  rax, r8
sal rax, 8
or  rax, rdi
sal rax, 8
or  rax, rsi
sal rax, 8
or  rax, rcx
mov rcx, QWORD PTR [rsp-120]
mov QWORD PTR [rsp-120], r11
sal rax, 8
or  rax, rdx
movzx   edx, BYTE PTR [rsp+136+rcx]
sal rax, 8
or  rax, rdx
mov QWORD PTR [rsp-112], rax
movdqa  xmm0, XMMWORD PTR [rsp-120]
add rsp, 160
pop rbx
pop rbp
pop r12
pop r13
pop r14
pop r15
ret

The above code unnecessarily allocates more stack space than is necessary and
stores xmm1 (the index vector) multiple times.

Here is an more optimal version of SSE2ShuffleI8:
.LSSE2ShuffleI8_Element_Mask:
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15
  .byte 15

SSE2ShuffleI8:
  

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-05-04 Thread sarah.kriesch at opensuse dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #62 from Sarah Julia Kriesch  ---
(In reply to Segher Boessenkool from comment #61)
> (In reply to Sarah Julia Kriesch from comment #60)
> > I have to agree with Richard. This problem has been serious for a long time
> > but has been ignored by IBM based on distribution choices.
> 
> What?  What does IBM have to do with this?  Yes, they are my employer, but
> what I decide is best for combine to do is not influenced by them *at all*
> (except some times they want me to spend time doing paid work, distracting
> me from things that really matter, like combine!)
> 
Then, tell other reasons why my requests in the openSUSE bug report have been
rejected in the past, and this bug report has been open for 3 years.
Perhaps it is helpful to know that IBM has fixed memory issues in PostgreSQL
(for openSUSE/upstream) with higher quality via my request with the support for
Red Hat (and faster).

> > Anyway, we want to live within the open source community without any Linux
> > distribution priorities (especially in upstream projects like here).
> 
> No idea what that means either.
> 
There is a reason for founding the Linux Distributions Working Group at the
Open Mainframe Project (equality for all Linux Distributions on s390x).
SUSE, Red Hat and Canonical have been supporting this idea also (especially
based on my own work experience at IBM and the priorities inside).

> > Segher, can you specify the failed test cases? Then, it should be possible
> > to reproduce and improve that all. In such a collaborative way, we can also
> > achieve a solution.
> 
> What failed test cases?  You completely lost me.
> 
This one:
(In reply to Segher Boessenkool from comment #57)
> (In reply to Richard Biener from comment #56)
> PR101523 is a very serious problem, way way way more "P1" than any of the
> "my target was inconvenienced by some bad testcases failing now" "P1"s there
> are now.  Please undo this!

(In reply to Segher Boessenkool from comment #61)
> We used to do the wrong thing in combine.  Now that my fix was reverted, we
> still do.  This should be undone soonish, so that I can commit an actual
> UNCSE
> implementation, which fixes all "regressions" (quotes, because they are not!)
> caused by my previous patch, and does a lot more too.  It also will allow us
> to remove a bunch of other code from combine, speeding up things a lot more
> (things that keep a copy of a set if the dest is used more than once).  There
> has been talk of doing an UNCSE for over twenty years now, so annoying me
> enough to get this done is a good result of this whole thing :-)
Your fixes should also work with upstream code and the used gcc versions in
our/all Linux distributions. I recommend applying tests and merging your fixes
to at least one gcc version.


If you want to watch something about our reasons for creating a collaboration
between Linux distributions (and upstream projects), you should watch my first
presentation "Collaboration instead of Competition":
https://av.tib.eu/media/57010

Hint: The IBM statement came from my former IBM Manager (now your CPO).

[Bug target/114943] New: X86 AVX2: inefficient code generated to convert SIMD Vectors

2024-05-04 Thread vincenzo.innocente at cern dot ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114943

Bug ID: 114943
   Summary: X86 AVX2: inefficient code generated to convert SIMD
Vectors
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincenzo.innocente at cern dot ch
  Target Milestone: ---

in the example below (see https://godbolt.org/z/qnfT4fE5G )
convert and covert3 produce code that looks to me inefficient w/r/t convert2
(and clang)  for target x86-64-v3

#define VECTOR_EXT(N) __attribute__((vector_size(N)))
typedef float VECTOR_EXT(16) float32x4_t;
typedef double VECTOR_EXT(32) float64x4_t;

float32x4_t f1,f2,f3,f4,f;
float64x4_t d1,d2,d3,d4,d;


void covert() {
   for (int i=0;i<4;++i) {
d1[i] = f1[i];
d2[i] = f2[i];
d3[i] = f3[i];
d4[i] = f4[i];
  }

}

void covert2() {
   for (int i=0;i<4;++i)
d1[i] = f1[i];
 for (int i=0;i<4;++i)
d2[i] = f2[i];
 for (int i=0;i<4;++i)
d3[i] = f3[i];
 for (int i=0;i<4;++i)
d4[i] = f4[i];
}



void covert3() {
  d1 = __builtin_convertvector(f1,float64x4_t);
}

[Bug libstdc++/114940] std::generator relies on an optional overload of operator delete

2024-05-04 Thread arsen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114940

--- Comment #7 from Arsen Arsenović  ---
(In reply to Jonathan Wakely from comment #6)
> What would be needed to work without it? It looks like the allocation would
> have to be larger so the size could be stored as a cookie at the start of
> the allocated block?
> 
> Tangentially, could _M_alloc_size use __ba - 1 instead of __ba?

would it even require that?  AIUI, that flag only affects global sized dealloc
functions, so it'd only require changing the ::operator new/delete dealloc fn
in the _Promise_alloc case.  but, clang appears to intend to flip the
default soon: https://github.com/llvm/llvm-project/pull/90373 so I'm not sure
it's worth it anyway

re _M_alloc_size, do you mean _Promise_alloc case or the non-void one? 
in the void case, I think so (but haven't ensured it by doing the math on paper
yet; reasoning through it, it seems fine).

in the non-void case, it might even be possible to do -2 (same disclaimer goes)

[Bug libstdc++/114940] std::generator relies on an optional overload of operator delete

2024-05-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114940

--- Comment #6 from Jonathan Wakely  ---
What would be needed to work without it? It looks like the allocation would
have to be larger so the size could be stored as a cookie at the start of the
allocated block?

Tangentially, could _M_alloc_size use __ba - 1 instead of __ba?

[Bug c++/71482] Add -Wglobal-constructors warning option

2024-05-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71482

--- Comment #8 from Jonathan Wakely  ---
(In reply to Eric Gallager from comment #6)
> Another reason this warning might be wanted: name mangling and demangling of
> global constructors has been buggy for awhile now; see bug 54254

Looks like that's just a problem demangling the symbol name to print it in a
human-readable form. What's buggy about the mangling?

[Bug libstdc++/114940] std::generator relies on an optional overload of operator delete

2024-05-04 Thread arsen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114940

--- Comment #5 from Arsen Arsenović  ---
imo, creating a divergent code path for this case isn't worth it, especially
for something that isn't trivial.  I'd opt for checking for sized dealloc in
version.def.

[Bug libbacktrace/114941] libbacktrace build is broken for FDPIC uclibc targets by r14-5173-g2b64e4a54042

2024-05-04 Thread jcmvbkbc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114941

--- Comment #3 from jcmvbkbc at gcc dot gnu.org ---
(In reply to Ian Lance Taylor from comment #2)
> What is the correct way to get the address at which the shared library was
> loaded when using FDPIC?

There's no single base address in case of FDPIC, the macro __RELOC_POINTER(ptr,
loadaddr) can be used with the elf32_fdpic_loadaddr::dlpi_addr as the last
argument to translate an address according to the load map. An example is
available in the libgcc/unwind-dw2-fde-dip.c

[Bug target/114942] [14/15 Regression] ICE on valid code at -O1 with "-fno-tree-sra -fno-guess-branch-probability": in extract_constrain_insn, at recog.cc:2713

2024-05-04 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114942

--- Comment #2 from Uroš Bizjak  ---
This is the insn in question:

;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*qi_ext_1_slp"
   [(set (strict_low_part (match_operand:QI 0 "register_operand" "+Q,"))
 (any_logic:QI
   (subreg:QI
 (match_operator:SWI248 3 "extract_operator"
   [(match_operand 2 "int248_register_operand" "Q,Q")
(const_int 8)
(const_int 8)]) 0)
   (match_operand:QI 1 "nonimmediate_operand" "0,!qm")))
(clobber (reg:CC FLAGS_REG))]

When targeting alternative 1, reload should use some other register for operand
2.

[Bug other/101166] Add SPDX license identifiers

2024-05-04 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101166

--- Comment #3 from Eric Gallager  ---
The FSFE's REUSE tool could be helpful for this: 
https://github.com/fsfe/reuse-tool

[Bug tree-optimization/99475] [11 Regression] bogus -Warray-bounds accessing an array element of empty structs

2024-05-04 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99475

Eric Gallager  changed:

   What|Removed |Added

   Keywords||needs-bisection

--- Comment #8 from Eric Gallager  ---
(In reply to Siddhesh Poyarekar from comment #7)
> This doesn't appear to be reproducible on trunk anymore, should we close it?

Might be worth bisecting to find out when exactly it was fixed, but I'll leave
that decision up to someone else to make...

[Bug c++/71482] Add -Wglobal-constructors warning option

2024-05-04 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71482

Eric Gallager  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=2474,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=56009

--- Comment #7 from Eric Gallager  ---
(In reply to Eric Gallager from comment #6)
> Another reason this warning might be wanted: name mangling and demangling of
> global constructors has been buggy for awhile now; see bug 54254

Some more bugs about global constructors/destructors that might lead one to
want this warning: bug 2474 and bug 56009

[Bug sanitizer/79341] Many Asan tests fail on s390

2024-05-04 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79341

--- Comment #78 from Eric Gallager  ---
(In reply to Ilya Leoshkevich from comment #77)
> Apparently fixing the message in GCC will produce maintenance overhead [1]. 
> If that's not very important to you, I'd rather leave this message as is.
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648775.html

OK, I haven't actually seen GCC emit the message in the wild myself yet,
actually; I only came across it due to searching for bugs related to MSan...

[Bug rtl-optimization/101523] Huge number of combine attempts

2024-05-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #61 from Segher Boessenkool  ---
(In reply to Sarah Julia Kriesch from comment #60)
> I have to agree with Richard. This problem has been serious for a long time
> but has been ignored by IBM based on distribution choices.

What?  What does IBM have to do with this?  Yes, they are my employer, but
what I decide is best for combine to do is not influenced by them *at all*
(except some times they want me to spend time doing paid work, distracting
me from things that really matter, like combine!)

> Anyway, we want to live within the open source community without any Linux
> distribution priorities (especially in upstream projects like here).

No idea what that means either.

> Segher, can you specify the failed test cases? Then, it should be possible
> to reproduce and improve that all. In such a collaborative way, we can also
> achieve a solution.

What failed test cases?  You completely lost me.

We used to do the wrong thing in combine.  Now that my fix was reverted, we
still do.  This should be undone soonish, so that I can commit an actual UNCSE
implementation, which fixes all "regressions" (quotes, because they are not!)
caused by my previous patch, and does a lot more too.  It also will allow us
to remove a bunch of other code from combine, speeding up things a lot more
(things that keep a copy of a set if the dest is used more than once).  There
has been talk of doing an UNCSE for over twenty years now, so annoying me
enough to get this done is a good result of this whole thing :-)