[Bug target/101922] mips: illegal instruction at -O3 with -mmsa -mloongson-mmi

2021-08-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101922

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Xi Ruoyao :

https://gcc.gnu.org/g:f93f0868919ab32bfbc24adb40158298031a4d58

commit r12-3063-gf93f0868919ab32bfbc24adb40158298031a4d58
Author: Xi Ruoyao 
Date:   Fri Aug 20 22:52:57 2021 +0800

mips: msa: truncate immediate shift amount [PR101922]

When -mloongson-mmi is enabled, SHIFT_COUNT_TRUNCATED is turned off.
This causes untruncated immediate shift amount outputed into the asm,
and the GNU assembler refuses to assemble it.

Truncate immediate shift amount when outputing the asm instruction to
make GAS happy again.

gcc/

PR target/101922
* config/mips/mips-protos.h (mips_msa_output_shift_immediate):
  Declare.
* config/mips/mips.c (mips_msa_output_shift_immediate): New
  function.
* config/mips/mips-msa.md (vashl3, vashr3,
  vlshr3): Call it.

gcc/testsuite/

PR target/101922
* gcc.target/mips/pr101922.c: New test.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2021-08-22 Thread arthur200126 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Mingye Wang  changed:

   What|Removed |Added

 CC||arthur200126 at gmail dot com

--- Comment #30 from Mingye Wang  ---
One of the weird probably SEH-related things is that the lack-of-alignment
behavior of comment 28 and attachment 1 is not reproduced on a "normal" Linux
GCC with __attribute__((ms_abi)) sprinkled all over to get the right calling
convention. The code takes the same shape, uses mostly the same registers, but
the `and rsp, -32` is just either not there or placed wrong.

[Bug c++/92494] ICE on function templates with placeholder return type decltype([]{}) and if return type doesn't match

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92494

--- Comment #1 from Andrew Pinski  ---
This is fixed in GCC 10+.

[Bug c++/88162] GCC does not accept non-type template parameters of class type

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88162

--- Comment #1 from Andrew Pinski  ---
ICC also rejects this at both -std=c++17 and -std=c++20:
(15): error: a nontype template parameter may not have class type
  template class T> using nttp_t = typename decltype(
f(T()) )::type;
 ^

(15): error: no instance of function template "f" matches the argument
list
argument types are: (int_constant<>)
  template class T> using nttp_t = typename decltype(
f(T()) )::type;
 ^
(6): note: this candidate was rejected because at least one template
argument could not be deduced
  template class T> id f( T )
   ^
  detected during instantiation of type "nttp_t" at line
22

(15): error: no instance of function template "f" matches the argument
list
argument types are: (char_constant<>)
  template class T> using nttp_t = typename decltype(
f(T()) )::type;
 ^
(6): note: this candidate was rejected because at least one template
argument could not be deduced
  template class T> id f( T )
   ^
  detected during instantiation of type "nttp_t" at line
22

(15): error: no instance of function template "f" matches the argument
list
argument types are: (long_constant<>)
  template class T> using nttp_t = typename decltype(
f(T()) )::type;
 ^
(6): note: this candidate was rejected because at least one template
argument could not be deduced
  template class T> id f( T )
   ^
  detected during instantiation of type "nttp_t" at line
22

(15): error: no instance of function template "f" matches the argument
list
argument types are: (voidp_constant<>)
  template class T> using nttp_t = typename decltype(
f(T()) )::type;
 ^
(6): note: this candidate was rejected because at least one template
argument could not be deduced
  template class T> id f( T )
   ^
  detected during instantiation of type "nttp_t" at
line 22

compilation aborted for  (code 2)

[Bug c++/86959] Use of a variadic alias template unexpectedly breaks compilation

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86959

--- Comment #1 from Andrew Pinski  ---
clang ICEs with -std=c++20 :).

GCC ICEs starting in GCC 9:
: In substitution of 'template template using Alias
= Outer<  >::Inner [with T = {void};
 = void]':
:18:38:   required from here
:8:11: internal compiler error: in lookup_template_class_1, at
cp/pt.c:10185
8 | using Alias = Inner;
  |   ^
0x1dd5279 internal_error(char const*, ...)
???:0
0x74caf1 fancy_abort(char const*, int, char const*)
???:0
0x9c6efe lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int, int)
???:0
0x9b1d4d tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x9b1df7 tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x9e8992 instantiate_template(tree_node*, tree_node*, int)
???:0
0x9b22a9 tsubst(tree_node*, tree_node*, int, tree_node*)
???:0
0x9c557d lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int, int)
???:0
0xa1e7fd finish_template_type(tree_node*, tree_node*, int)
???:0
0x97e6b5 c_parse_file()
???:0
0xb03782 c_common_parse_file()
???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug c++/86234] non-type template argument is not a constant expression

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86234

--- Comment #1 from Andrew Pinski  ---
If I place "A t;" before main(), then ICC, clang and MSVC all accept the
code. That seems out a bit backwards for me but I don't know the C++ standard
really. But it might point out why GCC is accepting the code.

[Bug c++/82947] Variadic `using` directive incorrectly compiled without base classes (with class template argument deduction)

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82947

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.0
   Keywords||accepts-invalid
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
This was fixed in GCC 11+, most likely by r11-6942.

[Bug c++/15272] lookup, dependent base

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15272

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.0

[Bug c++/82947] Variadic `using` directive incorrectly compiled without base classes (with class template argument deduction)

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82947

--- Comment #1 from Andrew Pinski  ---
In GCC 11+ we get:
: In instantiation of 'struct foo >':
:16:19:   required from here
:8:29: error: type 'main()::' is not a base type for type
'foo >'
8 | using Ts::operator()...;
  | ^~~

[Bug c++/78753] non-ambiguous overload resolution with function template partial ordering rules

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78753

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||accepts-invalid

--- Comment #2 from Andrew Pinski  ---
Only Clang rejects this code as being ambiguous.

[Bug c++/62227] [DR535] Templated move not elided

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62227

--- Comment #5 from Andrew Pinski  ---
C++17 and C++20 modes no longer print move since GCC 7.
Most likely due to the patches to implement p0135.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0135r1.html

[Bug c++/16191] Note for missing 'template' reports wrong template parameter

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16191

--- Comment #8 from Andrew Pinski  ---
Looks like the resolution of DR1710 (though it was supposed to be C++17+)
causes the code without the template to be accepted which means this should be
rejected for C++98, C++03, C++11 and C++14 

[Bug c++/94057] [9 Regression] -std=gnu++20 causes failure naming nested templated class since r9-4536

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94057

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|9.4 |10.0

[Bug c++/16191] Note for missing 'template' reports wrong template parameter

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16191

--- Comment #7 from Andrew Pinski  ---
Hmm, this code started be accepted in GCC 10+; I suspect by the fix for PR
94057.
was that really expected?

[Bug target/30484] INT_MIN % -1 is well defined for -fwrapv

2021-08-22 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30484

--- Comment #12 from Vincent Lefèvre  ---
(In reply to Joseph S. Myers from comment #10)
> There is still a bug for the -fwrapv case, where clearly both INT_MIN / -1
> and INT_MIN % -1 should be well defined, but probably the extra checks
> if implemented should only be enabled implicitly for -fwrapv, not for C
> standards conformance modes.

I don't understand why it is still a bug for -fwrapv. Mathematically, INT_MIN %
-1 gives 0; there is no wrapping for the modulo. So, -fwrapv shouldn't
introduce a change.

[Bug c++/78223] [DR1454] struct containing default member initializer fails constexpr test in aggregate initialization

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78223

Andrew Pinski  changed:

   What|Removed |Added

Summary|struct containing default   |[DR1454] struct containing
   |member initializer fails|default member initializer
   |constexpr test in aggregate |fails constexpr test in
   |initialization  |aggregate initialization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-08-22
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
:3:8: error: modification of '' is not a constant expression
3 | } y {{}};
  |^

[Bug c++/61991] Destructors not always called for statically initialized thread_local objects

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61991

--- Comment #2 from Andrew Pinski  ---
GCC, clang and ICC all have this same behavior in that if y is not used, the y
is not initialized or deconstructed.

[Bug c++/92073] references/pointers to thread_local are not constant expressions

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92073

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-08-22
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug c++/59994] [meta-bug] thread_local

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59994
Bug 59994 depends on bug 60673, which changed state.

Bug 60673 Summary: c++11 static thread_local members may cause a segfault when 
accessed via 'this->'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60673

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/60702] thread_local initialization

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60702

Andrew Pinski  changed:

   What|Removed |Added

 CC||michael at ensslin dot cc

--- Comment #26 from Andrew Pinski  ---
*** Bug 60673 has been marked as a duplicate of this bug. ***

[Bug c++/60673] c++11 static thread_local members may cause a segfault when accessed via 'this->'

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60673

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
Which makes this a dup of bug 60702.

*** This bug has been marked as a duplicate of bug 60702 ***

[Bug target/15533] Missed move to partial register

2021-08-22 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15533

Peter Cordes  changed:

   What|Removed |Added

 CC||peter at cordes dot ca

--- Comment #5 from Peter Cordes  ---
The new asm less bad, but still not good.  PR53133 is closed, but this code-gen
is a new instance of partial-register writing with xor al,al.  Also related:
PR82940 re: identifying bitfield insert patterns in the middle-end; hopefully
Andrew Pinski's planned set of patches to improve that can help back-ends do a
better job?

If we're going to read a 32-bit reg after writing an 8-bit reg (causing a
partial-register stall on Nehalem and earlier), we should be doing

  mov  a, %al   # merge into the low byte of RAX
  ret

Haswell and newer Intel don't rename the low byte partial register separately
from the full register, so they behave like AMD and other non-P6 /
non-Sandybridge CPU: dependency on the full register.  That's good for this
code; in this case the merging is necessary and we don't want the CPU to guess
that it won't be needed later.  The load+ALU-merge uops can micro-fuse into a
single uop for the front end.

 xor %al,%al still has a false dependency on the old value of RAX because it's
not a zeroing idiom; IIRC in my testing it's at least as good to do  mov $0,
%al.  Both instructions are 2 bytes long.

*
https://stackoverflow.com/questions/41573502/why-doesnt-gcc-use-partial-registers
 survey of the ways partial regs are handled on Intel P6 family vs. Intel
Sandybridge vs. Haswell and later vs. non-Intel and Intel Silvermont etc.
*
https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to
- details of my testing on Haswell / Skylake.



*If* we still care about  -mtune=nehalem  and other increasingly less relevant
CPUs, we should be avoiding a partial register stall for those tuning options
with something like

   movzbl   a, %edx
   and  $-256, %eax
   or   %edx, %eax

i.e. what we're already doing, but spend a 5-byte AND-immediate instead of a
2-byte xor %al,%al or mov $0, %al

(That's what clang always does, so it's missing the code-size optimization.
https://godbolt.org/z/jsE57EKcb shows a similar case of return (a&0xFF00u)
| (b&0xFFu); with two register args)

-

The penalty on Pentium-M through Nehalem is to stall for 2-3 cycles while a
merging uop is inserted.  The penalty on earlier P6 (PPro / Pentium III) is to
stall for 5-6 cycles until the partial-register write retires.

The penalty on Sandybridge (and maybe Ivy Bridge if it renames AL) is no stall,
just insert a merging uop.

On later Intel, and AMD, and Silvermont-family Intel, writing AL has a
dependency on the old RAX; it's a merge on the spot.

BTW, modern Intel does still rename AH separately, and merging does require the
front-end to issue a merging uop in a cycle by itself.  So writing AH instead
of AL would be different.

[Bug c++/60673] c++11 static thread_local members may cause a segfault when accessed via 'this->'

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60673

--- Comment #3 from Andrew Pinski  ---
(In reply to Jonathan Wakely from comment #2)
> This seems to be fixed in GCC 5 onwards (and recent Clang versions).

It was not fixed until GCC 7.5, 8.4 and 9+.
Here is a reduced testcase which shows it was not fixed until then:
extern "C" void abort(void);
struct tt
{
  int *tt1 = new int{1};
  int bucket_count() const {return *tt1;}
};
struct A{
static thread_local tt  s;

int f() {
return this->s.bucket_count();
}
int g() {
return A::s.bucket_count();
}
int h() {
return s.bucket_count();
}
};
thread_local tt A::s;
int main() {
if (A{}.f() != 1) abort();
return 0;
}

[Bug c++/81880] thread_local static member template initialisation fails

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81880

--- Comment #4 from Andrew Pinski  ---
Reduced testcase:
extern "C" void abort(void);
struct tt
{
  int *tt1 = new int{1};
  int bucket_count() const {return *tt1;}
};
struct A {
  template thread_local static tt m;
};
template thread_local tt A::m{};
int main() {
  if ( A::m.bucket_count() != 1) abort();
  return 0;
}

[Bug ipa/101949] [11/12 Regression] git miscompiled with -flto -fipa-pta since r11-5061-g85ebbabd85e03bdc

2021-08-22 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101949

--- Comment #17 from H.J. Lu  ---
(In reply to H.J. Lu from comment #16)
> On Linux/x86-64 with -m32, r12-3059 gave
> 
> FAIL: gcc.dg/lto/pr101949 c_lto_pr101949_0.o-c_lto_pr101949_1.o execute -O2
> -fipa-pta -flto -flto-partition=1to1

It also failed with -m64.

[Bug c++/44613] Declaring an array with non-constant length inside a switch corrupts stack pointer.

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44613

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0
 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Andrew Pinski  ---
Fixed.

[Bug c/98397] C2X: pointers to arrays with qualifiers are now pointers to qualified types

2021-08-22 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98397

Martin Uecker  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Martin Uecker  ---
Fixed on master.

[Bug middle-end/82940] Suboptimal code for (a & 0x7f) | (b & 0x80) on powerpc

2021-08-22 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82940

Peter Cordes  changed:

   What|Removed |Added

 CC||peter at cordes dot ca

--- Comment #6 from Peter Cordes  ---
For a simpler test case, GCC 4.8.5 did redundantly mask before using
bitfield-insert, but GCC 9.2.1 doesn't.


unsigned merge2(unsigned a, unsigned b){
return (a&0xFF00u) | (b&0xFFu);
}

https://godbolt.org/z/froExaPxe
# PowerPC (32-bit) GCC 4.8.5
rlwinm 4,4,0,0xff # b &= 0xFF is totally redundant
rlwimi 3,4,0,24,31
blr

# power64 GCC 9.2.1 (ATI13.0)
rlwimi 3,4,0,255# bit-blend according to mask, rotate count=0
rldicl 3,3,0,32 # Is this zero-extension to 64-bit redundant?
blr

But ppc64 GCC does zero-extension of the result from 32 to 64-bit, which is
probably not needed unless the calling convention has different requirements
for return values than for incoming args.  (I don't know PPC well enough.)

So for at least some cases, modern GCC does ok.

Also, when the blend isn't split at a byte boundary, even GCC4.8.5 manages to
avoid redundant masking before the bitfield-insert.

unsigned merge2(unsigned a, unsigned b){
return (a & 0xFF80u) | (b & 0x7Fu);
}

rlwimi 3,4,0,25,31   # GCC4.8.5, 32-bit so no zero-extension
blr

[Bug c/98397] C2X: pointers to arrays with qualifiers are now pointers to qualified types

2021-08-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98397

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Martin Uecker :

https://gcc.gnu.org/g:972eab51f53d1db26864ec7d62d40c2ff83407ec

commit r12-3060-g972eab51f53d1db26864ec7d62d40c2ff83407ec
Author: Martin Uecker 
Date:   Sun Aug 22 23:47:58 2021 +0200

Correct treatment of qualifiers for pointers to arrays for C2X [PR98397]

2021-08-22  Martin Uecker  

gcc/c/
PR c/98397
* c-typeck.c (comp_target_types): Change pedwarn to pedwarn_c11
for pointers to arrays with qualifiers.
(build_conditional_expr): For C23 don't lose qualifiers for
pointers
to arrays when the other pointer is a void pointer. Update
warnings.
(convert_for_assignment): Update warnings for C2X when converting
from
void* with qualifiers to a pointer to array with the same
qualifiers.

gcc/testsuite/
PR c/98397
* gcc.dg/c11-qual-1.c: New test.
* gcc.dg/c2x-qual-1.c: New test.
* gcc.dg/c2x-qual-2.c: New test.
* gcc.dg/c2x-qual-3.c: New test.
* gcc.dg/c2x-qual-4.c: New test.
* gcc.dg/c2x-qual-5.c: New test.
* gcc.dg/c2x-qual-6.c: New test.
* gcc.dg/c2x-qual-7.c: New test.
* gcc.dg/pointer-array-quals-1.c: Remove unnecessary flag.
* gcc.dg/pointer-array-quals-2.c: Remove unnecessary flag.

[Bug c++/55885] Modulo operator crashes for int and long variables if they have minimal value

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55885

Andrew Pinski  changed:

   What|Removed |Added

 CC||Eric.Deplagne at nerim dot net

--- Comment #8 from Andrew Pinski  ---
*** Bug 29511 has been marked as a duplicate of this bug. ***

[Bug c/29511] 0x80000000/-1 causes FPE on Intel/AMD

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29511

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|INVALID |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
Dup of bug 55885.

*** This bug has been marked as a duplicate of bug 55885 ***

[Bug c++/55885] Modulo operator crashes for int and long variables if they have minimal value

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55885

--- Comment #7 from Andrew Pinski  ---
Note PR 30484 is for the -fwrapv issue with %.

[Bug c++/55885] Modulo operator crashes for int and long variables if they have minimal value

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55885

Andrew Pinski  changed:

   What|Removed |Added

 CC||jens.seifert at de dot ibm.com

--- Comment #6 from Andrew Pinski  ---
*** Bug 93013 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/93013] PPC: optimization around modulo leads to incorrect result

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93013

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|INVALID |DUPLICATE

--- Comment #8 from Andrew Pinski  ---
Dup of bug 55885.

*** This bug has been marked as a duplicate of bug 55885 ***

[Bug ipa/101949] [11/12 Regression] git miscompiled with -flto -fipa-pta since r11-5061-g85ebbabd85e03bdc

2021-08-22 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101949

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #16 from H.J. Lu  ---
On Linux/x86-64 with -m32, r12-3059 gave

FAIL: gcc.dg/lto/pr101949 c_lto_pr101949_0.o-c_lto_pr101949_1.o execute -O2
-fipa-pta -flto -flto-partition=1to1

[Bug libstdc++/89979] subtract_with_carry_engine incorrect carry flag

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89979

--- Comment #3 from Andrew Pinski  ---
LLVM's libc++ does not go into the 0 loop but still does not do a good job:
4294967295 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1
0 0 0 0 0 0 4294967295 1
0 0 0 0 0 4294967295 4294967295 1
0 0 0 0 4294967295 4294967295 4294967295 1
0 0 0 4294967295 4294967295 4294967295 4294967294 0
0 0 4294967295 4294967295 4294967295 4294967294 4294967295 0
0 4294967295 4294967295 4294967295 4294967294 4294967295 4294967295 0
4294967295 4294967295 4294967295 4294967294 4294967295 4294967295 4294967294 0
4294967295 4294967295 4294967294 4294967295 4294967295 4294967294 0 0
4294967295 4294967294 4294967295 4294967295 4294967294 0 0 0
4294967294 4294967295 4294967295 4294967294 0 0 4294967295 1


While libstdc++ does seems to get into a loop:
4294967295 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 3
0 0 0 0 0 0 0 0 4
0 0 0 0 0 0 0 0 5
0 0 0 0 0 0 0 0 6
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 2
0 0 0 0 0 0 0 0 3
0 0 0 0 0 0 0 0 4

[Bug c++/87312] statics in lambdas should be weak not local symbols

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87312

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.0
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED
  Known to work||10.1.0
  Known to fail||9.4.0

--- Comment #1 from Andrew Pinski  ---
This is fixed in GCC 10+, most likely by r10-6110 (there are other changes in
the area of linkage too).

[Bug libstdc++/102015] [missed optimization] Small memory overhead in _Rb_tree_impl (fix would require ABI break)

2021-08-22 Thread kamkaz at windowslive dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102015

--- Comment #2 from Kamil Kaznowski  ---
(In reply to Andrew Pinski from comment #1)
> https://stackoverflow.com/questions/66573773/is-there-a-reason-for-8-bytes-
> of-size-overhead-in-libstdc-stdmultiset-map

This is my post, I forgot to post a bug that day.

There is also a mistake by me in the comments - MSVC _Compressed_pair works by
template specialization, not SFINAE.

[Bug tree-optimization/79334] Segfault on tree loop hoisting

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79334

--- Comment #5 from Andrew Pinski  ---
(In reply to Alan Modra from comment #4)
> When you have the tree optimization bug fixed, this becomes an rtl
> optimization bug since rtl pre does the same as tree pre..

GCSE was fixed with PR 78812. So this is just a bug on the gimple level still.

[Bug c++/77312] Lambda that deletes itself accesses freed memory, but only if class is templated

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77312

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |8.0
 Resolution|--- |FIXED

--- Comment #8 from Andrew Pinski  ---
Fixed.

[Bug c++/77312] Lambda that deletes itself accesses freed memory, but only if class is templated

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77312

--- Comment #7 from Andrew Pinski  ---
This is fixed in GCC 8:

  if (SAVE_EXPR <(struct LambdaHolder *) this> != 0B)
{
  try
{
  LambdaHolder::~LambdaHolder (SAVE_EXPR <(struct LambdaHolder *)
this>);
}
  finally
{
  operator delete ((void *) SAVE_EXPR <(struct LambdaHolder *) this>,
8);
}
}
  else
{
  <<< Unknown tree: void_cst >>>
} >;

[Bug rtl-optimization/57448] GCSE generates incorrect code with acquire barrier

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57448

Andrew Pinski  changed:

   What|Removed |Added

 CC||lucenadeveloper at gmail dot 
com

--- Comment #5 from Andrew Pinski  ---
*** Bug 70889 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/70889] memory reordering across loads tagged as memory_order_seq_cst

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70889

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Andrew Pinski  ---
Dup of bug 57448 which is fixed in GCC 8.

*** This bug has been marked as a duplicate of bug 57448 ***

[Bug rtl-optimization/57448] GCSE generates incorrect code with acquire barrier

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57448

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.0
   Keywords||wrong-code

[Bug fortran/94070] Assumed-rank arrays – bounds mishandled, SIZE/SHAPE/UBOUND/LBOUND

2021-08-22 Thread sandra at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94070

sandra at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |sandra at gcc dot 
gnu.org

--- Comment #8 from sandra at gcc dot gnu.org ---
In gfc_desc_to_cfi_desc (in libgfortran/runtime/ISO_Fortran_binding.c):

/* Assumed size arrays have gfc ubound == 0 and CFI extent = -1.  */
if (n == GFC_DESCRIPTOR_RANK (s) - 1
&& GFC_DESCRIPTOR_LBOUND(s, n) == 1
&& GFC_DESCRIPTOR_UBOUND(s, n) == 0)
  d->dim[n].extent = -1;
else
  d->dim[n].extent = (CFI_index_t)GFC_DESCRIPTOR_UBOUND(s, n)
 - (CFI_index_t)GFC_DESCRIPTOR_LBOUND(s, n) + 1;

The comment and test are only correct if the lower bound of the array dimension
either defaults to 1 or is explicitly specified as 1.  It does appear that the
ubound == 0 part is true, but this means e.g. an array dimension specified as
-3:* is indistinguishable from -3:0.

I think this needs to be corrected at the point where the GFC descriptor is
created; perhaps set ubound = lbound - 1?  Or also set lbound = 1 as the code
snippet above checks?  Assumed-size arrays can't be pointers or allocatable so
their bounds don't need to be preserved across calls, but maybe GFC descriptors
are used for other purposes where it matters?

[Bug rtl-optimization/70889] memory reordering across loads tagged as memory_order_seq_cst

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70889

--- Comment #1 from Andrew Pinski  ---
Testcase:

#include 
#include 

std::atomic seq_;
std::size_t value;

auto load()
{
std::size_t copy;
std::size_t seq0;
do
{
seq0 = seq_.load();
if (!seq0) continue;
copy = value;
seq0 = seq_.load();
} while (!seq0);

return copy;
}

[Bug rtl-optimization/97836] wrong code at -O1 on x86_64-pc-linux-gnu by r11-5029

2021-08-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97836

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Jan Hubicka  ---
EAF_UNUSED is now realy unused.

[Bug ipa/101257] [11/12 Regression] Maybe wrong code since IPA mod ref was introduced

2021-08-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101257

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |INVALID

--- Comment #7 from Jan Hubicka  ---
Thanks for testcase.  This indeed is aliasing violation.

We do:

ipa-modref: call stmt md5_single (, digest_18(D));  
ipa-modref: call to md5_single/11 does not use ref: MEM[(uint64_t *)_8] alias
sets: 3->3

which makes us to optimize it away.  This is uint64_t store from
*((uint64_t *) & buf[i - 8]) = (uint64_t) len *8;
and md5_single does uint32_t loads.

So I am marking this as invalid.

[Bug target/101296] Addition of x86 addsub SLP patterned slowed down 433.milc by 12% on znver2 with -Ofast -flto

2021-08-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101296

--- Comment #7 from Jan Hubicka  ---
"every access" means that we no longer track individual bases+offsets+sizes and
everything matching the base/ref alias set will be considered conflicting.

I planned to implement smarter merging of accesses so we do not run out of
limits for such sequential case.  Will look into it.

[Bug libstdc++/102015] [missed optimization] Small memory overhead in _Rb_tree_impl (fix would require ABI break)

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102015

--- Comment #1 from Andrew Pinski  ---
https://stackoverflow.com/questions/66573773/is-there-a-reason-for-8-bytes-of-size-overhead-in-libstdc-stdmultiset-map

[Bug ipa/101949] [11/12 Regression] git miscompiled with -flto -fipa-pta since r11-5061-g85ebbabd85e03bdc

2021-08-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101949

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:9b08f7764cecd16cba84944f2a8b67a7f73a7ce7

commit r12-3059-g9b08f7764cecd16cba84944f2a8b67a7f73a7ce7
Author: Jan Hubicka 
Date:   Sun Aug 22 20:57:19 2021 +0200

Clear EAF_NOCLOBBER for indirect calls

gcc/ChangeLog:

2021-08-22  Jan Hubicka  
Martin Liska  

PR middle-end/101949
* ipa-modref.c (analyze_ssa_name_flags): Indirect call implies
~EAF_NOCLOBBER.

gcc/testsuite/ChangeLog:

2021-08-22  Jan Hubicka  
Martin Liska  

* gcc.dg/lto/pr101949_0.c: New test.
* gcc.dg/lto/pr101949_1.c: New test.

[Bug target/58897] Improve 128/64 division

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58897

Andrew Pinski  changed:

   What|Removed |Added

 CC||kamkaz at windowslive dot com

--- Comment #2 from Andrew Pinski  ---
*** Bug 102014 has been marked as a duplicate of this bug. ***

[Bug c/102014] [missed optimization] __uint128_t % uint64_t emits a call to __umodti3 instead of div instruction

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102014

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Dup of bug 58897.

*** This bug has been marked as a duplicate of bug 58897 ***

[Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned

2021-08-22 Thread arthur200126 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #6 from Mingye Wang  ---
FWIW, the ticket about doing stuff to align the stack in the prologue is bug
54412. Apologies for the noisy emails, but thing is I can't do the see-also
thing here.

[Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned

2021-08-22 Thread arthur200126 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

Mingye Wang  changed:

   What|Removed |Added

 CC||arthur200126 at gmail dot com

--- Comment #5 from Mingye Wang  ---
I think I am bumping into the same bug with GCC 10.3.0, MinGW64 environment, in
an SIMD library at [1].
  [1]: https://github.com/google/highway/issues/332

There was a related bug at [2] showing another small (not quite minimal) test
case.
  [2]: https://osdn.net/projects/mingw/ticket/39565

The VMOVUPS idea seems cool -- can we do it?

[Bug c++/102015] New: [missed optimization] Small memory overhead in _Rb_tree_impl (fix would require ABI break)

2021-08-22 Thread kamkaz at windowslive dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102015

Bug ID: 102015
   Summary: [missed optimization] Small memory overhead in
_Rb_tree_impl (fix would require ABI break)
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kamkaz at windowslive dot com
  Target Milestone: ---

Current definition of _Rb_tree_key_compare causes size overhead for all
std::(multi)set/map-s:

  template
struct _Rb_tree_key_compare
{
  _Key_compare  _M_key_compare;
  ...
};

If it were possible to change the ABI in the future, I think it can be improved
by empty-base-optimization for comparators that are not final classes - or by
adding  [[no_unique_address]] to _M_key_compare.

[Bug c/102014] [missed optimization] __uint128_t % uint64_t emits a call to __umodti3 instead of div instruction

2021-08-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102014

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
No, that is not a safe optimization.  The x86 128-bit by 64-bit DIV instruction
will #DE if the quotient is larger than ~(uint64_t) 0, you won't get a modulo
in that case even when the modulo is guaranteed to be representable.
Consider e.g. a == b == 0x3fffULL and n == 3.

[Bug fortran/102011] Infinite loop in heron iteration when optimization is enabled with gfortran 10.3.0

2021-08-22 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102011

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from kargl at gcc dot gnu.org ---
(In reply to Ralph Trenkler from comment #0)
> Created attachment 51346 [details]
> The fortran program, which does the infinite loop with compiler version
> 10.3.0
> 
> I wrote a function in gfortran-10.3.0, which computes the square root with
> the heron iteration method. Without optimization the program is okay, but if
> I turn on optimization, then it does an infinite loop. I use Kubuntu 20.04.

If I compile your program with -Wall, I get 

gfcx -o z -fcheck=all -Wall a.f90 && ./z
a.f90:11:13:

   11 |  if (abs((x2-x1)/(x1+x2)) < epsilon) exit
  | ^
Warning: 'epsilon' may be used uninitialized [-Wmaybe-uninitialized]


Sure, enough.

  real(8), parameter :: epilson = 1.0e-15
^^^

 if (abs((x2-x1)/(x1+x2)) < epsilon) exit
^^^

One of these is wrong.

[Bug c/102014] New: [missed optimization] __uint128_t % uint64_t emits a call to __umodti3 instead of div instruction

2021-08-22 Thread kamkaz at windowslive dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102014

Bug ID: 102014
   Summary: [missed optimization] __uint128_t % uint64_t emits a
call to __umodti3 instead of div instruction
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kamkaz at windowslive dot com
  Target Milestone: ---

The following code:

#include 
extern u64 safe_mul(uint64_t a, uint64_t b, uint64_t n) {
return (((__uint128_t)a)*b)%n;
}

compiled with -O2 for x86_64 architecture generates following assembly:

safe_mul(unsigned long, unsigned long, unsigned long):
mov rax, rdi
mov r8, rdx
sub rsp, 8
xor ecx, ecx
mul rsi
mov rsi, rdx
mov rdi, rax
mov rdx, r8
call__umodti3
add rsp, 8
ret

With call to __umodti3, while it could compiled to:

safe_mul(unsigned long, unsigned long, unsigned long):
mov rax, rdx
mul rcx
div r8
mov rax, rdx
ret

The same thing happens with division __uint128_t / uint64_t and unnecessary
call to __udivti3 instead of div instruction.

[Bug c++/102013] New: Incorrect aggregate initialization of union

2021-08-22 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102013

Bug ID: 102013
   Summary: Incorrect aggregate initialization of union
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

In this program:
```
#include 

struct A { int x = 1; };
struct B { int x = 0; };

union U {
A a;
B b;
};

int main() {
U u{};
std::cout << u.a.x;
}
```

GCC prints `0`: https://gcc.godbolt.org/z/8Tj4Y1Pv1
But the correct result is `1`, which is the default initializer of the first
union member `a` (see Clang 's result)

[Bug c++/102012] New: GCC accepts any non-bool atomic constraint type

2021-08-22 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102012

Bug ID: 102012
   Summary: GCC accepts any non-bool atomic constraint type
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

struct S { };

template
concept C = T(true);

decltype(C) x = 0;
decltype(C) y = 0;
decltype(C) z = 0;
decltype(C) w = 0;

https://godbolt.org/z/vEEPboYcq

[Bug c/100532] ICE: tree check: expected class ‘type’, have ‘exceptional’ (error_mark) in useless_type_conversion_p, at gimple-expr.c:259

2021-08-22 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100532

H.J. Lu  changed:

   What|Removed |Added

Summary|[12 Regression] ICE: tree   |ICE: tree check: expected
   |check: expected class   |class ‘type’, have
   |‘type’, have ‘exceptional’  |‘exceptional’ (error_mark)
   |(error_mark) in |in
   |useless_type_conversion_p,  |useless_type_conversion_p,
   |at gimple-expr.c:259|at gimple-expr.c:259

--- Comment #3 from H.J. Lu  ---
It isn't a GCC 12 regression. r10-0 has the same ICE. It was hidden on release
branches.

[Bug target/43147] SSE shuffle merge

2021-08-22 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147

H.J. Lu  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2021-August/
   ||577884.html

--- Comment #13 from H.J. Lu  ---
A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577884.html

[Bug objc/101666] Objective-C frontend crashes with `-fobjc-nilcheck`

2021-08-22 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101666

Iain Sandoe  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |iains at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2021-08-22
   Target Milestone|--- |9.5

--- Comment #5 from Iain Sandoe  ---
fixed on master, should be backported to open branches.

[Bug fortran/102011] New: Infinite loop in heron iteration when optimization is enabled with gfortran 10.3.0

2021-08-22 Thread Ralph-Trenkler--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102011

Bug ID: 102011
   Summary: Infinite loop in heron iteration when optimization is
enabled with gfortran 10.3.0
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ralph-trenk...@t-online.de
  Target Milestone: ---

Created attachment 51346
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51346=edit
The fortran program, which does the infinite loop with compiler version 10.3.0

I wrote a function in gfortran-10.3.0, which computes the square root with the
heron iteration method. Without optimization the program is okay, but if I turn
on optimization, then it does an infinite loop. I use Kubuntu 20.04.

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2021-08-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877

Tamar Christina  changed:

   What|Removed |Added

Version|11.0|12.0

--- Comment #5 from Tamar Christina  ---
We're in the process of rewriting these intrinsics, should be fixed in GCC 12.

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877

--- Comment #4 from Andrew Pinski  ---
Here is another example where GCC messes up:

#include "arm_neon.h"
uint8x16_t g(void);
uint8x16_t fun(uint8x16_t lo, uint8x16_t hi, uint8x16_t idx) {
  uint8x16x2_t tab = { .val = {g(), g()} };
  uint8x16_t res = vqtbl2q_u8(tab, idx);
  return res;
}

Note clang/LLVM messes the above one up even worse.

[Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

Andrew Pinski  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
 Resolution|FIXED   |---
   Target Milestone|9.0 |---

--- Comment #8 from Andrew Pinski  ---
Well it was just this case that was fixed.
here is another one which is still broken:
unsigned int
adds_shift_ext ( unsigned long long a, unsigned short b, unsigned c)
{
 unsigned long long  d = (a - ((unsigned long long)b << 3));

  if (d == 0)
return a + c + b;
  else
return b + d + c;
}

Note I think there is a missed reassociation/code hoisting too.

   [local count: 536870913]:
  _3 = (unsigned int) a_11(D);
  _4 = _3 + c_13(D);
  _15 = _4 + _8;
  goto ; [100.00%]

   [local count: 536870913]:
  _7 = (unsigned int) d_12;
  _17 = _8 + c_13(D);
  _14 = _7 + _17;

c_13(D) + _8 is full redundant here

[Bug rtl-optimization/64537] Aarch64 redundant sxth instruction gets generated

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64537

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |9.0
 Status|NEW |RESOLVED

--- Comment #7 from Andrew Pinski  ---
GCC9+ does:
subsx3, x0, x1, sxth 3
add w1, w2, w1, sxth
add w1, w1, w3
add w0, w2, w0
cselw0, w1, w0, ne
ret

GCC 8 produced:
sxthw1, w1
subsx3, x0, x1, sxth 3
add w1, w1, w2
add w1, w1, w3
add w0, w2, w0
cselw0, w1, w0, ne
ret

GCC 9's combine is able to do this:
Trying 3 -> 8:
3: r99:SI=sign_extend(x1:HI)
  REG_DEAD x1:HI
8: r101:DI=sign_extend(r99:SI#0)
Failed to match this instruction:
(parallel [
(set (reg:DI 101 [ b ])
(sign_extend:DI (reg:HI 1 x1 [ b ])))
(set (reg/v:SI 99 [ b ])
(sign_extend:SI (reg:HI 1 x1 [ b ])))
])
Failed to match this instruction:
(parallel [
(set (reg:DI 101 [ b ])
(sign_extend:DI (reg:HI 1 x1 [ b ])))
(set (reg/v:SI 99 [ b ])
(sign_extend:SI (reg:HI 1 x1 [ b ])))
])
Successfully matched this instruction:
(set (reg/v:SI 99 [ b ])
(sign_extend:SI (reg:HI 1 x1 [ b ])))
Successfully matched this instruction:
(set (reg:DI 101 [ b ])
(sign_extend:DI (reg:HI 1 x1 [ b ])))
allowing combination of insns 3 and 8
original costs 4 + 4 = 8
replacement costs 4 + 4 = 8

So fixed by r9-2064.

[Bug rtl-optimization/96031] suboptimal codegen for store low 16-bits value

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96031

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=57231
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-08-22
 Ever confirmed|0   |1

--- Comment #5 from Andrew Pinski  ---
Confirmed. There might be a few others like this too.

[Bug rtl-optimization/55549] zero_extend and vectors

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55549

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||internal-improvement
   Last reconfirmed||2021-08-22
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Still there:
#ifdef INSN_SCHEDULING
  /* If *SPLIT is a paradoxical SUBREG, when we split it, it should
 be written as a ZERO_EXTEND.  */
  if (split_code == SUBREG && MEM_P (SUBREG_REG (*split)))
{
  /* Or as a SIGN_EXTEND if LOAD_EXTEND_OP says that that's
 what it really is.  */
  if (load_extend_op (GET_MODE (SUBREG_REG (*split)))
  == SIGN_EXTEND)
SUBST (*split, gen_rtx_SIGN_EXTEND (split_mode,
SUBREG_REG (*split)));
  else
SUBST (*split, gen_rtx_ZERO_EXTEND (split_mode,
SUBREG_REG (*split)));
}
#endif

[Bug rtl-optimization/87238] Redundant Restore of $x0 when memcpy always returns the first argument.

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87238

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ra

--- Comment #4 from Andrew Pinski  ---
Note I had to change the testcase to be using size 128.

But this is interesting, IRA was able to figure the memset return value:
(call_insn 14 13 17 2 (parallel [
(set (reg:DI 0 x0)
(call (mem:DI (symbol_ref:DI ("memcpy") [flags 0x41] 
) [0 __builtin_memcpy S8 A8])
(const_int 0 [0])))
(unspec:DI [
(const_int 0 [0])
] UNSPEC_CALLEE_ABI)
(clobber (reg:DI 30 x30))
]) "/app/example.cpp":10:16 47 {*call_value_insn}
 (expr_list:REG_RETURNED (reg/f:DI 95)
(expr_list:REG_DEAD (reg:DI 2 x2)
(expr_list:REG_DEAD (reg:DI 1 x1)
(expr_list:REG_UNUSED (reg:DI 0 x0)
(expr_list:REG_CALL_DECL (symbol_ref:DI ("memcpy") [flags
0x41]  )
(expr_list:REG_EH_REGION (const_int 0 [0])
(nil)))
(expr_list (clobber (reg:DI 17 x17))
(expr_list (clobber (reg:DI 16 x16))
(expr_list:DI (set (reg:DI 0 x0)
(reg:DI 0 x0))
(expr_list:DI (use (reg:DI 0 x0))
(expr_list:DI (use (reg:DI 1 x1))
(expr_list:DI (use (reg:DI 2 x2))
(nil
(insn 17 14 18 2 (set (reg:DI 0 x0)
(reg/f:DI 95)) "/app/example.cpp":10:16 53 {*movdi_aarch64}
 (expr_list:REG_DEAD (reg/f:DI 95)
(expr_list:REG_EQUAL (plus:DI (reg/f:DI 64 sfp)
(const_int -512 [0xfe00]))
(nil

But not actually use it.

REG_RETURNED (reg/f:DI 95)

Instead if used the REG_EQUAL 

[Bug rtl-optimization/87238] Redundant Restore of $x0 when memcpy always returns the first argument.

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87238

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=61241,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=101995

--- Comment #3 from Andrew Pinski  ---
Related to PR 101995.  IRA knows how to rematerialize the address of memset
already to be the return value just it is not doing it in this case.

[Bug rtl-optimization/86901] [AArch64] Suboptimal register allocation for int/float reinterpret

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86901

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|ra  |
   Last reconfirmed|2020-05-16 00:00:00 |2021-8-22
  Component|middle-end  |rtl-optimization

--- Comment #2 from Andrew Pinski  ---
The DI mode comes from combine:

Trying 7 -> 8:
7: r98:SI=r96:SF#0 0>>0x14
8: r99:SI=r98:SI&0x7ff
  REG_DEAD r98:SI
Successfully matched this instruction:
(set (subreg:DI (reg:SI 99) 0)
(zero_extract:DI (subreg:DI (reg/v:SF 96 [ y ]) 0)
(const_int 11 [0xb])
(const_int 20 [0x14])))
allowing combination of insns 7 and 8
original costs 16 + 4 = 20
replacement cost 16
deferring deletion of insn with uid = 7.
modifying insn i3 8: r99:SI#0=zero_extract(r96:SF#0,0xb,0x14)
deferring rescan insn with uid = 8.

[Bug rtl-optimization/81501] mulitple calls to __tls_get_addr() with -fPIC

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81501

Andrew Pinski  changed:

   What|Removed |Added

Summary|Unneccessary calls to   |mulitple calls to
   |__tls_get_addr() in simple  |__tls_get_addr() with -fPIC
   |thread-singleton pattern|
   Severity|normal  |enhancement

--- Comment #5 from Andrew Pinski  ---
Due to the way addresses are formed for TLS, we emit the call to the
__tls_get_addr function at the point of address.  We should do something
similar to how other PIC addresses are handled.

[Bug rtl-optimization/81501] Unneccessary calls to __tls_get_addr() in simple thread-singleton pattern

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81501

Andrew Pinski  changed:

   What|Removed |Added

 CC||amohr at amohr dot org

--- Comment #4 from Andrew Pinski  ---
*** Bug 82803 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/82803] Wildly excessive calls to __tls_get_addr with optimizations enabled.

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82803

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #14 from Andrew Pinski  ---
Dup of bug 81501.

*** This bug has been marked as a duplicate of bug 81501 ***

[Bug target/89517] [8/9 Regression] AArch64's configure option --with-arch can silently lead to incorrectly configured compiler

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89517

Andrew Pinski  changed:

   What|Removed |Added

 CC||vladimir at bashkirtsev dot com

--- Comment #5 from Andrew Pinski  ---
*** Bug 86713 has been marked as a duplicate of this bug. ***

[Bug target/86713] 'nofp', 'nosimd', 'nocrypto' and 'nofp16' feature modifiers for Aarch64 fail to build

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86713

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
   Keywords||documentation
 Status|NEW |RESOLVED

--- Comment #4 from Andrew Pinski  ---
Dup of bug 89517 which is fixed for 8.4.0 and GCC 9+ and fixed using the C
preprocessor in GCC 10+ (r10-2124).

*** This bug has been marked as a duplicate of bug 89517 ***

[Bug libstdc++/89461] FAIL: experimental/net/timer/waitable/cons.cc

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89461

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |9.0

--- Comment #11 from Andrew Pinski  ---
Fixed.

[Bug libstdc++/89461] FAIL: experimental/net/timer/waitable/cons.cc

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89461

--- Comment #10 from Andrew Pinski  ---
*** Bug 69331 has been marked as a duplicate of this bug. ***

[Bug libstdc++/69331] FAIL: 20_util/shared_ptr/thread/default_weaktoshared.cc execution test

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69331

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #26 from Andrew Pinski  ---
Dup of bug 89461.

*** This bug has been marked as a duplicate of bug 89461 ***

[Bug middle-end/80295] [7 Regression] ICE in __builtin_update_setjmp_buf expander

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80295

--- Comment #16 from Andrew Pinski  ---
*** Bug 80266 has been marked as a duplicate of this bug. ***

[Bug target/80266] ICE in store_pairsi condition with -mabi=ilp32

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80266

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #6 from Andrew Pinski  ---
Same bug as PR 80295.

*** This bug has been marked as a duplicate of bug 80295 ***

[Bug target/80881] Implement Windows native TLS

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80881

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/90458] mingw64: ICE in i386_pe_seh_unwind_emit, at config/i386/winnt.c:1258 with -fstack-clash-protection

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90458

Andrew Pinski  changed:

   What|Removed |Added

 CC||vladimir.kokovic at gmail dot 
com

--- Comment #6 from Andrew Pinski  ---
*** Bug 97795 has been marked as a duplicate of this bug. ***

[Bug target/97795] internal compiler error: in i386_pe_seh_unwind_emit

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97795

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Dup of bug 90458.

*** This bug has been marked as a duplicate of bug 90458 ***

[Bug target/90458] mingw64: ICE in i386_pe_seh_unwind_emit, at config/i386/winnt.c:1258 with -fstack-clash-protection

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90458

Andrew Pinski  changed:

   What|Removed |Added

 CC||nightstrike at gmail dot com

--- Comment #5 from Andrew Pinski  ---
*** Bug 102010 has been marked as a duplicate of this bug. ***

[Bug c/102010] ICE in stack-check-8.c in i386_pe_seh_unwind_emit

2021-08-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102010

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
Dup of bug 90458.

*** This bug has been marked as a duplicate of bug 90458 ***

[Bug c/102010] New: ICE in stack-check-8.c in i386_pe_seh_unwind_emit

2021-08-22 Thread nightstrike at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102010

Bug ID: 102010
   Summary: ICE in stack-check-8.c in i386_pe_seh_unwind_emit
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nightstrike at gmail dot com
  Target Milestone: ---

See related PR97795 and PR90458

The testsuite test stack-check-8.c fails thusly:

spawn -ignore SIGHUP /tmp/_/build/gc/gcc/xgcc -B/tmp/_/build/gc/gcc/
/tmp/_/src/gcc-git/gcc/testsuite/gcc.dg/stack-check-8.c
-fdiagnostics-plain-output -O2 -fstack-clash-protection -Wno-psabi -fno-
optimize-sibling-calls --param stack-clash-protection-probe-interval=12 --param
stack-clash-protection-guard-size=12 -lm -o ./stack-check-8.exe
during RTL pass: final
/tmp/_/src/gcc-git/gcc/testsuite/gcc.dg/stack-check-8.c: In function 'f3':
/tmp/_/src/gcc-git/gcc/testsuite/gcc.dg/stack-check-8.c:40:1: internal compiler
error: in i386_pe_seh_unwind_emit, at config/i386/winnt.c:1274
0x7f25c7 i386_pe_seh_unwind_emit(_IO_FILE*, rtx_insn*)
/tmp/_/src/gcc-git/gcc/config/i386/winnt.c:1274
0xb3aa3b final_scan_insn_1
/tmp/_/src/gcc-git/gcc/final.c:2904
0xb3ae5b final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/tmp/_/src/gcc-git/gcc/final.c:2940
0xb3af36 final_1
/tmp/_/src/gcc-git/gcc/final.c:1997
0xb3baf4 rest_of_handle_final
/tmp/_/src/gcc-git/gcc/final.c:4285
0xb3baf4 execute
/tmp/_/src/gcc-git/gcc/final.c:4363
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1