date:20210911

[Bug middle-end/47397] Alignment of array element is not optimal in AVX mode due to use of TARGET_MEM_REFs

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47397

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |5.5
  Known to work||5.5.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #12 from Andrew Pinski  ---
Fixed for GCC 5.5, 6.4.0 and 7+ by the patch which fixed PR 80334 (r7-7533 on
the trunk, r6-9285 for GCC 6.4.0 and r5-10370 for GCC 5.5.0).

[Bug middle-end/47397] Alignment of array element is not optimal in AVX mode due to use of TARGET_MEM_REFs

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47397

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||6.4.0, 7.1.0, 7.5.0
  Known to fail||5.1.0, 6.1.0, 6.2.0, 6.3.0

--- Comment #11 from Andrew Pinski  ---
Looks fixed:

a:
(insn 9 8 10 4 (set (reg:V4DF 87 [ vect__2.5 ])
(mem:V4DF (plus:DI (reg/f:DI 86)
(reg:DI 83 [ ivtmp.12 ])) [1 MEM  [(double
*) + 16B + ivtmp.12_14 * 1]+0 S32 A128])) "/app/example.cpp":9:19 -1
 (nil))


b:
(insn 13 12 14 4 (set (mem:V4DF (plus:DI (reg/f:DI 85)
(reg:DI 83 [ ivtmp.12 ])) [1 MEM  [(double
*) + ivtmp.12_14 * 1]+0 S32 A256])
(reg:V4DF 88 [ vect__3.6 ])) "/app/example.cpp":9:12 -1
 (nil))

[Bug tree-optimization/46763] gcc 4.5: missed optimization: copy global to local, prefetch

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46763

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization

--- Comment #4 from Andrew Pinski  ---
-O3 produces:
jmp .L6
.p2align 4,,10
.p2align 3
.L3:
leal1(%rbx), %edx
movl%ebx, g(%rip)
cmpl%edx, %ebp
je  .L1
.L4:
movl%ebx, %eax
movl%edx, %ebx
.L6:
testl   %eax, %eax
je  .L3
addl$1, %eax
movl%ebx, %edi
movl%eax, g(%rip)
callbar(int)
leal1(%rbx), %edx
movl%eax, g(%rip)
cmpl%edx, %ebp
je  .L1
movl%eax, %ebx
jmp .L4

[Bug middle-end/102285] New flag -ftrivial-auto-var-init=zero causes many crashes in the testsuite

2021-09-11 Thread qinzhao at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102285

--- Comment #5 from qinzhao at gcc dot gnu.org ---
with the latest GCC, for all the 42647 c files under gcc/testsuite, with -c -g
-O2 -Wall -ftrivial-auto-var-init=zero, there is only one failure:
 /home/opc/Install/latest/bin/gcc -c -g -ftrivial-auto-var-init=zero -O2 -Wall
./gcc.dg/graphite/pr82421.c

during RTL pass: expand
./gcc.dg/graphite/pr82421.c: In function ‘qy’:
./gcc.dg/graphite/pr82421.c:10:7: internal compiler error: in
expand_expr_addr_expr_1, at expr.c:8432
   10 |   int fb[tw];
  |   ^~
0x6b3494 expand_expr_addr_expr_1
../../latest-gcc/gcc/expr.c:8432
0xaecd9d expand_expr_addr_expr
../../latest-gcc/gcc/expr.c:8545
0xaecd9d expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../latest-gcc/gcc/expr.c:11761
0xaeac65 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier,
rtx_def**, bool)
../../latest-gcc/gcc/expr.c:8733
0xaeac65 expand_expr
../../latest-gcc/gcc/expr.h:301
0xaeac65 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../latest-gcc/gcc/expr.c:9053
0xaeb732 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
../../latest-gcc/gcc/expr.c:11818
0x991030 expand_expr
../../latest-gcc/gcc/expr.h:301
0x991030 get_memory_rtx
../../latest-gcc/gcc/builtins.c:1369
0x9989f5 expand_builtin_memset_args
../../latest-gcc/gcc/builtins.c:4101
0x9c4a67 expand_call_stmt
../../latest-gcc/gcc/cfgexpand.c:2749
0x9c4a67 expand_gimple_stmt_1
../../latest-gcc/gcc/cfgexpand.c:3876
0x9c4a67 expand_gimple_stmt
../../latest-gcc/gcc/cfgexpand.c:4040
0x9cac8b expand_gimple_basic_block
../../latest-gcc/gcc/cfgexpand.c:6082
0x9cc98f execute
../../latest-gcc/gcc/cfgexpand.c:6808
Please submit a full bug report,

delete -ftrivial-auto-var-init=zero, the failure is gone.

I will study this one to fix it.

[Bug middle-end/31179] strange behaviour of floor rounding

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31179

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Andrew Pinski  ---
Actually 14070 is exactly represented but 140.7 is not.

[Bug target/58110] Useless GPR push and pop when only xmm registers are used.

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58110

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|--- |7.0

--- Comment #4 from Andrew Pinski  ---
Fixed by LRA in GCC 7.

[Bug bootstrap/58090] bootstrap fails comparison with --enable-gather-detailed-mem-stats

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58090

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Andrew Pinski  ---
This was fixed long ago.

[Bug target/31035] x86 GNU/Hurd should include crtfm and dfprules because it uses linux.h

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31035

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=28102
 Resolution|--- |FIXED

--- Comment #5 from Andrew Pinski  ---
Fixed by r0-90585

[Bug other/65794] Building crossback fails: No rule to make target `auto-build.h', needed by `build/genmddeps.o'

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65794

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Andrew Pinski  ---
Fixed for GCC 10 by r10-4331.

[Bug bootstrap/57125] Build not SMP safe; fails to build bconfig.h

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57125

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |8.0
 Status|UNCONFIRMED |RESOLVED
   Keywords||build

--- Comment #10 from Andrew Pinski  ---
f142b5bc2102 (Romain Geissler   2011-08-04 11:30:45 + 2872)
gengtype-lex.o build/gengtype-lex.o : gengtype-lex.c gengtype.h $(SYSTEM_H)
d6d34aa9133a (Jakub Jelinek 2014-12-03 17:44:27 +0100 2873)
CFLAGS-gengtype-lex.o += -DHOST_GENERATOR_FILE
f142b5bc2102 (Romain Geissler   2011-08-04 11:30:45 + 2874)
build/gengtype-lex.o: $(BCONFIG_H)




18406601136a (Paolo Bonzini 2010-11-13 09:42:58 + 3001) #
Generated source files for gengtype.  Prepend inclusion of
8c7dbea9f193 (Boris Kolpackov   2017-11-26 13:00:48 + 3002) #
config.h/bconfig.h because AIX requires _LARGE_FILES to be defined before
18406601136a (Paolo Bonzini 2010-11-13 09:42:58 + 3003) # any
system header is included.
03787dfd8153 (Kelley Cook   2004-01-15 04:02:24 + 3004)
gengtype-lex.c : gengtype-lex.l
8fb15466274a (Paolo Bonzini 2010-11-11 23:44:44 + 3005)
-$(FLEX) $(FLEXFLAGS) -o$@ $< && { \
8c7dbea9f193 (Boris Kolpackov   2017-11-26 13:00:48 + 3006)
  echo '#ifdef HOST_GENERATOR_FILE' > $@.tmp; \
8c7dbea9f193 (Boris Kolpackov   2017-11-26 13:00:48 + 3007)
  echo '#include "config.h"'   >> $@.tmp; \
8c7dbea9f193 (Boris Kolpackov   2017-11-26 13:00:48 + 3008)
  echo '#else' >> $@.tmp; \
8c7dbea9f193 (Boris Kolpackov   2017-11-26 13:00:48 + 3009)
  echo '#include "bconfig.h"'  >> $@.tmp; \
8c7dbea9f193 (Boris Kolpackov   2017-11-26 13:00:48 + 3010)
  echo '#endif'>> $@.tmp; \
8fb15466274a (Paolo Bonzini 2010-11-11 23:44:44 + 3011)
  cat $@ >> $@.tmp; \
8fb15466274a (Paolo Bonzini 2010-11-11 23:44:44 + 3012)
  mv $@.tmp $@; \
8fb15466274a (Paolo Bonzini 2010-11-11 23:44:44 + 3013)
}



So this was fixed with r8-4925-g8c7dbea9f193

[Bug go/101994] go1: internal compiler error: in return_statement, at go/go-gcc.cc:2318

2021-09-11 Thread ian at airs dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101994

Ian Lance Taylor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Ian Lance Taylor  ---
Fixed on tip.

[Bug go/101994] go1: internal compiler error: in return_statement, at go/go-gcc.cc:2318

2021-09-11 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101994

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Ian Lance Taylor :

https://gcc.gnu.org/g:79513dc0b2d980bfd1b109d0d502de487c02b894

commit r12-3462-g79513dc0b2d980bfd1b109d0d502de487c02b894
Author: Ian Lance Taylor 
Date:   Fri Aug 20 11:33:29 2021 -0700

compiler: don't pad zero-sized trailing field in results struct

Nothing can take the address of that field anyhow.

Fixes PR go/101994

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/343873

[Bug c++/77565] `typdef int Int;` --> did you mean `typeof`?

2021-09-11 Thread mimomorin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77565

--- Comment #4 from Michel Morin  ---
It seems that the reason is:

`cp_keyword_starts_decl_specifier_p` in `cp/parser.c` does not include
`RID_TYPENAME`.

Note that `typedef` is a decl-specifier ([dcl.spec] p.1 in the Standard).

[Bug fortran/67972] Substrings of arrays of unicode strings are of type DEFAULT rather than ISO_10646

2021-09-11 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67972

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

  Known to work||11.2.1, 12.0
  Known to fail||10.3.1
 Status|NEW |WAITING

--- Comment #2 from anlauf at gcc dot gnu.org ---
This appears to have been fixed at some point.  12-trunk and 11-branch
work for me as of today, while 10-branch still shows the corruption.

[Bug c/102291] [9/10/11/12 Regression] wrong overflow warning for compound expression conversion and bit_and expressions

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102291

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=32643

--- Comment #2 from Andrew Pinski  ---
Related to PR 32643.

[Bug c/102291] [9/10/11/12 Regression] wrong overflow warning for compound expression conversion and bit_and expressions

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102291

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-09-11
 Status|UNCONFIRMED |NEW

[Bug c/102291] [9/10/11/12 Regression] wrong overflow warning for compound expression conversion and bit_and expressions

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102291

Andrew Pinski  changed:

   What|Removed |Added

Summary|dubious overflow warning|[9/10/11/12 Regression]
   ||wrong overflow warning for
   ||compound expression
   ||conversion and bit_and
   ||expressions
  Known to work|9.4.0   |4.1.2
  Known to fail||4.4.7
   Target Milestone|--- |9.5

--- Comment #1 from Andrew Pinski  ---
I don't think this really a regression from GCC 9 but rather a regression from
a long time ago.
Take:
typedef unsigned long ulong;
typedef unsigned char uchar;


ulong testera(ulong ul) {
return uchar) (((void)0, ((uchar) (0x80)))|0)) & 0x3f));
}
- CUT 
This is slightly different from your teststcase but it warns all the way back
until sometime before 4.4.0 (I only can test 4.1.2).

[Bug fortran/98490] Unexpected out of bounds in array constructor with implied do loop

2021-09-11 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98490

--- Comment #13 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:7ca5bcb0f12380a399c6f88f82fa01b530eb85e5

commit r11-8980-g7ca5bcb0f12380a399c6f88f82fa01b530eb85e5
Author: Harald Anlauf 
Date:   Thu Sep 9 21:34:01 2021 +0200

Fortran - out of bounds in array constructor with implied do loop

gcc/fortran/ChangeLog:

PR fortran/98490
* trans-expr.c (gfc_conv_substring): Do not generate substring
bounds check for implied do loop index variable before it actually
becomes defined.

gcc/testsuite/ChangeLog:

PR fortran/98490
* gfortran.dg/bounds_check_23.f90: New test.

(cherry picked from commit 5fe0865ab788bdc387b284a3ad57e5a95a767b18)

[Bug fortran/101327] ICE in find_array_element, at fortran/expr.c:1355

2021-09-11 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101327

--- Comment #7 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:0d09acc0d627dcc7b3d82d873ee9da2f7546414e

commit r11-8979-g0d09acc0d627dcc7b3d82d873ee9da2f7546414e
Author: Harald Anlauf 
Date:   Tue Sep 7 20:51:49 2021 +0200

Fortran - improve error recovery determining array element from constructor

gcc/fortran/ChangeLog:

PR fortran/101327
* expr.c (find_array_element): When bounds cannot be determined as
constant, return error instead of aborting.

gcc/testsuite/ChangeLog:

PR fortran/101327
* gfortran.dg/pr101327.f90: New test.

(cherry picked from commit 2a1537a19cb2fa85823cfa18ed40baa4b259b4e3)

[Bug debug/99090] gsplit-dwarf broken on riscv64-linux

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99090

Andrew Pinski  changed:

   What|Removed |Added

 CC||kilobyte at angband dot pl

--- Comment #8 from Andrew Pinski  ---
*** Bug 102290 has been marked as a duplicate of this bug. ***

[Bug debug/102290] ICE with -gsplit-dwarf

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102290

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup of bug 99090.  riscv hates leb128 .

*** This bug has been marked as a duplicate of bug 99090 ***

[Bug fortran/85130] Substrings out of range are not rejected

2021-09-11 Thread anlauf at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85130

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #5 from anlauf at gcc dot gnu.org ---
We should handle the substring bounds as signed instead of unsigned.
This is obviously fixed by:

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index dfecc3012e1..604e63e6164 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -1724,8 +1724,8 @@ find_substring_ref (gfc_expr *p, gfc_expr **newp)
   *newp = gfc_copy_expr (p);
   free ((*newp)->value.character.string);

-  end = (gfc_charlen_t) mpz_get_ui (p->ref->u.ss.end->value.integer);
-  start = (gfc_charlen_t) mpz_get_ui (p->ref->u.ss.start->value.integer);
+  end = (gfc_charlen_t) mpz_get_si (p->ref->u.ss.end->value.integer);
+  start = (gfc_charlen_t) mpz_get_si (p->ref->u.ss.start->value.integer);
   if (end >= start)
 length = end - start + 1;
   else


and regtests cleanly.  :-)

@Thomas: although the wording is slightly different between F2003 and F2018, it
always results in length zero if the starting point exceeds the ending point.
Would you agree in reverting your commit r258976 after applying the above?

[Bug lto/102292] R_ARM_THM_JUMP24 incorrect link result if symbol duplicated

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102292

--- Comment #1 from Andrew Pinski  ---
Are you sure this is not a linker issue?

[Bug lto/102292] New: R_ARM_THM_JUMP24 incorrect link result if symbol duplicated

2021-09-11 Thread eason.lai at mediatek dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102292

Bug ID: 102292
   Summary: R_ARM_THM_JUMP24 incorrect link result if symbol
duplicated
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: eason.lai at mediatek dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 51439
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51439=edit
simple code to reproduce this issue

If a program is linked with duplicated symbol, one in symbol file and another
one in object file, the R_ARM_THM_JUMP24(b.w) result will be incorrect.

Please find the simple code from the attachment.
Following are the results of simple code.

(Correct) The veneer stay in THUMB mode if LTO is disabled.

Disassembly of section .text:

1000 :
1000:   f04f 0004   mov.w   r0, #4
1004:   f04f 0105   mov.w   r1, #5
1008:   f000 b806   b.w 1018 <__foo_veneer>
100c:   4770bx  lr

100e :
100e:   b508push{r3, lr}
1010:   f7ff fff6   bl  1000 
1014:   2000movsr0, #0
1016:   bd08pop {r3, pc}

1018 <__foo_veneer>:
1018:   b401push{r0}
101a:   4802ldr r0, [pc, #8]; (1024
<__foo_veneer+0xc>)
101c:   4684mov ip, r0
101e:   bc01pop {r0}
1020:   4760bx  ip
1022:   bf00nop
1024:   10009ed1.word   0x10009ed1


(Incorrect) The veneer switch to ARM mode if LTO is enabled.

Disassembly of section .text:

1000 :
1000:   f04f 0004   mov.w   r0, #4
1004:   f04f 0105   mov.w   r1, #5
1008:   f000 b806   b.w 1018 <__foo_veneer>
100c:   4770bx  lr

100e :
100e:   b510push{r4, lr}
1010:   2000movsr0, #0
1012:   f7ff fff5   bl  1000 
1016:   bd10pop {r4, pc}

1018 <__foo_veneer>:
1018:   4778bx  pc
101a:   e7fdb.n 1018 <__foo_veneer>
101c:   e51ff004ldr pc, [pc, #-4]   ; 1020
<__foo_veneer+0x8>
1020:   10009ed0.word   0x10009ed0
1024:   .word   0x


(Correct) After removing foo.o from C_OBJS in Makefile, the veneer stay in
THUMB mode as expected when LTO is enabled.

Disassembly of section .text:

1000 :
1000:   f04f 0004   mov.w   r0, #4
1004:   f04f 0105   mov.w   r1, #5
1008:   f000 b806   b.w 1018 <__foo_veneer>
100c:   4770bx  lr

100e :
100e:   b510push{r4, lr}
1010:   2000movsr0, #0
1012:   f7ff fff5   bl  1000 
1016:   bd10pop {r4, pc}

1018 <__foo_veneer>:
1018:   b401push{r0}
101a:   4802ldr r0, [pc, #8]; (1024
<__foo_veneer+0xc>)
101c:   4684mov ip, r0
101e:   bc01pop {r0}
1020:   4760bx  ip
1022:   bf00nop
1024:   10009ed1.word   0x10009ed1

[Bug c++/102289] Concept declaration with multiple template-heads not diagnosed

2021-09-11 Thread hewillk at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102289

康桓瑋  changed:

   What|Removed |Added

 CC||hewillk at gmail dot com

--- Comment #1 from 康桓瑋  ---
dup of PR101499. ;-)

[Bug target/85915] ABI incompatibility (multiple definition of `__x86_return_thunk') for static libraries, between GCC 7.3.0 and GCC >=7.4.0, caused by -mfunction-return=thunk

2021-09-11 Thread luke-jr+gccbugs at utopios dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85915

--- Comment #16 from Luke Dashjr  ---
GCC only supports defaults now?

[Bug c/102291] New: dubious overflow warning

2021-09-11 Thread hv at crypt dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102291

Bug ID: 102291
   Summary: dubious overflow warning
   Product: gcc
   Version: 11.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hv at crypt dot org
  Target Milestone: ---

The following code warns with gcc-11.2.0 (but not with 9.2.1-17ubuntu1~18.04.1)
in testera(), but not in testerb() which differs only by the removal of an
assert. I don't understand this, and cannot see what is overflowing.

This is reduced from a report against the UTF8_ACCUMULATE macro in perl source.
This build of gcc was configured as follows (including the error in prefix):
  ../gcc/configure --prefix=/opt/gcc-12 --disable-gcov --disable-multilib
--enable-languages=c --disable-nls --disable-decimal-float


% uname -a
Linux zen2 5.4.0-73-generic #82~18.04.1-Ubuntu SMP Fri Apr 16 15:10:02 UTC 2021
x86_64 x86_64 x86_64 GNU/Linux

% gcc --version | head -1
gcc (GCC) 11.2.0

% cat test.c
/* #include  */
extern void __assert_fail (const char *__assertion, const char *__file,
  unsigned int __line, const char *__function)
__attribute__ ((__nothrow__ , __leaf__))
__attribute__ ((__noreturn__));
#define assert(expr)  \
  ((void) sizeof ((expr) ? 1 : 0), __extension__ ({ \
  if (expr) \
; /* empty */   \
  else  \
__assert_fail (#expr, __FILE__, __LINE__, __ASSERT_FUNCTION);   \
}))
#define __ASSERT_FUNCTION__extension__ __PRETTY_FUNCTION__
/* end assert.h */

typedef unsigned long ulong;
typedef unsigned char uchar;

#define FIT8(c) assert(((sizeof(c) == 1) || (((ulong) (c)) >> 8) == 0))
#define BE8a(c) (FIT8(c), ((uchar) (c)))
#define BE8b(c) ( ((uchar) (c)))
#define NUM(c) ((c) | 0)

#define TESTER(old, new) ulong)(old)) << 6) | (((uchar) NUM(new)) & 0x3f))

ulong testera(ulong ul) {
return TESTER(ul, BE8a(0x80));
}

ulong testerb(ulong ul) {
return TESTER(ul, BE8b(0x80));
}

% gcc -c test.c 
test.c: In function 'testera':
test.c:24:49: warning: overflow in conversion from 'int' to 'long unsigned int'
changes value from 'void)4, (({...}))), 128)) & 63' to '0' [-Woverflow]
   24 | #define TESTERa(old, new) ulong)(old)) << 6) | (((uchar) NUM(new))
& 0x3f))
  |  ^
test.c:27:12: note: in expansion of macro 'TESTERa'
   27 | return TESTERa(ul, BE8a(0x80));
  |^~~
%

[Bug debug/102290] New: ICE with -gsplit-dwarf

2021-09-11 Thread kilobyte at angband dot pl via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102290

Bug ID: 102290
   Summary: ICE with -gsplit-dwarf
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kilobyte at angband dot pl
  Target Milestone: ---

void foo()
{
int c = 0;
do ; while (bar(, c));
}

riscv64-linux-gnu-gcc-11 -O2 -gsplit-dwarf -gdwarf-5 -c foo.i

foo.i:5:1: internal compiler error: Segmentation fault
5 | }
  | ^
0x7eff8a765eef ???
./signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x7eff8a750e49 __libc_start_main
../csu/libc-start.c:314

(The original file was arch/riscv/kernel/cpu.c in linux-5.15-rc0)

Debian unstable, gcc-11 11.2.0-5, amd64 host, riscv64 target.

[Bug c++/102289] New: Concept declaration with multiple template-heads not diagnosed

2021-09-11 Thread barry.revzin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102289

Bug ID: 102289
   Summary: Concept declaration with multiple template-heads not
diagnosed
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

gcc trunk accepts this:

template 
template 
concept C = true;

I'm not sure exactly what gcc does with it, but C does become a real concept,
and this leads to other strange things happening down the line (reduced from:
https://stackoverflow.com/q/69143991/2069064)

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-11 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154

--- Comment #28 from Hongtao.liu  ---
(In reply to Segher Boessenkool from comment #27)
> (In reply to Hongtao.liu from comment #22)
> > > Btw, I think this is a subreg that would be reasonable to handle.
> > > It's exactly the kind that x86 would like to allow, (subreg:HF (reg:SI 
> > > ..) 0).
> > 
> > Yes, LRA/reload can handle it correctly, like i tried in #c10.
> 
> It is fundamentally wrong to rely on reloading for non-exceptional code.
> If reloading creates good code you are very lucky.  And the whole point of
> doing any of this with subregs is to get good code.

I don't have direct evidence to proof LRA/reload is functionally ok to handle
subreg, but x86 have a very heavy use of subreg, also there're comments in
general_operand saying below which makes me believe LRA/reload can handle
subreg right.

  if (SCALAR_FLOAT_MODE_P (GET_MODE (op))
  /* LRA can use subreg to store a floating point value in an
 integer mode.  Although the floating point and the
 integer modes need the same number of hard registers, the
 size of floating point mode can be less than the integer
 mode.  */
  && ! lra_in_progress 
  && paradoxical_subreg_p (op))
return false;


Back to #c10, because I don't know much about the power architecture, the
solution in #c10 that relies on reload to solve this (subreg:SF (reg:DI) 0) is
wrong.
The real problem for rs6000 is SFmode is represented as DFmode in the vector
and floating point registers, and this can't be handled by reload.
validate_subreg disallow (subreg:SF (reg:DI) 0) just so happens that you have
one less special subreg to deal with. there's already
rs6000_emit_move_si_sf_subreg to hanlde (subreg:SF (reg:SI) 0).

[Bug c/102288] New: using .field = ({ static struct a a = {... }; }) requires following field to be initialized

2021-09-11 Thread cagney at sourceware dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102288

Bug ID: 102288
   Summary: using .field = ({ static struct a a = {... };  })
requires following field to be initialized
   Product: gcc
   Version: 11.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cagney at sourceware dot org
  Target Milestone: ---

Created attachment 51438
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51438=edit
more examples (hopefully the version used for the above_

gcc --version
gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)

Note the use of ({}) and static when initializing the field BA:

 1  struct a {
 2unsigned a1;
 3  };
 4  
 5  #define A  ({ static const struct a a = { .a1 = 1, };  })
 6  
 7  struct ba01 {
 8const struct a *ba;
 9int b0;
10int b1;
11  };
12  
13  struct b0a1 {
14int b0;
15const struct a *ba;
16int b1;
17  };
18  
19  struct b01a {
20int b0;
21int b1;
22const struct a *ba;
23  };

29// complains that next field (.b0) isn't initialized
30  
31{
32  struct ba01 ba01 = {
33.ba = A,
34// .b0 = 0,
35// .b1 = 1,
36  };
37  use();
38}


 gcc -Wall -Wextra -c unused.c
unused.c: In function ‘main’:
unused.c:36:5: warning: missing initializer for field ‘b0’ of ‘struct ba01’
[-Wmissing-field-initializers]
   36 | };
  | ^
unused.c:9:7: note: ‘b0’ declared here
9 |   int b0;
  |   ^~

- just the next field needs to be initialized (not all)
- so, putting BA at the end avoids the problem
- so to does not using ({})

[Bug target/45548] Add with carry - missed optimization on x86

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45548

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.0

[Bug libstdc++/98319] LFTS headers give errors if included as C++11 or C++98

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98319

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug libgcc/98952] powerpc*: __trampoline_setup inverted test for trampoline size

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98952

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/33661] template methods forget explicit local register asm vars

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33661

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug sanitizer/97294] ASAN "dynamic-stack-buffer-overflow" false positive with OpenMP reduction to std::vector

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97294

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/86877] ICE in vectorizable_load, at tree-vect-stmts.c:8038

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86877

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug inline-asm/98847] Miscompilation with c++17, templates, and register keyword

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98847

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug middle-end/93235] [AArch64] ICE with __fp16 in a struct

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93235

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/94479] NetBSD: internal compiler error: in recompute_tree_invariant_for_addr_expr

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94479

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/82959] g++ doesn't appreciate C++17 evaluation order rules for overloaded operators

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82959

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/100441] [8/9 Regression] ICE in output_constant_pool_2, at varasm.c:3955

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100441

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug jit/100096] libgccjit.so.0: Cannot write-enable text segment: Permission denied on NetBSD 9.1

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100096

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/97663] [c++17] Function with return type 'unsigned' in nested namespace misinterpreted as deduction guide

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97663

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug bootstrap/97163] Build error with -mcpu=power9 on ppc64

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97163

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/100255] Crosscompiler to ia64-hp-vms: vmsdbgout.c:368:20: error: ISO C++17 does not allow 'register' storage class specifier [-Werror=register]

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100255

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug sanitizer/100114] libasan built against latest glibc doesn't work

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100114

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/99745] ICE when parameter pack not expanded in bit field

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99745

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/92992] Side-effects dropped when decltype(nullptr) typed expression is passed to ellipsis

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92992

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.4

[Bug c/99136] ICE in gimplify_expr, at gimplify.c:14854

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99136

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug debug/99388] Invalid debug info for __fp16

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99388

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c/99588] variable set but not used warning on static _Atomic assignment

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99588

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/89954] missed optimization for signed extension for x86-64

2021-09-11 Thread crazylht at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89954

--- Comment #6 from Hongtao.liu  ---
(In reply to Uroš Bizjak from comment #5)
> (In reply to Hongtao.liu from comment #4)
> > It looks like there's splitter in aarch64 which combines
> > load+xor+zero_extend to zero_extend(mem) + xor, x86 doesn't have. The simple
> > way is to add corresponding define_split for x86.
> > 
> > -x86 dump---
> > Failed to match this instruction:
> > (set (reg:SI 85)
> > (sign_extend:SI (xor:QI (mem/c:QI (symbol_ref:DI ("c") [flags 0x2] 
> > ) [0 c+0 S1 A8])
> > (const_int 1 [0x1]
> > ---dump end--
> 
> Maybe I'm missing something, but I don't think this transformation is
> correct. Please consider the following analysis with the emphasis on the
> sign bit of the QImode operation:
> 
> r = sext:HI (xor:QI (a, b)); b IMM
> 
> a  0xxx
> b  0xxx
> r  0xxx
> 
> a  1xxx
> b  0xxx
> r  1xxx
> 
> a  0xxx
> b  1xxx
> r  1xxx
> 
> a  1xxx
> b  1xxx
> r  0xxx
> 
> r = xor:HI ((a, b); a ZEXT, b IMM
movzbl maybe confuse you, there's no ZEXT here, just movqi_internal. also at
the stage of combine it is xorqi_1 not xorsi_1, and b here is const_int 1, the
most significant bit is zero.

here's dump before combine.

(insn 5 2 6 2 (set (reg:QI 87 [ c ])
(mem/c:QI (symbol_ref:DI ("c") [flags 0x2]  ) [0 c+0 S1 A8])) "test.c":4:14 79 {*movqi_internal}
 (nil))
(insn 6 5 7 2 (parallel [
(set (reg:QI 86)
(xor:QI (reg:QI 87 [ c ])
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) "test.c":4:14 556 {*xorqi_1}
 (expr_list:REG_DEAD (reg:QI 87 [ c ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUAL (xor:QI (mem/c:QI (symbol_ref:DI ("c") [flags
0x2]  ) [0 c+0 S1 A8])
(const_int 1 [0x1]))
(nil)
(insn 7 6 12 2 (set (reg:SI 85)
(sign_extend:SI (reg:QI 86))) "test.c":4:14 156 {extendqisi2}
 (expr_list:REG_DEAD (reg:QI 86)


> 
> a  0xxx
> b  0xxx
> r  0xxx
> 
> a  1xxx
> b  0xxx
> r  1xxx
> 
> a  0xxx
> b  1xxx
> r  1xxx
> 
> a  1xxx
> b  1xxx
> r  0xxx
> 
> As demonstrated above, results differ when sign bit of the value a is set.
> 
> The conversion works when the value a is loaded with a sign-extend operation.
> 
> r = xor:HI ((a, b); a SEXT, b IMM
> 
> a  0xxx
> b  0xxx
> r  0xxx
> 
> a  1xxx
> b  0xxx
> r  1xxx
> 
> a  0xxx
> b  1xxx
> r  1xxx
> 
> a  1xxx
> b  1xxx
> r  0xxx

[Bug libstdc++/99181] char_traits (and thus string_view) compares strings differently in constexpr and non-constexpr contexts

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99181

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/99650] ICE when trying to form reference to void in structured binding

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99650

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c++/99613] Static variable destruction order race condition

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99613

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug middle-end/98205] ICE in expand_omp_for_generic, at omp-expand.c:4307

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98205

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/98100] ICE in expand_debug_locations, at cfgexpand.c:5610

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98100

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug target/98063] Emit R_X86_64_GOTOFF64 instead of R_X86_64_GOTPCRELX for -mcmodel=large -fno-plt

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98063

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug c/97958] ICE in build2, at tree.c:4868

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97958

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.5

[Bug middle-end/54400] recognize vector reductions

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54400

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug target/89954] missed optimization for signed extension for x86-64

2021-09-11 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89954

--- Comment #5 from Uroš Bizjak  ---
(In reply to Hongtao.liu from comment #4)
> It looks like there's splitter in aarch64 which combines
> load+xor+zero_extend to zero_extend(mem) + xor, x86 doesn't have. The simple
> way is to add corresponding define_split for x86.
> 
> -x86 dump---
> Failed to match this instruction:
> (set (reg:SI 85)
> (sign_extend:SI (xor:QI (mem/c:QI (symbol_ref:DI ("c") [flags 0x2] 
> ) [0 c+0 S1 A8])
> (const_int 1 [0x1]
> ---dump end--

Maybe I'm missing something, but I don't think this transformation is correct.
Please consider the following analysis with the emphasis on the sign bit of the
QImode operation:

r = sext:HI (xor:QI (a, b)); b IMM

a  0xxx
b  0xxx
r  0xxx

a  1xxx
b  0xxx
r  1xxx

a  0xxx
b  1xxx
r  1xxx

a  1xxx
b  1xxx
r  0xxx

r = xor:HI ((a, b); a ZEXT, b IMM

a  0xxx
b  0xxx
r  0xxx

a  1xxx
b  0xxx
r  1xxx

a  0xxx
b  1xxx
r  1xxx

a  1xxx
b  1xxx
r  0xxx

As demonstrated above, results differ when sign bit of the value a is set.

The conversion works when the value a is loaded with a sign-extend operation.

r = xor:HI ((a, b); a SEXT, b IMM

a  0xxx
b  0xxx
r  0xxx

a  1xxx
b  0xxx
r  1xxx

a  0xxx
b  1xxx
r  1xxx

a  1xxx
b  1xxx
r  0xxx

[Bug target/91103] AVX512 vector element extract uses more than 1 shuffle instruction; VALIGND can grab any element

2021-09-11 Thread peter at cordes dot ca via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91103

--- Comment #9 from Peter Cordes  ---
Thanks for implementing my idea :)

(In reply to Hongtao.liu from comment #6)
> For elements located above 128bits, it seems always better(?) to use
> valign{d,q}

TL:DR:
 I think we should still use vextracti* / vextractf* when that can get the job
done in a single instruction, especially when the VEX-encoded vextracti/f128
can save a byte of code size for v[4].

Extracts are simpler shuffles that might have better throughput on some future
CPUs, especially the upcoming Zen4, so even without code-size savings we should
use them when possible.  Tiger Lake has a 256-bit shuffle unit on port 1 that
supports some common shuffles (like vpshufb); a future Intel might add
256->128-bit extracts to that.

It might also save a tiny bit of power, allowing on-average higher turbo
clocks.

---

On current CPUs with AVX-512, valignd is about equal to a single vextract, and
better than multiple instruction.  It doesn't really have downsides on current
Intel, since I think Intel has continued to not have int/FP bypass delays for
shuffles.

We don't know yet what AMD's Zen4 implementation of AVX-512 will look like.  If
it's like Zen1 was AVX2 (i.e. if it decodes 512-bit instructions other than
insert/extract into at least 2x 256-bit uops) a lane-crossing shuffle like
valignd probably costs more than 2 uops.  (vpermq is more than 2 uops on
Piledriver/Zen1).  But a 128-bit extract will probably cost just one uop.  (And
especially an extract of the high 256 might be very cheap and low latency, like
vextracti128 on Zen1, so we might prefer vextracti64x4 for v[8].)

So this change is good, but using a vextracti64x2 or vextracti64x4 could be a
useful peephole optimization when byte_offset % 16 == 0.  Or of course
vextracti128 when possible (x/ymm0..15, not 16..31 which are only accessible
with an EVEX-encoded instruction).

vextractf-whatever allows an FP shuffle on FP data in case some future CPU
cares about that for shuffles.

An extract is a simpler shuffle that might have better throughput on some
future CPU even with full-width execution units.  Some future Intel CPU might
add support for vextract uops to the extra shuffle unit on port 1.  (Which is
available when no 512-bit uops are in flight.)  Currently (Ice Lake / Tiger
Lake) it can only run some common shuffles like vpshufb ymm, but not including
any vextract or valign.  Of course port 1 vector ALUs are shut down when
512-bit uops are in flight, but could be relevant for __m256 vectors on these
hypothetical future CPUs.

When we can get the job done with a single vextract-something, we should use
that instead of valignd.  Otherwise use valignd.

We already check the index for low-128 special cases to use vunpckhqdq vs.
vpshufd (or vpsrldq) or similar FP shuffles.

-

On current Intel, with clean YMM/ZMM uppers (known by the CPU hardware to be
zero), an extract that only writes a 128-bit register will keep them clean
(even if it reads a ZMM), not needing a VZEROUPPER.  Since VZEROUPPER is only
needed for dirty y/zmm0..15, not with dirty zmm16..31, so a function like

float foo(float *p) {
  some vector stuff that can use high zmm regs;
  return scalar that happens to be from the middle of a vector;
}

could vextract into XMM0, but would need vzeroupper if it used valignd into
ZMM0.

(Also related
https://stackoverflow.com/questions/58568514/does-skylake-need-vzeroupper-for-turbo-clocks-to-recover-after-a-512-bit-instruc
re reading a ZMM at all and turbo clock).

---

Having known zeros outside the low 128 bits (from writing an xmm instead of
rotating a zmm) is unlikely to matter, although for FP stuff copying fewer
elements that might be subnormal could happen to be an advantage, maybe saving
an FP assist for denormal.  We're unlikely to be able to take advantage of it
to save instructions/uops (like OR instead of blend).  But it's not worse to
use a single extract instruction instead of a single valignd.

[Bug tree-optimization/79201] missed optimization: sinking doesn't handle calls, swap PRE and sinking

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79201

Andrew Pinski  changed:

   What|Removed |Added

 Status|RESOLVED|NEW
   Severity|normal  |enhancement
 Resolution|FIXED   |---

--- Comment #6 from Andrew Pinski  ---
So this is actually still broken in the sense if we turn off dominator
optimizations (-fno-tree-dominator-opts), the problem with PRE and sink
interaction comes into play still.

The improvement came in via r8-2694 which in fact added
-fno-tree-dominator-opts to gcc.dg/tree-ssa/ssa-sink-16.c .
Note DOM actually does the sinking rather than the rotating of the loop.

here is a testcase where DOM does not mess with the loop but we should still be
able to sink the function out and do when adding  -fno-tree-pre:

int f(int n, int t) {
  int i,j=0;
  if (t >=31 || t < 0)  return 100;

  for (i = 0; i < t; i++) {
j = __builtin_ffs(i);
  }
  return j;
}

[Bug fortran/93701] ICE on associate of wrongly accessed array

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93701

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |9.5

[Bug fortran/98472] internal compiler error: in gfc_conv_expr_descriptor, at fortran/trans-array.c:7352

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98472

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |9.5

[Bug fortran/100110] Parameterized Derived Types, problems with global variable

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100110

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |9.5

[Bug tree-optimization/102269] ICE in verify_gimple_stmt and -ftrivial-auto-var-init=zero and zero size array

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102269

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug tree-optimization/102183] sccvn compare predicated result issue in vn_nary_op_insert_into

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102183

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug c++/98332] [10/11 Regression] ICE in unshare_constructor, at cp/constexpr.c:1527 since r6-7607-g52228180f1e50cbb

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98332

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
   Target Milestone|--- |10.4

[Bug tree-optimization/23902] update_ssa very very very slow

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23902

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug testsuite/101646] [12 regression] excess errors after r12-2533

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101646

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug tree-optimization/57858] AVX2: ymm used for div, not for sqrt

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |8.0

[Bug c++/102228] lookup_anon_field is quadratic

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102228

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug sanitizer/97868] warn about using fences with TSAN

2021-09-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97868

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.0

[Bug sanitizer/97868] warn about using fences with TSAN

2021-09-11 Thread glisse at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97868

--- Comment #6 from Marc Glisse  ---
(In reply to pavlick from comment #5)
> Why is there false positive and no warning about the unsupported feature
> (atomic_thread_fence)?

You are probably using an old version of gcc. With a recent one, this prints

In function 'void std::atomic_thread_fence(std::memory_order)',
inlined from 'void Test::add()' at 3.cc:14:22:
/usr/lib/gcc-snapshot/include/c++/12/bits/atomic_base.h:126:26: warning:
'atomic_thread_fence' is not supported with '-fsanitize=thread' [-Wtsan]
  126 |   { __atomic_thread_fence(int(__m)); }
  | ~^~

[Bug sanitizer/97868] warn about using fences with TSAN

2021-09-11 Thread ispavlick at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97868

pavlick  changed:

   What|Removed |Added

 CC||ispavlick at gmail dot com

--- Comment #5 from pavlick  ---
Why is there false positive and no warning about the unsupported feature
(atomic_thread_fence)?

[code]
#include 
#include 
#include 
#include 
using namespace std;

class Test {
atomic_flag m_spin_lock;
vector m_data;
public:
void add() {
while (m_spin_lock.test_and_set(memory_order_relaxed))
this_thread::yield();
atomic_thread_fence(memory_order_acquire);
while (true) {
this_thread::sleep_for(300ms);
m_data.push_back(4);
}
m_spin_lock.clear(memory_order_release);
}
void read() {
size_t sz;
if (! m_spin_lock.test_and_set(std::memory_order_acquire)) {
sz = m_data.size();
m_spin_lock.clear(std::memory_order_release);
}
(void)sz;
}
}test;

int main() {
jthread rt(::read, );
this_thread::sleep_for(10ms);
test.add();
}

[/code]

$ g++ 3.cc -std=c++20 -Wtsan -Wall -fsanitize=thread

78 matches

Mail list logo