[Bug c++/105779] [12/13 Regression] static function with auto return type not being resolved correctly

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105779

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> Here is a even more reduced testcase which is rejected rather than crashes.
> It is also is rejected on the trunk:

Here is the error message for the rejection (which does not even make sense
since to the eye, the type looks exactly the same):
:13:15: error: invalid conversion from 'int (*)()' to 'int (*)()'
[-fpermissive]
   13 | int t = method(struct1<1>::apply);
  | ~~^~~
  |   |
  |   int (*)()
:11:17: note:   initializing argument 1 of 'int method(int (*)())'
   11 | int method(int(*f)());
  |~^~~~

[Bug c++/105779] [12/13 Regression] static function with auto return type not being resolved correctly

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105779

--- Comment #4 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> Here is a even more reduced testcase which is rejected rather than crashes.
> It is also is rejected on the trunk:
> template
> struct struct1
> {
>   static auto apply()
>   {
> return 1;
>   }
> };
> 
> int method(int(*f)());
> 
> int t = method(struct1<1>::apply);

Note if you want to reproduce the crash, just change method to:
template
int method(T(*f)());

[Bug c++/105779] [12/13 Regression] static function with auto return type not being resolved correctly

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105779

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||rejects-valid
Summary|[12 Regression] |[12/13 Regression] static
   |internal_error on passing a |function with auto return
   |pointer to static function  |type not being resolved
   |to another function |correctly

--- Comment #3 from Andrew Pinski  ---
Here is a even more reduced testcase which is rejected rather than crashes. It
is also is rejected on the trunk:
template
struct struct1
{
  static auto apply()
  {
return 1;
  }
};

int method(int(*f)());

int t = method(struct1<1>::apply);

[Bug tree-optimization/105643] [13 Regression] Code-Size regression for specrate 538.imagick_r

2022-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105643

--- Comment #8 from Hongtao.liu  ---
Looks like codesize decreased after
r13-754-ga1c9f779f75283427316b5c670c1e01ff8ce9ced.

Now we have cost model for unswitching loop

decorate.c:605:25: note: not unswitching condition, cost too big (37 insns
copied to 35 and 37)

[Bug analyzer/105784] New: -Wanalyzer-use-of-uninitialized-value false positive on partly initialized array

2022-05-30 Thread eggert at cs dot ucla.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105784

Bug ID: 105784
   Summary: -Wanalyzer-use-of-uninitialized-value false positive
on partly initialized array
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: eggert at cs dot ucla.edu
  Target Milestone: ---

Created attachment 53056
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53056=edit
False positive with -O2 -fanalyzer -Wanalyzer-use-of-uninitialized-value

I found this bug with GCC 12.1.1 20220507 (Red Hat 12.1.1-1) on x86-64. Compile
the attached program x.i (which is simplified from GNU Emacs master) with:

gcc -O2 -fanalyzer -Wanalyzer-use-of-uninitialized-value -S x.i

The GCC output is as follows. This is a false positive, since *src must point
into the initialized part of the array.

x.i: In function ‘ccl_driver’:
x.i:13:11: warning: use of uninitialized value ‘*src’ [CWE-457]
[-Wanalyzer-use-of-uninitialized-value]
   13 | i = *src++;
  | ~~^~~~
  ‘Fccl_execute_on_string’: events 1-5
|
|   19 | Fccl_execute_on_string (char *str, long str_bytes)
|  | ^~
|  | |
|  | (1) entry to ‘Fccl_execute_on_string’
|..
|   25 |   int source[1024];
|  |   ~~
|  |   |
|  |   (2) region created on stack here
|..
|   28 |   while (src_size < 1024 && p < endp)
|  |  ~~~
|  |  |
|  |  (3) following ‘false’ branch...
|..
|   31 |   ccl_driver (source, src_size);
|  |   ~
|  |   |
|  |   (4) ...to here
|  |   (5) calling ‘ccl_driver’ from ‘Fccl_execute_on_string’
|
+--> ‘ccl_driver’: events 6-11
   |
   |5 | ccl_driver (int *source, int src_size)
   |  | ^~
   |  | |
   |  | (6) entry to ‘ccl_driver’
   |..
   |   10 |   while (!quit_flag)
   |  |  ~~
   |  |  |
   |  |  (7) following ‘false’ branch...
   |   11 | {
   |   12 |   if (src < src_end)
   |  |  ~
   |  |  |
   |  |  (8) ...to here
   |  |  (9) following ‘true’ branch (when ‘src <
src_end’)...
   |   13 | i = *src++;
   |  | ~~
   |  |   | |
   |  |   | (10) ...to here
   |  |   (11) use of uninitialized value ‘*src’ here
   |

[Bug analyzer/105783] New: -Wanalyzer-null-dereference false positive with union and functions

2022-05-30 Thread kamilcukrowski at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105783

Bug ID: 105783
   Summary: -Wanalyzer-null-dereference false positive with union
and functions
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: kamilcukrowski at gmail dot com
  Target Milestone: ---

> the exact version of GCC; the system type; the options given when GCC was 
> configured/built;

```
$ gcc --version
gcc (GCC) 12.1.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ cat /etc/arch-release 
Arch Linux release
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/12.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-bootstrap
--prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit
--enable-cet=auto --enable-checking=release --enable-clocale=gnu
--enable-default-pie --enable-default-ssp --enable-gnu-indirect-function
--enable-gnu-unique-object --enable-linker-build-id --enable-lto
--enable-multilib --enable-plugin --enable-shared --enable-threads=posix
--disable-libssp --disable-libstdcxx-pch --disable-werror
--with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.1.0 (GCC) 
```

> the complete command line that triggers the bug ; the compiler output (error 
> messages, warnings, etc.);

I have the following MCVE:

```
struct ss_s {
union out_or_counting_u {
char *newstr;
unsigned long long cnt;
} uu;
_Bool counting;
};

struct ss_s ss_init(void) {
   struct ss_s rr = { .counting = 1 };
   return rr;
}

void ss_out(struct ss_s *t, char cc) {
   if (!t->counting) {
   *t->uu.newstr++ = cc;
   }
}

int main() {
struct ss_s ss = ss_init();
ss_out(, 'a');
}

```

Compiling with gcc12.1 with `-fanalyzer -O` results in
https://godbolt.org/z/K84Pr1zcx :

```
: In function 'ss_out':
:16:33: warning: dereference of NULL '0' [CWE-476]
[-Wanalyzer-null-dereference]
   16 | *t->uu.newstr++ = cc;
  | ^~~~
  'main': events 1-2
|
|   20 | int main() {
|  | ^~~~
|  | |
|  | (1) entry to 'main'
|   21 | struct ss_s ss = ss_init();
|   22 | ss_out(, 'a');
|  | 
|  | |
|  | (2) calling 'ss_out' from 'main'
|
+--> 'ss_out': events 3-7
   |
   |   14 | void ss_out(struct ss_s *t, char cc) {
   |  |  ^~
   |  |  |
   |  |  (3) entry to 'ss_out'
   |   15 | if (!t->counting) {
   |  |~
   |  ||
   |  |(4) following 'false' branch...
   |   16 | *t->uu.newstr++ = cc;
   |  | 
   |  | | | |
   |  | | | (7) dereference of NULL
'*t.uu.newstr'
   |  | | (6) '0' is NULL
   |  | (5) ...to here
   |
```

It will not be null, because `t->counting` is true. Gcc seems to take wrong
branch on line 15 `if (t->counting) {` inside `ss_out`. I feel like changing
random things makes the problem go away, like changing `counting` from `bool`
to `int` or changing `count` from `size_t` to `unsigned`.

Thanks for amazing gcc!

[Bug middle-end/105781] GCC does not unroll auto-vectorized loops.

2022-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105781

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #4 from Hongtao.liu  ---
Just note we can also use *#pragma GCC unroll n* to specify the unrolling
factor.

#pragma GCC unroll 4
void double_elements(int* f, int* l, int v) {
while (f != l) {
*f = *f + *f;
++f;
}
}

[Bug target/89929] __attribute__((target("avx512bw"))) doesn't work on non avx512bw systems

2022-05-30 Thread elrodc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89929

--- Comment #32 from Chris Elrod  ---
Ha, I accidentally misreported my gcc version. I was already using 12.1.1.

Using x86-64-v4 worked, excellent! Thanks.

[Bug target/89929] __attribute__((target("avx512bw"))) doesn't work on non avx512bw systems

2022-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89929

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #31 from Hongtao.liu  ---
(In reply to Chris Elrod from comment #30)
> > #if defined(__clang__)
> > #define MULTIVERSION
> >\
> > __attribute__((target_clones("avx512dq", "avx2", "default")))
> > #else
> > #define MULTIVERSION
> >\
> > __attribute__((target_clones(   
> >\
> > "arch=skylake-avx512,arch=cascadelake,arch=icelake-client,arch="
> >\
> > "tigerlake,"
> >\
> > "arch=icelake-server,arch=sapphirerapids,arch=cooperlake",  
> >\
> > "avx2", "default")))
> > #endif
> 
> For example, I can do something like this, but gcc produces a ton of
> unnecessary duplicates for each of the avx512dq architectures. There must be
> a better way.

Maybe you can use __attribute__((target_clones("arch=x86-64-v4","avx2",
"default"))), oh it works only for GCC12.1 and trunk, not for GCC11.2

[Bug tree-optimization/105780] GCC does not vectorise filling array of integers with a value on sse2

2022-05-30 Thread denis.yaroshevskij at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105780

--- Comment #3 from Denis Yaroshevskiy  ---
My bad then

[Bug target/89929] __attribute__((target("avx512bw"))) doesn't work on non avx512bw systems

2022-05-30 Thread elrodc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89929

--- Comment #30 from Chris Elrod  ---
> #if defined(__clang__)
> #define MULTIVERSION  
>  \
> __attribute__((target_clones("avx512dq", "avx2", "default")))
> #else
> #define MULTIVERSION  
>  \
> __attribute__((target_clones( 
>  \
> "arch=skylake-avx512,arch=cascadelake,arch=icelake-client,arch="  
>  \
> "tigerlake,"  
>  \
> "arch=icelake-server,arch=sapphirerapids,arch=cooperlake",
>  \
> "avx2", "default")))
> #endif

For example, I can do something like this, but gcc produces a ton of
unnecessary duplicates for each of the avx512dq architectures. There must be a
better way.

[Bug middle-end/105781] GCC does not unroll auto-vectorized loops.

2022-05-30 Thread denis.yaroshevskij at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105781

--- Comment #3 from Denis Yaroshevskiy  ---
Thank you, feel free to close then

[Bug c++/105779] [12 Regression] internal_error on passing a pointer to static function to another function

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105779

--- Comment #2 from Andrew Pinski  ---
using WrappedT = typename TypeWrapperT::type;

is important ...

[Bug tree-optimization/105780] GCC does not vectorise filling array of integers with a value on sse2

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105780

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> And has been since GCC 6.

Since GCC 4.4.x. It is not unrolled but that is PR 105781.

[Bug tree-optimization/105780] GCC does not vectorise filling array of integers with a value on sse2

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105780

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---
It was vectorized:
movdxmm1, edx
shr rsi, 2
pshufd  xmm0, xmm1, 0
sal rsi, 4
add rsi, rcx
.L4:
movups  XMMWORD PTR [rax], xmm0
add rax, 16
cmp rsi, rax
jne .L4

And has been since GCC 6.

[Bug target/89929] __attribute__((target("avx512bw"))) doesn't work on non avx512bw systems

2022-05-30 Thread elrodc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89929

Chris Elrod  changed:

   What|Removed |Added

 CC||elrodc at gmail dot com

--- Comment #29 from Chris Elrod  ---
"RESOLVED FIXED". I haven't tried this with `target`, but avx512bw does not
work with target_clones with gcc 11.2, but it does with clang 14.

[Bug middle-end/105781] GCC does not unroll auto-vectorized loops.

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105781

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
Yes adding -funroll-loops unrolls the loops as expected.

I don't think there is anything to do here really.

[Bug target/105782] New: [sparc64] Emission of questionable movxtod/movdtox with -mvis3

2022-05-30 Thread koachan+gccbugs at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105782

Bug ID: 105782
   Summary: [sparc64] Emission of questionable movxtod/movdtox
with -mvis3
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: koachan+gccbugs at protonmail dot com
  Target Milestone: ---

Created attachment 53055
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53055=edit
The problematic function, adapted for standalone compilation

Hello, I found out that the blake2b implementation in monocypher runs much
slower on a SPARC T4 when compiled with `-O3 -mvis3`, as opposed to plain
`-O3`:

With plain -O3:  Blake2b : 184 megabytes  per second
With -O3 -mvis3: Blake2b : 118 megabytes  per second

(Results are from monocypher's `make speed` benchmark)

Looking at the generated assembly, it seems that when the code is compiled with
-mvis3, GCC emits a lot of questionable `movxtod`/`movdtox` instructions?

I'm using sparc64-linux-gnu-gcc (GCC) 12.1.0.

[Bug target/105778] Shift by register --- unnecessary AND instruction

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105778

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
I think:
--- gcc/config/i386/i386.md.jj  2022-05-30 14:07:11.988199636 +0200
+++ gcc/config/i386/i386.md 2022-05-31 00:39:08.031757037 +0200
@@ -12708,19 +12708,21 @@ (define_expand "3"
   ""
   "ix86_expand_binary_operator (, mode, operands); DONE;")

+(define_mode_iterator SWI48A [SI (DI "TARGET_64BIT")])
+
 ;; Avoid useless masking of count operand.
-(define_insn_and_split "*3_mask"
+(define_insn_and_split "*3_mask_"
   [(set (match_operand:SWI48 0 "nonimmediate_operand")
(any_shiftrt:SWI48
  (match_operand:SWI48 1 "nonimmediate_operand")
  (subreg:QI
-   (and:SI
- (match_operand:SI 2 "register_operand" "c,r")
- (match_operand:SI 3 "const_int_operand")) 0)))
+   (and:SWI48A
+ (match_operand:SWI48A 2 "register_operand" "c,r")
+ (match_operand:SWI48A 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
-  "ix86_binary_operator_ok (, mode, operands)
-   && (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
-  == GET_MODE_BITSIZE (mode)-1
+  "ix86_binary_operator_ok (, mode, operands)
+   && (INTVAL (operands[3]) & (GET_MODE_BITSIZE (mode)-1))
+  == GET_MODE_BITSIZE (mode)-1
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -12754,16 +12756,16 @@ (define_insn_and_split "*3_m
   ""
   [(set_attr "isa" "*,bmi2")])

-(define_insn_and_split "*3_doubleword_mask"
-  [(set (match_operand: 0 "register_operand")
-   (any_shiftrt:
- (match_operand: 1 "register_operand")
+(define_insn_and_split "*3_doubleword_mask_"
+  [(set (match_operand: 0 "register_operand")
+   (any_shiftrt:
+ (match_operand: 1 "register_operand")
  (subreg:QI
-   (and:SI
- (match_operand:SI 2 "register_operand" "c")
- (match_operand:SI 3 "const_int_operand")) 0)))
+   (and:SWI48
+ (match_operand:SWI48 2 "register_operand" "c")
+ (match_operand:SWI48 3 "const_int_operand")) 0)))
(clobber (reg:CC FLAGS_REG))]
-  "(INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0
+  "(INTVAL (operands[3]) & ( * BITS_PER_UNIT)) == 0
&& ix86_pre_reload_split ()"
   "#"
   "&& 1"
@@ -12772,7 +12774,8 @@ (define_insn_and_split "*3_do
   (ior:DWIH (lshiftrt:DWIH (match_dup 4)
   (and:QI (match_dup 2) (match_dup 8)))
 (subreg:DWIH
-  (ashift: (zero_extend: (match_dup 7))
+  (ashift:
+(zero_extend: (match_dup 7))
 (minus:QI (match_dup 9)
   (and:QI (match_dup 2) (match_dup 8 0)))
   (clobber (reg:CC FLAGS_REG))])
@@ -12781,13 +12784,14 @@ (define_insn_and_split "*3_do
   (any_shiftrt:DWIH (match_dup 7) (match_dup 2)))
   (clobber (reg:CC FLAGS_REG))])]
 {
-  split_double_mode (mode, [0], 2, [4], [6]);
+  split_double_mode (mode, [0], 2, [4],
+[6]);

-  operands[8] = GEN_INT ( * BITS_PER_UNIT - 1);
-  operands[9] = GEN_INT ( * BITS_PER_UNIT);
+  operands[8] = GEN_INT ( * BITS_PER_UNIT - 1);
+  operands[9] = GEN_INT ( * BITS_PER_UNIT);

-  if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1))
-  != (( * BITS_PER_UNIT) - 1))
+  if ((INTVAL (operands[3]) & (( * BITS_PER_UNIT) - 1))
+  != (( * BITS_PER_UNIT) - 1))
 {
   rtx tem = gen_reg_rtx (SImode);
   emit_insn (gen_andsi3 (tem, operands[2], operands[3]));
could fix this.  Wonder if it couldn't be written without the extra iterator
though...

[Bug middle-end/105781] GCC does not unroll auto-vectorized loops.

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105781

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |middle-end
   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
Unrolling happens with -funroll-loops really. Gcc does some unrolling by
default at -O2 and some more at -O3 but you need the extra flag to get the most
really.

[Bug tree-optimization/105780] GCC does not vectorise filling array of integers with a value on sse2

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105780

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug c++/105781] New: GCC does not unroll auto-vectorized loops.

2022-05-30 Thread denis.yaroshevskij at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105781

Bug ID: 105781
   Summary: GCC does not unroll auto-vectorized loops.
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: denis.yaroshevskij at gmail dot com
  Target Milestone: ---

Even when vectorizing very simple loops, gcc uses unroll factor of 1.
Clang unrolls 8, for context.

Example: 

void double_elements(int* f, int* l, int v) {
while (f != l) {
*f = *f + *f;
++f;
}
}


https://godbolt.org/z/hTx84essY

In many measurements unrolling such code by a factor of 4 is beneficial.

[Bug c++/105780] New: GCC does not vectorise filling array of integers with a value on sse2

2022-05-30 Thread denis.yaroshevskij at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105780

Bug ID: 105780
   Summary: GCC does not vectorise filling array of integers with
a value on sse2
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: denis.yaroshevskij at gmail dot com
  Target Milestone: ---

The following code snippet I believe should be auto-vectorised even on sse2

```
void fill(int* f, int* l, int v) {
while (f != l) {
*f++ = v;
}
}
```

Clang does it: https://godbolt.org/z/3bEeE8E8n

[Bug c++/105779] [12 Regression] internal_error on passing a pointer to static function to another function

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105779

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||12.1.0
  Known to work||11.3.0, 13.0
   Target Milestone|--- |12.2
   Last reconfirmed||2022-05-30
   Keywords||ice-on-valid-code
Summary|internal_error on passing a |[12 Regression]
   |pointer to static function  |internal_error on passing a
   |to another function |pointer to static function
   ||to another function
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
This is interesting as it works on the trunk ...

[Bug c++/105779] New: internal_error on passing a pointer to static function to another function

2022-05-30 Thread bart at bartjanssens dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105779

Bug ID: 105779
   Summary: internal_error on passing a pointer to static function
to another function
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bart at bartjanssens dot org
  Target Milestone: ---

Created attachment 53054
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53054=edit
Minimal example to reproduce this

Compiling the attached test case fails, explicitly adding & on line 23 to
indicate that method takes a pointer to a function solves the problem, but I
believe both should compile (and did prior to GCC 12), and certainly not crash.

Compile command: g++ -Wall -Wextra test-pointer.cpp
Output:
during RTL pass: expand
test-pointer.cpp: In member function ‘int
WrapSmartPointer::operator()(TypeWrapperT&&) [with TypeWrapperT = IntWrapper]’:
test-pointer.cpp:23:18: internal compiler error: Segmentation fault
   23 | return method(DereferenceSmartPointer::apply); //
method(::apply) works
  |~~^~
0x1ac4724 internal_error(char const*, ...)
???:0
0x11072d7 make_decl_rtl(tree_node*)
???:0
0x95ad77 expand_call(tree_node*, rtx_def*, int)
???:0
0xab27af expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
???:0
0xa930b4 store_expr(tree_node*, rtx_def*, int, bool, bool)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

GCC version: g++ (GCC) 12.1.0 (Arch Linux package)
System: Linux 5.15.43-1-lts #1 SMP Wed, 25 May 2022 14:08:34 + x86_64
GNU/Linux

[Bug target/105778] New: Rotate by register --- unnecessary AND instruction

2022-05-30 Thread zero at smallinteger dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105778

Bug ID: 105778
   Summary: Rotate by register --- unnecessary AND instruction
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zero at smallinteger dot com
  Target Milestone: ---

Created attachment 53053
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53053=edit
Sample code

With -O2, some x86 shift-by-register instructions are preceded by an
unnecessary AND instruction.  The AND instruction is unnecessary because the
shift-by-register instructions already mask the register containing the
variable shift.

In the sample code, the #if 0 branch produces the code

mov rax, rdi
mov ecx, edi
shr rax, cl
ret

but the #if 1 branch produces the code

mov rcx, rdi
mov rax, rdi
and ecx, 63
shr rax, cl
ret

even though the code has the same behavior.  Note that the and ecx, 63 is
unnecessary here because shr rax, cl will already operate on the bottom 6 bits
of ecx anyway, as per the Intel manual.

As notated in the code's comments, some explicit masks other than 0x3f may
produce even more inefficient code, e.g.:

movabs  rcx, 35184372088831
mov rax, rdi
and rcx, rdi
shr rax, cl
ret

while some other masks like 0xff and 0x eliminate the explicit and
altogether.

Found with gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0.  Verified with godbolt
for all gcc versions from 9.4.0 through trunk.

For the sake of completeness, I could not get clang to reproduce this problem. 
The latest classic ICC compiler available in Godbolt (2021.5.0) can emit code
with MOVABS as above.  However, the newer ICX Intel compiler behaves like clang
(this seems reasonable).

[Bug c++/99080] Add !TYPE_P assert to type_dependent_expression_p

2022-05-30 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99080

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Marek Polacek  ---
Done.

[Bug c++/99080] Add !TYPE_P assert to type_dependent_expression_p

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99080

--- Comment #1 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:ff91735a5b861dd6eaf2c1e511f26a9255898e7d

commit r13-860-gff91735a5b861dd6eaf2c1e511f26a9255898e7d
Author: Marek Polacek 
Date:   Fri May 13 20:09:53 2022 -0400

c++: Add !TYPE_P assert to type_dependent_expression_p [PR99080]

As discussed here:
,
type_dependent_expression_p should not be called with a type argument.

I promised I'd add an assert so here it is.  One place needed adjusting.

PR c++/99080

gcc/cp/ChangeLog:

* pt.cc (type_dependent_expression_p): Assert !TYPE_P.
* semantics.cc (finish_id_expression_1): Handle
UNBOUND_CLASS_TEMPLATE
specifically.

[Bug fortran/91300] Wrong runtime error message with allocate and errmsg=

2022-05-30 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91300

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from anlauf at gcc dot gnu.org ---
Fixed for gcc-13.  Closing.

Thanks for the report!

[Bug middle-end/105777] Failure to optimize __builtin_mul_overflow with constant operand to add+cmp check

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105777

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2022-05-30
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement

--- Comment #1 from Andrew Pinski  ---
It depends on how fast the multiply is so it is a middle-end issue where expand
should be able to do it if the multiply is slow enough.

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:1ad584d538d349db13cfa8440222d91d5e9aff3f

commit r13-859-g1ad584d538d349db13cfa8440222d91d5e9aff3f
Author: Roger Sayle 
Date:   Mon May 30 21:32:58 2022 +0100

Allow SCmode and DImode to be tieable with TARGET_64BIT on x86_64.

This patch is a form of insurance policy in case my patch for PR 7061 runs
into problems on non-x86 targets; the middle-end can add an extra check
that the backend is happy placing SCmode and DImode values in the same
register, before creating a SUBREG.  Unfortunately, ix86_modes_tieable_p
currently claims this is not allowed(?), even though the default target
hook for modes_tieable_p is to always return true [i.e. false can be
used to specifically prohibit bad combinations], and the x86_64 ABI
passes SCmode values in DImode registers!.  This makes the backend's
modes_tiable_p hook a little more forgiving, and additionally enables
interconversion between SCmode and V2SFmode, and between DCmode and
VD2Fmode, which opens interesting opporutunities in the future.

2022-05-30  Roger Sayle  

gcc/ChangeLog
* config/i386/i386.cc (ix86_modes_tieable_p): Allow SCmode to be
tieable with DImode on TARGET_64BIT, and SCmode tieable with
V2SFmode, and DCmode with V2DFmode.

[Bug tree-optimization/105777] New: Failure to optimize __builtin_mul_overflow with constant operand to add+cmp check

2022-05-30 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105777

Bug ID: 105777
   Summary: Failure to optimize __builtin_mul_overflow with
constant operand to add+cmp check
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

int f17(unsigned x)
{
int z;
return __builtin_mul_overflow((int)x, 35, );
}

This can be optimized to `return (x + 0xFC57C57C) < 0xF8AF8AF9;` (and I'd
assume the same pattern with other constants than 35 should be optimizable in
the same way). LLVM does this transformation, but GCC does not.

[Bug fortran/91300] Wrong runtime error message with allocate and errmsg=

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91300

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:871dbb6112e22ff92914613c332944fd19dd39a8

commit r13-858-g871dbb6112e22ff92914613c332944fd19dd39a8
Author: Harald Anlauf 
Date:   Sat May 28 22:02:20 2022 +0200

Fortran: improve runtime error message with ALLOCATE and ERRMSG= [PR91300]

ALLOCATE: generate different STAT,ERRMSG results for failures from
allocation of already allocated objects or insufficient virtual memory.

gcc/fortran/ChangeLog:

PR fortran/91300
* libgfortran.h: Define new error code LIBERROR_NO_MEMORY.
* trans-stmt.cc (gfc_trans_allocate): Generate code for setting
ERRMSG depending on result of STAT result of ALLOCATE.
* trans.cc (gfc_allocate_using_malloc): Use STAT value of
LIBERROR_NO_MEMORY in case of failed malloc.

gcc/testsuite/ChangeLog:

PR fortran/91300
* gfortran.dg/allocate_alloc_opt_15.f90: New test.

[Bug rtl-optimization/101617] a ? -1 : 1 -> (-(type)a) | 1

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101617

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:f1652e3343b1ec47035370801d9b9aca1f8b613f

commit r13-857-gf1652e3343b1ec47035370801d9b9aca1f8b613f
Author: Roger Sayle 
Date:   Mon May 30 21:26:37 2022 +0100

PR rtl-optimization/101617: Use neg/sbb in ix86_expand_int_movcc.

This patch resolves PR rtl-optimization/101617 where we should generate
the exact same code for (X ? -1 : 1) as we do for ((X ? -1 : 0) | 1).
The cause of the current difference on x86_64 is actually in
ix86_expand_int_movcc that doesn't know that negl;sbbl can be used
to create a -1/0 result depending on whether the input is zero/nonzero.

So for Andrew Pinski's test case:

int f1(int i)
{
  return i ? -1 : 1;
}

GCC currently generates:

f1: cmpl$1, %edi
sbbl%eax, %eax  // x ? 0 : -1
andl$2, %eax// x ? 0 : 2
subl$1, %eax// x ? -1 : 1
ret

but with the attached patch, now generates:

f1: negl%edi
sbbl%eax, %eax  // x ? -1 : 0
orl $1, %eax// x ? -1 : 1
ret

To implement this I needed to add two expanders to i386.md to generate
the required instructions (in both SImode and DImode) matching the
pre-existing define_insns of the same name.

2022-05-30  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/101617
* config/i386/i386-expand.cc (ix86_expand_int_movcc): Add a
special case (indicated by negate_cc_compare_p) to generate a
-1/0 mask using neg;sbb.
* config/i386/i386.md (x86_neg_ccc): New define_expand
to generate an *x86_neg_ccc instruction.
(x86_movcc_0_m1_neg): Likewise, a new define_expand to
generate a *x86_movcc_0_m1_neg instruction.

gcc/testsuite/ChangeLog
PR rtl-optimization/101617
* gcc.target/i386/pr101617.c: New test case.

[Bug middle-end/89845] Consider improving division and modulo by constant if highpart multiply is cheap

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89845

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:2a12adfa8bd61e46538ebd97ae927d594843026a

commit r13-856-g2a12adfa8bd61e46538ebd97ae927d594843026a
Author: Roger Sayle 
Date:   Mon May 30 21:23:15 2022 +0100

Make the default rtx_costs of MULT/DIV variants consistent.

GCC's middle-end provides a default cost model for RTL expressions, for
backends that don't specify their own instruction timings, that can be
summarized as multiplications are COSTS_N_INSNS(4), divisions are
COSTS_N_INSNS(7) and all other operations are COSTS_N_INSNS(1).
This patch tweaks the above definition so that fused-multiply-add
(FMA) and high-part multiplications cost the same as regular
multiplications,
or more importantly aren't (by default) considered less expensive. 
Likewise
the saturating forms of multiplication and division cost the same as the
regular variants.  These values can always be changed by the target, but
the goal is to avoid RTL expansion substituting a suitable operation with
its saturating equivalent because it (accidentally) looks much cheaper.
For example, PR 89845 is about implementing division/modulus via highpart
multiply, which may accidentally look extremely cheap.

2022-05-30  Roger Sayle  

gcc/ChangeLog
* rtlanal.cc (rtx_cost) : Treat FMA, SS_MULT, US_MULT,
SMUL_HIGHPART and UMUL_HIGHPART as having the same cost as MULT.
: Likewise, SS_DIV and US_DIV have the same default as DIV.

[Bug target/70321] [10/11/12/13 Regression] STV generates less optimized code

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70321

--- Comment #25 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:43201f2c2173894bf7c423cad6da1c21567e06c0

commit r13-855-g43201f2c2173894bf7c423cad6da1c21567e06c0
Author: Roger Sayle 
Date:   Mon May 30 21:20:09 2022 +0100

PR target/70321: Split double word equality/inequality after STV on x86.

This patch resolves the last piece of PR target/70321 a code quality
(P2 regression) affecting mainline.  Currently, for HJ's testcase:

void foo (long long ixi)
{
  if (ixi != 14348907)
__builtin_abort ();
}

GCC with -m32 -O2 generates four instructions for the comparison:

movl16(%esp), %eax
movl20(%esp), %edx
xorl$14348907, %eax
orl %eax, %edx

but with this patch it now requires only three, making better use of
x86's addressing modes:

movl16(%esp), %eax
xorl$14348907, %eax
orl 20(%esp), %eax

The solution is to expand "doubleword" equality/inequality expressions
using flag setting COMPARE instructions for the early RTL passes, and
then split them during split1, after STV and before reload.
Hence on x86_64, we now see/allow things like:

(insn 11 8 12 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg/v:TI 84 [ x ])
(reg:TI 96))) "cmpti.c":2:43 30 {*cmpti_doubleword}

This allows the STV pass to decide whether it's preferrable to perform
this comparison using vector operations, i.e. a pxor/ptest sequence,
or as scalar integer operations, i.e. a xor/xor/or sequence.  Alas
this required tweaking of the STV pass to recognize the "new" form of
these comparisons and split out the pxor operation itself.  To confirm
this still works as expected I've added a new STV test case:

long long a[1024];
long long b[1024];

int foo()
{
  for (int i=0; i<1024; i++)
  {
long long t = (a[i]<<8) | (b[i]<<24);
if (t == 0)
  return 1;
  }
  return 0;
}

where with -m32 -O2 -msse4.1 the above comparison with zero should look
like:

punpcklqdq  %xmm0, %xmm0
ptest   %xmm0, %xmm0

Although this patch includes one or two minor tweaks to provide all the
necessary infrastructure to support conversion of TImode comparisons to
V1TImode (and SImode comparisons to V4SImode), STV doesn't yet implement
these transformations, but this is something that can be considered after
stage 4.  Indeed the new convert_compare functionality is split out
into a method to simplify its potential reuse by the timode_scalar_chain
class.

2022-05-30  Roger Sayle  

gcc/ChangeLog
PR target/70321
* config/i386/i386-expand.cc (ix86_expand_branch): Don't decompose
DI mode equality/inequality using XOR here.  Instead generate a
COMPARE for doubleword modes (DImode on !TARGET_64BIT or TImode).
* config/i386/i386-features.cc (gen_gpr_to_xmm_move_src): Use
gen_rtx_SUBREG when NUNITS is 1, i.e. for TImode to V1TImode.
(general_scalar_chain::convert_compare): New function to convert
scalar equality/inequality comparison into vector operations.
(general_scalar_chain::convert_insn) [COMPARE]: Refactor. Call
new convert_compare helper method.
(convertible_comparion_p): Update to match doubleword COMPARE
of two register, memory or integer constant operands.
* config/i386/i386-features.h
(general_scalar_chain::convert_compare):
Prototype/declare member function here.
* config/i386/i386.md (cstore4): Change mode to SDWIM, but
only allow new doubleword modes for EQ and NE operators.
(*cmp_doubleword): New define_insn_and_split, to split a
doubleword comparison into a pair of XORs followed by an IOR to
set the (zero) flags register, optimizing the XORs if possible.
* config/i386/sse.md (V_AVX): Include V1TI and V2TI in mode
iterator; V_AVX is (currently) only used by ptest.
(sse4_1 mode attribute): Update to support V1TI and V2TI.

gcc/testsuite/ChangeLog
PR target/70321
* gcc.target/i386/pr70321.c: New test case.
* gcc.target/i386/sse4_1-stv-1.c: New test case.

[Bug ipa/105639] [12/13 Regression] ICE in propagate_controlled_uses, at ipa-prop.cc:4195 since r12-7936-gf6d65e803623c7ba

2022-05-30 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105639

Martin Jambor  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Martin Jambor  ---
Fixed.  Thanks for reporting.

[Bug ipa/105639] [12/13 Regression] ICE in propagate_controlled_uses, at ipa-prop.cc:4195 since r12-7936-gf6d65e803623c7ba

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105639

--- Comment #4 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Martin Jambor
:

https://gcc.gnu.org/g:081c472589329fc6c58c4dfed70f1fbc029083a5

commit r12-8437-g081c472589329fc6c58c4dfed70f1fbc029083a5
Author: Martin Jambor 
Date:   Mon May 30 22:04:21 2022 +0200

ipa: Check cst type when propagating controled uses info

PR 105639 shows that code with type-mismatches can trigger an assert
after runnning into a branch that was inteded only for references to
variables - as opposed to references to functions.  Fixed by moving
the condition from the assert to the guarding if statement.

gcc/ChangeLog:

2022-05-25  Martin Jambor  

PR ipa/105639
* ipa-prop.cc (propagate_controlled_uses): Check type of the
constant before adding a LOAD reference.

gcc/testsuite/ChangeLog:

2022-05-25  Martin Jambor  

PR ipa/105639
* gcc.dg/ipa/pr105639.c: New test.

(cherry picked from commit f571596f8cd8fbad34305b4bec1a813620e0cbf0)

[Bug target/105624] [13 Regression] ICE in final_scan_insn_1, at final.cc:2861 (error: could not split insn)

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105624

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:3595231d9f5aec301422b152809b1322bdb525fe

commit r13-854-g3595231d9f5aec301422b152809b1322bdb525fe
Author: Uros Bizjak 
Date:   Mon May 30 21:38:16 2022 +0200

i386: Remove constraints when used with constant integer predicates, take 2

const_int_operand and other const*_operand predicates do not need
constraints when the constraint is inherited from the range of
constant integer predicate.  Remove the constraint in case all
alternatives use the same inherited constraint.

However, when there are operands, commitative with a non-constant
operand, the operand effectively matches e.g.
nonimmediate_operand|const_int_operand rather than just
const_int_operand.  We should keep the constraint for
const_int_operand that are in a % pair. See PR 105624.

2022-05-30  Uroš Bizjak  

gcc/ChangeLog:

* config/i386/i386.md: Remove constraints when used with
const_int_operand, const0_operand, const_1_operand,
constm1_operand,
const8_operand, const128_operand, const248_operand,
const123_operand,
const2367_operand, const1248_operand, const359_operand,
const_4_or_8_to_11_operand, const48_operand, const_0_to_1_operand,
const_0_to_3_operand, const_0_to_4_operand, const_0_to_5_operand,
const_0_to_7_operand, const_0_to_15_operand, const_0_to_31_operand,
const_0_to_63_operand, const_0_to_127_operand,
const_0_to_255_operand,
const_0_to_255_mul_8_operand, const_1_to_31_operand,
const_1_to_63_operand, const_2_to_3_operand, const_4_to_5_operand,
const_4_to_7_operand, const_6_to_7_operand, const_8_to_9_operand,
const_8_to_11_operand, const_8_to_15_operand,
const_10_to_11_operand,
const_12_to_13_operand, const_12_to_15_operand,
const_14_to_15_operand,
const_16_to_19_operand, const_16_to_31_operand,
const_20_to_23_operand,
const_24_to_27_operand and const_28_to_31_operand.
* config/i386/mmx.md: Ditto.
* config/i386/sse.md: Ditto.
* config/i386/subst.md: Ditto.
* config/i386/sync.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr105624.c: New test.

[Bug tree-optimization/105776] Failure to recognize __builtin_mul_overflow pattern

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105776

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=101856

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > The second one is a target issue.
> > For f3, widening multiple pass (which is misnamed these days) detects the
> > __builtin_mul_overflow for x86_64 but not for aarch64 even though the
> > incoming IR is the same.
> 
> I will file it seperately, the problem there is umulv4/mulv4 patterns are
> not defined for aarch64.

Actually I already filed it as PR 101856 :).

[Bug tree-optimization/105776] Failure to recognize __builtin_mul_overflow pattern

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105776

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> The second one is a target issue.
> For f3, widening multiple pass (which is misnamed these days) detects the
> __builtin_mul_overflow for x86_64 but not for aarch64 even though the
> incoming IR is the same.

I will file it seperately, the problem there is umulv4/mulv4 patterns are not
defined for aarch64.

[Bug tree-optimization/105776] Failure to recognize __builtin_mul_overflow pattern

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105776

--- Comment #1 from Andrew Pinski  ---
Note there are two different issues here.





The second one is a target issue.
For f3, widening multiple pass (which is misnamed these days) detects the
__builtin_mul_overflow for x86_64 but not for aarch64 even though the incoming
IR is the same.

[Bug tree-optimization/105776] Failure to recognize __builtin_mul_overflow pattern

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105776

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/105776] New: Failure to recognize __builtin_mul_overflow pattern

2022-05-30 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105776

Bug ID: 105776
   Summary: Failure to recognize __builtin_mul_overflow pattern
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

int f4(unsigned x, unsigned y)
{
if (x == 0)
return 1;
return ((int)(x * y) / (int)x) == y;
}

can be optimized to

int f4(unsigned x, unsigned y)
{
int z;
return !__builtin_mul_overflow((int)x, (int)y, );
}

This transformation is done by LLVM, but not by GCC.

Note that this derivates from another function written as such:

int
f3 (unsigned x, unsigned y)
{
  unsigned int r = x * y;
  return !x || ((int) r / (int) x) == (int) y;
}

which does optimize correctly on x86 but not on aarch64 (where it generates
tree-optimized GIMPLE corresponding to the code above)

[Bug c++/105752] Template function can access private member

2022-05-30 Thread csaba_22 at yahoo dot co.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105752

--- Comment #4 from Csaba Ráduly  ---
Looks like there *was* a  bug, I just wasn't  able to properly reproduce it
initially:

#include 
#include 

class CB {
struct DCB {};
};

struct  NMC1 : public CB {
int meow() const { return __LINE__; }
};

struct CC {
template 
int  meow(Args&&... args) {
if constexpr(std::is_same_v<
decltype(std::declval().meow(std::forward(args)...)), CB::DCB>) {
return __LINE__;
}
else {
return __LINE__;
}
}
};

int main()
{
CC  cc;
return  cc.meow();
}

(accepted  by GCC 10, rejected by clang and GCC11).

https://godbolt.org/z/81rWdf6v8

This has been fixed  between GCC 10.3 and 11.1

[Bug libstdc++/98723] On Windows with CP936 encoding, regex compiles very slow.

2022-05-30 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98723

--- Comment #7 from cqwrteur  ---
well the right solution is to write the regex by yourself. C++ regex might be
deprecated in the future.

[Bug c/85487] Support '#pragma region' and '#pragma endregion' to allow code folding with Visual Studio

2022-05-30 Thread bmburstein at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85487

--- Comment #15 from Baruch Burstein  ---
(In reply to Jonathan Wakely from comment #13)
> (In reply to rsand...@gcc.gnu.org from comment #12)
>  
> > I think the patch would need to wait for GCC 13 now though.
> 
> Indeed.

Now that GCC 13 is the main development trunk, can this patch be merged? If I
understood the comments in this thread correctly, the patch already exists and
was just waiting for GCC 12 to be branched.

[Bug c/105775] New: GCC uses an invalid assumption in numeric limits of char

2022-05-30 Thread dante19031999 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105775

Bug ID: 105775
   Summary: GCC uses an invalid assumption in numeric limits of
char
   Product: gcc
   Version: og11 (devel/omp/gcc-11)
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dante19031999 at gmail dot com
  Target Milestone: ---

On my system char is defined as a signed char which means that the limit is
indeed 127.
However in the C standard it is false, therefor there is an error.
This affects the use of the code for cross platform purposes.
Most probably you will find the complementary error in platforms where char is
defined as unsigned char giving the warning on cChar >= 0.
The error can be avoided to a certain extent with #if CHAR_MIN < 0 or
conversion to unsigned char...
The code shown here is a simplified version of the original made for the
purpose of the bug report.

According to cppreference:
signed char - type for signed character representation.
unsigned char - type for unsigned character representation. Also used to
inspect object representations (raw memory). 
char - type for character representation. Equivalent to either signed char or
unsigned char (which one is implementation-defined and may be controlled by a
compiler command line switch), but char is a distinct type, different from both
signed char and unsigned char.

./inc/ascii.h: In function 'is_ascii':
./inc/ascii.h:13:89: error: comparison is always true due to limited range of
data type [-Werror=type-limits]
   13 | __FULL_INLINE inline bool is_ascii( char cChar){return cChar >= 0 &&
cChar <= 127;}
  |
   ^~
./inc/ascii.h: In function 'is_ascii_printable':
./inc/ascii.h:17:94: error: comparison is always true due to limited range of
data type [-Werror=type-limits]
   17 | __FULL_INLINE inline bool is_ascii_printable( char cChar){return cChar
>= 32 && cChar <= 127;}
  |
  ^~
cc1: some warnings being treated as errors

#define __FULL_INLINE __attribute__((__const__))
__attribute__((__nothrow__)) __attribute__((__always_inline__))

gcc -x c -std=c17 -Wimplicit-function-declaration -pipe -Werror=format-security
-Wextra -Wall -pedantic -frounding-math -fsignaling-nans -Werror=narrowing
-fPIC -Wunused-variable -Wunused-value -Wunused-but-set-variable -Og -std=gnu17
-I./ascii.h -c ./ascii.c -o ./instdir/ascii.c.o

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin
--enable-initfini-array
--with-isl=/builddir/build/BUILD/gcc-11.3.1-20220421/obj-x86_64-redhat-linux/isl-install
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-gnu-indirect-function --enable-cet --with-tune=generic
--with-arch_32=i686 --build=x86_64-redhat-linux
--with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.1 20220421 (Red Hat 11.3.1-2) (GCC)

[Bug c++/105774] Bogus overflow in constant expression

2022-05-30 Thread klaus.doldinger64 at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105774

--- Comment #1 from Wilhelm M  ---
To make it more clear make the type of x *signed char`.

[Bug c++/105774] New: Bogus overflow in constant expression

2022-05-30 Thread jeff at jgarrett dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105774

Bug ID: 105774
   Summary: Bogus overflow in constant expression
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jeff at jgarrett dot org
  Target Milestone: ---

The following is diagnosed as ill-formed by GCC but not by Clang:

int main() {
  constexpr auto _ = [] {
char x = 127;
return ++x;
  }();
}

:5:5: error: overflow in constant expression [-fpermissive]

On godbolt https://godbolt.org/z/91oeGsEbh
Originally from
https://stackoverflow.com/questions/72425404/still-unsure-about-signed-integer-overflow-in-c

I believe that this is well-formed. [expr.pre.incr]/1 says x++ is equivalent to
x+=1. [expr.ass]/6 says that x+=1 is equivalent to x=x+1 except that x is only
evaluated once. That expression x=x+1 avoids overflow through integer
promotion.

The same code with x+=1 instead of ++x is allowed by GCC.

[Bug tree-optimization/105763] [13 Regression] ICE in outgoing_edge_range_p, at gimple-range-gori.cc:1253 since r13-754-ga1c9f779f7528342

2022-05-30 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105763

--- Comment #6 from Andrew Macleod  ---
yeah, from times of yore when the small set of callers made sure it was only
invoked on useful cases.  There were a lot of development asserts from initial
development.

There is no reason to trap, it can simply return false. ie


-  gcc_checking_assert (gimple_range_ssa_p (name));
+  if (!gimple_range_ssa_p (name))
+return false;

[Bug target/105773] New: [Aarch64] Failure to optimize and+cmp to tst

2022-05-30 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105773

Bug ID: 105773
   Summary: [Aarch64] Failure to optimize and+cmp to tst
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

int
baz (unsigned long x, unsigned long y)
{
  return (int) (x & y) > 0;
}

With -O3, AArch64 GCC outputs this:

baz(unsigned long, unsigned long):
and w0, w0, w1
cmp w0, 0
csetw0, gt
ret

whereas LLVM outputs this:

baz(unsigned long, unsigned long):
tst w1, w0
csetw0, gt
ret

It seems to me as though using tst should be faster (unless Aarch64 processors
are extremely weird).

[Bug target/105624] [13 Regression] ICE in final_scan_insn_1, at final.cc:2861 (error: could not split insn)

2022-05-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105624

--- Comment #8 from Uroš Bizjak  ---
> I think it would work to keep the constraints for
> const_int_operands that are in a % pair and drop them
> elsewhere.  (So a partial reapplication, rather than a
> full reapplication.)

OK, let's throw the patch to the wall the second time and see if it sticks this
time ;)

[Bug fortran/104036] Derived type assigment to allocatable with dynamic type

2022-05-30 Thread trnka at scm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104036

Tomáš Trnka  changed:

   What|Removed |Added

 CC||trnka at scm dot com

--- Comment #1 from Tomáš Trnka  ---
This might be related to PR57696, which is about allocatable _components_. It
could even be a duplicate, but it's hard to tell without more investigation.

[Bug target/105624] [13 Regression] ICE in final_scan_insn_1, at final.cc:2861 (error: could not split insn)

2022-05-30 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105624

--- Comment #7 from rsandifo at gcc dot gnu.org  
---
(In reply to Uroš Bizjak from comment #6)
> I was afraid I don't understood the reason of the failure well, although it
> happened very rarely (actually, no failures were detected during the build
> or testsuite run). The patch obviously triggered some inconsistency in the
> infrastructure, so without some assurances, I took the safe way and reverted
> everything.
But like I say, I think it's due to the % in that particular instruction.
When % is used on operand N, the constraints for operands N and N+1
have to be tight enough to support both the predicate on operand N
and the predicate on operand N+1.  So for:

(define_insn_and_split "*anddi_1_btr"
  [(set (match_operand:DI 0 "nonimmediate_operand" "=rm")
(and:DI
 (match_operand:DI 1 "nonimmediate_operand" "%0")
 (match_operand:DI 2 "const_int_operand")))
   (clobber (reg:CC FLAGS_REG))]

The constraints on operand 2 are effectively matching
nonimmediate_operand|const_int_operand rather than just
const_int_operand.

I think it would work to keep the constraints for
const_int_operands that are in a % pair and drop them
elsewhere.  (So a partial reapplication, rather than a
full reapplication.)

[Bug target/105624] [13 Regression] ICE in final_scan_insn_1, at final.cc:2861 (error: could not split insn)

2022-05-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105624

--- Comment #6 from Uroš Bizjak  ---
(In reply to rsand...@gcc.gnu.org from comment #5)
> FWIW, I think the problem is specific to operands that are
> commutative with a non-constant operand.  For example,
> suppose the pre-RA instruction had a pseudo register R matching
> a register_operand and a constant C matching a const_int_operand.
> If R does not get allocated, and so gets replaced by a stack slot M,
> the % would allow the RA to try mapping C to the register_operand
> and M to the const_int_operand.  Without a constraint on the latter,
> the M mapping would seem to be valid, and reloading C into a register
> might seem less costly than reloading M into a register.
> 
> The intent of the patch seemed good otherwise (and a nice clean-up).
> I don't think the whole thing needed to be reverted.

I was afraid I don't understood the reason of the failure well, although it
happened very rarely (actually, no failures were detected during the build or
testsuite run). The patch obviously triggered some inconsistency in the
infrastructure, so without some assurances, I took the safe way and reverted
everything.

I would gladly revert the revert. The reload is just doing unnecessary work
when multiple constraints are the same; all necessary information could be
retreived from the predicate.

[Bug debug/105772] [debug, i386] sched2 moves get_pc_thunk call past debug_insn

2022-05-30 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105772

--- Comment #2 from Tom de Vries  ---
As background info, I'm proposing a patch for gdb to have the
architecture-specific prologue skipper skip over the get_pc_thunk call:
https://sourceware.org/pipermail/gdb-patches/2022-May/189563.html , which helps
to skip over the prologue with -O0 -pie -fPIE code.

But that causes a regression in test-case gdb/testsuite/gdb.base/break.exp,
because of this PR.

[Bug target/105624] [13 Regression] ICE in final_scan_insn_1, at final.cc:2861 (error: could not split insn)

2022-05-30 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105624

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #5 from rsandifo at gcc dot gnu.org  
---
FWIW, I think the problem is specific to operands that are
commutative with a non-constant operand.  For example,
suppose the pre-RA instruction had a pseudo register R matching
a register_operand and a constant C matching a const_int_operand.
If R does not get allocated, and so gets replaced by a stack slot M,
the % would allow the RA to try mapping C to the register_operand
and M to the const_int_operand.  Without a constraint on the latter,
the M mapping would seem to be valid, and reloading C into a register
might seem less costly than reloading M into a register.

The intent of the patch seemed good otherwise (and a nice clean-up).
I don't think the whole thing needed to be reverted.

[Bug debug/105772] [debug, i386] sched2 moves get_pc_thunk call past debug_insn

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105772

Richard Biener  changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
So scheduling should either reset any debug_marker stmts it schedules stmts
over or, as in this case, choose to not cross them if that would not change the
insn position relative to non-debug insns?

[Bug tree-optimization/86725] ICE: Segmentation fault (in vect_get_vec_def_for_operand_1)

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86725

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Richard Biener  ---
.

[Bug tree-optimization/86725] ICE: Segmentation fault (in vect_get_vec_def_for_operand_1)

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86725

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |9.4

--- Comment #7 from Richard Biener  ---
Fixedin GCC 9.

[Bug c++/105751] std::array comparision does not inline memcmp

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105751

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2022-05-30
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  We end up expanding

   [local count: 1073741824]:
  _5 = a_2(D) + 8;
  _6 = [(const struct array *)a_2(D)]._M_elems;
  _8 = _5 - _6;
  __len_9 = (const size_t) _8;
  if (__len_9 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  _4 = [(const struct array *)b_3(D)]._M_elems;
  _10 = __builtin_memcmp_eq (_6, _4, __len_9);
  _11 = _10 == 0;

   [local count: 1073741824]:
  # _12 = PHI <1(2), _11(3)>
  return _12;

So somehow the size is not known to be 8 and thus the proposed optimized code
would compare uninitialized memory?

[Bug debug/105772] New: [debug, i386] sched2 moves get_pc_thunk call past debug_insn

2022-05-30 Thread vries at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105772

Bug ID: 105772
   Summary: [debug, i386] sched2 moves get_pc_thunk call past
debug_insn
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Consider the test-case source gdb/testsuite/gdb.base/break1.c (
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/testsuite/gdb.base/break1.c;h=24d0d15dc2e8bd14f6b14fabbb38fec43db9c990;hb=HEAD
) containing function marker4.

Extracting the relevant part:
...
struct some_struct
{
  int a_field;
  int b_field;
  union { int z_field; };
};

struct some_struct values[50];

void marker4 (long d) { values[0].a_field = d; }/* set breakpoint 14
here */
...

When compiling with gcc 12.1.0 and -O2 like so:
...
$ gcc -fno-stack-protector -m32 -fPIE -pie -w -c -g break1.c -O2
...
we have before sched2:
...
(note 4 1 14 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 14 4 11 2 NOTE_INSN_PROLOGUE_END)
(insn/f 11 14 3 2 (parallel [
(set (reg:SI 0 ax [82])
(unspec:SI [
(const_int 0 [0])
] UNSPEC_SET_GOT))
(clobber (reg:CC 17 flags))
]) 931 {*set_got}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUIV (unspec:SI [
(const_int 0 [0])
] UNSPEC_SET_GOT)
(expr_list:REG_CFA_FLUSH_QUEUE (nil)
(nil)
(note 3 11 6 2 NOTE_INSN_FUNCTION_BEG)
(debug_insn 6 3 12 2 (debug_marker) "break1.c":59:25 -1
 (nil))
(insn 12 6 8 2 (set (reg/v:SI 1 dx [orig:83 d ] [83])
(mem/c:SI (plus:SI (reg/f:SI 7 sp)
(const_int 4 [0x4])) [5 d+0 S4 A32])) "break1.c":59:43 81
{*movsi_internal}
 (expr_list:REG_EQUIV (mem/c:SI (reg/f:SI 16 argp) [5 d+0 S4 A32])
(nil)))
...
and after:
...
(note 4 1 14 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 14 4 3 2 NOTE_INSN_PROLOGUE_END)
(note 3 14 6 2 NOTE_INSN_FUNCTION_BEG)
(debug_insn 6 3 11 2 (debug_marker) "break1.c":59:25 -1
 (nil))
(insn/f:TI 11 6 12 2 (parallel [
(set (reg:SI 0 ax [82])
(unspec:SI [
(const_int 0 [0])
] UNSPEC_SET_GOT))
(clobber (reg:CC 17 flags))
]) 931 {*set_got}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUIV (unspec:SI [
(const_int 0 [0])
] UNSPEC_SET_GOT)
(expr_list:REG_CFA_FLUSH_QUEUE (nil)
(nil)
(insn 12 11 8 2 (set (reg/v:SI 1 dx [orig:83 d ] [83])
(mem/c:SI (plus:SI (reg/f:SI 7 sp)
(const_int 4 [0x4])) [5 d+0 S4 A32])) "break1.c":59:43 81
{*movsi_internal}
 (expr_list:REG_EQUIV (mem/c:SI (reg/f:SI 16 argp) [5 d+0 S4 A32])
(nil)))
...

This moves the get_pc_thunk call after the debug_insn, making it (in terms of
debug info) part of the first statement instead of the prologue.

That is, with -O1 we have insn:
...
000d :
   d:   e8 fc ff ff ff  call   e 
  12:   05 01 00 00 00  add$0x1,%eax
  17:   8b 54 24 04 mov0x4(%esp),%edx
  1b:   89 90 00 00 00 00   mov%edx,0x0(%eax)
  21:   c3  ret
...
and line info:
...
File nameLine numberStarting addressViewStmt
break1.c  59 0xd   x
break1.c  59 0xd   1
break1.c  590x17   x
break1.c  590x17   1
break1.c  590x21
break1.c   -0x22
...
so at 0x17 we have the start of a statement.

But with -O2 we have identical insn:
...

0030 :
  30:   e8 fc ff ff ff  call   31 
  35:   05 01 00 00 00  add$0x1,%eax
  3a:   8b 54 24 04 mov0x4(%esp),%edx
  3e:   89 90 00 00 00 00   mov%edx,0x0(%eax)
  44:   c3  ret
...
but different line info:
...
File nameLine numberStarting addressViewStmt
break1.c  590x30   x
break1.c  590x30   1   x
break1.c  590x3a
break1.c  590x44
break1.c   -0x45
...
so at 0x3a we don't have the start of a statement.

[Bug tree-optimization/105763] [13 Regression] ICE in outgoing_edge_range_p, at gimple-range-gori.cc:1253 since r13-754-ga1c9f779f7528342

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105763

--- Comment #5 from Richard Biener  ---
It ICEs because ranger asserts on SSA_NAME_OCCURS_IN_ABNORMAL_PHI (for whatever
reason...).  I have a fix.

[Bug tree-optimization/105763] [13 Regression] ICE in outgoing_edge_range_p, at gimple-range-gori.cc:1253 since r13-754-ga1c9f779f7528342

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105763

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Richard Biener  ---
I will have a look.

[Bug c/105771] matrix partial transposition with -O3 since r8-5159-g1cc521f1a824b591

2022-05-30 Thread franckbehaghel_gcc at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

--- Comment #6 from Franck Behaghel  
---
Hello,

> Does adding -fno-strict-aliasing fix the issue?
Right, it does. 

> I think you have an aliasing violation here.
I can not say if we have aliasing violation here. My understanding is that AV
happens when mixing pointer type referring to the same address.

>I think the way to fix the code is to do this:
>transpose_upper_to_lower (mat,);
It does not change the result. The issue is still present.

> -fno-loop-unroll-and-jam fixes it.  Can't check trunk right now whether it's 
> fixed.
I can confirm this too.


Regards,
Franck

[Bug c/105771] matrix partial transposition with -O3 since r8-5159-g1cc521f1a824b591

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

Martin Liška  changed:

   What|Removed |Added

Summary|matrix partial  |matrix partial
   |transposition with -O3  |transposition with -O3
   ||since
   ||r8-5159-g1cc521f1a824b591
   Keywords|needs-bisection |
 CC||marxin at gcc dot gnu.org

--- Comment #5 from Martin Liška  ---
Started with r8-5159-g1cc521f1a824b591.

[Bug tree-optimization/105769] program segmentation fault with -ftree-vectorize and nested lambdas

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105769

Martin Liška  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-05-30
 CC||marxin at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #1 from Martin Liška  ---
Started with param change in r11-4438-g686c1b70c70a8df4.

[Bug analyzer/105765] [13 Regression] ICE: Segmentation fault (in ana::region_model::deref_rvalue) since r13-514-g2402dc6b982c4dac

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105765

Martin Liška  changed:

   What|Removed |Added

   Last reconfirmed||2022-05-30
 Ever confirmed|0   |1
Summary|[13 Regression] ICE:|[13 Regression] ICE:
   |Segmentation fault (in  |Segmentation fault (in
   |ana::region_model::deref_rv |ana::region_model::deref_rv
   |alue)   |alue) since
   ||r13-514-g2402dc6b982c4dac
 Status|UNCONFIRMED |NEW
 CC||marxin at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Started with r13-514-g2402dc6b982c4dac.

[Bug tree-optimization/105763] [13 Regression] ice in outgoing_edge_range_p, at gimple-range-gori.cc:1253 since r13-754-ga1c9f779f7528342

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105763

Martin Liška  changed:

   What|Removed |Added

   Last reconfirmed||2022-05-30
Summary|[13 Regression] ice in  |[13 Regression] ice in
   |outgoing_edge_range_p, at   |outgoing_edge_range_p, at
   |gimple-range-gori.cc:1253   |gimple-range-gori.cc:1253
   ||since
   ||r13-754-ga1c9f779f7528342
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||marxin at gcc dot gnu.org

--- Comment #3 from Martin Liška  ---
Started with r13-754-ga1c9f779f7528342, let me take a look.

[Bug middle-end/105762] [12/13 Regression] -Warray-bounds false positives for integer-to-pointer casts since r12-2132-ga110855667782dac

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105762

Martin Liška  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||marxin at gcc dot gnu.org,
   ||msebor at gcc dot gnu.org
Summary|[12/13 Regression]  |[12/13 Regression]
   |-Warray-bounds false|-Warray-bounds false
   |positives for   |positives for
   |integer-to-pointer casts|integer-to-pointer casts
   ||since
   ||r12-2132-ga110855667782dac
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-05-30

--- Comment #1 from Martin Liška  ---
Started with r12-2132-ga110855667782dac.

[Bug c++/105761] [11/12/13 Regression] ICE on hidden friend definition defined in base type since r11-388-gc4bff4c230c8d341

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105761

Martin Liška  changed:

   What|Removed |Added

Summary|ICE on hidden friend|[11/12/13 Regression] ICE
   |definition defined in base  |on hidden friend definition
   |type|defined in base type since
   ||r11-388-gc4bff4c230c8d341
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-05-30
 CC||marxin at gcc dot gnu.org,
   ||nathan at gcc dot gnu.org

--- Comment #2 from Martin Liška  ---
Started with r11-388-gc4bff4c230c8d341.

[Bug c++/105760] ICE: in build_function_type, at tree.cc:7365

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105760

Martin Liška  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||jason at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org
   Last reconfirmed||2022-05-30
 Status|UNCONFIRMED |NEW

--- Comment #1 from Martin Liška  ---
Likely started with r11-2748-gb871301f09be7061.

[Bug tree-optimization/105740] missed optimization switch transformation for conditions with duplicate conditions

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105740

Martin Liška  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-05-30
 Status|UNCONFIRMED |NEW
 CC||marxin at gcc dot gnu.org

--- Comment #1 from Martin Liška  ---
Yes, I can confirm that at time of if-to-switch conversion pass, the condition
'len > 3' is not extracted and is present in every condition.

[Bug c/105771] matrix partial transposition with -O3

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

Richard Biener  changed:

   What|Removed |Added

  Known to fail||12.1.1

--- Comment #4 from Richard Biener  ---
Not fixed on trunk.

[Bug target/105738] asan error during bootstrap

2022-05-30 Thread marxin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105738

Martin Liška  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||marxin at gcc dot gnu.org
   Last reconfirmed||2022-05-30
 Status|UNCONFIRMED |WAITING

[Bug demangler/96345] __cxa demangle fails to demangle a very long string

2022-05-30 Thread hededrk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96345

V  changed:

   What|Removed |Added

Version|10.1.0  |12.1.1
   Host||x86_64
 Target||x86_64

--- Comment #6 from V  ---
Still fails as of GCC 12.1.

I've shortened the input string, it fails with names longer than 999
characters.

[Bug c/105771] matrix partial transposition with -O3

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

Richard Biener  changed:

   What|Removed |Added

   Keywords||needs-bisection, wrong-code
  Known to fail||10.2.0, 11.2.0, 9.4.0
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-05-30

--- Comment #3 from Richard Biener  ---
Confirmed.

main0.c:28:20: optimized: applying unroll and jam with factor 2
main0.c:29:24: optimized: loop with 16 iterations completely unrolled (header
execution count 59700049)
main0.c:45:24: optimized: loop vectorized using 16 byte vectors
main0.c:45:24: optimized: loop turned into non-loop; it never loops
main0.c:41:5: optimized: loop with 3 iterations completely unrolled (header
execution count 59700049)
main0.c:44:20: optimized: loop with 16 iterations completely unrolled (header
execution count 0)

-fno-loop-unroll-and-jam fixes it.  Can't check trunk right now whether it's
fixed.

[Bug tree-optimization/105770] [13 Regression] ICE in decompose, at wide-int.h:984 since r13-754

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105770

--- Comment #4 from Jakub Jelinek  ---
I think all the case labels are required to have the same types, but the index
can be promoted (such as in this case promotion from char to int).  I've looked
at other switch handling cases and they do such conversions there.

[Bug tree-optimization/105770] [13 Regression] ICE in decompose, at wide-int.h:984 since r13-754

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105770

--- Comment #3 from Richard Biener  ---
LGTM - IIRC at some point we required the case labels to have compatible types
to the index, didn't we?

[Bug analyzer/105765] [13 Regression] ICE: Segmentation fault (in ana::region_model::deref_rvalue)

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105765

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.0

[Bug middle-end/105762] [12/13 Regression] -Warray-bounds false positives for integer-to-pointer casts

2022-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105762

Richard Biener  changed:

   What|Removed |Added

Summary|[12 Regression] |[12/13 Regression]
   |-Warray-bounds false|-Warray-bounds false
   |positives for   |positives for
   |integer-to-pointer casts|integer-to-pointer casts
   Target Milestone|--- |12.2
 Blocks||56456
   Keywords||diagnostic


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456
[Bug 56456] [meta-bug] bogus/missing -Warray-bounds

[Bug c/105771] matrix partial transposition with -O3

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

--- Comment #2 from Andrew Pinski  ---
I think the way to fix the code is to do this:
transpose_upper_to_lower (mat,);

[Bug rtl-optimization/53533] [10/11/12/13 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2022-05-30 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #48 from rguenther at suse dot de  ---
On Mon, 30 May 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533
> 
> --- Comment #47 from Hongtao.liu  ---
> 
> > 
> > The issue is that the re-association pass doesn't handle operations
> > with undefined overflow behavior, we do have duplicate bugreports
> > for this.
> > 
> 
> I saw below in match.pd
> 
>  478/* Combine successive multiplications.  Similar to above, but handling
>  479   overflow is different.  */
>  480(simplify
>  481 (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
>  482 (with {
>  483   wi::overflow_type overflow;
>  484   wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
>  485   TYPE_SIGN (type), );
>  486  }
>  487  /* Skip folding on overflow: the only special case is @1 * @2 ==
> -INT_MIN,
>  488 otherwise undefined overflow implies that @0 must be zero.  */
>  489  (if (!overflow || TYPE_OVERFLOW_WRAPS (type))
>  490   (mult @0 { wide_int_to_tree (type, mul); }
> 
> Can it be extend to (mult (plus_minus (mult @0 INTEGER_CST@1) INTEGER_CST@3)
> INTEGER_CST@2), so at least we can handle it under -fwrapv?

With -fwrapv the reassoc pass might do this already (not sure with
mixing multiplication and addition, you'd have to try).  But sure,
we could add a pattern for the above (with appropriate single-use
handling).

[Bug c/105771] matrix partial transposition with -O3

2022-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

--- Comment #1 from Andrew Pinski  ---
I think you have an aliasing violation here. Does adding -fno-strict-aliasing
fix the issue?

[Bug tree-optimization/105770] [13 Regression] ICE in decompose, at wide-int.h:984 since r13-754

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105770

--- Comment #2 from Jakub Jelinek  ---
Created attachment 53052
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53052=edit
gcc13-pr105770.patch

Full untested patch.

[Bug rtl-optimization/53533] [10/11/12/13 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2022-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #47 from Hongtao.liu  ---

> 
> The issue is that the re-association pass doesn't handle operations
> with undefined overflow behavior, we do have duplicate bugreports
> for this.
> 

I saw below in match.pd

 478/* Combine successive multiplications.  Similar to above, but handling
 479   overflow is different.  */
 480(simplify
 481 (mult (mult @0 INTEGER_CST@1) INTEGER_CST@2)
 482 (with {
 483   wi::overflow_type overflow;
 484   wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
 485   TYPE_SIGN (type), );
 486  }
 487  /* Skip folding on overflow: the only special case is @1 * @2 ==
-INT_MIN,
 488 otherwise undefined overflow implies that @0 must be zero.  */
 489  (if (!overflow || TYPE_OVERFLOW_WRAPS (type))
 490   (mult @0 { wide_int_to_tree (type, mul); }

Can it be extend to (mult (plus_minus (mult @0 INTEGER_CST@1) INTEGER_CST@3)
INTEGER_CST@2), so at least we can handle it under -fwrapv?

[Bug tree-optimization/105770] [13 Regression] ICE in decompose, at wide-int.h:984 since r13-754

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105770

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |13.0
Summary|[13 Regression] ICE in  |[13 Regression] ICE in
   |decompose, at   |decompose, at
   |wide-int.h:984  |wide-int.h:984 since
   ||r13-754
   Priority|P3  |P1

--- Comment #1 from Jakub Jelinek  ---
Started with r13-754-ga1c9f779f75283427316b5c670c1e01ff8ce9ced
--- gcc/tree-ssa-loop-unswitch.cc.jj2022-05-25 11:07:29.754185772 +0200
+++ gcc/tree-ssa-loop-unswitch.cc   2022-05-30 10:57:23.165131441 +0200
@@ -494,6 +494,7 @@ find_unswitching_predicates_for_bb (basi
 {
   unsigned nlabels = gimple_switch_num_labels (stmt);
   tree idx = gimple_switch_index (stmt);
+  tree idx_type = TREE_TYPE (idx);
   if (TREE_CODE (idx) != SSA_NAME
  || nlabels < 1)
return;
@@ -526,16 +527,18 @@ find_unswitching_predicates_for_bb (basi
  if (CASE_HIGH (lab) != NULL_TREE)
{
  tree cmp1 = fold_build2 (GE_EXPR, boolean_type_node, idx,
-  CASE_LOW (lab));
+  fold_convert (idx_type,
+CASE_LOW (lab)));
  tree cmp2 = fold_build2 (LE_EXPR, boolean_type_node, idx,
-  CASE_HIGH (lab));
+  fold_convert (idx_type,
+CASE_HIGH (lab)));
  cmp = fold_build2 (BIT_AND_EXPR, boolean_type_node, cmp1, cmp2);
  lab_range.set (CASE_LOW (lab), CASE_HIGH (lab));
}
  else
{
  cmp = fold_build2 (EQ_EXPR, boolean_type_node, idx,
-CASE_LOW (lab));
+fold_convert (idx_type, CASE_LOW (lab)));
  lab_range.set (CASE_LOW (lab));
}

fixes it for me.

[Bug tree-optimization/105770] [13 Regression] ICE in decompose, at wide-int.h:984

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105770

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2022-05-30

[Bug rtl-optimization/53533] [10/11/12/13 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2022-05-30 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #46 from rguenther at suse dot de  ---
On Mon, 30 May 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533
> 
> --- Comment #45 from Hongtao.liu  ---
> A reduced testcase.
> 
> int a[256];
> int b[256];
> 
> void foo (void)
> {
>   int i;
>   for (i = 0; i < 256; ++i)
> {
>   int tmp = a[i] + 12345;
>   tmp *= 914237;
>   tmp += 12332;
>   tmp *= 914237;
>   tmp += 12332;
>   tmp *= 914237;
>   tmp -= 13;
>   tmp *= 8000;
>   b[i] = tmp;
> }
> }
> 
> GCC now simply pmulld to pslld + padd + psub, the vectorizer cost model looks
> fine,  but for scalar version, it's extraly optimized in pass_combine from 4 *
> mult + 3 * add to 1 * mult + 2 * add which is not taken in count by 
> vectorizer.
> The vectorized version is not simplified later.
> 
> mov eax, DWORD PTR a[rdx]
> add rdx, 4
> add eax, 12345
> imuleax, eax, -1564285888
> sub eax, 333519936
> mov DWORD PTR b[rdx-4], eax
> cmp rdx, 1024
> jne .L2
> 
> 
> I'm wondering could Gimple also simplify 
> 
>   tmp *= 914237;
>   tmp += 12332;
>   tmp *= 914237;
>   tmp += 12332;
>   tmp *= 914237;
>   tmp -= 13;
>   tmp *= 8000;
> 
> to 
>  tmp *= -1564285888;
>  tmp -= 333519936;
> 
> refer to https://godbolt.org/z/qYMYMTxEY
> 
> Then the vectorized code would be more optimal.

The issue is that the re-association pass doesn't handle operations
with undefined overflow behavior, we do have duplicate bugreports
for this.

On the RTL level likely simplify-rtx (or the variants used by combine)
only have limited support for vector operations.

[Bug c/105771] New: matrix partial transposition with -O3

2022-05-30 Thread franckbehaghel_gcc at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105771

Bug ID: 105771
   Summary: matrix partial transposition with -O3
   Product: gcc
   Version: 10.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: franckbehaghel_gcc at protonmail dot com
  Target Milestone: ---

Created attachment 53051
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53051=edit
source

Hello,

The attached code does not produce the same result with -O3 flag enabled.

It seems that gcc reorders operations that should not be in the matrix
transposition operation. The trick here is that the attached code does inplace
partial transposition. 


To reproduce : 
gcc  main0.c  && ./a.out > O0.txt ; gcc main0.c -O3 && ./a.out > O3.txt ;
md5sum O0.txt O3.txt 
0b513fb110f11f0e9b143c53d5b7a634  O0.txt
12be7305e8e96decd579a1e42d45bc46  O3.txt

This behavior is weird as matrix size lower than 16 do not trigger the
suspected bug.

My gcc version is 10.3.1.
I tested with https://godbolt.org/ : It seems to be introduce in Gcc 8.1 as Gcc
7.5 give the correct output. The last gcc 12.1 seems also affected.

Clang is fine and give the right output.

Can someone confirmed ?
Best regards,
Franck

[Bug ada/105303] Assertion_Policy (Pre => Ignore) executes precondition

2022-05-30 Thread charlet at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105303

Arnaud Charlet  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||charlet at gcc dot gnu.org
 Resolution|--- |FIXED
   Target Milestone|--- |13.0

--- Comment #3 from Arnaud Charlet  ---
fixed on master

[Bug ada/105303] Assertion_Policy (Pre => Ignore) executes precondition

2022-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105303

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Pierre-Marie de Rodat
:

https://gcc.gnu.org/g:5b7630f2f266346173eb2172a9a96e925010afc5

commit r13-826-g5b7630f2f266346173eb2172a9a96e925010afc5
Author: Yannick Moy 
Date:   Tue Apr 19 14:37:58 2022 +0200

[Ada] PR ada/105303 Fix use of Assertion_Policy in internal generics unit

The internal unit System.Generic_Array_Operations defines only generic
subprograms. Thus, pragma Assertion_Policy inside the spec has no
effect, as each instantiation is only subject to the assertion policy at
the program point of the instantiation. Remove this confusing pragma,
and add the pragma inside each generic body making use of additional
assertions or ghost code, so that running time of instantiations is not
impacted by assertions meant for formal verification.

gcc/ada/

PR ada/105303
* libgnat/s-gearop.adb: Add pragma Assertion_Policy in generic
bodies making use of additional assertions or ghost code.
* libgnat/s-gearop.ads: Remove confusing Assertion_Policy.

[Bug preprocessor/105732] [10/11 Regression] internal compiler error: unspellable token PADDING

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105732

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[10/11/12/13 Regression]|[10/11 Regression] internal
   |internal compiler error:|compiler error: unspellable
   |unspellable token PADDING   |token PADDING

--- Comment #17 from Jakub Jelinek  ---
Fixed for 12.2 and 13+ so far.

[Bug libgomp/105745] [12/13 Regression] Conditional OpenMP directive fails with GCC 12

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105745

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Jakub Jelinek  ---
Hopefully fixed.

[Bug sanitizer/105714] [12 Regression] ASan in gcc trunk missed a buffer-overflow at -Os since r12-5138-ge82c382971664d6f

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105714

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Jakub Jelinek  ---
Fixed for 12.2 too.

[Bug c/105635] [12 Regression] ICE in gimple_parm_array_size, at pointer-query.cc:592 since r12-6606-g9d6a0f388eb048f8

2022-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105635

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jakub Jelinek  ---
Fixed for 12.2 too.

[Bug tree-optimization/105770] New: [13 Regression] ICE in decompose, at wide-int.h:984

2022-05-30 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105770

Bug ID: 105770
   Summary: [13 Regression] ICE in decompose, at wide-int.h:984
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---

gcc 13.0.0 20220529 snapshot (g:58a40e76ebadce78639644cd3d56e42b68336927) ICEs
when compiling the following testcase, reduced from
gcc/testsuite/gcc.dg/analyzer/pr103892.c, w/ -O1 -funswitch-loops
-fno-tree-forwprop:

char argstr;

void
argstr_get_word (void)
{
  while (argstr)
switch (argstr)
  {
  case ' ':
  case '\t':
return;
  }

  __builtin_unreachable ();
}

% gcc-13.0.0 -O1 -funswitch-loops -fno-tree-forwprop -c hlkdlkuv.c
during GIMPLE pass: unswitch
hlkdlkuv.c: In function 'argstr_get_word':
hlkdlkuv.c:4:1: internal compiler error: in decompose, at wide-int.h:984
4 | argstr_get_word (void)
  | ^~~
0x7e7cea wi::int_traits >
>::decompose(long*, unsigned int, generic_wide_int > const&)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/wide-int.h:984
0x7ec593 wi::int_traits >::decompose(long*,
unsigned int, generic_wide_int const&)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/tree.h:3701
0x7ec593 wide_int_ref_storage::wide_int_ref_storage
>(generic_wide_int const&, unsigned int)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/wide-int.h:1034
0x7ec593 generic_wide_int
>::generic_wide_int
>(generic_wide_int const&, unsigned int)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/wide-int.h:790
0x7ec593 wi::binary_traits
>, generic_wide_int,
wi::int_traits >
>::precision_type, wi::int_traits
>::precision_type>::result_type
wi::bit_and_not >,
generic_wide_int
>(generic_wide_int > const&,
generic_wide_int const&)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/wide-int.h:2343
0x7ec593 generic_simplify_111
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/build/gcc/generic-match.cc:8192
0x169139f generic_simplify_EQ_EXPR
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/build/gcc/generic-match.cc:56962
0xb44402 fold_binary_loc(unsigned int, tree_code, tree_node*, tree_node*,
tree_node*)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/fold-const.cc:10902
0xb4cdda fold_build2_loc(unsigned int, tree_code, tree_node*, tree_node*,
tree_node*)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/fold-const.cc:13854
0x1071bb3 find_unswitching_predicates_for_bb
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/tree-ssa-loop-unswitch.cc:537
0x1075b19 init_loop_unswitch_info
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/tree-ssa-loop-unswitch.cc:268
0x1075b19 tree_ssa_unswitch_loops(function*)
   
/var/tmp/portage/sys-devel/gcc-13.0.0_p20220529/work/gcc-13-20220529/gcc/tree-ssa-loop-unswitch.cc:332

[Bug target/105607] sh-unknown-elf: Error: opcode not valid for this cpu variant

2022-05-30 Thread judge.packham at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105607

--- Comment #10 from Chris Packham  ---
I don't know if it helps at all but it looks like we actually noticed the
difference between GCC 10 and GCC 11. The workaround we have in ct-ng was GCC
11 specific.

[Bug rtl-optimization/53533] [10/11/12/13 regression] vectorization causes loop unrolling test slowdown as measured by Adobe's C++Benchmark

2022-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53533

--- Comment #45 from Hongtao.liu  ---
A reduced testcase.

int a[256];
int b[256];

void foo (void)
{
  int i;
  for (i = 0; i < 256; ++i)
{
  int tmp = a[i] + 12345;
  tmp *= 914237;
  tmp += 12332;
  tmp *= 914237;
  tmp += 12332;
  tmp *= 914237;
  tmp -= 13;
  tmp *= 8000;
  b[i] = tmp;
}
}

GCC now simply pmulld to pslld + padd + psub, the vectorizer cost model looks
fine,  but for scalar version, it's extraly optimized in pass_combine from 4 *
mult + 3 * add to 1 * mult + 2 * add which is not taken in count by vectorizer.
The vectorized version is not simplified later.

mov eax, DWORD PTR a[rdx]
add rdx, 4
add eax, 12345
imuleax, eax, -1564285888
sub eax, 333519936
mov DWORD PTR b[rdx-4], eax
cmp rdx, 1024
jne .L2


I'm wondering could Gimple also simplify 

  tmp *= 914237;
  tmp += 12332;
  tmp *= 914237;
  tmp += 12332;
  tmp *= 914237;
  tmp -= 13;
  tmp *= 8000;

to 
 tmp *= -1564285888;
 tmp -= 333519936;

refer to https://godbolt.org/z/qYMYMTxEY

Then the vectorized code would be more optimal.

  1   2   >