[Bug middle-end/111845] [14 regression] ICE when building pycryptodome

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

--- Comment #5 from Andrew Pinski  ---
Created attachment 56129
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56129=edit
no uninitialized variable in this testcase

This slightly changed testcase from Sam's testcase which removes the
uninitialized variable (just made it an argument)

[Bug middle-end/111845] [14 regression] ICE when building pycryptodome

2023-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

Sam James  changed:

   What|Removed |Added

  Attachment #56127|0   |1
is obsolete||

--- Comment #4 from Sam James  ---
Created attachment 56128
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56128=edit
reduced.i

how about that?

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-16 Thread jan.wassenberg at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #4 from Jan Wassenberg  ---
I understand the slippery slope concern. But the empty asm string is a special
case, we and others use it (with +r output and memory clobber) to prevent
optimizing variables out e.g. during tests.

It seems useful for that to work without running into inlining issues on ppc10
:)

[Bug middle-end/111845] [14 regression] ICE when building pycryptodome

2023-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

--- Comment #3 from Sam James  ---
Created attachment 56127
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56127=edit
reduced.ii

I've attached what cvise gave me but it also relies on uninitialised vars (not
tried to fix it yet)

[Bug middle-end/111845] [14 regression] ICE when building pycryptodome

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

Andrew Pinski  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-17

--- Comment #2 from Andrew Pinski  ---
Confirmed.

It is .UADDC production somehow but I am not sure what is being tried here
since there are almost no detail dumps from widening_mul either.

[Bug middle-end/111845] [14 regression] ICE when building pycryptodome

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

--- Comment #1 from Andrew Pinski  ---
Created attachment 56126
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56126=edit
Reduced testcase

I am not a fan of this reduced testcase since it depends on uninitialized
variables.

[Bug middle-end/111845] [14 regression] ICE when building pycryptodome

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
   Target Milestone|--- |14.0

[Bug middle-end/111845] New: [14 regression] ICE when building pycryptodome

2023-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111845

Bug ID: 111845
   Summary: [14 regression] ICE when building pycryptodome
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 56125
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56125=edit
mont.i.xz

Hit this when building pycryptodome on x86_64 (recently started testing that
plaform w/ 14). This package seems to be a good smoke test - see PR110271 from
a few months ago.

```
# x86_64-pc-linux-gnu-gcc -O2 -pipe -march=native -fdiagnostics-color=always
-frecord-gcc-switches -DNDEBUG -fPIC -DHAVE_STDINT_H -DPYCRYPTO_LITTLE_ENDIAN
-DSYS_BITS=64 -DLTC_NO_ASM -DHAVE_UINT128 -DHAVE_CPUID_H -DHAVE_POSIX_MEMALIGN
-DHAVE_X86INTRIN_H -DUSE_SSE2 -Isrc/ -I/usr/include/pypy3.10 -c src/mont.c -o
/var/tmp/portage/dev-python/pycryptodome-3.18.0/work/pycryptodome-3.18.0-pypy3/build/temp.linux-x86_64-pypy310/src/mont.o
-msse2
In file included from src/mont.c:43:
src/multiply_64.c: In function ‘addmul128’:
src/multiply_64.c:62:6: error: missing definition
   62 | void addmul128(uint64_t *t, uint64_t *scratchpad, const uint64_t *a,
uint64_t b0, uint64_t b1, size_t t_words, size_t a_nw)
  |  ^
for SSA_NAME: c_81 in statement:
_106 = c_81 + _87;
during GIMPLE pass: widening_mul
src/multiply_64.c:62:6: internal compiler error: verify_ssa failed
0x55e4233d3d77 verify_ssa(bool, bool) [clone .constprop.0]
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231015/gcc-14-20231015/gcc/tree-ssa.cc:1203
0x55e42478d6ca execute_function_todo
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231015/gcc-14-20231015/gcc/passes.cc:2095
0x55e4246deaa1 do_per_function
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231015/gcc-14-20231015/gcc/passes.cc:1687
0x55e4246deaa1 execute_todo
   
/usr/src/debug/sys-devel/gcc-14.0.0_pre20231015/gcc-14-20231015/gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
``

`x86_64-pc-linux-gnu-gcc -O2 -c mont.i -march=znver2` seems to be enough for me
to reproduce.

```
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/14/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-14.0.0_pre20231015/work/gcc-14-20231015/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/14
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/14/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/14/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/14/python
--enable-languages=c,c++,fortran,rust --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=yes,extra,rtl
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo Hardened
14.0.0_pre20231015 p4' --with-gcc-major-version-only --enable-libstdcxx-time
--enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --enable-multilib
--with-multilib-list=m32,m64 --disable-fixed-point --enable-targets=all
--enable-libgomp --disable-libssp --disable-libada --enable-cet
--disable-systemtap --enable-valgrind-annotations --disable-vtable-verify
--disable-libvtv --with-zstd --with-isl --disable-isl-version-check
--enable-default-pie --enable-host-pie --enable-host-bind-now
--enable-default-ssp --with-build-config='bootstrap-lto bootstrap-cet'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231015 (experimental) (Gentoo Hardened 14.0.0_pre20231015
p4)
```

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread n.deshmukh at samsung dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

--- Comment #9 from n.deshmukh at samsung dot com  ---
(In reply to Andrew Pinski from comment #8)
> (In reply to n.deshm...@samsung.com from comment #7)
> > 
> > Is there a reason why the second error is not categorized under
> > -Wfloat-conversion diagnostic?
> 
> Did you read what I linked?
> I will link it again:
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1467r4.
> html#implicit
> 
> It even mentions why things won't change.
> Again the code:
> ```
> float a;
> _Float16 b = a;
> ```
> is invalid C++ code as defined by that paper (which was addopted for C++23).
> While:
> ```
> double a;
> float b = a;
> ```
> is still valid even.

Yes I went through the link and I understand that double to float conversion is
valid code. My question was regarding the use of the diagnostic flag. My
understanding was that the diagnostic flag was implementation defined. That's
why I was asking.

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

--- Comment #8 from Andrew Pinski  ---
(In reply to n.deshm...@samsung.com from comment #7)
> 
> Is there a reason why the second error is not categorized under
> -Wfloat-conversion diagnostic?

Did you read what I linked?
I will link it again:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1467r4.html#implicit

It even mentions why things won't change.
Again the code:
```
float a;
_Float16 b = a;
```
is invalid C++ code as defined by that paper (which was addopted for C++23).
While:
```
double a;
float b = a;
```
is still valid even.

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread n.deshmukh at samsung dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

--- Comment #7 from n.deshmukh at samsung dot com  ---
How about the following code: 

int f(double a) {
float b = a;
 return 0;
}

int g(double a) {
_Float16 b = a;
 return 0;
}

It generates the following errors: 

: In function 'int f(double)':
:2:15: warning: conversion from 'double' to 'float' may change value
[-Wfloat-conversion]
2 | float b = a;
  |   ^
: In function 'int g(double)':
:7:18: warning: converting to '_Float16' from 'double' with greater
conversion rank
7 | _Float16 b = a;
  |  ^
Compiler returned: 0

Is there a reason why the second error is not categorized under
-Wfloat-conversion diagnostic?

[Bug tree-optimization/95034] Failure to convert xor pattern (made out of or+and) to xor

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95034

--- Comment #7 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #6)
> So after phiopt2 we end up with:
> ```
>   _1 = a_3(D) | b_4(D);
>   if (_1 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
>[local count: 536870912]:
>   _5 = a_3(D) & b_4(D);
>   _7 = ~_5;
> 
>[local count: 1073741824]:
>   # iftmp.0_2 = PHI <_7(3), 0(2)>
> ```
> 
> Phi-opt does not support more than one statement inside the middle BB so it
> does nothing here.
> 
> But if we rewrite it such that the 2 statements were not inside the middle
> BB by phiopt2, we get the xor as expected.
> That is:
> ```
> bool f(bool a, bool b)
> {
>   bool c = a | b;
>   bool d = a & b;
>   d = !d;
>   return c ? d : false;
> }
> ```
> Will produce:
> ```
>   _8 = a_3(D) ^ b_4(D);
>   return _8;
> ```
> From the phiopt2 dump:
> ```
> Folded into the sequence:
> _8 = a_3(D) ^ b_4(D);
> ```
> 
> I wonder if we could support more than 1 statement in the middle BBs iff the
> resulting simplifications only reference one SSA_NAME of the statements max
> ...

Or we could support what we do for casts for `~` and push the ~ outside of the
if statement to see if that improves the phi-opt.
That is get:
```
bool f(bool a, bool b)
{
bool t = a | b;
bool t1;
if (t) t1 = a & b; else t1 = 1;
return !t1;
}

```
Which is already known how to optimized to `t1 = a == b;`

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #6 from Andrew Pinski  ---
(In reply to n.deshm...@samsung.com from comment #5)
> The code is part of a third party library hence adding a explicit cast is
> not possible.

Well that third party library is NOT valid C++ code ... The whole point of the
warning (and the reason why -Wno-pedantic does not turn off the warning) is
point that out and more over point out the code should be fixed.

Even more things like:
```
template
void f(T a) requires requires(T a) { a = 5.0;}
{

}

void g()
{
f<_Float16> (5.0);
}
```

Will not work.
Clang currently incorrectly accepts the above code even.

[Bug tree-optimization/111844] missed optimization

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111844

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-10-17
 Status|UNCONFIRMED |NEW
   Severity|normal  |enhancement
   Keywords||missed-optimization
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
I suspect LLVM is able to just optimize it to:
#include 


void foo1(void* buf, int inc) {
unsigned int px;
memcpy(, ((char*)buf)+offsetof(P, x), sizeof(px)) ;
px += inc;
memcpy(((char*)buf)+offsetof(P, x), , sizeof(px)) ;

   //  bar();
}

As it can see that is the only location is changed ...

[Bug libstdc++/110854] constructor of std::counting_semaphore is not constexpr

2023-10-16 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110854

--- Comment #3 from Jiang An  ---
(In reply to Jiang An from comment #2)
> The constructor of the internal __platform_semaphore class currently calls
> sem_init, which make it incompatible with constexpr...

It seems doable to make the ctor constexpr, but we need to use different
strategies for different C libs.

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread n.deshmukh at samsung dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

n.deshmukh at samsung dot com  changed:

   What|Removed |Added

 Resolution|INVALID |---
 Status|RESOLVED|UNCONFIRMED

--- Comment #5 from n.deshmukh at samsung dot com  ---
(In reply to Andrew Pinski from comment #3)
> >Is there a way to disable this warning without using an explicit cast?
> 
> No and this is by design because this is how C++ defines extended floating
> point types and implict casts.

(In reply to Eric Gallager from comment #4)
> I think the part about the warning not being controlled by a specific flag
> is still valid, though? It'd fit under the bug 44209 meta-bug.

Yes this is what I meant. Sorry for not being clear in my initial description.
The warning is valid but I wish to suppress it like the other conversion
warning using -Wno flag. But it does not come under a flag. It is still printed
with -Wno-pedantic option. 

The code is part of a third party library hence adding a explicit cast is not
possible.

[Bug tree-optimization/111844] New: missed optimization

2023-10-16 Thread 113245 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111844

Bug ID: 111844
   Summary: missed optimization
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 113245 at gmail dot com
  Target Milestone: ---

Hello,

The following code compiles and optimizes to something reasonable under -O2
-std=c++14 with gcc trunk (Oct 16, d5cfabc677b08f38ea5d5f85deeda746b4fabb88)


#include 

extern void bar();

struct P {
unsigned int x;
unsigned int y;
unsigned int z[20];
};

void foo(void* buf, int inc) {
P p;
memcpy(, buf, sizeof(p)) ;
p.x += inc;
memcpy(buf, , sizeof(p)) ;

// bar();
}


Results in assembly that only loads the portion of data from 'buf' that
corresponds to p.x.

foo(void*, int):
movdqu  xmm0, XMMWORD PTR [rdi]
movaps  XMMWORD PTR [rsp-104], xmm0
add DWORD PTR [rsp-104], esi
movdqa  xmm0, XMMWORD PTR [rsp-104]
movups  XMMWORD PTR [rdi], xmm0
ret

However, reintroducing the call to bar() results in significantly worse
assembly; it appears to want to copy the entire struct `p` out of buf, even
though almost all of the movaps instructions are not useful.

foo(void*, int):
movdqu  xmm0, XMMWORD PTR [rdi]
mov rax, QWORD PTR [rdi+80]
movaps  XMMWORD PTR [rsp-104], xmm0
movdqu  xmm0, XMMWORD PTR [rdi+16]
add DWORD PTR [rsp-104], esi
movaps  XMMWORD PTR [rsp-88], xmm0
movdqu  xmm0, XMMWORD PTR [rdi+32]
mov QWORD PTR [rsp-24], rax
movaps  XMMWORD PTR [rsp-72], xmm0
movdqu  xmm0, XMMWORD PTR [rdi+48]
movaps  XMMWORD PTR [rsp-56], xmm0
movdqu  xmm0, XMMWORD PTR [rdi+64]
movaps  XMMWORD PTR [rsp-40], xmm0
movdqa  xmm0, XMMWORD PTR [rsp-104]
movups  XMMWORD PTR [rdi], xmm0
jmp bar()

For comparison, several versions of clang with the same flags will optimize
this to:

foo(void*, int):
add dword ptr [rdi], esi
jmp bar()

I am not sure why the loads to the stack-local `P p` are not elided; my first
thought was that perhaps escape analysis on  forces the full load in case
memcpy "saves" the address of `p` for use by bar(); I would have expected that
wrapping the {decl/memcpy/increment/memcpy} in it's own scope would address
that but it seems to have no effect.

Thanks

[Bug target/111466] RISC-V: redundant sign extensions despite ABI guarantees

2023-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111466

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:8eb9cdd142182aaa3ee39750924bc0a0491236c3

commit r14-4676-g8eb9cdd142182aaa3ee39750924bc0a0491236c3
Author: Vineet Gupta 
Date:   Mon Oct 16 21:59:09 2023 -0600

expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg
[target/111466]

RISC-V suffers from extraneous sign extensions, despite/given the ABI
guarantee that 32-bit quantities are sign-extended into 64-bit registers,
meaning incoming SI function args need not be explicitly sign extended
(so do SI return values as most ALU insns implicitly sign-extend too.)

Existing REE doesn't seem to handle this well and there are various ideas
floating around to smarten REE about it.

RISC-V also seems to correctly implement middle-end hook PROMOTE_MODE
etc.

Another approach would be to prevent EXPAND from generating the
sign_extend in the first place which this patch tries to do.

The hunk being removed was introduced way back in 1994 as
   5069803972 ("expand_expr, case CONVERT_EXPR .. clear the promotion
flag")

This survived full testsuite run for RISC-V rv64gc with surprisingly no
fallouts: test results before/after are exactly same.

|   | # of unexpected case / # of unique
unexpected case
|   |  gcc |  g++ |
gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/|  264 /87 |5 / 2 |   72 /   
12 |
|lp64d/medlow

Granted for something so old to have survived, there must be a valid
reason. Unfortunately the original change didn't have additional
commentary or a test case. That is not to say it can't/won't possibly
break things on other arches/ABIs, hence the RFC for someone to scream
that this is just bonkers, don't do this ð

I've explicitly CC'ed Jakub and Roger who have last touched subreg
promoted notes in expr.cc for insight and/or screaming ð

Thanks to Robin for narrowing this down in an amazing debugging session
@ GNU Cauldron.

```
foo2:
sext.w  a6,a1 <-- this goes away
beq a1,zero,.L4
li  a5,0
li  a0,0
.L3:
addwa4,a2,a5
addwa5,a3,a5
addwa0,a4,a0
bltua5,a6,.L3
ret
.L4:
li  a0,0
ret
```

Signed-off-by: Vineet Gupta 
Co-developed-by: Robin Dapp 

PR target/111466
gcc/
* expr.cc (expand_expr_real_2): Do not clear SUBREG_PROMOTED_VAR_P.

gcc/testsuite
* gcc.target/riscv/pr111466.c: New test.

[Bug middle-end/111843] New: [meta-bug] wrong-code due to -fstack-reuse=

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111843

Bug ID: 111843
   Summary: [meta-bug] wrong-code due to -fstack-reuse=
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

This is a meta-bug for all the current known wrong-code due to -fstack-reuse= .

[Bug tree-optimization/111839] [12/13/14 Regression] Wrong code at -O3 on x86_64-linux-gnu since r12-2097-g9f34b780b0

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111839

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-10-17
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Partition 1: size 8 align 8
l   j

So another -fstack-reuse= issue.

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #4 from Eric Gallager  ---
I think the part about the warning not being controlled by a specific flag is
still valid, though? It'd fit under the bug 44209 meta-bug.

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #3 from Kewen Lin  ---
(In reply to Peter Bergner from comment #1)
> (In reply to Kewen Lin from comment #0)
> > Technically speaking we are able to parse the inline asm string and figure
> > out it's HTM related or not.  Excepting for the HTM specific instructions
> > like .tbegin etc., we also need to take care of those special registers
> > related to HTM feature. 
> 
> I don't like the idea of scanning the inline asm string for HTM
> instructions.  The problem is, the user could have used ".long 0x7c00051d"
> which is a valid HTM instruction, rather than typing "tbegin.".  There's
> just no easy way to handle all possible cases.  It's also a slippery slope
> of peeking into the inline asm string and I don't think we want to do that! 
> If we start doing it for this, there will just be more and more requests for
> doing it for other things.  Let's not.

Thanks for the comments, fair enough, I guess that's why we don't have such a
inline asm parser. Another idea is to launch a child process to invoke the
assembler to parse it without htm support (if assembler can support this
fine-grain feature) and see if it works. But it's still complicated.

> 
> I also believe that if the user compiles some inline asm using
> -mcpu=power10, then the compiler can assume that inline asm only uses
> features available on Power10, meaning we can assume the inline asm does not
> contain HTM or any other feature not supported on Power10.  If the user
> compiles a piece of inline asm that doesn't support the features used in
> that inline asm, then that is user error!

We can. But it's not the case that this request aims to solve. 

The motivation of this request is to try our best to make power10 attributed
code inline more power8/power9 attribute code which likely includes some inline
asm but not HTM related as the quoted OSS shows. For now, for one function
which has any non-empty inline asm string, we would consider it's possible to
have HTM code so it's unsafe to inline it.

Users usually think higher cpu attributed code can safely inline lower cpu
attributed code, but it's out of expectation for power10 code inlining
power8/power9 code as we drops HTM from power10. If we can support it better,
users don't need more extra efforts to teach about it.

[Bug c++/111841] Lookup context rejected at definition if lookup finds a namespace

2023-10-16 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111841

--- Comment #3 from Johel Ernesto Guerrero Peña  ---
Thank you.
For reference, here's how I found out this bug:
.

[Bug c++/111841] Lookup context rejected at definition if lookup finds a namespace

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111841

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||98939

--- Comment #2 from Andrew Pinski  ---
GCC does not implement
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1787r6.html (PR
98939) yet.

I am 99% sure that paper is what causes this to be valid.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98939
[Bug 98939] [C++23] Implement P1787R6 "Declarations and where to find them"

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

--- Comment #3 from Andrew Pinski  ---
>Is there a way to disable this warning without using an explicit cast?

No and this is by design because this is how C++ defines extended floating
point types and implict casts.

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Andrew Pinski  ---
As defined by the standard:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1467r4.html#implicit

You could either use f16 or add an explicit cast to avoid the (ped)warning

[Bug c++/111842] Unable to disable conversion warning in case of _Float16

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

--- Comment #1 from Andrew Pinski  ---
I think you should be using 5.0f16 instead ...

[Bug c++/111842] New: Unable to disable conversion warning in case of _Float16

2023-10-16 Thread n.deshmukh at samsung dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111842

Bug ID: 111842
   Summary: Unable to disable conversion warning in case of
_Float16
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: n.deshmukh at samsung dot com
  Target Milestone: ---

The following code when compiled with gcc-13.1 results in the following output:

int main() {
_Float16 a = 5.0;
float b = a;
 return 0;
}

: In function 'int main()':
:2:18: warning: converting to '_Float16' from 'double' with greater
conversion rank
2 | _Float16 a = 5.0;
  |  ^~~
Compiler returned: 0

There is no way to disable this warning as it is not classified under any -W
categories. Is there a way to disable this warning without using an explicit
cast?

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-16 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

--- Comment #2 from Peter Bergner  ---
(In reply to Peter Bergner from comment #1)
> If the user compiles a piece of inline asm that doesn't support the
> features used in that inline asm, then that is user error!

I meant to say: If the user compiles a piece of inline asm using options that
doesn't support the features used in that inline asm, then that is user error!

[Bug target/111828] rs6000: Parse inline asm string to figure out it requires HTM feature or not.

2023-10-16 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111828

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #1 from Peter Bergner  ---
(In reply to Kewen Lin from comment #0)
> Technically speaking we are able to parse the inline asm string and figure
> out it's HTM related or not.  Excepting for the HTM specific instructions
> like .tbegin etc., we also need to take care of those special registers
> related to HTM feature. 

I don't like the idea of scanning the inline asm string for HTM instructions. 
The problem is, the user could have used ".long 0x7c00051d" which is a valid
HTM instruction, rather than typing "tbegin.".  There's just no easy way to
handle all possible cases.  It's also a slippery slope of peeking into the
inline asm string and I don't think we want to do that!  If we start doing it
for this, there will just be more and more requests for doing it for other
things.  Let's not.

I also believe that if the user compiles some inline asm using -mcpu=power10,
then the compiler can assume that inline asm only uses features available on
Power10, meaning we can assume the inline asm does not contain HTM or any other
feature not supported on Power10.  If the user compiles a piece of inline asm
that doesn't support the features used in that inline asm, then that is user
error!

[Bug bootstrap/111601] [14 Regression] bootstrap fails in stagestrain in libcody on x86_64-linux-gnu and powerpc64le-linux-gnu

2023-10-16 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111601

Peter Bergner  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org

--- Comment #1 from Peter Bergner  ---
Did you see the same error on both x86_64 and powerpc64le?  ...and what
configure options did you use on powerpc64le-linux?

When I do a normal configure and make profiledbootstrap-lean on
powerpc64le-linux, I see instead:

In file included from
/home/bergner/gcc/gcc-fsf-mainline-pr111601/libstdc++-v3/include/precompiled/stdc++.h:42:
/home/bergner/gcc/build/gcc-fsf-mainline-pr111601-regtest-2/powerpc64le-linux/libstdc++-v3/include/cstdlib:
In function ‘ldiv_t std::div(long int, long int)’:
/home/bergner/gcc/build/gcc-fsf-mainline-pr111601-regtest-2/powerpc64le-linux/libstdc++-v3/include/cstdlib:181:57:
internal compiler error: tree check: expected tree that contains ‘decl common’
structure, have ‘’ in build_new_method_call, at
cp/call.cc:11630
  181 |   div(long __i, long __j) _GLIBCXX_NOTHROW { return ldiv(__i, __j); }
  | ^~
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
See  for instructions.
make[5]: *** [Makefile:1925:
powerpc64le-linux/bits/stdc++.h.gch/O2ggnu++0x.gch] Error 1
make[5]: Leaving directory
'/home/bergner/gcc/build/gcc-fsf-mainline-pr111601-regtest-2/powerpc64le-linux/libstdc++-v3/include'
make[4]: *** [Makefile:576: all-recursive] Error 1

[Bug libstdc++/78276] regex_search is slow

2023-10-16 Thread jklowden at schemamania dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78276

--- Comment #3 from James K. Lowden  ---
Here is a nonpathological example taken from a real-world problem were
std::regex_search fails.  

This pattern is part of the COBOL COPY text-manipulation directive: 

([[:space:]]+(LEADING|TRAILING))?[[:space:]]+("((["]{2}|[^"])*)"|'(([']{2}|[^'])*)[']|([[:alnum:]]+([_-]+[[:alnum:]]+)*)|==((=?[^=]+)+)==)[[:space:]]+BY[[:space:]]+(("(["]{2}|[^"])*")|('([']{2}|[^'])*')|([[:alnum:]]+([_-]+[[:alnum:]]+)*)|==((=?[^=]+)*)==)([[:space:]]*[.])?

That pattern has 21 captures.  Ignoring the optional LEADING/TRAILING clause,
it accepts 1 of 3 operands on either side of the BY keyword: 

1.  a quoted string using the " double-quote
2.  a quoted string using the ' single-quote
3.  an identifier consisting of alphanumerics with hyphens or underscores

Quoted strings in this syntax may include embedded quotes by doubling them. 

By "fails", I mean "does not terminate" in a reasonable time.  Using gdb I have
seen over 1900 stack frames inside std::regex_search.  This is with gcc 11 on
Linux.  

I have recast the program using awk and regex(3) from the C standard library,
both of which succeed instantly.  I attach a tarball that includes all three
files, the input, and a Makefile to demonstrate them.

[Bug libstdc++/78276] regex_search is slow

2023-10-16 Thread jklowden at schemamania dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78276

James K. Lowden  changed:

   What|Removed |Added

 CC||jklowden at schemamania dot org

--- Comment #2 from James K. Lowden  ---
Created attachment 56124
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56124=edit
test programs and input

[Bug c++/111841] Lookup context rejected at definition if lookup finds a namespace

2023-10-16 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111841

--- Comment #1 from Johel Ernesto Guerrero Peña  ---
According to my reading of
 and
,
Clang and MSVC are right.

[Bug c++/111841] New: Lookup context rejected at definition if lookup finds a namespace

2023-10-16 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111841

Bug ID: 111841
   Summary: Lookup context rejected at definition if lookup finds
a namespace
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: johelegp at gmail dot com
CC: johelegp at gmail dot com
  Target Milestone: ---

See .

```C++
namespace ns { }
struct X {
  struct ns { void f(); };
  struct t : ns { };
};
void g() {
  [](auto x) {
  return x.ns::f();
  }
  // (X::t{}) // Clang and MSVC accept.
  ;
}
```

```output
: In lambda function:
:8:16: error: 'ns::f' is not a class member
8 |   return x.ns::f();
  |^~
Compiler returned: 1
```

This goes back to GCC 4.9.4: .

[Bug c++/111790] [12/13/14 Regression] Unwarranted missing template keyword warning

2023-10-16 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111790

Patrick Palka  changed:

   What|Removed |Added

   Target Milestone|--- |12.4
Summary|Unwarranted missing |[12/13/14 Regression]
   |template keyword warning|Unwarranted missing
   ||template keyword warning
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-16
 CC||ppalka at gcc dot gnu.org

--- Comment #2 from Patrick Palka  ---
Confirmed.

[Bug target/81426] [SH]: unable to find a register to spill in class 'R0_REGS' when building webkit2gtk

2023-10-16 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81426

--- Comment #12 from Oleg Endo  ---
(In reply to John Paul Adrian Glaubitz from comment #11)
> Created attachment 56123 [details]
> Preprocessed source from building GHC with gcc-13
> 
> This is still present in gcc-13, I just ran into it while cross-building the
> Haskell compiler GHC for sh4:
> 

Have you tried using the -mlra option for this build?

[Bug c/111808] [C23] constexpr with excess precision

2023-10-16 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111808

--- Comment #5 from joseph at codesourcery dot com  ---
We could add a "note: initializer represented with excess precision" or 
similar for the case where the required error might be surprising because 
the semantic types are the same.

[Bug c++/111840] New: =delete("can have a reason")?

2023-10-16 Thread ed at catmur dot uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111840

Bug ID: 111840
   Summary: =delete("can have a reason")?
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at catmur dot uk
  Target Milestone: ---

Since 6.1, gcc accepts the following (without warnings at any level):

int f() = delete("should have a reason");

Much as I'd love to be able to write this, gcc seems to be slightly jumping the
gun, since P2573[1][2] hasn't been accepted yet and in fact wasn't even
proposed until almost 6 years after 6.1 was released.

1. https://github.com/cplusplus/papers/issues/1229
2. https://wg21.link/p2573r0

[Bug fortran/109105] Error-prone format string building in resolve.cc

2023-10-16 Thread roland.illig at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109105

--- Comment #3 from Roland Illig  ---
Nothing has changed yet.

There is no built-in validation in the translated messages that each '%%L' from
the msgid matches a '%%L' from the msgstr.

I suggest to replace the label 'bad_op' with a function named 'bad_op', so that
the format strings contain '%L' instead of the current '%%L', so that gettext
can validate the format strings in the translations.

[Bug bootstrap/111812] [14 regression] Can't build with gcc 4.8.5

2023-10-16 Thread seurer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111812

--- Comment #4 from seurer at gcc dot gnu.org ---
I tried a build with r14-4659-ga22eeaca5ce753 and I see the following which
looks like it might be the mentioned union issues.

g++ -std=gnu++11  -fno-PIE -c   -g -O2 -DIN_GCC-fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common 
-DHAVE_CONFIG_H -fno-PIE -I. -I. -I/home/seurer/gcc/git/gcc-test/gcc
-I/home/seurer/gcc/git/gcc-test/gcc/.
-I/home/seurer/gcc/git/gcc-test/gcc/../include 
-I/home/seurer/gcc/git/gcc-test/gcc/../libcpp/include
-I/home/seurer/gcc/git/gcc-test/gcc/../libcody
-I/home/seurer/gcc/git/build/gcc-test/./gmp -I/home/seurer/gcc/git/gcc-test/gmp
-I/home/seurer/gcc/git/build/gcc-test/./mpfr/src
-I/home/seurer/gcc/git/gcc-test/mpfr/src
-I/home/seurer/gcc/git/gcc-test/mpc/src 
-I/home/seurer/gcc/git/gcc-test/gcc/../libdecnumber
-I/home/seurer/gcc/git/gcc-test/gcc/../libdecnumber/dpd -I../libdecnumber
-I/home/seurer/gcc/git/gcc-test/gcc/../libbacktrace
-I/home/seurer/gcc/git/build/gcc-test/./isl/include
-I/home/seurer/gcc/git/gcc-test/isl/include  -o cse.o -MT cse.o -MMD -MP -MF
./.deps/cse.TPo /home/seurer/gcc/git/gcc-test/gcc/cse.cc
In file included from /home/seurer/gcc/git/gcc-test/gcc/cse.cc:25:0:
/home/seurer/gcc/git/gcc-test/gcc/rtl.h:66:26: warning: 'rtx_def::code' is too
small to hold all values of 'enum rtx_code' [enabled by default]
 #define RTX_CODE_BITSIZE 8
  ^
/home/seurer/gcc/git/gcc-test/gcc/rtl.h:318:33: note: in expansion of macro
'RTX_CODE_BITSIZE'
   ENUM_BITFIELD(rtx_code) code: RTX_CODE_BITSIZE;
 ^
/home/seurer/gcc/git/gcc-test/gcc/rtl.h:66:26: warning:
'qty_table_elem::comparison_code' is too small to hold all values of 'enum
rtx_code' [enabled by default]
 #define RTX_CODE_BITSIZE 8
  ^
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:252:45: note: in expansion of macro
'RTX_CODE_BITSIZE'
   ENUM_BITFIELD(rtx_code) comparison_code : RTX_CODE_BITSIZE;
 ^
/home/seurer/gcc/git/gcc-test/gcc/cse.cc: In function 'void
add_to_set(vec*, rtx)':
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::rtl' [-Wmissing-field-initializers]
   struct set entry = {};
   ^
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_elt' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_hash' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::dest_hash' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::inner_dest' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_in_memory' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_volatile' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::mode' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_const_hash' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_const' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::src_const_elt' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4236:23: warning: missing initializer
for member 'set::dest_addr_elt' [-Wmissing-field-initializers]
/home/seurer/gcc/git/gcc-test/gcc/cse.cc: In function 'void
cse_insn(rtx_insn*)':
/home/seurer/gcc/git/gcc-test/gcc/cse.cc:4954:19: error: use of deleted
function 'rtx_def::rtx_def()'
struct rtx_def memory_extend_buf;
   ^
In file included from /home/seurer/gcc/git/gcc-test/gcc/cse.cc:25:0:
/home/seurer/gcc/git/gcc-test/gcc/rtl.h:313:38: note: 'rtx_def::rtx_def()' is
implicitly deleted because the default definition would be ill-formed:
  chain_prev ("RTX_PREV (&%h)"))) rtx_def {
  ^
/home/seurer/gcc/git/gcc-test/gcc/rtl.h:313:38: error: use of deleted function
'rtx_def::u::u()'
/home/seurer/gcc/git/gcc-test/gcc/rtl.h:445:9: note: 'rtx_def::u::u()' is
implicitly deleted because the default definition would be ill-formed:
   union u {
 ^

[Bug tree-optimization/111839] [12/13/14 Regression] Wrong code at -O3 on x86_64-linux-gnu since r12-2097-g9f34b780b0

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111839

Andrew Pinski  changed:

   What|Removed |Added

Summary|Wrong code at -O3 on|[12/13/14 Regression] Wrong
   |x86_64-linux-gnu since  |code at -O3 on
   |r12-2097-g9f34b780b0|x86_64-linux-gnu since
   ||r12-2097-g9f34b780b0
   Target Milestone|--- |12.4

[Bug tree-optimization/111839] New: Wrong code at -O3 on x86_64-linux-gnu since r12-2097-g9f34b780b0

2023-10-16 Thread shaohua.li at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111839

Bug ID: 111839
   Summary: Wrong code at -O3 on x86_64-linux-gnu since
r12-2097-g9f34b780b0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: shaohua.li at inf dot ethz.ch
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---

gcc at -O3 produced the wrong code.

Bisected to r12-2097-g9f34b780b0

Compiler explorer: https://godbolt.org/z/Ea1hGGnob

$ cat a.c
int printf(const char *, ...);
long a;
int b, c, e, g, i;
long *d, *h;
char f = -26;
int main() {
  long j;
  c = 0;
  for (; c != 7; ++c) {
long k=0;
long l = k;
long **m = 
for (; f + i!=0; i++)
  h = 
g = h != (*m = );
int *n = 
*n = g;
for (; e;)
  for (; a; a = a + 1)
;
  }
  printf("%d\n", b);
}
$
$ gcc -fsanitize=address,undefined a.c && ./a.out
1
$ gcc -O3 a.c && ./a.out
0
$

[Bug fortran/111837] [8/9/10/11/12/13/14 Regression] Out of bounds access with optimization inside io-implied-do-control

2023-10-16 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111837

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from anlauf at gcc dot gnu.org ---
Submitted: https://gcc.gnu.org/pipermail/fortran/2023-October/059832.html

[Bug c++/111785] [modules] ICE when compiling fmt lib as module

2023-10-16 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111785

Patrick Palka  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=108080
 CC||ppalka at gcc dot gnu.org

--- Comment #1 from Patrick Palka  ---
The modules streaming code doesn't yet support "GCC optimize" pragmas.  It
should work if you compile the

[Bug tree-optimization/111838] [14 Regression] wrong code at -O3 on x86_64-linux-gnu

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111838

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||needs-bisection
   Last reconfirmed||2023-10-16

--- Comment #1 from Andrew Pinski  ---
The first IR difference comes in from lsplit ...

[Bug tree-optimization/111838] [14 Regression] wrong code at -O3 on x86_64-linux-gnu

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111838

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
Summary|wrong code at -O3 on|[14 Regression] wrong code
   |x86_64-linux-gnu|at -O3 on x86_64-linux-gnu

[Bug tree-optimization/111838] New: wrong code at -O3 on x86_64-linux-gnu

2023-10-16 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111838

Bug ID: 111838
   Summary: wrong code at -O3 on x86_64-linux-gnu
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

It appears to be a recent regression.

Compiler Explorer: https://godbolt.org/z/Mx76x5h5K


[614] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk
--enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231016 (experimental) (GCC) 
[615] % 
[615] % gcctk -O2 small.c; ./a.out
[616] % 
[616] % gcctk -O3 small.c; ./a.out
small.c: In function ‘main’:
small.c:6:17: warning: iteration 4 invokes undefined behavior
[-Waggressive-loop-optimizations]
6 |   if (e ? a % e : 0)
  |   ~~^~~
small.c:5:26: note: within this loop
5 | for (char e = -17; e < 1; e += 5) {
  |~~^~~
Floating point exception
[617] % 
[617] % cat small.c
int a, b, c;
volatile char d;
int main() {
  for (; b < 1; b++)
for (char e = -17; e < 1; e += 5) {
  if (e ? a % e : 0)
d;
  for (c = 0; c < 1; c++)
;
}
  return 0;
}

[Bug target/81426] [SH]: unable to find a register to spill in class 'R0_REGS' when building webkit2gtk

2023-10-16 Thread glaubitz at physik dot fu-berlin.de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81426

--- Comment #11 from John Paul Adrian Glaubitz  ---
Created attachment 56123
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56123=edit
Preprocessed source from building GHC with gcc-13

This is still present in gcc-13, I just ran into it while cross-building the
Haskell compiler GHC for sh4:

"inplace/bin/ghc-stage1" -optc-Wall -optc-Wall -optc-Wextra
-optc-Wstrict-prototypes -optc-Wmissing-prototypes -optc-Wmissing-declarations
-optc-Winline -optc-Wpointer-ari
th -optc-Wmissing-noreturn -optc-Wnested-externs -optc-Wredundant-decls
-optc-Wno-aggregate-return -optc-Wno-unused-label -optc-DNOSMP
-optc-fno-strict-aliasing -optc-fno-
common -optc-Irts/dist-install/build/./autogen
-optc-Irts/include/../dist-install/build/include -optc-Irts/include/.
-optc-Irts/. -optc-DCOMPILING_RTS -optc-DFS_NAMESPACE=
rts -optc-Werror=unused-but-set-variable -optc-Wno-error=inline -optc-O2
-optc-fomit-frame-pointer -optc-g -optc-fno-omit-frame-pointer -optc-O0
-optc-g3 -optc-DRtsWay=\"r
ts_thr_debug\" -optc-ffunction-sections -optc-fdata-sections -static
-optc-DTHREADED_RTS -optc-DDEBUG  -H32m -O -lffi -optl-pthread -O0 -H64m -Wall 
-this-unit-id rts -opt
c-DNOSMP -dcmm-lint -package-env - -i -irts -irts/dist-install/build
-Irts/dist-install/build -irts/dist-install/build/./autogen
-Irts/dist-install/build/./autogen -Ir
ts/include/../dist-install/build/include -Irts/include/. -Irts/.
-optP-DCOMPILING_RTS -optP-DFS_NAMESPACE=rts-O2 -Wcpp-undef -O0  
-Wnoncanonical-monad-instances  
-c rts/ProfilerReport.c -o rts/dist-install/build/ProfilerReport.thr_debug_o
rts/sm/NonMovingMark.c: In function ‘mark_closure’:

rts/sm/NonMovingMark.c:1763:1: error:
 error: unable to find a register to spill in class ‘R0_REGS’
 1763 | }
  | ^
 |
1763 | }
 | ^

rts/sm/NonMovingMark.c:1763:1: error:  error: this is the insn:
 |
1763 | }
 | ^
(insn 1553 3768 1554 115 (parallel [
(set (subreg:SI (reg:QI 7 r7 [803]) 0)
(unspec_volatile:SI [
(mem/v:QI (reg/f:SI 3 r3 [orig:299 _160 ] [299]) [-1 
S1 A8])
(reg:QI 800 [ MEM[(struct StgStack *)_313].marking ])
(reg:QI 5 r5 [orig:802 nonmovingMarkEpoch ] [802])
] UNSPECV_CMPXCHG_1))
(set (mem/v:QI (reg/f:SI 3 r3 [orig:299 _160 ] [299]) [-1  S1 A8])
(unspec_volatile:QI [
(const_int 0 [0])
] UNSPECV_CMPXCHG_2))
(set (reg:SI 147 t)
(unspec_volatile:SI [
(const_int 0 [0])
] UNSPECV_CMPXCHG_3))
(clobber (scratch:SI))
(clobber (reg:SI 0 r0))
(clobber (reg:SI 1 r1))

]) "rts/include/stg/SMP.h":325:0: error:
5 405 {atomic_compare_and_swapqi_soft_gusa}
 (expr_list:REG_DEAD (reg:QI 5 r5 [orig:802 nonmovingMarkEpoch ] [802])
(expr_list:REG_DEAD (reg:QI 800 [ MEM[(struct StgStack
*)_313].marking ])
(expr_list:REG_DEAD (reg/f:SI 3 r3 [orig:299 _160 ] [299])
(expr_list:REG_UNUSED (reg:SI 147 t)
(expr_list:REG_UNUSED (reg:SI 1 r1)
(expr_list:REG_UNUSED (reg:SI 0 r0)
(nil

rts/sm/NonMovingMark.c:1763:0: error:
 confused by earlier errors, bailing out
 |
1763 | }
 | ^

Attaching the preprocessed source for that.

[Bug c++/109751] [13/14 Regression] boost iterator_interface fails concept check starting in gcc-13

2023-10-16 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109751

Patrick Palka  changed:

   What|Removed |Added

 CC||janezz55 at gmail dot com

--- Comment #27 from Patrick Palka  ---
*** Bug 111831 has been marked as a duplicate of this bug. ***

[Bug c++/111831] friend with requires keyword compilation error

2023-10-16 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111831

Patrick Palka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE
 CC||ppalka at gcc dot gnu.org

--- Comment #4 from Patrick Palka  ---
Looks like this is a dup of PR109751, which has been fixed for GCC 13.3 / 14.

A workaround for earlier GCC is to turn the problematic constrained hidden
friend into a template:

  template 
  friend auto operator==(list const& l, list const& r)
  ...

This prevents GCC from overeagerly checking the function's constraints.

*** This bug has been marked as a duplicate of bug 109751 ***

[Bug fortran/111837] [8/9/10/11/12/13/14 Regression] Out of bounds access with optimization inside io-implied-do-control

2023-10-16 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111837

--- Comment #2 from anlauf at gcc dot gnu.org ---
Lightly tested, probably obvious patch:

diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc
index 136a292807d..536884b13f0 100644
--- a/gcc/fortran/frontend-passes.cc
+++ b/gcc/fortran/frontend-passes.cc
@@ -1326,7 +1326,7 @@ traverse_io_block (gfc_code *code, bool *has_reached,
gfc_code *prev)
   if (iters[i])
{
  gfc_expr *var = iters[i]->var;
- for (int j = i - 1; j < i; j++)
+ for (int j = 0; j < i; j++)
{
  if (iters[j]
  && (var_in_expr (var, iters[j]->start)

[Bug rtl-optimization/111835] Suboptimal codegen: zero extended load instead of sign extended one

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-10-16
  Component|middle-end  |rtl-optimization
 Ever confirmed|0   |1
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
 Target||aarch64
   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
So as you said it depends on the target.
Most non-x86 target have:
/* Define if loading from memory in MODE, an integral mode narrower than
   BITS_PER_WORD will either zero-extend or sign-extend.  The value of this
   macro should be the code that says which one of the two operations is
   implicitly done, or UNKNOWN if none.  */
#define LOAD_EXTEND_OP(MODE) ZERO_EXTEND

defined.

Which causes REE to be confused before hand:
Before REE:
(insn 7 10 9 2 (set (reg:SI 0 x0 [orig:92 _1 ] [92])
(zero_extend:SI (mem:QI (reg:DI 0 x0 [99]) [0 *src_3(D)+0 S1 A8])))
"/app/example.cpp":4:39 146 {*zero_extendqisi2_aarch64}
 (nil))
(insn 9 7 15 2 (set (mem:QI (reg:DI 1 x1 [100]) [0 *dst_5(D)+0 S1 A8])
(reg:QI 0 x0 [orig:92 _1 ] [92])) "/app/example.cpp":5:10 62
{*movqi_aarch64}
 (nil))
(insn 15 9 16 2 (set (reg/i:SI 0 x0)
(sign_extend:SI (reg:QI 0 x0 [orig:92 _1 ] [92])))
"/app/example.cpp":7:1 142 {*extendqisi2_aarch64}
 (nil))

Which means that REE does not elimite it.


Note on x86 we get before REE:
(insn 7 4 8 2 (set (reg:QI 0 ax [orig:98 _1 ] [98])
(mem:QI (reg:DI 5 di [104]) [0 *src_3(D)+0 S1 A8]))
"/app/example.cpp":4:39 93 {*movqi_internal}
 (nil))
(insn 8 7 9 2 (set (mem:QI (reg:DI 4 si [105]) [0 *dst_5(D)+0 S1 A8])
(reg:QI 0 ax [orig:98 _1 ] [98])) "/app/example.cpp":5:10 93
{*movqi_internal}
 (nil))
(insn 9 8 15 2 (set (reg:SI 0 ax [orig:103 _1 ] [103])
(sign_extend:SI (reg:QI 0 ax [orig:98 _1 ] [98])))
"/app/example.cpp":6:12 183 {extendqisi2}
 (nil))

So REE is able to move that sign extend back to the original load.

[Bug modula2/111756] Re-building all-gcc after source changes fails to link

2023-10-16 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111756

Gaius Mulley  changed:

   What|Removed |Added

  Attachment #56114|0   |1
is obsolete||

--- Comment #3 from Gaius Mulley  ---
Created attachment 56122
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56122=edit
Proposed patches v4 (implement all -M flags).  No Make-lang.in changes yet
though.

This patch implements all the -M* options.  It seems to work on small projects
outside GCC.  The next patch will include the changes to gcc/m2/Make-lang.in.

[Bug target/111829] Redudant register moves inside the loop

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111829

--- Comment #5 from Andrew Pinski  ---
I am 99% sure it is a dup of bug 94663 (and others).

[Bug tree-optimization/101541] Missing ABSU detection at gimple

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101541

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #10 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/101541] Missing ABSU detection at gimple

2023-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101541

--- Comment #9 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:c7609acb8a8210188d21b2cd72ecc6d3b2de2ab8

commit r14-4662-gc7609acb8a8210188d21b2cd72ecc6d3b2de2ab8
Author: Andrew Pinski 
Date:   Sun Oct 15 10:36:56 2023 -0700

MATCH: Improve `A CMP 0 ? A : -A` set of patterns to use bitwise_equal_p.

This improves the `A CMP 0 ? A : -A` set of match patterns to use
bitwise_equal_p which allows an nop cast between signed and unsigned.
This allows catching a few extra cases which were not being caught before.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/101541
* match.pd (A CMP 0 ? A : -A): Improve
using bitwise_equal_p.

gcc/testsuite/ChangeLog:

PR tree-optimization/101541
* gcc.dg/tree-ssa/phi-opt-36.c: New test.
* gcc.dg/tree-ssa/phi-opt-37.c: New test.

[Bug tree-optimization/14792] ((int)b & 1) != 0 is not folded to b & 1 != 0

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14792
Bug 14792 depends on bug 31531, which changed state.

Bug 31531 Summary: A microoptimization of isnegative of signed integer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31531

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/31531] A microoptimization of isnegative of signed integer

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31531

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #22 from Andrew Pinski  ---
Fixed.

[Bug middle-end/31531] A microoptimization of isnegative of signed integer

2023-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31531

--- Comment #21 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:29a4453c7b8a86d242dab89b9e4d222749fd911e

commit r14-4661-g29a4453c7b8a86d242dab89b9e4d222749fd911e
Author: Andrew Pinski 
Date:   Sun Oct 15 15:18:42 2023 -0700

[PR31531] MATCH: Improve ~a < ~b and ~a < CST, allow a nop cast inbetween ~
and a/b

Currently we able to simplify `~a CMP ~b` to `b CMP a` but we should allow
a nop
conversion in between the `~` and the `a` which can show up. A similarly
thing should
be done for `~a CMP CST`.

I had originally submitted the `~a CMP CST` case as
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585088.html;
I noticed we should do the same thing for the `~a CMP ~b` case and combined
it with that one here.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/31531

gcc/ChangeLog:

* match.pd (~X op ~Y): Allow for an optional nop convert.
(~X op C): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr31531-1.c: New test.
* gcc.dg/tree-ssa/pr31531-2.c: New test.

[Bug fortran/111837] [8/9/10/11/12/13/14 Regression] Out of bounds access with optimization inside io-implied-do-control

2023-10-16 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111837

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org
   Priority|P3  |P4
   Keywords||wrong-code
Summary|[8,9,10,11,12,13|[8/9/10/11/12/13/14
   |Regression] Out of bounds   |Regression] Out of bounds
   |access with optimization|access with optimization
   |inside  |inside
   |io-implied-do-control   |io-implied-do-control
  Known to fail||8.5.0
  Known to work||7.5.0
   Target Milestone|--- |11.5
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-16

--- Comment #1 from anlauf at gcc dot gnu.org ---
Confirmed.

This is a frontend-optimization bug.

Workaround: compile with -fno-frontend-optimize .

[Bug c++/111831] friend with requires keyword compilation error

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111831

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection,
   ||needs-reduction,
   ||rejects-valid
   Host|Linux e5-2620v2 |
   |6.5.7-arch1-1 #1 SMP|
   |PREEMPT_DYNAMIC Tue, 10 Oct |
   |2023 21:10:21 + x86_64  |
   |GNU/Linux   |

--- Comment #3 from Andrew Pinski  ---
Seems to work on the trunk ...

[Bug c++/111831] friend with requires keyword compilation error

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111831

--- Comment #2 from Andrew Pinski  ---
Created attachment 56121
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56121=edit
preprocessed source

[Bug c++/111831] friend with requires keyword compilation error

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111831

Andrew Pinski  changed:

   What|Removed |Added

URL|https://github.com/user1095 |
   |108/xl/blob/master/list.cpp |

--- Comment #1 from Andrew Pinski  ---
https://github.com/user1095108/xl/blob/master/list.cpp

[Bug tree-optimization/111833] [14 Regression] GCC: 14: hangs on a simple for loop

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111833

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug target/55522] -funsafe-math-optimizations is unexpectedly harmful, especially w/ -shared

2023-10-16 Thread o.hlinka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522

OH  changed:

   What|Removed |Added

 CC||o.hlinka at gmail dot com

--- Comment #47 from OH  ---
Will/Could this be back-ported to the 12.x or lower versions? (Wasn't clear to
me from previous comments if this would be the case).

[Bug fortran/111837] New: [8, 9, 10, 11, 12, 13 Regression] Out of bounds access with optimization inside io-implied-do-control

2023-10-16 Thread vladimir.fuka at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111837

Bug ID: 111837
   Summary: [8,9,10,11,12,13 Regression] Out of bounds access with
optimization inside io-implied-do-control
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vladimir.fuka at gmail dot com
  Target Milestone: ---

The following code causes an out-of bounds access in array ni(1) when optimized
with -O1 or higher with GCC 8 and higher. Based on
https://stackoverflow.com/questions/77300746/how-does-gfortran-with-optimization-flags-interpret-nested-implied-do-loops

program implied_do_bug
implicit none
integer :: i,j,k
real :: arr(1,1,1)
integer, dimension(:) :: ni(1)

ni(1) = 1
arr = 1

write(*,*) (((arr(i,j,k), i=1,ni(k)), j=1,1), k=1,1)
end program



With error checker:



> gfortran-13 -O1 q77300746.f90 -fcheck=all -g
> ./a.out 
At line 10 of file q77300746.f90
Fortran runtime error: Index '0' of dimension 1 of array 'ni' below lower bound
of 1

Error termination. Backtrace:
#0  0x4006e6 in implied_do_bug
at /home/lada/f/testy/stackoverflow//q77300746.f90:10
#1  0x400717 in main
at /home/lada/f/testy/stackoverflow//q77300746.f90:11








With address sanitization:



> gfortran-13 -O1 q77300746.f90 -fsanitize=address,undefined
> ./a.out 
=
==30012==ERROR: AddressSanitizer: stack-buffer-underflow on address
0x7fdf3930002c at pc 0x0040128b bp 0x7ffe56f222b0 sp 0x7ffe56f222a8
READ of size 4 at 0x7fdf3930002c thread T0
#0 0x40128a in MAIN__ (/home/lada/f/testy/stackoverflow/a.out+0x40128a)
(BuildId: 4f112b517d93d007bc1b001caf3ac9b317046f1c)
#1 0x401358 in main (/home/lada/f/testy/stackoverflow/a.out+0x401358)
(BuildId: 4f112b517d93d007bc1b001caf3ac9b317046f1c)
#2 0x7fdf3b76e24c in __libc_start_main (/lib64/libc.so.6+0x3524c) (BuildId:
171a59c1c43a8f7b93c3dff765aae0b675fe10f6)
#3 0x400b59 in _start ../sysdeps/x86_64/start.S:120

Address 0x7fdf3930002c is located in stack of thread T0 at offset 44 in frame
#0 0x400c15 in MAIN__ (/home/lada/f/testy/stackoverflow/a.out+0x400c15)
(BuildId: 4f112b517d93d007bc1b001caf3ac9b317046f1c)

  This frame has 4 object(s):
[48, 52) 'ni' (line 5) <== Memory access at offset 44 underflows this
variable
[64, 96) 'arr' (line 4)
[128, 240) 'parm.4' (line 10)
[272, 800) 'dt_parm.3' (line 10)
HINT: this may be a false positive if your program uses some custom stack
unwind mechanism, swapcontext or vfork
  (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-underflow
(/home/lada/f/testy/stackoverflow/a.out+0x40128a) (BuildId:
4f112b517d93d007bc1b001caf3ac9b317046f1c) in MAIN__
Shadow bytes around the buggy address:
  0x7fdf392ffd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf392ffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf392ffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf392fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf392fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x7fdf3930: f1 f1 f1 f1 f1[f1]04 f2 00 00 00 00 f2 f2 f2 f2
  0x7fdf39300080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f2 f2
  0x7fdf39300100: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf39300180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf39300200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7fdf39300280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
==30012==ABORTING

[Bug c/111808] [C23] constexpr with excess precision

2023-10-16 Thread Laurent.Rineau__gcc at normalesup dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111808

Laurent Rineau  changed:

   What|Removed |Added

 CC||Laurent.Rineau__gcc@normale
   ||sup.org

--- Comment #4 from Laurent Rineau  
---
Maybe `constexpr` evaluation of floating point expressions could be computed
using MPFR, instead of using the local hardware.

[Bug c/111836] New: gcc: internal compiler error: in get_expr_operands, at tree-ssa-operands.cc

2023-10-16 Thread congli at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111836

Bug ID: 111836
   Summary: gcc: internal compiler error: in get_expr_operands, at
tree-ssa-operands.cc
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: congli at smail dot nju.edu.cn
  Target Milestone: ---

Compiler explorer: https://godbolt.org/z/v3q61PETv.

The following program `small.c` triggers a crash in gcc-14:

``` sh
% cat small.c
void __GIMPLE (ssa) t () {
  i = __PHI (__BB6: _Literal (i32) 1, __BB4: j);
}

% gcc-tk -O0 small.c
:1:6: error: '__GIMPLE' only valid with '-fgimple'
1 | void __GIMPLE (ssa) t () {
  |  ^~~~
: In function 't':
:2:3: error: 'i' undeclared (first use in this function)
2 |   i = __PHI (__BB6: _Literal (i32) 1, __BB4: j);
  |   ^
:2:3: note: each undeclared identifier is reported only once for each
function it appears in
:2:31: error: unknown type name 'i32'
2 |   i = __PHI (__BB6: _Literal (i32) 1, __BB4: j);
  |   ^~~
:2:46: error: 'j' undeclared (first use in this function)
2 |   i = __PHI (__BB6: _Literal (i32) 1, __BB4: j);
  |  ^
unhandled expression in get_expr_operands():
 

:3:1: internal compiler error: in get_expr_operands, at
tree-ssa-operands.cc:940
3 | }
  | ^
0x230184e internal_error(char const*, ...)
  ???:0
0x9fb842 fancy_abort(char const*, int, char const*)
  ???:0
0x12e301d operands_scanner::get_expr_operands(tree_node**, int)
  ???:0
0x12e35c2 operands_scanner::parse_ssa_operands()
  ???:0
0x12e43ea operands_scanner::build_ssa_operands()
  ???:0
0x12e4634 update_stmt_operands(function*, gimple*)
  ???:0
0xd62007 update_modified_stmts(gimple*)
  ???:0
0xd620e9 gsi_insert_seq_after(gimple_stmt_iterator*, gimple*,
gsi_iterator_update)
  ???:0
0xaad40e c_parser_parse_gimple_body(c_parser*, char*, c_declspec_il,
profile_count)
  ???:0
0xaa3dfd c_parse_file()
  ???:0
0xb17139 c_common_parse_file()
  ???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
```

GCC version:

```
gcc
(Compiler-Explorer-Build-gcc-d5cfabc677b08f38ea5d5f85deeda746b4fabb88-binutils-2.40)
14.0.0 20231016 (experimental)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
```

[Bug middle-end/111835] New: Suboptimal codegen: zero extended load instead of sign extended one

2023-10-16 Thread lis8215 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111835

Bug ID: 111835
   Summary: Suboptimal codegen: zero extended load instead of sign
extended one
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lis8215 at gmail dot com
  Target Milestone: ---

In this simplified example:

int test (const uint8_t * src, uint8_t * dst)
{
int8_t tmp = (int8_t)*src;
*dst = tmp;
return tmp;
}

GCC prefers to use load with zero extension instead of more rational sign
extended load.
Then it needs to do explicit sign extension for making return value.

I know there's a lot of bugs related to zero/sign ext, but I guessed it's rare
special case, and it reproduces in any GCC version available at godbolt and any
architecture except x86-64.

[Bug fortran/110644] Error in gfc_format_decoder

2023-10-16 Thread kyle.shores44 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110644

--- Comment #3 from Kyle Shores  ---
I'll try to create a smaller example, but as y'all know this can be hard...

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794

--- Comment #10 from Robin Dapp  ---
>From what I can tell with my barely working connection no regressions on x86,
aarch64 or power10 with the adjusted check.

[Bug c/111834] GCC: 14: out of memory when __builtin_return_address receive a large constant

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111834

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Dup of bug 105910

*** This bug has been marked as a duplicate of bug 105910 ***

[Bug middle-end/105910] [11/12/13/14 Regression] __builtin_return_address expansion with a large # causes a compile time issues and even ICEs sometimes

2023-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105910

Andrew Pinski  changed:

   What|Removed |Added

 CC||141242068 at smail dot 
nju.edu.cn

--- Comment #7 from Andrew Pinski  ---
*** Bug 111834 has been marked as a duplicate of this bug. ***

[Bug c/111834] New: GCC: 14: out of memory when __builtin_return_address receive a large constant

2023-10-16 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111834

Bug ID: 111834
   Summary: GCC: 14: out of memory when __builtin_return_address
receive a large constant
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

Compiler Explorer: https://gcc.godbolt.org/z/7d63G6fWT

Testcase is pasted below:
```
void *retaddr;

void foo (void) {
  retaddr = __builtin_return_address (1084850891);
}
```

When compile it with GCC-14, it quickly used up all 8G memory on my PC.

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2023-10-16 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #13 from Mikael Morin  ---
(In reply to Tamar Christina from comment #12)
> (In reply to Mikael Morin from comment #11)
> > Created attachment 56094 [details]
> > Improved patch
> > 
> > This improved patch (still single argument only) passes the fortran
> > regression testsuite.
> > 
> 
> Awesome! Thanks! it looks like the benchmark always uses dim=1 or the mask
> argument.
> 
> Can you give a hint into what I'd need to do to add the additional params?

For the mask argument, I hope it would just need
gfc_inline_intrinsic_function_p to return true to work.

For the dim argument, it's a bit more complicated. You can have a look at how
the dim argument support for sum and product was introduced here:
https://gcc.gnu.org/pipermail/fortran/2011-October/037574.html
To describe in a few words the needed bits:
 - gfc_walk_inline_intrinsic_function needs to collect the arrays involved in
scalarization.  In the case where dim is absent, there is just the result of
minloc and it can't be decomposed further.  If dim is present on the other
hand, the arrays are the non-dim arguments, and there is one nested loop
reducing those arrays' dimension by one.  One important thing to pay attention
for, the arrays must be present in the same order they will be consumed by the
gfc_conv_intrinsic_minmaxloc later.
 - All the calls to gfc_walk_expr in gfc_conv_intrinsic_minmaxloc should be
disabled in favor enter_nested_loop.
 - Setting of gfc_se::ss pointers should be disabled, as they should come
correct from initialization.
 - The call to gfc_cleanup_loop should be disabled

These 4 points are very similar to the sum/product patch mentioned above.
Additionally, one has to disable the changes to support the other cases of
{min,max}loc.  Possibly it's easier to start with an unpatched master and merge
the patches afterwards.

Anyway, I should be able to help, maybe by the end of this week, or next week.

[Bug tree-optimization/36010] Loop interchange not performed

2023-10-16 Thread aagarwa at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36010

Ajit Kumar Agarwal  changed:

   What|Removed |Added

 CC||aagarwa at gcc dot gnu.org

--- Comment #5 from Ajit Kumar Agarwal  ---
Use the following flags-fassociative-math -fno-signed-zeros -fno-trapping-math
to make loop-interchange work.

Following code in gcc/gimple-loop-interchange.cc
@@ -514,8 +514,8 @@ loop_cand::analyze_iloop_reduction_var (tree var)
   if (! (associative_tree_code (code)
 || (code == MINUS_EXPR
 && use_p->use == gimple_assign_rhs1_ptr (ass)))
 || (FLOAT_TYPE_P (TREE_TYPE (var))
   && ! flag_associative_math))
 return false;
 }
   else

Because of the flag_associative_math conditions at line no: 514 it returns
false and loop interchange doesn't work. Using the above flags make
flag_associative_math as true and loop interchange works.

[Bug c++/111272] [13/14 Regression] Truncated error messages with -std=c++23 and -std=c++26

2023-10-16 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111272

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Marek Polacek  ---
Fixed.

[Bug c++/111272] [13/14 Regression] Truncated error messages with -std=c++23 and -std=c++26

2023-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111272

--- Comment #6 from CVS Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:a22eeaca5ce753a0a3c22013ee3ecde04c71c2f4

commit r14-4659-ga22eeaca5ce753a0a3c22013ee3ecde04c71c2f4
Author: Marek Polacek 
Date:   Fri Oct 13 16:47:47 2023 -0400

c++: fix truncated diagnostic in C++23 [PR111272]

In C++23, since P2448, a constexpr function F that calls a non-constexpr
function N is OK as long as we don't actually call F in a constexpr
context.  So instead of giving an error in maybe_save_constexpr_fundef,
we only give an error when evaluating the call.  Unfortunately, as shown
in this PR, the diagnostic can be truncated:

z.C:10:13: note: 'constexpr Jam::Jam()' is not usable as a 'constexpr'
function because:
   10 |   constexpr Jam() { ft(); }
  | ^~~

...because what?  With this patch, we say:

z.C:10:13: note: 'constexpr Jam::Jam()' is not usable as a 'constexpr'
function because:
   10 |   constexpr Jam() { ft(); }
  | ^~~
z.C:10:23: error: call to non-'constexpr' function 'int Jam::ft()'
   10 |   constexpr Jam() { ft(); }
  | ~~^~
z.C:8:7: note: 'int Jam::ft()' declared here
8 |   int ft() { return 42; }
  |   ^~

Like maybe_save_constexpr_fundef, explain_invalid_constexpr_fn should
also check the body of a constructor, not just the mem-initializer.

PR c++/111272

gcc/cp/ChangeLog:

* constexpr.cc (explain_invalid_constexpr_fn): Also check the body
of
a constructor in C++14 and up.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-diag1.C: New test.

[Bug c/111833] GCC: 14: hangs on a simple for loop

2023-10-16 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111833

--- Comment #1 from wierton <141242068 at smail dot nju.edu.cn> ---
The compiler explore link above is broken, here is the link:
https://gcc.godbolt.org/z/779zzjcze

[Bug c/111833] New: GCC: 14: hangs on a simple for loop

2023-10-16 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111833

Bug ID: 111833
   Summary: GCC: 14: hangs on a simple for loop
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

Compiler Explorer: [https://gcc.godbolt.org/](https://gcc.godbolt.org/)

Here is the code snippet that is causing the issue:

```c
unsigned char a[1];
unsigned char b;

void f(void) {
  unsigned char r = 0;
  int n;
  for (n = 8; n / sizeof(a); ++n) {
b += b;
r += b;
  }
}
```

This bug shares the same options as bug #111820, but the root cause seems to be
different:

While bug #111820 appears to get stuck in a loop when the initial value is zero
(as suggested by the comment `Maybe the loop should terminate when begin is
zero.`), this particular bug causes a hang starting from the initial value of
8. This suggests that the issue may not be tied to the initial loop value, but
some other aspect of the loop's iteration or the operations within it.

[Bug c/111832] New: RISC-V: ICE on dynamic LMUL

2023-10-16 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111832

Bug ID: 111832
   Summary: RISC-V: ICE on dynamic LMUL
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

#include 

#define INDEX16 int16_t

#define TEST_LOOP(DATA_TYPE, BITS)
\
  void __attribute__ ((noinline, noclone))
\
  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   
\
 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)   
\
  {   
\
for (int i = 0; i < 128; ++i) 
\
  if (cond[i])
\
dest[i] += src[indices[i]];   
\
  }

#define TEST_ALL(T)   
\
  T (int8_t, 16)  
\

TEST_ALL (TEST_LOOP)

ICE with --param=riscv-autovec-lmul=dynamic

ump file: auto.c.175t.vect
auto.c: In function 'f_int8_t':
auto.c:7:3: internal compiler error: in compute_nregs_for_mode, at
config/riscv/riscv-vector-costs.cc:269
7 |   f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,
   \
  |   ^~
auto.c:16:3: note: in expansion of macro 'TEST_LOOP'
   16 |   T (int8_t, 16)   
   \
  |   ^
auto.c:18:1: note: in expansion of macro 'TEST_ALL'
   18 | TEST_ALL (TEST_LOOP)
  | ^~~~
0x1ed4cc0 compute_nregs_for_mode
../../../../gcc/gcc/config/riscv/riscv-vector-costs.cc:269
0x1ed4ebc max_number_of_live_regs
../../../../gcc/gcc/config/riscv/riscv-vector-costs.cc:304
0x1ed634b riscv_vector::costs::preferred_new_lmul_p(vector_costs const*) const
../../../../gcc/gcc/config/riscv/riscv-vector-costs.cc:600
0x1ed658f riscv_vector::costs::better_main_loop_than_p(vector_costs const*)
const
../../../../gcc/gcc/config/riscv/riscv-vector-costs.cc:641
0x1bd7658 vect_better_loop_vinfo_p
../../../../gcc/gcc/tree-vect-loop.cc:3296
0x1bd7680 vect_joust_loop_vinfos
../../../../gcc/gcc/tree-vect-loop.cc:3306
0x1bd8438 vect_analyze_loop(loop*, vec_info_shared*)
../../../../gcc/gcc/tree-vect-loop.cc:3518
0x1c4b5dd try_vectorize_loop_1
../../../../gcc/gcc/tree-vectorizer.cc:1064
0x1c4baee try_vectorize_loop
../../../../gcc/gcc/tree-vectorizer.cc:1182
0x1c4bd9d execute
../../../../gcc/gcc/tree-vectorizer.cc:1296


Reference:
https://gcc.godbolt.org/z/e8oEjsa89

[Bug c++/111831] New: friend with requires keyword compilation error

2023-10-16 Thread janezz55 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111831

Bug ID: 111831
   Summary: friend with requires keyword compilation error
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: janezz55 at gmail dot com
  Target Milestone: ---

If I compile list.cpp with:

g++ -std=c++20 list.cpp -o l

friend auto operator==(list const& l, list const& r)
  noexcept(noexcept(
  std::equal(l.begin(), l.end(), r.begin(), r.end())
)
  )
  requires(requires{std::equal(l.begin(), l.end(), r.begin(), r.end());})
{
  return std::equal(l.begin(), l.end(), r.begin(), r.end());
}

gcc-13.2.1 will respond with a compilation error:

error: no match for 'operator==' (operand types are 'xl::list' and
'xl::list')
   51 |   std::cout << a.size() << " " << b.size() << " " << (a == b) <<
std::endl;
  |   ~ ^~ ~
  |   ||
  |   |list<[...]>
  |   list<[...]>

bug clang-16.0.6 won't. I think the bug is related to the friend declaration. I
have prepared an example:

https://github.com/user1095108/xl/blob/master/list.cpp

The relevant requires inside list.hpp is commented out, since the requires
serves no purpose at all in this case. Please uncomment, if you decide to
investigate.

[Bug tree-optimization/111515] [14 Regression] Missed Dead Code Elimination since r14-4089-gd45ddc2c04e

2023-10-16 Thread theodort at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111515

--- Comment #4 from Theodoros Theodoridis  ---
It turns out that the unreduced test case is also depended on not-inlining to
main. I will be more careful with filtering out such cases in the future.

[Bug libstdc++/111726] lgamma usage in std::poisson_distribution could cause a Data race

2023-10-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111726

--- Comment #3 from Jonathan Wakely  ---
(In reply to Andrew Pinski from comment #1)
> I am not 100% sure but since _M_initialize does not use signgam, this is
> just 2 writes to a global variable that will not be read so the data write
> race is 100% ok.

This example might not misbehave (even though it's technically UB) but if
somebody constructs a std::poisson_distribution in one thread and use lgamma in
another thread, the lgamma call might get the wrong result.

So in the general case, this data race can cause incorrect results (as well as
being UB).

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2023-10-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #12 from Tamar Christina  ---
(In reply to Mikael Morin from comment #11)
> Created attachment 56094 [details]
> Improved patch
> 
> This improved patch (still single argument only) passes the fortran
> regression testsuite.
> 

Awesome! Thanks! it looks like the benchmark always uses dim=1 or the mask
argument.

Can you give a hint into what I'd need to do to add the additional params?

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794

--- Comment #9 from Robin Dapp  ---
Yes, that's from pattern recog:

slp.c:11:20: note:   === vect_pattern_recog ===
slp.c:11:20: note:   vect_recog_mask_conversion_pattern: detected: _5 = _2 &
_4;
slp.c:11:20: note:   mask_conversion pattern recognized: patt_157 = patt_156 &
_4;
slp.c:11:20: note:   extra pattern stmt: patt_156 = () _2;
slp.c:11:20: note:   vect_recog_bool_pattern: detected: _6 = (int) _5;
slp.c:11:20: note:   bool pattern recognized: patt_159 = (int) patt_158;
slp.c:11:20: note:   extra pattern stmt: patt_158 = _5 ? 1 : 0;
slp.c:11:20: note:   vect_recog_mask_conversion_pattern: detected: _11 = _8 &
_10;
slp.c:11:20: note:   mask_conversion pattern recognized: patt_161 = patt_160 &
_10;
slp.c:11:20: note:   extra pattern stmt: patt_160 = () _8;
...

In vect_recog_mask_conversion_pattern we arrive at

  if (TYPE_PRECISION (rhs1_type) < TYPE_PRECISION (rhs2_type))
{
  vectype1 = get_mask_type_for_scalar_type (vinfo, rhs1_type);
  if (!vectype1)
return NULL;
  rhs2 = build_mask_conversion (vinfo, rhs2, vectype1, stmt_vinfo);
}
  else
{
  vectype1 = get_mask_type_for_scalar_type (vinfo, rhs2_type);
  if (!vectype1)
return NULL;
  rhs1 = build_mask_conversion (vinfo, rhs1, vectype1, stmt_vinfo);
}
  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
  pattern_stmt = gimple_build_assign (lhs, rhs_code, rhs1, rhs2);


vectype1 is then e.g. vector([8,8]) .  Then
vect_recog_bool_pattern creates the COND_EXPR.

Testsuites are running with your proposed change.

[Bug tree-optimization/111830] "omp simd reduction" cannot collaborate well with “loop peeling”.

2023-10-16 Thread guojie at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111830

--- Comment #1 from Guo Jie  ---
Details in PR111403.

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794

--- Comment #8 from rguenther at suse dot de  ---
On Mon, 16 Oct 2023, rdapp at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794
> 
> --- Comment #7 from Robin Dapp  ---
>   vectp.4_188 = x_50(D);
>   vect__1.5_189 = MEM  [(int *)vectp.4_188];
>   mask__2.6_190 = { 1, 1, 1, 1, 1, 1, 1, 1 } == vect__1.5_189;
>   mask_patt_156.7_191 = VIEW_CONVERT_EXPR >(mask__2.6_190);
>   _1 = *x_50(D);
>   _2 = _1 == 1;
>   vectp.9_192 = y_51(D);
>   vect__3.10_193 = MEM  [(short int *)vectp.9_192];
>   mask__4.11_194 = { 2, 2, 2, 2, 2, 2, 2, 2 } == vect__3.10_193;
>   mask_patt_157.12_195 = mask_patt_156.7_191 & mask__4.11_194;
>   vect_patt_158.13_196 = VEC_COND_EXPR  1, 1, 1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
>   vect_patt_159.14_197 = (vector(8) int) vect_patt_158.13_196;
> 
> 
> This yields the following assembly:
> vsetivlizero,8,e32,m2,ta,ma
> vle32.v v2,0(a0)
> vmv.v.i v4,1
> vle16.v v1,0(a1)
> vmseq.vvv0,v2,v4
> vsetvli zero,zero,e16,m1,ta,ma
> vmseq.viv1,v1,2
> vsetvli zero,zero,e32,m2,ta,ma
> vmv.v.i v2,0
> vmand.mmv0,v0,v1
> vmerge.vvm  v2,v2,v4,v0
> vse32.v v2,0(a0)
> 
> Apart from CSE'ing v4 this looks pretty good to me.  My connection is really
> poor at the moment so I cannot quickly compare what aarch64 does for that
> example.

That looks reasonable.  Note this then goes through
vectorizable_assignment as a no-op move.  The question is
if we can arrive here with signed bool : 2 vs. _Bool : 2
somehow (I wonder how we arrive with singed bool : 1 here - that's
from pattern recog, right?  why didn't that produce a
COND_EXPR for this?).

I think for more thorough testing the condition should change to

  /* But a conversion that does not change the bit-pattern is ok.  */
  && !(INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
   && INTEGRAL_TYPE_P (TREE_TYPE (op))
   && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
   > TYPE_PRECISION (TREE_TYPE (op)))
   && TYPE_UNSIGNED (TREE_TYPE (op
   || TYPE_PRECISION (TREE_TYPE (scalar_dest))
  == TYPE_PRECISION (TREE_TYPE (op)

rather than just doing >= which would be odd (why allow
to skip sign-extenting from the unsigned MSB but not allow
to skip zero-extending from it)

[Bug tree-optimization/111830] New: "omp simd reduction" cannot collaborate well with “loop peeling”.

2023-10-16 Thread guojie at loongson dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111830

Bug ID: 111830
   Summary: "omp simd reduction" cannot collaborate well with
“loop peeling”.
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Keywords: openmp
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: guojie at loongson dot cn
  Target Milestone: ---

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-10-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #20 from Kewen Lin  ---
(In reply to Richard Biener from comment #19)
> So maybe it's the same issue as PR90348 (you can verify the RTL expansion
> dump on whether the two involved decls are coalesced and see whether that's
> valid).

Thanks for the hints! Unfortunately the internal BE machine which I worked on
for this is unreachable today, will post more findings when it comes back.

[Bug target/111522] Different code path for static initialization with flto

2023-10-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111522

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 CC||rguenth at gcc dot gnu.org
 Status|WAITING |RESOLVED

--- Comment #12 from Kewen Lin  ---
(In reply to Mathieu Malaterre from comment #11)
> Here is a dead simple reduced version:
> 
> ```
> % cat pr111522.cc
> #include 
> #include 
> #pragma GCC push_options
> #pragma GCC target "cpu=power10"
> float BitCast(int in) {
>   float out;
>   memcpy(, , sizeof(out));
>   return out;
> }
> float kNearOneF = BitCast(1065353215);
> #pragma GCC pop_options
> int main() { std::cout << kNearOneF << std::endl; }
> ```
> 
> You can compare:
> 
> g++ -o works -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
> 
> vs
> 
> g++ -o fails -flto -O2 pr111522.cc -Wall -Wextra -Werror -Wfatal-errors
> 
> For some reason, `-flto` rightfully generates a `xxspltidp` instruction:
> 
> (gdb) display/i $pc
> 1: x/i $pc
> => 0x10940 <_Z7BitCasti.constprop.0>:   xxspltidp vs1,1065353215
> 
> I am not sure I understand the behavior of the non LTO case now...

I think this is a test issue. The given source code claims it wants to compile
the function BitCast with -mcpu=power10, it's valid to generate power10 insns
for it and its specialized ones.

Without LTO, no power10 insn helps the general BitCast, so the generated insns
looks like:

1b10 <_Z7BitCasti>:
1b10:   c6 07 69 78 rldicr  r9,r3,32,31
1b14:   66 01 29 7c mtfprd  f1,r9
1b18:   2c 0d 20 f0 xscvspdpn vs1,vs1
1b1c:   20 00 80 4e blr

while with LTO, function versioning is able to create one specialized function
with fixed argument 1065353215, then the newly created one is able to leverage
power10 insn so we have:

// specialized with const argument propagate 
1840 <_Z7BitCasti.constprop.0>:
1840:   7f 3f 00 05 xxspltidp vs1,1065353215
1844:   ff ff 24 80
1848:   20 00 80 4e blr

while the global variable initialization still uses power8 insns:

1940 <_GLOBAL__sub_I__Z7BitCasti>:
1940:   02 10 40 3c lis r2,4098
1944:   00 7f 42 38 addir2,r2,32512
1948:   a6 02 08 7c mflrr0
194c:   10 00 01 f8 std r0,16(r1)
1950:   e1 ff 21 f8 stdur1,-32(r1)
1954:   dd fe ff 4b bl  1830 <0184.long_branch.184:6>
1958:   18 00 41 e8 ld  r2,24(r1)
195c:   20 00 21 38 addir1,r1,32
1960:   00 00 00 60 nop
1964:   10 00 01 e8 ld  r0,16(r1)
1968:   5c 81 22 d0 stfsf1,-32420(r2)
196c:   a6 03 08 7c mtlrr0
1970:   20 00 80 4e blr

If we specify -mcpu=power10 -flto, we can see _GLOBAL__sub_I__Z7BitCasti will
directly adopts p10 insns (it implicitly indicates that with the default
-mcpu=power8, inlining considers it's unsafe to inline _Z7BitCasti.constprop.0)

1900 <_GLOBAL__sub_I__Z7BitCasti>:
1900:   7f 3f 00 05 xxspltidp vs0,1065353215
1904:   ff ff 04 80
1908:   01 00 10 06 pstfs   f0,128852   # 1002005c 
190c:   54 f7 00 d0
1910:   20 00 80 4e blr

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794

--- Comment #7 from Robin Dapp  ---
  vectp.4_188 = x_50(D);
  vect__1.5_189 = MEM  [(int *)vectp.4_188];
  mask__2.6_190 = { 1, 1, 1, 1, 1, 1, 1, 1 } == vect__1.5_189;
  mask_patt_156.7_191 = VIEW_CONVERT_EXPR>(mask__2.6_190);
  _1 = *x_50(D);
  _2 = _1 == 1;
  vectp.9_192 = y_51(D);
  vect__3.10_193 = MEM  [(short int *)vectp.9_192];
  mask__4.11_194 = { 2, 2, 2, 2, 2, 2, 2, 2 } == vect__3.10_193;
  mask_patt_157.12_195 = mask_patt_156.7_191 & mask__4.11_194;
  vect_patt_158.13_196 = VEC_COND_EXPR ;
  vect_patt_159.14_197 = (vector(8) int) vect_patt_158.13_196;


This yields the following assembly:
vsetivlizero,8,e32,m2,ta,ma
vle32.v v2,0(a0)
vmv.v.i v4,1
vle16.v v1,0(a1)
vmseq.vvv0,v2,v4
vsetvli zero,zero,e16,m1,ta,ma
vmseq.viv1,v1,2
vsetvli zero,zero,e32,m2,ta,ma
vmv.v.i v2,0
vmand.mmv0,v0,v1
vmerge.vvm  v2,v2,v4,v0
vse32.v v2,0(a0)

Apart from CSE'ing v4 this looks pretty good to me.  My connection is really
poor at the moment so I cannot quickly compare what aarch64 does for that
example.

[Bug libstdc++/111726] lgamma usage in std::poisson_distribution could cause a Data race

2023-10-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111726

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-16
 Ever confirmed|0   |1

--- Comment #2 from Jonathan Wakely  ---
We should use the non-standard but thread-safe lgamma_r if available.

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794

--- Comment #6 from rguenther at suse dot de  ---
On Mon, 16 Oct 2023, rdapp at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794
> 
> --- Comment #5 from Robin Dapp  ---
> Disregarding the reasons for the precision adjustment, for this case here, we
> seem to fail at:
> 
>   /* We do not handle bit-precision changes.  */
>   if ((CONVERT_EXPR_CODE_P (code)
>|| code == VIEW_CONVERT_EXPR)
>   && ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
>&& !type_has_mode_precision_p (TREE_TYPE (scalar_dest)))
>   || (INTEGRAL_TYPE_P (TREE_TYPE (op))
>   && !type_has_mode_precision_p (TREE_TYPE (op
>   /* But a conversion that does not change the bit-pattern is ok.  */
>   && !(INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
>&& INTEGRAL_TYPE_P (TREE_TYPE (op))
>&& (TYPE_PRECISION (TREE_TYPE (scalar_dest))
>> TYPE_PRECISION (TREE_TYPE (op)))
>&& TYPE_UNSIGNED (TREE_TYPE (op
> {
>   if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>  "type conversion to/from bit-precision "
>  "unsupported.\n");
>   return false;
> }
> 
> for the expression
>  patt_156 = () _2;
> where _2 (op) is of type _Bool (i.e. TYPE_MODE QImode) and patt_156
> (scalar_dest) is signed-boolean:1.  In that case the mode's precision (8) does
> not match the type's precision (1) for both op and _scalar_dest.
> 
> The second part of the condition I don't fully get.  When does a conversion
> change the bit pattern?  When the source has higher precision than the dest we
> would need to truncate which we probably don't want.  When the dest has higher
> precision that's considered ok?  What about equality?
> 
> If both op and dest have precision 1 the padding could differ (or rather the 1
> could be at different positions) but do we even support that?  In other words,
> could we relax the condition to TYPE_PRECISION (TREE_TYPE (scalar_dest)) >=
> TYPE_PRECISION (TREE_TYPE (op)) (>= instead of >)?
> 
> FWIW bootstrap and testsuite unchanged with >= instead of > on x86, aarch64 
> and
> power10 but we might not have a proper test for that?

It's about sign- vs. zero-extending into padding.  What kind of code
does the vectorizer emit?

[Bug tree-optimization/111820] [13/14 Regression] Compiler time hog in the vectorizer with `-O3 -fno-tree-vrp`

2023-10-16 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820

--- Comment #6 from rguenther at suse dot de  ---
On Mon, 16 Oct 2023, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111820
> 
> --- Comment #5 from Hongtao.liu  ---
> (In reply to Richard Biener from comment #3)
> > for (unsigned i = 0; i != skipn - 1; i++)
> >   begin = wi::mul (begin, wi::to_wide (step_expr));
> > 
> > (gdb) p skipn
> > $5 = 4294967292
> > 
> > niters is 4294967292 in vect_update_ivs_after_vectorizer.  Maybe the loop
> > should terminate when begin is zero.  But I wonder why we pass in 'niters'
> Here, it want to calculate begin * pow (step_expr, skipn), yes we can just 
> skip
> the loop when begin is 0.

I mean terminate it when the multiplication overflowed to zero.

As for the MASK_ thing the skip is to be interpreted negative (we
should either not use a 'tree' here or make it have the correct type
maybe).  Can we even handle this here?  It would need to be
a division, no?

So I think we need to disable non-linear IV or masked peeling for
niter/aligment?  But I wonder how we run into this with plain -O3.

[Bug libstdc++/111826] __cpp_lib_format should be 202110, not 202106

2023-10-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111826

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-10-16

[Bug libstdc++/111824] [14 Regression] is invalid under -U__STRICT_ANSI__ -std=c++11

2023-10-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111824

--- Comment #5 from Jonathan Wakely  ---
Feel free to quote me in a bug report to monotone

[Bug libstdc++/111824] [14 Regression] is invalid under -U__STRICT_ANSI__ -std=c++11

2023-10-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111824

--- Comment #4 from Jonathan Wakely  ---
It's just idiotic to request strict mode and then undefine the macro that tells
the library the compiler is being strict. How can the library know if it can
use extensions if you lie to it about the compiler's (lack of) support for
those extensions?

Quite apart from the stupidity of doing it, that is not a macro that users are
allowed to define/undefine, it's in the implementation space and not documented
as one that users can/should mess with.

Just don't do this.

[Bug bootstrap/111812] [14 regression] Can't build with gcc 4.8.5

2023-10-16 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111812

Roger Sayle  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-10-16
  Known to work||13.0
  Known to fail||14.0
   Host|powerpc64-linux-gnu |*-linux-gnu
 Status|UNCONFIRMED |NEW
 Target|powerpc64-linux-gnu |
 CC||roger at nextmovesoftware dot 
com
  Build|powerpc64-linux-gnu |

  1   2   >