[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #4 from Robin Dapp  ---
I cannot reproduce it either, tried with -ftree-vectorize as well as
-fno-vect-cost-model.

[Bug target/113609] EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF.

2024-01-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609

--- Comment #2 from Uroš Bizjak  ---
(In reply to Hongtao Liu from comment #1)
> Since they're different modes, CCZ for cmp, but CCS for kortest, it could be
> diffcult to optimize it in RA stage by adding alternatives(like we did for
> compared to 0). So the easy way could be adding peephole to hanlde that.

You can use pre-reload split for this. Please see for example how *jcc_bt
and *bt_setcqi provide compound operation (constructed from CCZmode
compare) that is later split to CCCmode operation. You will have to provide jcc
and setcc patterns to fully handle mode change.

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #3 from Richard Biener  ---
I'll note that esp. two-lane reductions (or in general two-lane BB
vectorization) is hardly profitable on modern x86 uarchs unless the vectorized
code is interleaved with other non-vectorized code that can execute at the same
time.  vectorizing two lanes will only make them dependent on each other while
when not vectorized modern uarchs have no difficulty in executing them in
parallel (but without the tied dependences).  It's only when there's sufficient
benefit, aka more lanes, approaching the issue width or the number of available
ports for the ops, or the whole SLP mostly consisting of loads/stores, that BB
vectorization is going to be profitable.  Note the cost model only ever looks
at the stmts participating in the vectorization, not the "surrounding" code,
and it would be difficult to include that since the schedule on GIMPLE isn't
even close to what we get later.  The reduction op is also a serialization
point on the scalar side of course, whether that means that BB reductions
with two lanes are possibly better candidates than grouped BB stores with
two lanes is another question.

The BB reduction op itself is costed properly.

So the 525.x264_r case might be loop vectorization, OTOH the epilogue
cost is hardly ever a knob that decides whether a vectorization is profitable.

I think we need to figure out what exactly gets slower (and hope it's not
scattered all over the place)

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #16 from Andrew Pinski  ---
Fixed.

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

--- Comment #15 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:bfd6b36f08021f023e0e9223f5aea315b74a5c56

commit r14-8443-gbfd6b36f08021f023e0e9223f5aea315b74a5c56
Author: Andrew Pinski 
Date:   Thu Jan 25 13:58:10 2024 -0800

testsuite/vect: Fix pr25413a.c expectations [PR109705]

The 2 loops in octfapg_universe can and will be vectorized now
after r14-333-g6d4b59a9356ac4 on targets that support multiplication
in the long type. But the testcase does not check vect_long_mult for
that, so this patch corrects that error and now the testcase passes
correctly
on aarch64-linux-gnu (with and without SVE).

Built and tested on aarch64-linux-gnu (with and without SVE).

gcc/testsuite/ChangeLog:

PR testsuite/109705
* gcc.dg/vect/pr25413a.c: Expect 1 vectorized loops for
!vect_long_mult
and 2 for vect_long_mult.

Signed-off-by: Andrew Pinski 

[Bug target/113469] RISC-V: Illegal Insn for test case 920501-8.c when make linux for rv32

2024-01-25 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113469

Li Pan  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Li Pan  ---
Fixed.

[Bug target/105479] ICE in subreg_size_lowpart_offset, at emit-rtl.cc:1673

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105479

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-26
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug c/104427] ICE with __builtin_assoc_barrier and float types which introduce excess precision

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104427

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |12.0

--- Comment #8 from Andrew Pinski  ---
Fixed.

[Bug tree-optimization/113614] New: wrong code with _BitInt() division at -O1

2024-01-25 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113614

Bug ID: 113614
   Summary: wrong code with _BitInt() division at -O1
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57220
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57220=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -O1 testcase.c
$ ./a.out
Aborted

The wrong result is -39, as if the division were with signed types.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240125 (experimental) (GCC)

[Bug target/102252] svbool_t with SVE can generate invalid assembly

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102252

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |12.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Andrew Pinski  ---
Fixed.

[Bug target/90155] aarch64: too much quoting in diagnostic for %d

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90155

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-26
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Andrew Pinski  ---
Confirmed.  I will submit a patch in a little bit.

[Bug target/113469] RISC-V: Illegal Insn for test case 920501-8.c when make linux for rv32

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113469

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:d40b3c1e439db05c835b6bd4fd5bba58fda71dd6

commit r14-8442-gd40b3c1e439db05c835b6bd4fd5bba58fda71dd6
Author: Juzhe-Zhong 
Date:   Fri Jan 26 14:46:21 2024 +0800

RISC-V: Fix incorrect LCM delete bug [VSETVL PASS]

This patch fixes the recent noticed bug in RV32 glibc.

We incorrectly deleted a vsetvl:

...
and a4,a4,a3
vmv.v.i v1,0 ---> Missed vsetvl cause illegal
instruction report.
vse8.v  v1,0(a5)

The root cause the laterin in LCM is incorrect.

  BB 358:
avloc: n_bits = 2, set = {}
kill: n_bits = 2, set = {}
antloc: n_bits = 2, set = {}
transp: n_bits = 2, set = {}
avin: n_bits = 2, set = {}
avout: n_bits = 2, set = {}
del: n_bits = 2, set = {}

cause LCM let BB 360 delete the vsetvl:

  BB 360:
avloc: n_bits = 2, set = {}
kill: n_bits = 2, set = {}
antloc: n_bits = 2, set = {}
transp: n_bits = 2, set = {0 1 }
avin: n_bits = 2, set = {}
avout: n_bits = 2, set = {}
del: n_bits = 2, set = {1}

Also, remove unknown vsetvl info into local computation since it is
unnecessary.

Tested on both RV32/RV64 no regression.

PR target/113469

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc
(pre_vsetvl::compute_lcm_local_properties): Fix bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr113469.c: New test.

[Bug c++/113612] [13/14 Regression] ICE: SIGSEGV in get_template_info (pt.cc:378) or tree_check (tree.h:3611) with invalid -fpreprocessed

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113612

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection
   Last reconfirmed||2024-01-26
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug c++/113612] [13/14 Regression] ICE: SIGSEGV in get_template_info (pt.cc:378) or tree_check (tree.h:3611) with invalid -fpreprocessed

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113612

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||error-recovery
   Target Milestone|--- |13.3
Summary|ICE: SIGSEGV in |[13/14 Regression] ICE:
   |get_template_info   |SIGSEGV in
   |(pt.cc:378) or tree_check   |get_template_info
   |(tree.h:3611) with invalid  |(pt.cc:378) or tree_check
   |-fpreprocessed  |(tree.h:3611) with invalid
   ||-fpreprocessed
  Known to work||13.2.0

[Bug target/113613] [14 Regression] Missing ldp/stp optimization sometimes

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113613

--- Comment #2 from Andrew Pinski  ---
Note I don't know if this shows up in real programs but it might point to
something missing that might happen in real programs.

Another testcase this time without vectors:
```
double a[4];
double b[4];
void f()
{
  b[0] += a[0];
  b[1] *= a[1];
}

```

For some reason it works with the GPRs though:
```
int a[4];
int b[4];
void f()
{
  b[0] += a[0];
  b[1] *= a[1];
}

```

[Bug target/113613] [14 Regression] Missing ldp/stp optimization sometimes

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113613

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

--- Comment #1 from Andrew Pinski  ---
Note we should really get v4sf but that is PR 95960.

[Bug target/113613] New: [14 Regression] Missing ldp/stp optimization sometimes

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113613

Bug ID: 113613
   Summary: [14 Regression] Missing ldp/stp optimization sometimes
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
CC: acoplan at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64-*-*

Take:
```
typedef float __attribute__((vector_size(8))) v2sf;
v2sf a[4];
v2sf b[4];
void f()
{
  b[0] += a[0];
  b[1] += a[1];
}

```

With -O3 on the trunk we get:
```
f:
adrpx1, .LANCHOR0
add x0, x1, :lo12:.LANCHOR0
ldr d31, [x1, #:lo12:.LANCHOR0]
ldr d30, [x0, 32]
faddv30.2s, v31.2s, v30.2s
ldr d31, [x0, 8]
str d30, [x1, #:lo12:.LANCHOR0]
ldr d30, [x0, 40]
faddv30.2s, v31.2s, v30.2s
str d30, [x0, 8]
ret
```

But in GCC 13 we got:
```
f:
adrpx1, .LANCHOR0
add x0, x1, :lo12:.LANCHOR0
ldp d1, d0, [x0]
ldp d3, d2, [x0, 32]
faddv1.2s, v1.2s, v3.2s
faddv0.2s, v0.2s, v2.2s
stp d1, d0, [x0]
ret
```

[Bug c++/113612] New: ICE: SIGSEGV in get_template_info (pt.cc:378) or tree_check (tree.h:3611) with invalid -fpreprocessed

2024-01-25 Thread zsojka at seznam dot cz via Gcc-bugs
_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240125 (experimental) (GCC)

[Bug tree-optimization/95960] GCC should re-vectorize vector code with larger VF

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95960

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-01-26
 Status|UNCONFIRMED |NEW

--- Comment #2 from Andrew Pinski  ---
Confirmed.

Even V2SF -> V4SF would be useful for aarch64:
```
typedef float __attribute__((vector_size(8))) v2sf;
float a[4];
float b[4];
void f()
{
  *(v2sf *)[0] += *(v2sf *)[0];
  *(v2sf *)[2] += *(v2sf *)[2];
}

```

[Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114

Andrew Pinski  changed:

   What|Removed |Added

  Known to work||12.1.0

--- Comment #12 from Andrew Pinski  ---
Starting in GCC 12 we get on arm64 (with -Ofast):
```
mult_su3_na:
ldp q3, q1, [x1, 16]
ldr q0, [x0, 32]
ldp q2, q4, [x0]
fmulv0.2d, v0.2d, v1.2d
ldr q1, [x1]
fmlav0.2d, v4.2d, v3.2d
fmlav0.2d, v2.2d, v1.2d
faddp   d0, v0.2d
ret
```

Which is better than before even. (similarly on x86_64 with -mfma) due to SLP
happening.

With -fno-tree-vectorize, -Ofast is slightly on x86_64 better than 13 by one
instruction.

I am not sure if this matters any more due to the SLP improvement ...

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #22 from Hongtao Liu  ---
typedef unsigned long mp_limb_t;
typedef long mp_size_t;
typedef unsigned long mp_bitcnt_t;

typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;

#define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8)

#define GMP_LIMB_MAX (~ (mp_limb_t) 0)

mp_bitcnt_t
__attribute__((noipa))
mpn_common_scan (mp_limb_t limb, mp_size_t i, mp_srcptr up, mp_size_t un,
 mp_limb_t ux)
{
  unsigned cnt;

  while (limb == 0)
{
  i++;
  if (i == un)
return (ux == 0 ? ~(mp_bitcnt_t) 0 : un * GMP_LIMB_BITS);
  limb = ux ^ up[i];
}
  return limb;
}

int main ()
{
  mp_limb_t up[1];
  for (int i = 0; i != 1; i++)
up[i] = 1 << 8;
  up[2000] = 1;
  mp_bitcnt_t res = mpn_common_scan (0, 0, up, 1, 1 << 8);
  if (res != 257)
__builtin_abort ();
  return 1;
}


aborted with -O3 -march=skylake-avx512.

[Bug target/89628] aarch64_vector_pcs does not use v24-v31 as temp regs

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89628

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection

--- Comment #2 from Andrew Pinski  ---
Something seems to have fixed this in GCC 10+.

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2024-01-25 Thread john.harper at vuw dot ac.nz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

--- Comment #25 from john.harper at vuw dot ac.nz ---
With that program Intel's two compilers (ifort and ifx) both print

  >.30D+01<
  >.30E+01<

If your program removes the d0.2 stuff and changes e0.2 to es0.2e0, i.e.

   character(20) :: fmt
   character(9) :: buffer
   fmt = "(1a1,es0.2e0,1a1)"
   write(buffer,fmt) ">", 3.0, "<"
   print *, buffer
   end

then both Intel compilers print what you seem to have hoped for:

  >3.00E+0<

but my gfortran, gcc version 13.1.0 (Ubuntu 13.1.0-8ubuntu1~22.04), prints

  >3.00<

I won't argue about the difference between gfortran's >0.30D+1< 
and Intel's >.30D+01< because I have been caught before by whether the 
zero before the decimal point and the zero after the D are optional. 
The f2018 standard is not easy to read on this.

I tried aocc-flang on your original program, and I ought to send them a 
bug report because it printed

  ><
  ><

I don't have access to the NAG compiler or anyone else's flang.

John


On Thu, 25 Jan 2024, jvdelisle at gcc dot gnu.org wrote:

> Date: Thu, 25 Jan 2024 22:21:01 +
> From: jvdelisle at gcc dot gnu.org 
> To: John Harper 
> Subject: [Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too
>  high.
> Resent-Date: Fri, 26 Jan 2024 11:21:15 +1300 (NZDT)
> Resent-From: 
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022
>
> --- Comment #24 from Jerry DeLisle  ---
> Currently gfortran does the following:
>
> character(20) :: fmt
> character(9) :: buffer
> fmt = "(1a1,d0.2,1a1)"
> write(buffer,fmt) ">", 3.0, "<"
> print *, buffer
> fmt = "(1a1,e0.2,1a1)"
> write(buffer,fmt) ">", 3.0, "<"
> print *, buffer
> end
>
>
> $ gfc question.f90
> $ ./a.out
> >0.30D+1<
> >0.30E+1<
>
> Why not:
>
> $ ./a.out
> >3.00D+0<
> >3.00E+0<
>
> What does Intel do?
>
> -- 
> You are receiving this mail because:
> You reported the bug.
>


-- John Harper, School of Mathematics and Statistics
Victoria Univ. of Wellington, PO Box 600, Wellington 6140, New Zealand.
e-mail john.har...@vuw.ac.nz

[Bug target/113220] [aarch64] ICE Segmentation fault with r14-6178-g8d29b7aca15133

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113220

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2024-01-03 00:00:00 |2024-1-25

--- Comment #2 from Andrew Pinski  ---
Note you can also reproduce the failure with `-fstack-clash-protection` and
this simplified testcase:
```
#pragma GCC target "+sme2"
void inout_zt0() __arm_inout("zt0") {
  __builtin_abort();
}
```

Basically any noreturn function and -fstack-clash-protection and SME2 will
cause the ICE.

[Bug target/113084] aarch64: vget_low blocks tail-call

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113084

Andrew Pinski  changed:

   What|Removed |Added

  Component|middle-end  |target

--- Comment #3 from Andrew Pinski  ---
Oh __builtin_aarch64_get_lowv4sf is not lowered to BIT_FIELD_REF either which
it can be now.

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/644004.html

--- Comment #14 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644004.html

It would be useful if tested on powerpc too. I tested it on aarch64 without and
with SVE enabled to make sure vect_long_mult is correct there.
And without SVE enabled we get:
PASS: gcc.dg/vect/pr25413a.c scan-tree-dump-times vect "vectorized 1 loops" 1

And with it being enabled we get:
PASS: gcc.dg/vect/pr25413a.c scan-tree-dump-times vect "vectorized 2 loops" 1


as expected now.

[Bug target/100212] UB (shift by -1) in aarch64_classify_index

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100212

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Andrew Pinski  ---
Fixed.

[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
Bug 63426 depends on bug 100212, which changed state.

Bug 100212 Summary: UB (shift by -1) in aarch64_classify_index
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100212

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/100212] UB (shift by -1) in aarch64_classify_index

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100212

--- Comment #3 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:0c2583dc2575f3f64e3d09e12c296eb56f01916d

commit r14-8441-g0c2583dc2575f3f64e3d09e12c296eb56f01916d
Author: Andrew Pinski 
Date:   Thu Jan 25 13:45:59 2024 -0800

aarch64: Fix/avoid undefinedness in aarch64_classify_index [PR100212]

The problem here is we don't check the return value of exact_log2
and always use that result as shifter. This fixes the issue by avoiding
the shift if the value was `-1` (which means the value was not exact a
power of 2);
in this case we could either check if the values was equal to -1 or not
equal to because
we then assign -1 to shift if the constant value was not equal. I chose
`!=` as
it seemed to be more obvious of what the code is doing.

Committed as obvious after a build/test for aarch64-linux-gnu.

gcc/ChangeLog:

PR target/100212
* config/aarch64/aarch64.cc (aarch64_classify_index): Avoid
undefined shift after the call to exact_log2.

Signed-off-by: Andrew Pinski 

[Bug target/113608] RISC-V: Vector spills after enabling vector abi

2024-01-25 Thread lehua.ding at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113608

--- Comment #1 from Lehua Ding  ---
(In reply to JuzheZhong from comment #0)
> https://godbolt.org/z/srdd4qhdc
> 
> #include "riscv_vector.h"
> 
> vint32m8_t
> foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c,
>  int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2,
>  int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3,
>  int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4,
>  int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5,
>  int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3,
>  int32_t *__restrict d4, int32_t *__restrict d5, int n, vint32m8_t
> vector)
> {
>   for (int i = 0; i < n; i++)
> {
>   a[i] = b[i] + c[i];
>   b5[i] = b[i] + c[i];
>   a2[i] = b2[i] + c2[i];
>   a3[i] = b3[i] + c3[i];
>   a4[i] = b4[i] + c4[i];
>   a5[i] = a[i] + a4[i];
>   d2[i] = a2[i] + c2[i];
>   d3[i] = a3[i] + c3[i];
>   d4[i] = a4[i] + c4[i];
>   d5[i] = a[i] + a4[i];
>   a[i] = a5[i] + b5[i] + a[i];
> 
>   c2[i] = a[i] + c[i];
>   c3[i] = b5[i] * a5[i];
>   c4[i] = a2[i] * a3[i];
>   c5[i] = b5[i] * a2[i];
>   c[i] = a[i] + c3[i];
>   c2[i] = a[i] + c4[i];
>   a5[i] = a[i] + a4[i];
>   a[i] = a[i] + b5[i]
>+ a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i]
>* c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i];
> }
> return vector;
> }
> 
> This case will have vector spills after enabling default vector ABI.

These vector save and restore (spills) are reasonable since the function use
v1-v5 registers which are callee-saved registers. Before enable
riscv-vector-abi, all vector registers are caller-saved registers. So there are
fewer vector registers that do not require save-restore we can use after enable
vector ABI.

But the vector move insn of argument is no need, I think this is a IRA problem
need to debug. Here is a simple case repreduce:
https://godbolt.org/z/e76Ynzcx6

[Bug testsuite/113611] [14 Regression] gcc.dg/pr110279-1.c fails on cross build since gcc-14-5779-g746344dd538

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113611

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||testsuite-fail
  Component|target  |testsuite

--- Comment #1 from Andrew Pinski  ---
I suspect this is just a testcase issue.

[Bug target/113611] New: [14 Regression] gcc.dg/pr110279-1.c fails on cross build since gcc-14-5779-g746344dd538

2024-01-25 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113611

Bug ID: 113611
   Summary: [14 Regression] gcc.dg/pr110279-1.c fails on cross
build since gcc-14-5779-g746344dd538
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: thiago.bauermann at linaro dot org
CC: dizhao at os dot amperecomputing.com
  Target Milestone: ---
Target: arm-linux-gnueabihf

After commit g:746344dd5380 ("swap ops in reassoc to reduce cross backedge
FMA") the following failure started appearing on cross builds of 32 bits Arm:

Running gcc:gcc.dg/dg.exp ...
FAIL: gcc.dg/pr110279-1.c scan-tree-dump-times widening_mul "Generated FMA" 3

We're seeing it with toolchains built with --host=x86_64-linux-gnu and targets
arm-linux-gnueabihf and arm-none-eabi. Both targets with and without
--with-mode=thumb.

Interestingly, with a native compiler (with --host=arm-linux-gnueabihf and
--target=arm-linux-gnueabihf) I can't reproduce the problem.

I tested on today's trunk (commit ffeab69e1ffc) and the failures are still
present.

Here's how to reproduce on an x86_64-linux machine with
--target=arm-linux-gnueabihf:

1. Build and install GCC:

$ ~/src/gcc/configure" \
SHELL=/bin/bash \
--with-gnu-as \
--with-gnu-ld \
--disable-libmudflap \
--enable-lto \
--enable-shared \
--without-included-gettext \
--enable-nls \
--with-system-zlib \
--disable-sjlj-exceptions \
--enable-gnu-unique-object \
--enable-linker-build-id \
--disable-libstdcxx-pch \
--enable-c99 \
--enable-clocale=gnu \
--enable-libstdcxx-debug \
--enable-long-long \
--with-cloog=no \
--with-ppl=no \
--with-isl=no \
--disable-multilib \
--with-float=hard \
--with-fpu=vfpv3-d16 \
--with-tune=cortex-a9 \
--with-arch=armv7-a \
--enable-threads=posix \
--enable-multiarch \
--enable-libstdcxx-time=yes \
--enable-gnu-indirect-function \
--with-sysroot=/var/tmp/sysroot-arm-linux-gnueabihf \
--enable-checking=yes \
--disable-bootstrap \
--enable-languages=default \
--prefix=/tmp/arm-linux-gnueabihf \
--build=x86_64-pc-linux-gnu \
--host=x86_64-pc-linux-gnu \
--target=arm-linux-gnueabihf \
&& make \
SHELL=/bin/bash \
-w \
-j $(nproc) \
CFLAGS_FOR_BUILD="-pipe -g -O2" \
CXXFLAGS_FOR_BUILD="-pipe -g -O2" \
LDFLAGS_FOR_BUILD="-static-libgcc" \
MAKEINFOFLAGS=--force \
BUILD_INFO="" \
MAKEINFO=echo \
&& make install

2. Finally, use it to compile the problematic .c file:

$ /tmp/arm-linux-gnueabihf/bin/arm-linux-gnueabihf-gcc
/home/bauermann/src/gcc/gcc/testsuite/gcc.dg/pr110279-1.c
-fdiagnostics-plain-output -Ofast --param avoid-fma-max-bits=512 --param
tree-reassoc-width=4 -fdump-tree-widening_mul-details -S -o pr110279-1.s
$ grep "Generated FMA" pr110279-1.c.215t.widening_mul || echo FAIL
FAIL

[Bug driver/113610] Manpage could be more clear about gcc's -e flag

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113610

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||documentation

--- Comment #2 from Andrew Pinski  ---
The GNU documentation for -e can be found at:
https://sourceware.org/binutils/docs-2.41/ld/Options.html

And then references
https://sourceware.org/binutils/docs-2.41/ld/Entry-Point.html for the defaults.

Maybe it should mention this is more for embedded folks and is not talking
about main.

[Bug driver/113610] Manpage could be more clear about gcc's -e flag

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113610

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |driver

--- Comment #1 from Andrew Pinski  ---
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Link-Options.html#index-e

[Bug middle-end/113586] ICE: RTL check: expected code 'const_int', have 'reg' in rtx_to_poly_int64, at rtl.h:2398 with -march=rv32gcv -mabi=ilp32e

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113586

--- Comment #2 from Andrew Pinski  ---
My bet is you might be able to reproduce this issue on aarch64 with SVE and
ilp32 but maybe not due to alignment of the stack there is 16 bytes still.

[Bug c/113610] New: Manpage could be more clear about gcc's -e flag

2024-01-25 Thread mike at flyn dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113610

Bug ID: 113610
   Summary: Manpage could be more clear about gcc's -e flag
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mike at flyn dot org
  Target Milestone: ---

The GCC manpage states this:

-e entry
--entry=entry
Specify that the program entry point is entry. The argument is interpreted
by the linker; the GNU linker accepts either a symbol name or an address.

It might be worth noting that this refers to _start, and not main. Many
references refer to main as the "entry point" for a C program. Of course,
thinking this here fails to realize there is significant initialization that
will not happen when using -e. Either mentioning _start explicitly or noting
that changing the entry point might leave things like the heap uninitialized (I
think) might help.

The same can be said about the ld manpage.

[Bug target/113609] EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF.

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609

--- Comment #1 from Hongtao Liu  ---
Since they're different modes, CCZ for cmp, but CCS for kortest, it could be
diffcult to optimize it in RA stage by adding alternatives(like we did for
compared to 0). So the easy way could be adding peephole to hanlde that.

[Bug target/113609] New: EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF.

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609

Bug ID: 113609
   Summary: EQ/NE comparison between avx512 kmask and -1 can be
optimized with kxortest with checking CF.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---

It's from PR113576, there's code like 
 35kmovb   %k0, %edx
 36cmpb$-1, %dl
 37jne .L21

The original codegen is buggy, but still exposed a optimization issue.
Compare 8/16/32/64-bit kmask to -1 and check equal or not can be optimized with
kortest by check CF.


KORTESTW ¶
TMP[15:0] := DEST[15:0] BITWISE OR SRC[15:0]
IF(TMP[15:0]=0)
THEN ZF := 1
ELSE ZF := 0
FI;
IF(TMP[15:0]=h)
THEN CF := 1
ELSE CF := 0
FI;
KORTESTB ¶
TMP[7:0] := DEST[7:0] BITWISE OR SRC[7:0]
IF(TMP[7:0]=0)
THEN ZF := 1
ELSE ZF := 0
FI;
IF(TMP[7:0]==FFh)
THEN CF := 1
ELSE CF := 0
FI;
KORTESTQ ¶
TMP[63:0] := DEST[63:0] BITWISE OR SRC[63:0]
IF(TMP[63:0]=0)
THEN ZF := 1
ELSE ZF := 0
FI;
IF(TMP[63:0]==_h)
THEN CF := 1
ELSE CF := 0
FI;
KORTESTD ¶
TMP[31:0] := DEST[31:0] BITWISE OR SRC[31:0]
IF(TMP[31:0]=0)
THEN ZF := 1
ELSE ZF := 0
FI;
IF(TMP[31:0]=h)
THEN CF := 1
ELSE CF := 0
FI;

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #21 from Hongtao Liu  ---
typedef unsigned long mp_limb_t;
typedef long mp_size_t;
typedef unsigned long mp_bitcnt_t;

typedef mp_limb_t *mp_ptr;
typedef const mp_limb_t *mp_srcptr;

#define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8)

#define GMP_LIMB_MAX (~ (mp_limb_t) 0)

mp_bitcnt_t
mpn_common_scan (mp_limb_t limb, mp_size_t i, mp_srcptr up, mp_size_t un,
 mp_limb_t ux)
{
  unsigned cnt;

  while (limb == 0)
{
  i++;
  if (i == un)
return (ux == 0 ? ~(mp_bitcnt_t) 0 : un * GMP_LIMB_BITS);
  limb = ux ^ up[i];
}
  return limb;
}

This one is miscompiled in 502.gcc_r

123   [local count: 862990464]:
124  _34 = ivtmp.20_20 * 32;
125  vect__5.15_59 = MEM  [(const mp_limb_t
*)vectp.14_53 + _34 * 1];
126  mask_patt_9.16_61 = vect__5.15_59 == vect_cst__60;
127  ivtmp.20_32 = ivtmp.20_20 + 1;
128  if (mask_patt_9.16_61 == { -1, -1, -1, -1 })
129goto ; [94.50%]
130  else
131goto ; [5.50%]


is expanded to

 30.L18:
 31movq%rdi, %rdx
 32incq%rdi
 33salq$5, %rdx
 34vpcmpeqq(%rax,%rdx), %ymm3, %k0
 35kmovb   %k0, %edx
 36cmpb$-1, %dl
 37jne .L21

[Bug c/113608] New: RISC-V: Vector spills after enabling vector abi

2024-01-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113608

Bug ID: 113608
   Summary: RISC-V: Vector spills after enabling vector abi
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

https://godbolt.org/z/srdd4qhdc

#include "riscv_vector.h"

vint32m8_t
foo (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict c,
 int32_t *__restrict a2, int32_t *__restrict b2, int32_t *__restrict c2,
 int32_t *__restrict a3, int32_t *__restrict b3, int32_t *__restrict c3,
 int32_t *__restrict a4, int32_t *__restrict b4, int32_t *__restrict c4,
 int32_t *__restrict a5, int32_t *__restrict b5, int32_t *__restrict c5,
 int32_t *__restrict d, int32_t *__restrict d2, int32_t *__restrict d3,
 int32_t *__restrict d4, int32_t *__restrict d5, int n, vint32m8_t vector)
{
  for (int i = 0; i < n; i++)
{
  a[i] = b[i] + c[i];
  b5[i] = b[i] + c[i];
  a2[i] = b2[i] + c2[i];
  a3[i] = b3[i] + c3[i];
  a4[i] = b4[i] + c4[i];
  a5[i] = a[i] + a4[i];
  d2[i] = a2[i] + c2[i];
  d3[i] = a3[i] + c3[i];
  d4[i] = a4[i] + c4[i];
  d5[i] = a[i] + a4[i];
  a[i] = a5[i] + b5[i] + a[i];

  c2[i] = a[i] + c[i];
  c3[i] = b5[i] * a5[i];
  c4[i] = a2[i] * a3[i];
  c5[i] = b5[i] * a2[i];
  c[i] = a[i] + c3[i];
  c2[i] = a[i] + c4[i];
  a5[i] = a[i] + a4[i];
  a[i] = a[i] + b5[i]
 + a[i] * a2[i] * a3[i] * a4[i] * a5[i] * c[i] * c2[i] * c3[i]
 * c4[i] * c5[i] * d[i] * d2[i] * d3[i] * d4[i] * d5[i];
}
return vector;
}

This case will have vector spills after enabling default vector ABI.

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576

--- Comment #20 from Hongtao Liu  ---

> Note that I wonder how to eliminate redundant maskings?  I suppose
> eventually combine tracking nonzero bits where obvious would do
> that?  For example for cmp:V4SI we know the bits will be zero but
> I wonder if the RTL IL is obvious enough to derive this (or whether
> there's a target hook for extra nonzero bit discovery, say for
> unspecs).

I guess we need extra patterns to make combine know, we already have those for
zero_extend.

3970;; Since vpcmpd implicitly clear the upper bits of dest, transform
 3971;; vpcmpd + zero_extend to vpcmpd since the instruction
 3972(define_insn_and_split
"*_cmp3_zero_extend"
 3973  [(set (match_operand:SWI248x 0 "register_operand")
 3974(zero_extend:SWI248x
 3975  (unspec:
 3976[(match_operand:V48H_AVX512VL 1 "nonimmediate_operand")
 3977 (match_operand:V48H_AVX512VL 2 "nonimmediate_operand")
 3978 (match_operand:SI 3 "const_0_to_7_operand")]
 3979UNSPEC_PCMP)))]

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #2 from Hongtao Liu  ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html

Would you give a try to see if it fixes the regression, I don't currently have
a znver4 machine for testing.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #3 from JuzheZhong  ---
I tried trunk GCC to run your case with SPIKE, still didn't reproduce this
issue.

[Bug target/100638] FP16 (vector) compare missed optimization on AArch64

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100638

Andrew Pinski  changed:

   What|Removed |Added

Summary|FP16 vector compare missed  |FP16 (vector) compare
   |optimization on AArch64 |missed optimization on
   ||AArch64

--- Comment #2 from Andrew Pinski  ---
(In reply to Tamar Christina from comment #0)
> However even the lowered operations are inefficient:
> 
> ```
> fcvts23, h23
> fcmpe   s23, #0.0
> ```
Actually that comes from expand:
```
;; _16 = _15 < 0.0;

(insn 48 47 49 (set (reg:SF 194)
(float_extend:SF (reg:HF 102 [ _15 ]))) "/app/example.c":8:16 -1
 (nil))

(insn 49 48 50 (set (reg:HF 196)
(const_double:HF 0.0 [0x0.0p+0])) "/app/example.c":8:16 -1
 (nil))

(insn 50 49 51 (set (reg:SF 195)
(float_extend:SF (reg:HF 196))) "/app/example.c":8:16 -1
 (nil))

(insn 51 50 52 (set (reg:CCFPE 66 cc)
(compare:CCFPE (reg:SF 194)
(reg:SF 195))) "/app/example.c":8:16 -1
 (nil))
```

Which can reproduce with just a simple:
```
void foo(_Float16 *x, unsigned short *out) {
*out = -(*x < 0.0f16);
}
```

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #2 from JuzheZhong  ---
I can't reproduce this issue.

Could you test it with this patch applied ?

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643934.html

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #1 from JuzheZhong  ---
I can reproduce this issue.

Could you test it with this patch applied ?

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643934.html

[Bug target/113325] unnecessary byte swap for memory clear

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113325

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-26
   Severity|normal  |enhancement
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Andrew Pinski  ---
.

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

--- Comment #1 from Hongtao Liu  ---
Guess it's same issue as PR112879?

[Bug c/29970] mixing ({...}) with VLA leads to massive breakage

2024-01-25 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970

Gabriel Ravier  changed:

   What|Removed |Added

 CC||gabravier at gmail dot com

--- Comment #19 from Gabriel Ravier  ---
Can also confirm this myself as I've also encountered this ICE in this code:

#include 

#define each(item, array) \
(typeof(*(array)) *foreach_p = (array), *foreach_p2 = foreach_p, (item) = {}; \
foreach_p < &((foreach_p2)[sizeof(array)/sizeof(*array)]); \
++foreach_p)if((__builtin_memcpy(&(item), foreach_p, sizeof((item, 0){}else

#define range1(_stop) (({ \
typeof(_stop) stop = _stop; \
struct{typeof((stop)) array[stop];}p = {}; \
if(stop < 0){ \
for(size_t i = 0; i > stop; --i) \
p.array[-i] = i; \
}else{ \
for(size_t i = 0; i < stop; ++i) \
p.array[i] = i; \
} \
p; \
}).array)

int main(){
char group[][4] = {
"egg",
"one",
"two",
"moo",
};
for each(x, group){
puts(x);
}
return sizeof(range1(6));
}

which I was able to minify to:

void f()
{
  (void)({
int x = 1;
struct {
  int array[x];
} p;
p;
  });
}

which roughly matches what testcase 2 does.

[Bug target/103781] generic/cortex-a53 cost model for SLP for aarch64 is good

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-26
 Ever confirmed|0   |1

--- Comment #6 from Andrew Pinski  ---
Confirmed.

Note if sve is turned on, we get:
```
.L2:
ldr q30, [x1], 16
ldr q29, [x2], 16
mul z29.d, z30.d, z29.d
add v31.2d, v31.2d, v29.2d
cmp x1, x3
bne .L2
```
For the inner loop on the trunk which is 100% what you want as then it is
vectorized.

[Bug target/93370] Aarch64 accepts but ignores target("+sm4") unless ARMv8.2-A is enabled

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93370

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Andrew Pinski  ---
Fixed in GCC 13.2.0 by r13-7597-g9aac37ab8a7b91 and on the trunk by
r14-2651-g73d3bc348190b5

[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877

--- Comment #6 from Andrew Pinski  ---
In the original testcase, there are still extra movs.

For the testcase in comment #4, it is fixed on the trunk and we now get:
```
fun:
stp x29, x30, [sp, -48]!
mov x29, sp
str q2, [sp, 32]
bl  g
str q0, [sp, 16]
bl  g
ldp q30, q2, [sp, 16]
mov v31.16b, v0.16b
ldp x29, x30, [sp], 48
tbl v0.16b, {v30.16b - v31.16b}, v2.16b
ret
```

Maybe the issue is only with arguments now.

[Bug tree-optimization/102066] aarch64: Suboptimal addressing modes for SVE LD1W, ST1W

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-26
 Ever confirmed|0   |1

--- Comment #4 from Andrew Pinski  ---
.

[Bug tree-optimization/102066] aarch64: Suboptimal addressing modes for SVE LD1W, ST1W

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org
  Component|target  |tree-optimization
   Severity|normal  |enhancement

--- Comment #3 from Andrew Pinski  ---
Confirmed.

GCC does not have a promotion pass, especially when dealing with induction
variables.  There are other bug reports which have a similar issue too.

[Bug target/102055] full 128byte swap using __builtin_shuffle should produce rev64 followed by ext

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102055

--- Comment #2 from Andrew Pinski  ---
The use of ldr/tbl vs rev64/ext is questionable and depend on if we are inside
a loop or not. In the case of it being inside the loop and there are enough
registers, then using TBL is better on many (not all though) micro-arches as it
is similar latency as rev64. 

Though I should note that clang/LLVM implements it as rev64/ext.

E.g.:
```

#define vector __attribute__((vector_size(16)))

vector char g(vector char a)
{
return __builtin_shufflevector (a,a,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,
0);
}

vector char g1(vector char a)
{
vector char t= __builtin_shufflevector
(a,a,7,6,5,4,3,2,1,0,15,14,13,12,11,10,9,8);
vector long long t1 = (vector long long)t;
t1 = __builtin_shufflevector(t1,t1, 1,0);
return (vector char)t1;
}
```

Produces:
```
rev64   v0.16b, v0.16b
ext v0.16b, v0.16b, v0.16b, #8
```

For both.

[Bug c++/113599] [14 Regression] Wrong computation of member offset through pointer-to-member since r14-5503

2024-01-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113599

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Jakub Jelinek  ---
Fixed.

[Bug c++/113599] [14 Regression] Wrong computation of member offset through pointer-to-member since r14-5503

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113599

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:fd620bd3351c6b9821c299035ed17e655d7954b5

commit r14-8439-gfd620bd3351c6b9821c299035ed17e655d7954b5
Author: Jakub Jelinek 
Date:   Fri Jan 26 00:08:36 2024 +0100

c++: Fix up build_m_component_ref [PR113599]

The following testcase reduced from GDB is miscompiled starting with
r14-5503 PR112427 change.
The problem is in the build_m_component_ref hunk, which changed
-  datum = fold_build_pointer_plus (fold_convert (ptype, datum),
component);
+  datum = cp_convert (ptype, datum, complain);
+  if (!processing_template_decl)
+   datum = build2 (POINTER_PLUS_EXPR, ptype,
+   datum, convert_to_ptrofftype (component));
+  datum = cp_fully_fold (datum);
Component is e, (sizetype) e is 16, offset of c inside of C.
ptype is A *, pointer to type of C::c and datum is 
Now, previously the above created ((A *) ) p+ (sizetype) e which is
correct,
but in the new code cp_convert sees that C has A as base class and
instead of returning (A *) , it returns  where D.2800 is
the FIELD_DECL for the A base at offset 8 into C.
So, instead of computing ((A *) ) p+ (sizetype) e it computes
 p+ (sizetype) e, which is ((A *) ) p+ 24.

The following patch fixes it by using convert instead of cp_convert which
eventually calls build_nop (ptype, datum).

2024-01-26  Jakub Jelinek  

PR c++/113599
* typeck2.cc (build_m_component_ref): Use convert instead of
cp_convert for pointer conversion.

* g++.dg/expr/ptrmem11.C: New test.

[Bug target/113607] New: [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-25 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

Bug ID: 113607
   Summary: [14] RISC-V rv64gcv vector: Runtime mismatch at -O3
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Tested using r14-8438-g136a828754f

Testcase:
struct {
  signed b;
} c, d = {6};
short e, f;
int g[1000];
signed char h;
int i, j;
long k, l;

long m(long n, long o) {
  if (n < 1 && o == 0)
return 0;
  return n;
}

static int p() {
  long q = 0;
  int a = 0;
  for (; e < 2; e += 1)
g[e * 7 + 1] = 2637287069;
  for (; h <= 6; h += 1) {
k = g[8] || f;
l = m(g[f * 7 + 1], k);
a = l;
j = a < 0 || g[f * 7 + 1] < 0 || g[f * 7 + 1] >= 32 ? a : a << g[f * 7 +
1];
if (j)
  ++q;
  }
  if (q)
c = d;
  return i;
}

int main() {
  p();
  if (c.b == 0)
return 0;
  else
return 1;
}

Commands:
> /scratch/tc-testing/tc-jan-25-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -O3 -march=rv64gcv red.c -o user-config-o3.out
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true 
> /scratch/tc-testing/tc-jan-25-trunk/build-rv64gcv/bin/qemu-riscv64 
> user-config-o3.out
> echo $?
1

> /scratch/tc-testing/tc-jan-25-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -O2 -march=rv64gcv red.c -o user-config-o2.out
> QEMU_CPU=rv64,vlen=128,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true 
> /scratch/tc-testing/tc-jan-25-trunk/build-rv64gcv/bin/qemu-riscv64 
> user-config-o2.out
> echo $?
0

Found using fuzzer.

[Bug libgcc/113604] runtime SIGFPE with _BitInt() division

2024-01-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113604

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
longlong.h documents that
   HIGH_NUMERATOR must be less
   than DENOMINATOR for correct operation.  If, in addition, the most
   significant bit of DENOMINATOR must be 1, then the pre-processor symbol
   UDIV_NEEDS_NORMALIZATION is defined to 1.
While UDIV_NEEDS_NORMALIZATION is 1 only on sh32 and arches which don't define
their udiv_qrnnd and fallback to C, the first requirement is there for all
arches.
0x''8000'03e1uwb / 0x'uwb
is 0x1''uwb and
0x''8000'03e1uwb % 0x'uwb
is 0x8000'03e1uwb, so the quotient isn't representable in 64-bit
number, which is why SIGFPE is triggered.

[Bug analyzer/113606] New: -Wanalyzer-infinite-recursion false positive on code involving strstr, memset, strnlen and -D_FORTIFY_SOURCE

2024-01-25 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113606

Bug ID: 113606
   Summary: -Wanalyzer-infinite-recursion false positive on code
involving strstr, memset, strnlen and
-D_FORTIFY_SOURCE
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: dmalcolm at gcc dot gnu.org
  Target Milestone: ---

Taking the following from this downstream bug report:
  https://bugzilla.redhat.com/show_bug.cgi?id=2260398

Create str.c:
```
#define _POSIX_C_SOURCE 200809L
#include 
#include 
#include 

static char*
strredact(char *str, const char *sub, const char c)
{
  char *p;
  if (!str) return NULL;
  if (!sub) return str;
  p = strstr(str, sub);
  if (!c || !p) return str;
  (void)memset(p, c, strnlen(sub, strlen(str)));
  return strredact(str, sub, c);
}

int
main (void)
{
  char string[] = "This_is_a_string.";
  return printf("%s\n", strredact(string, "_", ' '));
}
```

Actual Results (with trunk aka gcc 14):  

$ gcc -fanalyzer -Werror -O str.c
$ gcc -fanalyzer -Werror -O -D_FORTIFY_SOURCE=2 str.c
str.c: In function ‘strredact’:
str.c:16:10: error: infinite recursion [CWE-674]
[-Werror=analyzer-infinite-recursion]
   16 |   return strredact(str, sub, c);
  |  ^~
  ‘strredact’: events 1-9
|
|8 | strredact(char *str, const char *sub, const char c)
|  | ^
|  | |
|  | (1) entry to ‘strredact’
|..
|   11 |   if (!str) return NULL;
|  |  ~
|  |  |
|  |  (2) following ‘false’ branch (when ‘str’ is non-NULL)...
|   12 |   if (!sub) return str;
|  |  ~
|  |  |
|  |  (3) ...to here
|  |  (4) following ‘false’ branch (when ‘sub’ is non-NULL)...
|   13 |   p = strstr(str, sub);
|  |   
|  |   |
|  |   (5) ...to here
|  |   (6) when ‘strstr’ returns non-NULL
|   14 |   if (!c || !p) return str;
|  |  ~
|  |  |
|  |  (7) following ‘false’ branch...
|   15 |   (void)memset(p, c, strnlen(sub, strlen(str)));
|  | ~~~
|  | |
|  | (8) ...to here
|   16 |   return strredact(str, sub, c);
|  |  ~~
|  |  |
|  |  (9) calling ‘strredact’ from ‘strredact’
|
+--> ‘strredact’: events 10-18
   |
   |8 | strredact(char *str, const char *sub, const char c)
   |  | ^
   |  | |
   |  | (10) initial entry to ‘strredact’
   |..
   |   11 |   if (!str) return NULL;
   |  |  ~
   |  |  |
   |  |  (11) following ‘false’ branch (when ‘str’ is
non-NULL)...
   |   12 |   if (!sub) return str;
   |  |  ~
   |  |  |
   |  |  (12) ...to here
   |  |  (13) following ‘false’ branch (when ‘sub’ is
non-NULL)...
   |   13 |   p = strstr(str, sub);
   |  |   
   |  |   |
   |  |   (14) ...to here
   |  |   (15) when ‘strstr’ returns non-NULL
   |   14 |   if (!c || !p) return str;
   |  |  ~
   |  |  |
   |  |  (16) following ‘false’ branch...
   |   15 |   (void)memset(p, c, strnlen(sub, strlen(str)));
   |  | ~~~
   |  | |
   |  | (17) ...to here
   |   16 |   return strredact(str, sub, c);
   |  |  ~~
   |  |  |
   |  |  (18) calling ‘strredact’ from ‘strredact’
   |
   +--> ‘strredact’: events 19-20
  |
  |8 | strredact(char *str, const char *sub, const char c)
  |  | ^
  |  | |
  |  | (19) recursive entry to ‘strredact’; previously
entered at (10)
  |  | (20) apparently infinite recursion
  |
cc1: all warnings being treated as errors



Expected Results:  
$ gcc -fanalyzer -Werror -O str.c
$ gcc -fanalyzer -Werror -O -D_FORTIFY_SOURCE=2 str.c

(no output)

Affects trunk.
Doesn't affect gcc 13.2

Reproduced on Godbolt, see https://godbolt.org/z/ebsq7WhxG
https://godbolt.org/z/Tn7oe1EbG - a slightly more minimized example

[Bug libgcc/113604] runtime SIGFPE with _BitInt() division

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113604

--- Comment #2 from Andrew Pinski  ---
x86 in include/longlong.h defines udiv_qrnnd as:

#define udiv_qrnnd(q, r, n1, n0, dv) \
  __asm__ ("div{l} %4"  \
   : "=a" ((USItype) (q)),  \
 "=d" ((USItype) (r))   \
   : "0" ((USItype) (n0)),  \
 "1" ((USItype) (n1)),  \
 "rm" ((USItype) (dv)))

(gdb) p/x uv1
$2 = 0x
(gdb) p/x uv0
$3 = 0x83e1
(gdb) p/x vv1
$4 = 0x


I have no idea why we are getting a FP exception here though.

[Bug libgcc/113604] runtime SIGFPE with _BitInt() division

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113604

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-25

--- Comment #1 from Andrew Pinski  ---
Confirmed.

1869  udiv_qrnnd (qhat, rhat, uv1, uv0, vv1);

[Bug tree-optimization/113602] ICE: in vn_reference_maybe_forwprop_address, at tree-ssa-sccvn.cc:1426 with invalid _BitInt() register asm with -O2 -fno-tree-loop-optimize

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113602

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-25

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug libfortran/111022] ES0.0E0 format gave ES0.dE0 output with d too high.

2024-01-25 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111022

--- Comment #24 from Jerry DeLisle  ---
Currently gfortran does the following:

character(20) :: fmt
character(9) :: buffer
fmt = "(1a1,d0.2,1a1)"
write(buffer,fmt) ">", 3.0, "<"
 print *, buffer
fmt = "(1a1,e0.2,1a1)"
write(buffer,fmt) ">", 3.0, "<"
 print *, buffer
end


$ gfc question.f90 
$ ./a.out 
 >0.30D+1<
 >0.30E+1<

Why not:

$ ./a.out 
 >3.00D+0<
 >3.00E+0<

What does Intel do?

[Bug ada/113605] New: Fixed-point declaration using an integer where a real is expected causes a crash

2024-01-25 Thread rwconnelly at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113605

Bug ID: 113605
   Summary: Fixed-point declaration using an integer where a real
is expected causes a crash
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rwconnelly at hotmail dot com
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57218
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57218=edit
Includes source file and compiler diagnostics

I encountered this bug when declaring a fixed-point type.

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #13 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #11)
> I think the only thing left is updating the testcase to use vect_long_mult
> since both powerpc and aarch64 have updated/corrected vect_long_mult .

I am going to test this patch.

[Bug c++/113598] [11/12/13/14 Regression] GCC internal compiler error since r0-124275

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113598

--- Comment #4 from GCC Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:136a828754ff65079a83482b49d54bd5bc64

commit r14-8438-g136a828754ff65079a83482b49d54bd5bc64
Author: Jason Merrill 
Date:   Thu Jan 25 12:02:07 2024 -0500

c++: array of PMF [PR113598]

Here AGGREGATE_TYPE_P includes pointers to member functions, which is not
what we want.  Instead we should use class||array, as elsewhere in the
function.

PR c++/113598

gcc/cp/ChangeLog:

* init.cc (build_vec_init): Don't use {} for PMF.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-pmf2.C: New test.

[Bug c++/109227] coroutines: ICE in tree check: expected record_type or union_type or qual_union_type, have array_type in build_special_member_call, at cp/call.cc:11067

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109227

--- Comment #8 from GCC Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:44868e7298de5048d6f04d7fa098d5bc767c8cb8

commit r14-8437-g44868e7298de5048d6f04d7fa098d5bc767c8cb8
Author: Jason Merrill 
Date:   Thu Jan 25 14:45:35 2024 -0500

c++: co_await and initializer_list [PR109227]

Here we end up with an initializer_list of 'aa', a type with a non-trivial
destructor, and need to destroy it.  The code called
build_special_member_call for cleanups, but that doesn't work for arrays,
so
use cxx_maybe_build_cleanup instead.  Let's go ahead and do that
everywhere that has been calling the destructor directly.

PR c++/109227

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Use cxx_maybe_build_cleanup.
(build_actor_fn, process_conditional, maybe_promote_temps)
(morph_fn_to_coro): Likewise.
(expand_one_await_expression): Use build_cleanup.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/co-await-initlist2.C: New test.

[Bug libgcc/113604] New: runtime SIGFPE with _BitInt() division

2024-01-25 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113604

Bug ID: 113604
   Summary: runtime SIGFPE with _BitInt() division
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57217
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57217=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc testcase.c
$ valgrind -q ./a.out 
==6365== 
==6365== Process terminating with default action of signal 8 (SIGFPE)
==6365==  Integer divide by zero at address 0x1002C76A28
==6365==at 0x4016DF: __divmodbitint4 (libgcc2.c:1868)
==6365==by 0x4011CF: foo (in a.out)
==6365==by 0x401254: main (in a.out)
Floating point exception

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240125 (experimental) (GCC)

[Bug tree-optimization/113603] [12/13/14 Regression] ICE Segfault during GIMPLE pass: strlen at -O3 since r12-145

2024-01-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113603

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
Summary|[14 Regression] ICE |[12/13/14 Regression] ICE
   |Segfault during GIMPLE  |Segfault during GIMPLE
   |pass: strlen at -O3 |pass: strlen at -O3 since
   ||r12-145
   Target Milestone|--- |12.4
   Priority|P3  |P2

--- Comment #1 from Jakub Jelinek  ---
Started with r12-145-gd1d01a66012a93cc8cb7dafbe1b5ec453ec96b59 but guess that
just triggered a latent bug in the strlen pass.

[Bug target/111677] [12/13 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |NEW
Summary|[12/13/14 Regression]   |[12/13 Regression]
   |darktable build on aarch64  |darktable build on aarch64
   |fails with unrecognizable   |fails with unrecognizable
   |insn due to |insn due to
   |-fstack-protector changes   |-fstack-protector changes

[Bug target/100204] aarch64: UB evaluating J constraint

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100204

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #4 from Andrew Pinski  ---
AARCH64 issue is fixed.

[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
Bug 63426 depends on bug 100204, which changed state.

Bug 100204 Summary: aarch64: UB evaluating J constraint
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100204

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/100204] aarch64: UB evaluating J constraint

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100204

--- Comment #3 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:f03b8f595b6350732bb0a9a69557c5ed2af085b2

commit r14-8436-gf03b8f595b6350732bb0a9a69557c5ed2af085b2
Author: Andrew Pinski 
Date:   Thu Jan 25 08:30:36 2024 -0800

aarch64: Fix undefinedness while testing the J constraint [PR100204]

The J constraint can invoke undefined behavior due to it taking the
negative of the ival if ival was HWI_MIN. The fix is simple as casting
to `unsigned HOST_WIDE_INT` before doing the negative of it. This
does that.

Committed as obvious after build/test for aarch64-linux-gnu.

gcc/ChangeLog:

PR target/100204
* config/aarch64/constraints.md (J): Cast to `unsigned
HOST_WIDE_INT`
before taking the negative of it.

Signed-off-by: Andrew Pinski 

[Bug target/113526] [14 Regression] gcc.target/arm/asm-flag-1.c fails since gcc-14-7248-g76bc70387d9

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113526

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Vladimir Makarov :

https://gcc.gnu.org/g:476226290dba8cd7f3e9f4e3f0185b58903db8cd

commit r14-8435-g476226290dba8cd7f3e9f4e3f0185b58903db8cd
Author: Vladimir N. Makarov 
Date:   Thu Jan 25 14:41:17 2024 -0500

[PR113526][LRA]: Fixing asm-flag-1.c failure on ARM

My recent patch for PR113356 results in failure asm-flag-1.c test on arm.
After the patch LRA treats asm operand pseudos as general regs.  There
are too many such operands and LRA can not assign hard regs to all
operand pseudos.  Actually we should not assign hard regs to the
operand pseudo at all.  The following patch fixes this.

gcc/ChangeLog:

PR target/113526
* lra-constraints.cc (curr_insn_transform): Change class even for
spilled pseudo successfully matched with with NO_REGS.

[Bug c++/102051] [coroutines] ICE in gimplify_var_or_parm_decl, at gimplify.c:2848

2024-01-25 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102051

Patrick Palka  changed:

   What|Removed |Added

 Status|WAITING |NEW
 CC||ppalka at gcc dot gnu.org

--- Comment #6 from Patrick Palka  ---
Confirmed with trunk/13.2/12.3/11.4 on the comment #5 testcase.

[Bug c++/109227] coroutines: ICE in tree check: expected record_type or union_type or qual_union_type, have array_type in build_special_member_call, at cp/call.cc:11067

2024-01-25 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109227

Jason Merrill  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug tree-optimization/113603] New: [14 Regression] ICE Segfault during GIMPLE pass: strlen at -O3

2024-01-25 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113603

Bug ID: 113603
   Summary: [14 Regression] ICE Segfault during GIMPLE pass:
strlen at -O3
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Command:
> /scratch/tc-testing/tc-jan-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -O3 red.c -S -freport-bug
during GIMPLE pass: strlen
red.c: In function 'h':
red.c:7:6: internal compiler error: Segmentation fault
7 | int *h() {
  |  ^
0x12c0303 crash_signal
../../../gcc/gcc/toplev.cc:316
0x7fe82a04251f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x14f03cc contains_struct_check(tree_node*, tree_node_structure_enum, char
const*, int, char const*)
../../../gcc/gcc/tree.h:3757
0x14f03cc maybe_invalidate
../../../gcc/gcc/tree-ssa-strlen.cc:1361
0x14f0861 do_invalidate
../../../gcc/gcc/tree-ssa-strlen.cc:5730
0x150015e strlen_pass::before_dom_children(basic_block_def*)
../../../gcc/gcc/tree-ssa-strlen.cc:5780
0x23cb957 dom_walker::walk(basic_block_def*)
../../../gcc/gcc/domwalk.cc:311
0x15006e6 printf_strlen_execute
../../../gcc/gcc/tree-ssa-strlen.cc:5899
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.

Testcase:
int a, e;
char b;
int *c;
signed char *d;
short f;
char g[3];
int *h() {
  int i = 0;
  for (; i < 3; i++)
g[i] = 2;
  int j[100][100] = {{}, {4}};
  signed char *k = [1];
  do {
for (;;) {
  if (c)
break;
  return 
}
f = 0;
for (;; f++) {
  b = 0;
  for (; b < 2; b++)
*c = j[b][f];
  if (e)
d = k;
  *k = *d;
  if (*c)
break;
  if (f)
break;
}
  } while (f);
  return 0;
}

Godbolt:
https://godbolt.org/z/ax1Tzc3To

Occurs on x86, RISC-V, ARM
Found using a fuzzer.

[Bug fortran/113377] Wrong code passing optional dummy argument to elemental procedure with optional dummy

2024-01-25 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113377

--- Comment #11 from anlauf at gcc dot gnu.org ---
(In reply to GCC Commits from comment #10)
> * gfortran.dg/optional_absent_10.f90: New test.

According to gcc-testresults this new test fails on POWER BE systems:

FAIL: gfortran.dg/optional_absent_10.f90   -O0  execution test

[Bug other/113336] libatomic (testsuite) regressions on arm

2024-01-25 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113336

Roger Sayle  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|roger at nextmovesoftware dot com  |unassigned at gcc dot 
gnu.org
Summary|libatomic (testsuite)   |libatomic (testsuite)
   |regressions on  |regressions on arm
   |armv6-linux-gnueabihf   |

--- Comment #4 from Roger Sayle  ---
Hi Victor,
Yes, I agree your approach is better/less invasive than mine.  I simply copied
the existing idiom in Makefile.am, not noticing that this adds more
functionality to libatomic than is strictly required. Just adding the
missing/required tas_1_2_.lo is better (and hopefully more acceptable to the
maintainers/reviewers).

[Bug tree-optimization/113602] New: ICE: in vn_reference_maybe_forwprop_address, at tree-ssa-sccvn.cc:1426 with invalid _BitInt() register asm with -O2 -fno-tree-loop-optimize

2024-01-25 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113602

Bug ID: 113602
   Summary: ICE: in vn_reference_maybe_forwprop_address, at
tree-ssa-sccvn.cc:1426 with invalid _BitInt() register
asm with -O2 -fno-tree-loop-optimize
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57216
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57216=edit
reduced testcase (from gcc.dg/torture/pr60606-1.c)

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O2 -fno-tree-loop-optimize testcase.c 
during GIMPLE pass: fre
testcase.c: In function 'f':
testcase.c:2:1: internal compiler error: in
vn_reference_maybe_forwprop_address, at tree-ssa-sccvn.cc:1426
2 | f(void) {
  | ^
0x875c2c vn_reference_maybe_forwprop_address
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:1426
0x17176d8 valueize_refs_1
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:1718
0x171c008 valueize_shared_reference_ops_from_ref
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:1763
0x171c008 valueize_shared_reference_ops_from_ref
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:1757
0x171c008 vn_reference_lookup(tree_node*, tree_node*, vn_lookup_kind,
vn_reference_s**, bool, tree_node**, tree_node*, bool)
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:3963
0x1720437 visit_reference_op_load
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:5752
0x1720437 visit_stmt
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:6273
0x1720abb process_bb
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:8037
0x1722496 do_rpo_vn_1
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:8638
0x172400b execute
/repo/gcc-trunk/gcc/tree-ssa-sccvn.cc:8799
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-8419-20240125172014-gc6c2a1d79eb-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240125 (experimental) (GCC)

[Bug target/113601] avr: Wrong SRAM start for ATmega3208 and ATmega3209

2024-01-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113601

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |13.3

--- Comment #4 from Georg-Johann Lay  ---
Fixed in v12.4+ and v13.3+

[Bug target/113601] avr: Wrong SRAM start for ATmega3208 and ATmega3209

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113601

--- Comment #3 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Georg-Johann Lay
:

https://gcc.gnu.org/g:f4015b8434a506a70f941a26d563d2de1dcbcf2f

commit r12-10114-gf4015b8434a506a70f941a26d563d2de1dcbcf2f
Author: Georg-Johann Lay 
Date:   Thu Jan 25 18:51:04 2024 +0100

AVR: target/113601 - Fix wrong data start for ATmega3208 and ATmega3209.

gcc/
PR target/113601
* config/avr/avr-mcus.def (atmega3208, atmega3209): Fix
data_section_start.

(cherry picked from commit 6b678d8f96ad5ffb8de9e3f1f1694cb21d7a2c33)

[Bug target/113601] avr: Wrong SRAM start for ATmega3208 and ATmega3209

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113601

--- Comment #2 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Georg-Johann Lay
:

https://gcc.gnu.org/g:fe3093ca9333965ec00d43ed2e24e594901a6ff9

commit r13-8252-gfe3093ca9333965ec00d43ed2e24e594901a6ff9
Author: Georg-Johann Lay 
Date:   Thu Jan 25 18:51:04 2024 +0100

AVR: target/113601 - Fix wrong data start for ATmega3208 and ATmega3209.

gcc/
PR target/113601
* config/avr/avr-mcus.def (atmega3208, atmega3209): Fix
data_section_start.

(cherry picked from commit 6b678d8f96ad5ffb8de9e3f1f1694cb21d7a2c33)

[Bug target/113601] avr: Wrong SRAM start for ATmega3208 and ATmega3209

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113601

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Georg-Johann Lay :

https://gcc.gnu.org/g:6b678d8f96ad5ffb8de9e3f1f1694cb21d7a2c33

commit r14-8433-g6b678d8f96ad5ffb8de9e3f1f1694cb21d7a2c33
Author: Georg-Johann Lay 
Date:   Thu Jan 25 18:51:04 2024 +0100

AVR: target/113601 - Fix wrong data start for ATmega3208 and ATmega3209.

gcc/
PR target/113601
* config/avr/avr-mcus.def (atmega3208, atmega3209): Fix
data_section_start.

[Bug target/113601] avr: Wrong SRAM start for ATmega3208 and ATmega3209

2024-01-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113601

Georg-Johann Lay  changed:

   What|Removed |Added

 Target||avr
   Priority|P3  |P4

[Bug target/113601] New: avr: Wrong SRAM start for ATmega3208 and ATmega3209

2024-01-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113601

Bug ID: 113601
   Summary: avr: Wrong SRAM start for ATmega3208 and ATmega3209
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

ATmega3208/9 have SRAM from 0x3000 to 0x3fff, which is 4KiB.

The hardware description in avr-mcus.def uses a start at 0x3800, which is not
correct.  This leads to a wrong -Tdata option when linking.

As a work-around, pass -Tdata 0x803000 when linking, or fix the respective
option in device-specs/specs-atmega3208/9.

[Bug target/112987] [14 Regression][aarch64] ICE in aarch64_do_track_speculation, at config/aarch64/aarch64-speculation.cc:214 since r14-5886-g426fddcbdad674

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112987

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Szabolcs Nagy :

https://gcc.gnu.org/g:305fe4f136a3a3a78377a48c55d546000a3ba529

commit r14-8432-g305fe4f136a3a3a78377a48c55d546000a3ba529
Author: Szabolcs Nagy 
Date:   Wed Jan 24 18:50:19 2024 +

aarch64: Fix eh_return for -mtrack-speculation [PR112987]

Recent commit introduced a conditional branch in eh_return epilogues
that is not compatible with speculation tracking:

  commit 426fddcbdad6746fe70e031f707fb07f55dfb405
  Author: Szabolcs Nagy 
  CommitDate: 2023-11-27 15:52:48 +

  aarch64: Use br instead of ret for eh_return

Refactor the compare zero and jump pattern and use it to fix the issue.

gcc/ChangeLog:

PR target/112987
* config/aarch64/aarch64.cc (aarch64_gen_compare_zero_and_branch):
New.
(aarch64_expand_epilogue): Use the new function.
(aarch64_split_compare_and_swap): Likewise.
(aarch64_split_atomic_op): Likewise.

[Bug libstdc++/100903] Bogus "zero as null pointer constant" warning

2024-01-25 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100903

--- Comment #14 from Jonathan Wakely  ---
Yes, that part's easy (and that's what we do in std::format for errors during
format string parsing). But accepting (a <=> b) < (1-1) and other zero-valued
constant expressions can't be solved by improving the diagnostics for
ill-formed cases.

(In reply to Jakub Jelinek from comment #6)
> So, instead of adding a new compiler extension, couldn't we just add a hack
> for this warning and temporarily disable the -Wzero-as-null-pointer-constant
> warning
> while doing convert_like_internal to std::__cmp_cat::__unspec convs->type, or
> while build_over_call to the std::__cmp_cat::__unspec ctor?
> Or add some attribute to that ctor which would cause the warning to be
> temporarily disabled while handling its argument.

This would be my preference.

[Bug other/113336] libatomic (testsuite) regressions on armv6-linux-gnueabihf

2024-01-25 Thread victor.donascimento at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113336

Victor Do Nascimento  changed:

   What|Removed |Added

 CC||victor.donascimento at arm dot 
com

--- Comment #3 from Victor Do Nascimento  
---
For what it's worth, I just so happened to stumble upon the same issue when
compiling and running Libatomic for the armv8l-unknown-linux-gnueabihf triplet
on a Cortex-A72 machine inside a 32-bit Docker container, so the issue is
clearly is clearly prevalent on a wider range of targets than perhaps alluded
to by the bug report title.

The patch provided does appear to fix all regressions.

Here are my initial thoughts on the issue and the proposed fix.

My only concern at the moment is that if the regression is caused by
HAVE_ATOMIC_TAS now being detected as false, then perhaps a more directed
solution is called for, specific to tas, as opposed to generating _i2 variants
for *all* atomic ops via $(addsuffix _1_2_.lo,$(SIZEOBJS))

If you look at the very end of tas_n.c at the `if !DONE' clause, you'll see
that for `SIZE(libat_test_and_set)', irrespective of the SIZE value,
SIZE(libat_test_and_set) always falls back onto `libat_test_and_set_1',
explaining why tas_1_2_.lo is needed.

This unconditional dependence on the *_1 does not, however, appear the norm. 
One example of this is seen with `SIZE(libat_compare_exchange)'.

With this in mind, I notice that adding `tas_1_2_.lo' to the
`libatomic_la_LIBADD' variable in Makefile.am, i.e.

  libatomic_la_LIBADD += tas_1_2_.lo

is apparently sufficient to fix all regressions on my machine.

[Bug c++/113598] [11/12/13/14 Regression] GCC internal compiler error since r0-124275

2024-01-25 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113598

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug c++/113598] [11/12/13/14 Regression] GCC internal compiler error since r0-124275

2024-01-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113598

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug c++/113598] [11/12/13/14 Regression] GCC internal compiler error since r0-124275

2024-01-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113598

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[11/12/13/14 Regression]|[11/12/13/14 Regression]
   |GCC internal compiler error |GCC internal compiler error
   ||since r0-124275
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Started with r0-124275-g16b53405ad2baba783cf7ecf34a623fd64db2dda aka PR57402
fix.

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||missed-optimization

[Bug c++/113598] [11/12/13/14 Regression] GCC internal compiler error

2024-01-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113598

Andrew Pinski  changed:

   What|Removed |Added

Summary|GCC internal compiler error |[11/12/13/14 Regression]
   ||GCC internal compiler error
  Known to work||4.4.7, 4.8.1, 4.8.5
  Known to fail||4.9.0, 5.1.0
   Target Milestone|--- |11.5

--- Comment #2 from Andrew Pinski  ---
Reduced testcase:
```
struct Cpu
{
int op_nop();
};
typedef int(Cpu::*OpCode)();
void f()
{
  new OpCode[256]{::op_nop};
}
```

[Bug target/111677] [12/13/14 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-25 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

--- Comment #20 from Alex Coplan  ---
I think the testcase in #c10 went latent on the 13 branch but the following
(reduced from the attachment) still ICEs on the tip of the 13 branch with
-Ofast -fopenmp -fstack-protector-strong:

typedef struct {
  long size_z;
  int width;
} dt_bilateral_t;
typedef float dt_aligned_pixel_t[4];
#pragma omp declare simd
void dt_bilateral_splat(dt_bilateral_t *b) {
  float *buf;
  long offsets[8];
  for (; b;) {
int firstrow;
for (int j = firstrow; j; j++)
  for (int i; i < b->width; i++) {
dt_aligned_pixel_t contrib;
for (int k = 0; k < 4; k++)
  buf[offsets[k]] += contrib[k];
  }
float *dest;
for (int j = (long)b; j; j++) {
  float *src = (float *)b->size_z;
  for (int i = 0; i < (long)b; i++)
dest[i] += src[i];
}
  }
}

[Bug middle-end/112971] [14] RISC-V rv64gcv_zvl256b vector -O3: internal compiler error: Segmentation fault signal terminated program cc1

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971

--- Comment #23 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:660e17f00658b68115282e6de38243e3c6cc1ee2

commit r14-8430-g660e17f00658b68115282e6de38243e3c6cc1ee2
Author: Robin Dapp 
Date:   Mon Jan 15 16:23:30 2024 +0100

fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

Found in PR112971 this patch adds folding support for bitwise operations
of const duplicate zero/one vectors with stepped vectors.
On riscv we have the situation that a folding would perpetually continue
without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would
not be folded to {0, 0, 0, ...}.

gcc/ChangeLog:

PR middle-end/112971

* fold-const.cc (simplify_const_binop): New function for binop
simplification of two constant vectors when element-wise
handling is not necessary.
(const_binop): Call new function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112971.c: New test.

[Bug testsuite/113558] [14 regression] gcc.dg/vect/vect-outer-4c-big-array.c etc. FAIL

2024-01-25 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113558

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:90880e117aa70a5ecd9b7df4381410c2ea0dcfdb

commit r14-8429-g90880e117aa70a5ecd9b7df4381410c2ea0dcfdb
Author: Robin Dapp 
Date:   Tue Jan 23 12:44:20 2024 +0100

testsuite/vect: Add target checks to refined patterns.

On Solaris/SPARC several vector tests appeared to be regressing.  They
were never vectorized but the checks before r14-3612-ge40edf64995769
would match regardless if a loop was actually vectorized or not.
The refined checks only match a successful vectorization attempt
but are run unconditionally.  This patch adds target checks to them.

gcc/testsuite/ChangeLog:

PR testsuite/113558

* gcc.dg/vect/no-scevccp-outer-7.c: Add target check.
* gcc.dg/vect/vect-outer-4c-big-array.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-s16a.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-s8a.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-s8b.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-u16b.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-u8a.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-u8b.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-1a.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-1b-big-array.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-1c-big-array.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-2a.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-2b-big-array.c: Ditto.
* gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Ditto.

[Bug c++/113598] GCC internal compiler error

2024-01-25 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113598

Marek Polacek  changed:

   What|Removed |Added

   Keywords|ice-on-invalid-code |ice-on-valid-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-25
 Ever confirmed|0   |1
 CC||mpolacek at gcc dot gnu.org

--- Comment #1 from Marek Polacek  ---
All the other compilers accept this.

Even 4.9 ICEs for me though, so it looks like an old problem.

[Bug analyzer/112969] -Wanalyzer-exposure-through-uninit-copy false positive seen on Linux kernel's drivers/net/ethernet/intel/ice/ice_ptp.c

2024-01-25 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112969

--- Comment #3 from David Malcolm  ---
Should be fixed on trunk for gcc 14 by the above patch.

Keeping open to track backporting this to other branches.

  1   2   >